'

RCGraph - A Tool to Integrate Readme and Commits through Temporal Knowledge Graphs


Tool Description

RCGraph is used to construct Temporal Knowledge Graph for the Readme file of a GitHub repository. Readme files of the projects serves as an important source of information corresponding to the project such as the dependencies involved, methodology followed and so on. Assessing changes in readme files could provide insights on evolution of the project and consequently could help in modifying the underlying environment or dependencies to work with the project. Linking the readme files with time stamp of changes made to the readme files and further querying the linked data could help in assessing changes in readme files. Generating a temporal knowledge graph specific to readme files could thus link the readme with temporal changes and consequently ease the querying of readme.

Installation

Source code is available at rishalab/RCGraph

Create and activate a new python3 virtual environment:

python(3) -m venv <path_to_env/env_name></path_to_env>

Clone or download this github repository:

git clone https://github.com/rishalab/github-kg.git

Get into the main directory:

cd github-kg

Download standard-core-nlp-4.3.0 from this Drive Link . Place this folder in the root directory of this tool.

Install the requirements:

pip install -r requirements.txt

Generate Knowledge Graph:

python(3) main.py "<username/reponame>"

Approach diagram

RCGraph constructs a Temporal Knowledge Graph on Readme File of a GitHub repository. It first constructs a Knowledge Graph on the present Readme file of the given repository. Then, individual Knowledge Graphs are constructed on changes made by each commit. These individual Knowledge Graphs are combined to obtain the commit based Knowledge Graph.

elegant icons

To construct a Temporal Knowledge Graph Tuple, the tuples present in Readme KG are mapped to equivalent tuple present in Commits KG. Using this mapping, the corresponding commit timestamp and commit SHA are extracted and embedded into the Readme KG tuple to construct the resultant Temporal KG tuple.

elegant icons

UseCase

elegant icons
Here, we present an example usecase scenario to use RCGraph. The inputs for the tool are the repository of the readme for which we wish to generate temporal knowledge graph and the query of interest. In this use case, we consider the Microsoft/PowerToys repository. Based on the readme, we wish to obtain information about the timestamp from when the PowerToys has started allowing various commands, as highlighted in above Figure A. The query for the corresponding requirement is as follows: 'select timestamp where entity1 = "PowerToys" and relation = "allows" and entity2 = " various commands in PowerToys Run" '. With this input, the tool generates a Readme-specific knowledge graph and commit-specific knowledge graph as intermediate outputs, which are further processed to generate temporal knowledge graph for the readme file. A visual representation of the knowledge graph generated for the query of interest is depicted in the Figure B. The parameters 'entity1', 'relation' and 'entity2' are matched to the tuple, and the corresponding timestamp '2021-09-21 17:07:55' is extracted from the temporal knowledge graph. Thus, the query on temporal knowledge graph results in the timestamp - '2021-09-21 17:07:55'.

Results

We have evaluated run the KG generator on 10 repositories. For each repository, we have generate KG of final Readme and KG for commits history. Results

Demo Video

Contributors

Akhila Sri Manasa Venigalla - Mir Sameed Ali - Nikhil M - Sridhar Chimalakonda

Preprint