Giter Site home page Giter Site logo

kkteru / grail Goto Github PK

View Code? Open in Web Editor NEW
214.0 4.0 56.0 13.81 MB

Inductive relation prediction by subgraph reasoning, ICML'20

Python 93.75% Shell 6.25%
graph-neural-networks knowledge-graphs logical-reasoning graph-representation-learning heterogeneous-graph heterogeneous-graph-neural-network inductive-learning

grail's Introduction

GraIL - Graph Inductive Learning

This is the code necessary to run experiments on GraIL algorithm described in the ICML'20 paper Inductive relation prediction by subgraph reasoning.

Requiremetns

All the required packages can be installed by running pip install -r requirements.txt.

Inductive relation prediction experiments

All train-graph and ind-test-graph pairs of graphs can be found in the data folder. We use WN18RR_v1 as a runninng example for illustrating the steps.

GraIL

To start training a GraIL model, run the following command. python train.py -d WN18RR_v1 -e grail_wn_v1

To test GraIL run the following commands.

  • python test_auc.py -d WN18RR_v1_ind -e grail_wn_v1
  • python test_ranking.py -d WN18RR_v1_ind -e grail_wn_v1

The trained model and the logs are stored in experiments folder. Note that to ensure a fair comparison, we test all models on the same negative triplets. In order to do that in the current setup, we store the sampled negative triplets while evaluating GraIL and use these later to evaluate other baseline models.

RuleN

RuleN operates in two steps. Rules are first learned from a training graph and then applied on the test graph. Detailed instructions can be found here.

  • Learn rules: source learn_rules.sh WN18RR_v1
  • Apply rules:
    • To get AUC: source auc_apply_rules.sh WN18RR_v1 WN18RR_v1_ind num_of_samples_to_score(=1000)
    • To get ranking score: source auc_apply_rules.sh WN18RR_v1 WN18RR_v1_ind num_of_samples_to_score(=1000)

NeuralLP and Drum

We use the implementations provided by the authors of the respective papers to evaluate these models.

Transductive experiments

The full transductive datasets used in these experiments are present in the data folder.

GraIL

The training and testing protocols of GraIL remains the same.

KGE models

We use the comprehensive implementation provided by authors of RotatE. This gives state-of-the-art results on all datasets. The best configurations can be found here. To train these KGE models, navigate to the kge folder and run the commands as shown in the above reference. For example, to train TransE on FB237-15k, run the following command.

bash run.sh train TransE FB15k-237 0 0 1024 256 1000 9.0 1.0 0.00005 100000 16

This will store the trained model and the logs in a folder named experiments/kge_baselines/TransE_FB15k-237.

Ensembling instructions

Once the KGE models are trained, to get ensembling results with GraIL, navigate to the ensembling folder and run the following command. source get_ensemble_predictions.sh WN18RR TransE

To get ensenbling among different KGE models, from the ensembling folder run the following command. source get_kge_predictions.sh WN18RR TransE ComplEx

If you make use of this code or the GraIL algorithm in your work, please cite the following paper:

@article{Teru2020InductiveRP,
  title={Inductive Relation Prediction by Subgraph Reasoning.},
  author={Komal K. Teru and Etienne Denis and William L. Hamilton},
  journal={arXiv: Learning},
  year={2020}
}

grail's People

Contributors

kkteru avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

grail's Issues

accuracy reproduction

Hi, I'm very interested in your work and I'm quite new to knowledge graph. And I was reproduce your results in the paper with the default code and the dataset given, the WN18RR datasets seem to work well with your given command line, and the results are always a bit higher than the paper's results.

But when I use nell_v1 for training and nell_v1_ind for testing, the Hits@10 and auc_pr is much lower than the paper's results. I want to make sure if this is the right way to run the code for this dataset(the given command line with dataset being replaced)? Should I tune other parameters? If so, can you please give me a hint about which parameters influence the performance the most if that's available?

Thanks for you kind response!

Screen Shot 2022-11-21 at 11 48 05 PM

Screen Shot 2022-11-21 at 11 49 27 PM

Is the data in the paper inconsistent with the data provided by the warehouse?

I have performed statistics on all the version data provided in the warehouse, and found that there is some inconsistency with the statistical results in Table 13 of the paper? Can you give a brief explanation?
Paper Sheet 13:
1

My statistics table:
2

The red and bold parts are inconsistent data.
Looking forward to your reply!!!

The statistics code I used is :

    root_path = 'data/WN18RR_v2_ind'
    file_list = [root_path + '/train.txt', root_path + '/valid.txt', root_path + '/test.txt']
    relation_list = []
    entity_list = []
    count = 0
    for file_path in file_list:
        with open(file_path) as f:
            file_data = [line.split() for line in f.read().split('\n')[:-1]]
            count = count + len(file_data)
            for triplet in file_data:
                if triplet[0] not in entity_list:
                    entity_list.append(triplet[0])
                if triplet[2] not in entity_list:
                    entity_list.append(triplet[2])
                if triplet[1] not in relation_list:
                    relation_list.append(triplet[1])
    print(root_path[root_path.rfind('/')+1:])
    print(len(relation_list))
    print(len(entity_list))
    print(count)

Do you mind sharing the code about how to get the dataset division?

Nice work! Thank you very much for the open source code. In the original paper, the FB15k-237, NELL995 and WN18RR datasets are divided into V1 to V4 versions. Because of the needs of my own paper, I want to divide the FB15k dataset similarly to generate four versions of datasets such as FB15k_v1, FB15k_v1_ind, etc. if it is convenient for you, would you like to open source the code divided by different versions of the dataset? Looking forward to your reply.

Can explain Why operate A_incidence += A_incidence.T?

Dear author,
May I ask a superficial problem that
Why make incidence_matrix(A_list) to be an undirected graph in function subgraph_extraction_labeling in grail/subgraph_extraction/graph_sampler.py
by adding A_incidence.T with A_incidence ?

Inductive Datasets

Hi, in the paper it says the following, but we observed several overlapping entities (e.g., /m/080knyg in fb237_v1_ind). Are we maybe missing steps in the data processing, or could you please detail what you mean by inductive setting?

"F. Inductive Graph Generation
The inductive train and test graphs examined in this paper do not have overlapping entities."

Thank you already!

return_array=False

There is a error: AssertionError: For return_array=False, there should be one and only one edge between u and v, but get 0 edges. Please use return_array=True instead

runtime error of _prepare_subgraphs for negative data

Do you meet the following problem when you run the code?

subgraphs_neg.append(self._prepare_subgraphs(nodes_neg, r_label_neg, n_labels_neg))

Exception has occurred: AssertionError
For return_array=False, there should be one and only one edge between u and v, but get 0 edges. Please use return_array=True instead

How are the embeddings stored/how can one access them?

I really appreciate the work that this does, but I am having issues understanding how I can access the embeddings produced by the algorithm. Are they in the mdb files? I have been trying to read data out of them, and can read the key/value pairs, but the values seem to be binary encoded, and I am not sure where they came from. Any insight is appreciated! Thanks again!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.