RMPI

Code and Data for the submission: "Relational Message Passing for Fully Inductive Knowledge Graph Completion".

In this work, we propose a novel method named RMPI which uses a novel Relational Message Passing network for fully Inductive knowledge graph completion, where the KG is completed with unseen entities and unseen relations newly emerged during testing.
Our proposed RMPI passes messages directly between relations to make full use of the relation patterns for subgraph reasoning with new techniques on graph transformation, graph pruning, relation-aware neighbourhood attention, addressing empty subgraphs, etc., and can utilize the relation semantics defined in the KG's ontological schema.
Extensive evaluation on multiple benchmarks has shown the effectiveness of RMPI's techniques and its better performance compared with the state-of-the-art methods that support fully inductive KGC as well traditional partially inductive KGC.

Requirements

The model is developed using PyTorch with environment requirements provided in requirements.txt.

Dataset Illustrations

Each benchmark consists of a training graph and a testing graph.

In partially inductive KGC, the training graph is denoted as "XXX_vi", and the testing graph is denoted as "XXX_vi_ind", where "XXX" means different KGs including WN18RR, FB15k-237 and NELL-995, and "i" means the version index;
In fully inductive KGC, the training graph is denoted as "XXX_vi", and the testing graph is denoted as "XXX_vi_ind_vj_semi" for testing with semi unseen relations and "XXX_vi_ind_vj_fully" for testing with fully unseen relations, where "j" indicates which version of partially inductive benchmark the testing graph comes from.

For example, for a dataset of NELL-995.v2.v3 in the paper, "nell_v2" is its training graph, while "nell_v2_ind_v3_semi" is its testing graph in the test seting of testing with semi unseen relations.

Model Illustrations

We provide the codes for our RMPI and a baseline of TACT, and augment them using ontological schemas with the codes contained in the folder RMPI_S and TACT_Base_S, respectively.

Basic Training and Testing of RMPI and its variants

To train the model (taking NELL-995.v2 as an example):

# for RMPI-base
python RMPI/train.py -d nell_v2 -e nell_v2_RMPI_base --ablation 0
# for RMPI-NE with summation-based fusion function
python RMPI/train.py -d nell_v2 -e nell_v2_RMPI_NE --ablation 1
#for RMPI-NE with concatenation-based fusion function
python RMPI/train.py -d nell_v2 -e nell_v2_RMPI_NE_conc --ablation 1 --conc
# for RMPI-TA
python RMPI/train.py -d nell_v2 -e nell_v2_RMPI_TA --ablation 0 --target2nei_atten
# for RMPI-NE-TA with summation-based fusion function
python RMPI/train.py -d nell_v2 -e nell_v2_RMPI_NE_TA --ablation 1  --target2nei_atten
# for RMPI-NE-TA with concatenation-based fusion function
python RMPI/train.py -d nell_v2 -e nell_v2_RMPI_NE_conc_TA --ablation 1 --conc --target2nei_atten

-e represents the saved model names, depending on the applied variants.
-d represents the target benchmarks.

To test the model (taking RMPI-base and testing with semi unseen relation as an example):

Fully inductive case & Triple classification

python RMPI/test_auc_F.py -d nell_v2_ind_v3_semi -e nell_v2_RMPI_base --ablation 0

Fully inductive case & Entity Prediction

python RMPI/test_ranking_F.py -d nell_v2_ind_v3_semi -e nell_v2_RMPI_base --ablation 0

Partially inductive case & Entity Prediction

python RMPI/test_ranking_P.py -d nell_v2_ind -e nell_v2_RMPI_base --ablation 0

RMPI_S is trained and tested in a similar way.

Basic Training and Testing of TACT and TACT-base

To train the model (taking NELL-995.v2 as an example):

# for TACT-base
python TACT/train.py -d nell_v2 -e nell_v2_TACT_base --ablation 3
# for TACT full model
python TACT/train.py -d nell_v2 -e nell_v2_TACT --ablation 0

To test the model (taking "TACT-base & Fully inductive case & testing with semi unseen relation & Triple classification" as an example):

python TACT/test_auc_F.py -d nell_v2_ind_v3_semi -e nell_v2_TACT_base --ablation 3

TACT_S is trained and tested in a similar way.

Pre-training of Ontological schema

The pre-training is implemented by running the open codes of TransE provided in RotatE on the schema graph.
The schema graph (Schema-NELL.csv) and pre-trained embeddings have been attached in the folder data/external_rel_embeds. Thanks for the resource from KZSL.

Some Results for Supplementing Main Paper

1. Entity Prediction on WN18RR in the partially inductive KGC.

Method	WN18RR.v1				WN18RR.v2
Method	MRR	Hits@1	Hit@5	Hit@10	MRR	Hits@1	Hit@5	Hit@10
TACT-base	80.62	77.93	82.45	82.45	78.11	76.76	78.68	78.68
TACT	79.56	76.33	82.45	82.45	78.55	76.87	78.68	78.68
RMPI-base	79.69	76.60	82.18	82.45	78.02	76.53	78.68	78.68
RMPI-NE	81.58	75.53	88.03	89.63	81.07	78.68	82.31	83.22
RMPI-TA	69.73	58.51	82.45	82.45	78.13	76.76	78.68	78.68
RMPI-NE-TA	81.74	77.13	86.44	87.77	81.34	79.82	81.97	82.43

Method	WN18RR.v3				WN18RR.v4
Method	MRR	Hits@1	Hit@5	Hit@10	MRR	Hits@1	Hit@5	Hit@10
TACT-base	54.42	50.58	57.19	58.84	73.35	72.29	73.34	73.34
TACT	54.21	50.00	57.19	58.60	73.28	72.04	73.41	73.41
RMPI-base	55.93	52.56	58.18	58.68	73.43	72.32	73.41	73.41
RMPI-NE	64.85	60.17	68.43	70.33	77.14	74.95	78.13	79.81
RMPI-TA	56.23	52.81	57.93	58.84	73.68	72.15	73.41	73.41
RMPI-NE-TA	65.62	60.08	70.08	73.14	77.68	74.84	79.50	81.42

2. Entity Prediction on NELL-995.v1 in the partially inductive KGC.

Method	NELL-995.v1
Method	MRR	Hits@1	Hit@5	Hit@10
TACT-base	49.76	44.00	53.00	56.50
TACT	47.68	43.50	48.00	51.50
RMPI-base	53.43	48.00	57.00	59.50
RMPI-NE	54.05	49.50	55.00	60.50
RMPI-TA	48.97	44.00	52.50	53.00
RMPI-NE-TA	54.24	50.00	55.50	60.50

Computation Efficiency Analysis

The running time mainly includes the time on subgraph preparation and subgraph prediction. Since the subgraph can be extracted in advance and saved to save running time, we prefer to discuss the computation cost during graph message passing, which is highly relevant to the size of the extracted subgraph. Therefore, we count the processing time of different models on subgraphs of different sizes.

Specifically, we use the number of edges in the entity-view subgraph (i.e., the number of nodes in the transformed relation-view subgraph) to describe the graph size, and run RMPI-base, RMPI-TA and RMPI-NE (with summation-based fusion function) on CPU to test the triples in the partially inductive setting with subgraph sizes of around 100, 1000, 5000 and 20000. The averaged inference time (seconds) are listed as follows.

Since the model can be trained offline, we mainly concern the model inference time here;

Since the subgraphs with the size of 20000 lead to out-of-memory problem on our GPU device (GeForce GTX 1080 with 12GB RAM), we report the time of inference on CPU (Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz), when using GPU, the inference time on graphs with other sizes is slightly less than that using CPU, in the future, we will test the large-size graphs with GPU.

Method	Graph Size
Method	100	1000	5000	20000
RMPI-base	0.031	0.053	0.132	7.131
RMPI-TA	0.036	0.058	0.202	21.311
RMPI-NE	0.053	0.059	0.159	8.539

hzwy3c / rmpi Goto Github PK

rmpi's Introduction

RMPI

Requirements

Dataset Illustrations

Model Illustrations

Basic Training and Testing of RMPI and its variants

Basic Training and Testing of TACT and TACT-base

Pre-training of Ontological schema

Some Results for Supplementing Main Paper

1. Entity Prediction on WN18RR in the partially inductive KGC.

2. Entity Prediction on NELL-995.v1 in the partially inductive KGC.

Computation Efficiency Analysis

rmpi's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent