weihua916 / powerful-gnns Goto Github PK

View Code? Open in Web Editor NEW

1.2K 1.2K 222.0 31.78 MB

How Powerful are Graph Neural Networks?

License: MIT License

Python 100.00%

deep-learning graph-convolutional-neural-networks graph-neural-networks

powerful-gnns's People

Contributors

Stargazers

Watchers

Forkers

david-lee-1990 gear rongchangzhao jdc08161063 likeucode mutual-ai sunke-github hdjsjyl zhangguanghui1 yanyan-cas wangkanger tandachi hongminli liujigang82 zwytop yangyugit jy5380 acecreamu raymondhliu xuyongzhi m-ak peng-liu tsuyoikaze researchingdexter arjunchandra bigdatamatta huhong12345 russchua ggzhang0071 afawa ibulu sharedcare tonydeep vikasmech ianliyi1996 aaaeeee angus1996 gaoleitao zlannnn embeddedsamurai sayduke sunstarchan zongyueqin ajs1ngh wuyang556 johnyfeng usmannazir gdreiman1 zcrwind juexinwang zhao1402072392 aoe-khkhan lyf14020510036 rxlgq sienna13 fuxianghua zhuhm1996 jiaxu0017 songzhen-neu a1564740774 raynoldng alexzhang-2019 liumarcus70s hxl523 sundaytomorrow patriciaxiao kezhende avltreeql qiaomuren97 e0hyl junwei121 milllllk hhh920406 foxtrotmike maguileracanon xptree samzhaoziran zaixizhang wu-patrick liuxz17 xyzhu12 shanwf xiandshi w-garcia joaorico margokhokhlova fioushen liangzai951 opprash zyh2011 chaoyan1037 chaowe emir-liu baical77 sailfish009 gavinwb youjibiying githubgreat886 ruiatelsevier yl1

powerful-gnns's Issues

dataset

Hello,weihua,thanks for your sharing this code.I just started to touch the knowledge of the graph neural network,soI am not very clear about something. I would like to ask what is the MUTAG data set.Like what's the nodes or edges.etc.Thanks^_^

Inconsistent dataset description and actual data

Hi,
I'm looking at Table 1 in the paper, and the number of classes associated with the datasets does not match the description of the data in the appendix. For example, the MUTAG dataset has 2 classes according to the table (and the actual data labels that I checked, which are either 1 or -1), versus in the appendix it says that the dataset has 7 discrete labels. Was wondering if you could clarify the disagreement.

Thank you!

Dropout in last layer

Thank you for your work.
You have used dropout prior to computing the output from each layer. What is the role of this dropout?
See:

powerful-gnns/models/graphcnn.py

Line 225 in f2626e7

    
           score_over_layer += F.dropout(self.linears_prediction[layer](pooled_h), self.final_dropout, training = self.training)

Data preprocessing

Hi,
I want to use your GIN implementation for my own dataset. I don't understand how should I prepare the initial txt file for my dataset.
can you explain it, please.
Thanks

Ask for information of discrete labels

Hello authors,

Thank you for the great work!

I would like to know about the information of discrete labels for all datasets.

In each dataset, I can only see the ".txt" file which includes only discrete numbers. Therefore, it's hard for me to guess which category belongs to what chemical element.

Thank you,

Apply GIN to node classification

Hi and thanks for sharing your code.
When applying GIN to node classification task for example on cora dataset, the accuracy is low.
You said in the paper that for mean aggregation and linear function GIN is GCN. I use the DGL implementation of GIN for node classification but I can't produce accuracy near to GCN.
IS there a need for some preprocessing when applying GIN on node classification?

result of paper

I am a beginner. I want to ask you how you got the result of your paper.The results of each validation are the max, and then the max in 10?

Incorrect reference to updated row length when meant to compare old row length

In line 66 of util.py, if tmp > len(row):, I think the intention is to compare tmp against the length of the row just after reading a line of the data file.

However, in line 60, row is updated to be a subarray of length tmp. This is the else clause of tmp == len(row).

This means line 66 will never be satisfied.

What is the meaning of the phrase "perform 10-fold cross-validation with LIB-SVM" in the paper?

I don't see you using LIB-SVM in the source code, so it's meaning that the cross-validation strategy of LIB-SVM is used? I.e. the one you wrote in the README?

Custom dataset creation

Hi, thank you for such an impressive work. I would like to apply your algorithm to my task, so I need to create dataset which will fit your code. I have, say, 150 graphs of 200 nodes each, where all the nodes are equal.

I'm trying to understand your txt files, but have some issues with that.
For example:

10 0
0 3 1 2 9
0 3 0 2 9
0 4 0 1 3 9
0 3 2 4 5
0 3 3 5 6
0 5 3 4 6 7 8
0 4 4 5 7 8
0 3 5 6 8
0 3 5 6 7
1 3 0 1 2

My assumption is that the block correspond to a graph, 10 is a number of nodes while 0 is a graph class label. Each row correspond to a node, where first value is a (node label I guess?), second is a number of links, and other are connections. Is that right? Is row number correspond to the node index?

Cannot reproduce result on COLLAB！

I used raw code / pyg /dgl to reproduce experiments on COLLAB and the accuracy was always below 70%.

Hyperparameters to replicate reported performance

Thank you for the great work!

I would like to ask the correct hyperparameters for each of the datasets in order to replicate the paper reported result. Thank you!

Possible bug in `load_data()`

For line 52 & 71 in util.load_data(), current node is assumed to be j instead of row[0], isn't this bug here? When reading NCI1.txt, wrong data will be assigned when line 25 reached.

Reproduce Issues

Hi, I used the same codes and datasets to tune the parameters provided by the paper. The random seed is set by 0. The followings are the results:

Where the first line represents results from the paper and the second line represents experimental results I conducted.
As you can see, I can not reproduce the results of the paper on many datasets. Would you tell me how to reproduce your results?

About node attributes

Hello, your program only uses the node label as input. If you want to add the node attributes as input, how should the program deal with the label

Low accuracy

Hi:
I use your published of this paper, I can't reproduce the result. For example, MUTAG, test accuracy is very low, about 70 percent while train accuracy is near to 1. There occurs overfitting I think. Do you meet this before and how to fix it ?

Question about input data

Hi. Thanks for great work.
I have a question about input data, specifically for node id numbering.
Is a node id unique across all graphs? or unique in a graph?

I mean, graph 1 contains node A, B, C, and graph 2 contain node A, D, E.
Then, node ids for graph 1 and 2 are both node 1, 2, 3 (for inductive setting).
If transductive setting, node ids are 1, 2, 3 for graph1 and 1, 4, 5 for graph2.

Is it correct?

About node smoothing

Hello, excellent job！

I have a little question about GIN.
Does GIN make nodes smoother and smoother like GCN?
If not, does node feature represent the structure around this node?

Thanks.

Problem.

COLLAB

Dear Author,

I could reproduce task including MYTAG,PTC,NCl1,PROTEINS,IMDB-B,IMDB-M,RDT-M5K using the parameters of your paper, however, I could not reproduce the results for RDT-B and COLLAB using the parameters.

Might I get your advice or suggestion about reproducing the results of COLLAB and RDT-B.

Regards