cripac-dig / grace Goto Github PK

[GRL+ @ ICML 2020] PyTorch implementation for "Deep Graph Contrastive Representation Learning" (https://arxiv.org/abs/2006.04131v2)

License: Apache License 2.0

Python 100.00%

contrastive-learning graph-representation-learning deep-learning machine-learning

grace's Introduction

GRACE

This is the code for the Paper: deep GRAph Contrastive rEpresentation learning (GRACE).

For a thorough resource collection of self-supervised learning methods on graphs, you may refer to this awesome list.

Usage

Train and evaluate the model by executing

python train.py --dataset Cora

The --dataset argument should be one of [ Cora, CiteSeer, PubMed, DBLP ].

Requirements

torch 1.4.0
torch-geometric 1.5.0
sklearn 0.21.3
numpy 1.18.1
pyyaml 5.3.1

Install all dependencies using

pip install -r requirements.txt

If you encounter some problems during installing torch-geometric, please refer to the installation manual on its official website.

Citation

Please cite our paper if you use the code:

@inproceedings{Zhu:2020vf,
  author = {Zhu, Yanqiao and Xu, Yichen and Yu, Feng and Liu, Qiang and Wu, Shu and Wang, Liang},
  title = {{Deep Graph Contrastive Representation Learning}},
  booktitle = {ICML Workshop on Graph Representation Learning and Beyond},
  year = {2020},
  url = {http://arxiv.org/abs/2006.04131}
}

grace's People

Contributors

Stargazers

Watchers

Forkers

freekang liun-online dendisuhubdy laurendelong21 geniusyx marlin-github johnlq chang111 gnn2qsu jerry185 aaaeeee jerry2398 csudragonzl jjzhou012 seongjinahn zshwuhan liuchuang0059 sangminwoo harry-zhou shuowang-ai tianyuzelin laplacekorea kang9779 devvrit shubhamguptaiitd g-taxonomy-workgroup susan1314 summer-sea lordoz234 almogdavid natavidad whuhxb struggle-forever alexzheng111 sugarlemons 2016102050016 shenzhiyang2000 tozero2520 voidharuhi techthiyanes anderstudio yanacademy frfy meng-tao lqmmring jinzong53 yun-fu lordlky difonjohaiv leader120 changjiale3 bachthanh kimdyun yrjia1015 leotonn

grace's Issues

Unfair comparison with other models.

In eval.py, train/test split follows a 90% / 10% mannner instead of that of public split. While the baseline models(e.g. DGI) use public split for evaluation.

About function batched_semi_loss()

Thank you for your efforts! When I utilize this model to process a large graph, I've observed that the space complexity of the batched loss becomes O(N^2). Consequently, the GPU memory usage steadily increases while the loop 'for i in range(num_batches):' is executed. Looking forward to your reply!

loss function

I wonder if the loss function in the paper was first proposed by you? Thank you very much!

ascii‘ codec can‘t encode characters...

just delete "n_jobs=8"

Scaling to larger datasets

Thanks for your awesome work! I am trying to apply GRACE to larger datasets, but according to your code, the training process is conducted in a full-batch way which hinders the scalability. In your paper, it is mentioned that EIGHT GPUs are used, could you please kindly share the way you implement it? As far as I know, PyG only supports multi-graph distributed computation. Also, it deserves many thanks if you could provide me with other suggestions! Looking forward to your reply!!

How to use graph augmentation when learning on large graphs

Hello! Thanks for the codes! I have a question on how to use the augmentation including RE and MF mentioned in the paper on a large graph. Now, I randomly select a minibatch of nodes from the large graph, and then a subgraph can be generated. I sample 15,10,5 neighbours at the first-, second-, and third-hop. However, I am not sure how to do the graph augmentation to obtain two views. Should I generate two views of the raw large graph first and then sample subgraphs, or should I produce two views for the subgraphs at the first-hop? Could you please upload codes on large graph training? Thank you very much!!!

How to process the dataset?

Hi, author.
I have some confusion about how to process the dataset. In your project, I could not find and understand this method. Can you push your code about processing dataset please? I will appreciate your generousness.

question about hidden dim

In your implementation setting, such as in Cora, hidden dim = 128, but in your code, you double it to 2 * out_channels, is this reasonable? Apparently the current dimension is 256 in your code.

class Encoder(torch.nn.Module):
    def __init__(self, in_channels: int, out_channels: int, activation,
                 base_model=GCNConv, k: int = 2):
        super(Encoder, self).__init__()
        self.base_model = base_model

        assert k >= 2
        self.k = k
        self.conv = [base_model(in_channels, 2 * out_channels)]
        for _ in range(1, k-1):
            self.conv.append(base_model(2 * out_channels, 2 * out_channels))
        self.conv.append(base_model(2 * out_channels, out_channels))
        self.conv = nn.ModuleList(self.conv)

        self.activation = activation

    def forward(self, x: torch.Tensor, edge_index: torch.Tensor):
        for i in range(self.k):
            x = self.activation(self.conv[i](x, edge_index))
        return x

about citeseer result

Hello, thank you very much for providing the source code, but I am having some problems reproducing Citeseer's results, no matter how many times I run it, the best F1 result is only about 68%, and I can't reach the results in the paper.

Question about PubMed performance

Hi,

In your paper, GRACE achieves 86.7% in Pubmed, and DGI achieves 86% in Pubmed. However, in the DGI paper, the performance of DGI only achieves 76.8% in Pubmed. I also notice you follow the DGI setting in your experiments. How did you make an improvement of almost 10% on DGI?