Giter Site home page Giter Site logo

cripac-dig / grace Goto Github PK

View Code? Open in Web Editor NEW
303.0 7.0 55.0 122 KB

[GRL+ @ ICML 2020] PyTorch implementation for "Deep Graph Contrastive Representation Learning" (https://arxiv.org/abs/2006.04131v2)

License: Apache License 2.0

Python 100.00%
contrastive-learning graph-representation-learning deep-learning machine-learning

grace's Introduction

GRACE

model

This is the code for the Paper: deep GRAph Contrastive rEpresentation learning (GRACE).

For a thorough resource collection of self-supervised learning methods on graphs, you may refer to this awesome list.

Usage

Train and evaluate the model by executing

python train.py --dataset Cora

The --dataset argument should be one of [ Cora, CiteSeer, PubMed, DBLP ].

Requirements

  • torch 1.4.0
  • torch-geometric 1.5.0
  • sklearn 0.21.3
  • numpy 1.18.1
  • pyyaml 5.3.1

Install all dependencies using

pip install -r requirements.txt

If you encounter some problems during installing torch-geometric, please refer to the installation manual on its official website.

Citation

Please cite our paper if you use the code:

@inproceedings{Zhu:2020vf,
  author = {Zhu, Yanqiao and Xu, Yichen and Yu, Feng and Liu, Qiang and Wu, Shu and Wang, Liang},
  title = {{Deep Graph Contrastive Representation Learning}},
  booktitle = {ICML Workshop on Graph Representation Learning and Beyond},
  year = {2020},
  url = {http://arxiv.org/abs/2006.04131}
}

grace's People

Contributors

linyxus avatar opilgrim avatar sxkdz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

grace's Issues

Unfair comparison with other models.

In eval.py, train/test split follows a 90% / 10% mannner instead of that of public split. While the baseline models(e.g. DGI) use public split for evaluation.

About function batched_semi_loss()

Thank you for your efforts! When I utilize this model to process a large graph, I've observed that the space complexity of the batched loss becomes O(N^2). Consequently, the GPU memory usage steadily increases while the loop 'for i in range(num_batches):' is executed. Looking forward to your reply!

loss function

I wonder if the loss function in the paper was first proposed by you? Thank you very much!

Scaling to larger datasets

Thanks for your awesome work! I am trying to apply GRACE to larger datasets, but according to your code, the training process is conducted in a full-batch way which hinders the scalability. In your paper, it is mentioned that EIGHT GPUs are used, could you please kindly share the way you implement it? As far as I know, PyG only supports multi-graph distributed computation. Also, it deserves many thanks if you could provide me with other suggestions! Looking forward to your reply!!

How to use graph augmentation when learning on large graphs

Hello! Thanks for the codes! I have a question on how to use the augmentation including RE and MF mentioned in the paper on a large graph. Now, I randomly select a minibatch of nodes from the large graph, and then a subgraph can be generated. I sample 15,10,5 neighbours at the first-, second-, and third-hop. However, I am not sure how to do the graph augmentation to obtain two views. Should I generate two views of the raw large graph first and then sample subgraphs, or should I produce two views for the subgraphs at the first-hop? Could you please upload codes on large graph training? Thank you very much!!!

How to process the dataset?

Hi, author.
I have some confusion about how to process the dataset. In your project, I could not find and understand this method. Can you push your code about processing dataset please? I will appreciate your generousness.

question about hidden dim

In your implementation setting, such as in Cora, hidden dim = 128, but in your code, you double it to 2 * out_channels, is this reasonable? Apparently the current dimension is 256 in your code.

class Encoder(torch.nn.Module):
    def __init__(self, in_channels: int, out_channels: int, activation,
                 base_model=GCNConv, k: int = 2):
        super(Encoder, self).__init__()
        self.base_model = base_model

        assert k >= 2
        self.k = k
        self.conv = [base_model(in_channels, 2 * out_channels)]
        for _ in range(1, k-1):
            self.conv.append(base_model(2 * out_channels, 2 * out_channels))
        self.conv.append(base_model(2 * out_channels, out_channels))
        self.conv = nn.ModuleList(self.conv)

        self.activation = activation

    def forward(self, x: torch.Tensor, edge_index: torch.Tensor):
        for i in range(self.k):
            x = self.activation(self.conv[i](x, edge_index))
        return x

about citeseer result

Hello, thank you very much for providing the source code, but I am having some problems reproducing Citeseer's results, no matter how many times I run it, the best F1 result is only about 68%, and I can't reach the results in the paper.

Question about PubMed performance

Hi,

In your paper, GRACE achieves 86.7% in Pubmed, and DGI achieves 86% in Pubmed. However, in the DGI paper, the performance of DGI only achieves 76.8% in Pubmed. I also notice you follow the DGI setting in your experiments. How did you make an improvement of almost 10% on DGI?

Question about dataset split

Hi,
Is there any reason that you use random split instead of the public split? I can't get similar performance in Cora public split compared to in random split based on the same hyperparameters setting (80.5 vs 83.3 in accuracy ).

Your requirements are not fine

I think you should mention all the required versions to compile your codes.
There is no version information of these below, as a result I can not compile your codes.
torch-scatter
torch-sparse
torch-cluster
torch-spline

You can see from the screenshot, torch-sparce is already installed but it still says there is an import error. I think this is because of the version issue. kindly state the correct versions. I want to use your work in my literature and paper.

image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.