Giter Site home page Giter Site logo

g2gnn's Introduction

This repository is an $1^{st}$ official PyTorch(Geometric) implementation of GNN-based machine learning models for handling imbalanced graph classification.

See more details for the paper 'Imbalanced Graph Classification via Graph-of-Graph Neural Networks'

If you use this code, please consider citing:

@inproceedings{wang2022imbalance,
author = {Wang, Yu and Zhao, Yuying and Shah, Neil and Derr, Tyler},
title = {Imbalanced Graph Classification via Graph-of-Graph Neural Networks},
year = {2022},
booktitle = {Proceedings of the 31st ACM International Conference on Information & Knowledge Management},
pages = {2067โ€“2076},
numpages = {10},
series = {CIKM '22}
}

Requirements

  • PyTorch 1.11.0+cu113
  • PyTorch Geometric 2.0.4
  • torch-scatter 2.0.9
  • torch-sparse 0.6.15
  • torch-cluster 1.6.0

Note that the version of the PyTorch Geometric/scatter/sparse/cluster used here is not the very latest one. The current used versions can be intalled via:

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113
pip install torch-geometric==2.0.4
pip install torch-scatter -f https://data.pyg.org/whl/torch-1.11.0+cu113.html
pip install torch-sparse -f https://data.pyg.org/whl/torch-1.11.0+cu113.html
pip install torch-cluster -f https://data.pyg.org/whl/torch-1.11.0+cu113.html
pip install torch-spline-conv -f https://data.pyg.org/whl/torch-1.11.0+cu113.html

Implemented GNN backbones

  • [ICLR 2019] GIN-How Powerful Are Graph Neural Networks? [paper]
  • [ICLR 2020] InfoGraph-InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization [paper] - incoming soon!
  • [Neurips 2020] GraphCL-Graph Contrastive Learning with Augmentations [paper] - incoming soon!

Implemented strategies for handling imbalance issue in graph classification

  • [ICML 1997] Upsampling: Addressing the curse of imbalanced training sets: one-sided selection [paper]
  • [IJCNN 2012] Reweight: Sampling + reweighting: Boosting the performance of AdaBoost on imbalanced datasets [paper]
  • [JAIR 2002] SMOTE: SMOTE: Synthetic Minority Over-sampling Technique [paper]
  • [CIKM 2022] GoG: Imbalanced Graph Classification via Graph-of-Graph Neural Networks [paper]
  • [CIKM 2022] Data-augmentation: Imbalanced Graph Classification via Graph-of-Graph Neural Networks [paper]
    • Edge Removal + consistency regularization
    • Node Mask + consistency regularization

Run

Note that compared to the previous verion of this repository, we move the K-nearest neighbor search in topological space into the batch-processing, which hence can be speed up due to parallel preparation. Furthermore, to solve the undeterministic issue, we replace the original scatter-based message-passing/pooling with sparse-matrix multiplication-based message-passing and segment_csr-based pooling, see more details [here]

To reproduce results in Table 2, please run the following code:

bash run_{dataset}.sh

To reproduce results in Figure 2, please run the following code:

python experiment.py

g2gnn's People

Contributors

yuwanguo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

g2gnn's Issues

Evaluation for missing datasets

Hi thank you for a well written repo, I have some queries that I wish you can help me with

  1. I notice that the evaluation scripts for Reddit-B and NCI1 datasets there is no scripts provided for replicating the benchmark form table2. Will you be adding the scripts?

  2. You mention the table 2 was considered by observing 50 different data splits. Please could you elaborate why so many cases and what were you able to observe in those splits. how would one test them out using your code?

Thank you

Overall_reweight and Batch_reweight are not running!!!

Hello Dear,
Thank you for sharing your research paper source code here. I think code is only running with "reweight" instead. Otherwise we will get below error:

G2GNN/learn.py", line 192, in train
loss.backward()
UnboundLocalError: local variable 'loss' referenced before assignment

Please correct me if I am wrong.

Thank you so much.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.