snap-stanford / graphgym Goto Github PK
View Code? Open in Web Editor NEWPlatform for designing and evaluating Graph Neural Networks (GNN)
License: Other
Platform for designing and evaluating Graph Neural Networks (GNN)
License: Other
Hi! I'm a little confused if it's actually possible to use my own dataset in a PyG format or not.
load_pyg function kind of suggests that the name of a dataset can only be one of the fixed ones (CiteSeer, PPI, Cora, etc) while the readme states that a PyG upload should be possible: "GraphGym currently accepts a list of NetworkX graphs or PyG datasets.".
If PyG format for the custom data is not possible then I guess it must be networkx. Is there any example how those graphs should look like? In particular, I'm interested in graph classification, if that matters.
Olga
I have so far tried it with torch1.4.0, 1.5.0 and lately 1.7.0 on my ubuntu18.04. I successfully installed them, but faced errors when I ran the step 6 Test the installation. The 1.7.0 is a step closer but it still had the following error when 'bash run_single.sh'. I searched and found nowhere to see libcusparse.so.
Traceback (most recent call last):
File "main.py", line 11, in <module>
from graphgym.loader import create_dataset, create_loader
File "/home/hdd2nd/dev/projects/gnn/GraphGym/graphgym/loader.py", line 6, in <module>
from deepsnap.dataset import GraphDataset
File "/home/hdd2nd/dev/projects/gnn/DeepSNAP/deepsnap/__init__.py", line 5, in <module>
import deepsnap.graph
File "/home/hdd2nd/dev/projects/gnn/DeepSNAP/deepsnap/graph.py", line 9, in <module>
from torch_geometric.utils import to_undirected
File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/__init__.py", line 5, in <module>
import torch_geometric.data
File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
from .data import Data
File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/data/data.py", line 8, in <module>
from torch_sparse import coalesce, SparseTensor
File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_sparse/__init__.py", line 15, in <module>
f'{library}_{suffix}', [osp.dirname(__file__)]).origin)
File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch/_ops.py", line 105, in load_library
ctypes.CDLL(path)
File "/home/hdd2nd/dev/miniconda3/envs/graphgym/lib/python3.7/ctypes/__init__.py", line 364, in __init__
self._handle = _dlopen(self._name, mode)
OSError: libcusparse.so.10: cannot open shared object file: No such file or directory
I also listed the installation steps as follows for reference. I successfully installed all by following closely the instructions on https://github.com/snap-stanford/GraphGym. Note that I installed torch-geometric following https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html because the commands from GraphGym didn't work.
>conda create -n graphgym python=3.7
>conda activate graphgym
>pip install torch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0
>(graphgym) $ pip install torch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0
>(graphgym) $ pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html
>(graphgym) $ pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html
>(graphgym) $ pip install torch-cluster -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html
>(graphgym) $ pip install torch-spline-conv -f https://pytorch-geometric.com/whl/torch-1.7.0+cu102.html
>(graphgym) $ pip install torch-geometric
>(graphgym) $ git clone https://github.com/snap-stanford/DeepSNAP
>(graphgym) $ cd DeepSNAP
>(graphgym) $ pip install -e .
>(graphgym) $ cd ..
>(graphgym) $ git clone https://github.com/snap-stanford/GraphGym
>(graphgym) $ cd GraphGym/
>(graphgym) $ pip install -r requirements.txt
>(graphgym) $ pip install -e .
Thank you
Hey there,
it seems like there is a weird flow-of-control issue when registering data loaders.
If you simply register another loader function in contrib/loader/example.py
def load_dataset_example2(format, name, dataset_dir):
pass
register_loader('example2', load_dataset_example2)
everything works fine and the register.loader_dict
is properly populated in loader.load_dataset
.
However, if instead one creates a separate file contrib/loader/myloader.py
with the same contents, the loader is not available in load_dataset
.
Stepping through with the debugger shows that register_loader
is indeed called both times for each loader, but `loader_dict´ is actually empty when either loader is added.
A wild guess is that register.py
is ran twice and as such register_dict
is being reset to {}
.
Can anyone reproduce this? Or am I misunderstanding something?
Cheers,
Ben
Hi, I can't find MetaLink
Hi,
When I run the script 'run_metalink.sh' (in brach meta_link), an ImportError occurs. The traceback messages are shown as follows.
Traceback (most recent call last): File "main.py", line 8, in <module> from graphgym.config import cfg, dump_cfg, load_cfg, set_out_dir, set_run_dir ImportError: cannot import name 'set_out_dir' from 'graphgym.config' (/usr/local/lib/python3.8/dist-packages/graphgym/config.py)
How can I fix the issue?
How hard would it be to implement a custom data loader module to support offline RL methods?
I encountered the issue when I tried the run_single.sh. After checking the doc of DeepSNAP, I think this line should be commented out
Line 211 in b36bdaf
Hello,
I am interested in reproducing the IDGNN's results on node classification. I looked at the code and had a few quick questions
When doing node classification, all configurations are in (https://github.com/snap-stanford/GraphGym/blob/master/run/grids/IDGNN/graph.txt )?
Is it correct here that dataset.augment_feature feature = 'node_identity' means ID-GNN fast?
If 2 is true, May I know where the node identity, i.e. cycle information, is used (position in the code) and how to use in code level?
Thank you so much for taking the time out of your busy schedule.
When I make a run with cfg.dataset.task_type = 'regression'
, the code crashes at the end of the run. The error message is:
Traceback (most recent call last):
File "main_pyg.py", line 55, in <module>
agg_runs(cfg.out_dir, cfg.metric_best)
File "~/Code/GraphGym/graphgym/utils/agg_runs.py", line 100, in agg_runs
[stats[metric] for stats in stats_list])
File "~/Code/GraphGym/graphgym/utils/agg_runs.py", line 100, in <listcomp>
[stats[metric] for stats in stats_list])
KeyError: 'accuracy'
The problem seems to be that accuracy
is not a metric logged for regression tasks. Here are the relevant lines in agg_runs.py
:
if metric_best == 'auto':
metric = 'auc' if 'auc' in stats_list[0] else 'accuracy'
Here's a fix:
if metric_best == 'auto':
if cfg.dataset.task_type == 'classification':
metric = 'auc' if 'auc' in stats_list[0] else 'accuracy'
elif cfg.dataset.task_type == 'regression':
metric = 'mse'
I am running a graph classification model on ogb-molhiv.
My model's forward function is being passed a batch with the following fields/shapes.
Batch(G=[128], batch=[3512], edge_feature=[7498, 3], edge_index=[2, 7498], edge_label_index=[2, 7498], graph_label=[128, 1], node_feature=[3512, 128], node_label_index=[3512], task=[128])
This dataset has ~40,000 graphs with ~25 nodes per graph. The batch info is present in the batch.batch tensor, but how do I do batch training here, processing each batch independently i.e. predict y0 given nodeset0 and edgeset0?
I forked the repo and include a comparison here: master...jkamalu:gmt
Hello,
I am trying to load a dataset and to keep the dataset split, as masks already exist.
I realized there exists an argument that controls this:
@staticmethod
def pyg_to_graphs( dataset, verbose: bool = False, fixed_split: bool = False, tensor_backend: bool = False, netlib=None ) -> List[Graph]:
r"""
Transform a :class: torch_geometric.data.Dataset object to a
list of :class:deepsnap.grpah.Graph objects.
Args:
dataset (:class:`torch_geometric.data.Dataset`): A
:class:`torch_geometric.data.Dataset` object that will be
transformed to a list of :class:`deepsnap.grpah.Graph`
objects.
verbose (bool): Whether to print information such as warnings.
fixed_split (bool): Whether to load the fixed data split from
the original PyTorch Geometric dataset.
tensor_backend (bool): `True` will use pure tensors for graphs.
netlib (types.ModuleType, optional): The graph backend module.
Currently DeepSNAP supports the NetworkX and SnapX (for
SnapX only the undirected homogeneous graph) as the graph
backend. Default graph backend is the NetworkX.
Returns:
list: A list of :class:`deepsnap.graph.Graph` objects.
"""
However, when I run it, it always propagates with errors. Now I'm not sure whether it is implemented until the end or it is yet to be done.
I would appreciate your help and instructions on how can I accomplish this.
Best,
I had installed the graph gym step by step as readme mentioned, But I got this error:
Traceback (most recent call last):
File "main.py", line 5, in
from torch_geometric import seed_everything
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_geometric/init.py", line 1, in
import torch_geometric.utils
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_geometric/utils/init.py", line 3, in
from .scatter import scatter
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_geometric/utils/scatter.py", line 7, in
import torch_geometric.typing
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_geometric/typing.py", line 37, in
import torch_sparse # noqa
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_sparse/init.py", line 40, in
from .tensor import SparseTensor # noqa
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_sparse/tensor.py", line 13, in
class SparseTensor(object):
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch/jit/_script.py", line 974, in script
_compile_and_register_class(obj, _rcb, qualified_name)
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch/jit/_script.py", line 67, in _compile_and_register_class
torch._C._jit_script_class_compile(qualified_name, ast, defaults, rcb)
RuntimeError:
Tried to access nonexistent attribute or method 'crow_indices' of type 'Tensor'.:
File "/data02/home/scv9476/.conda/envs/graphgym_env/lib/python3.7/site-packages/torch_sparse/tensor.py", line 109
def from_torch_sparse_csr_tensor(self, mat: torch.Tensor,
has_value: bool = True):
rowptr = mat.crow_indices()
~~~~~~~~~~~~~~~~ <--- HERE
col = mat.col_indices()
How Can I solve this issue?
Thank you for the nice library. but i got some troubles in MetaLink, i sincerely wish you to give some help:
Some default settings in run/config/MetaLink/mol_classification.yaml
can not be found in graphgym/config.py
, such like cfg.dataset.subgraph = False
, cfg.kg.xxx = xxx
. I copied some 'kg' settings from graphgym/contrib/config/metalink.py
to graphgym/config.py
for now, but i wonder if there is another config.py
for MetaLink?
I have downloaded several datasets like tox21 and toxcast, but it seems not suitble, i don't have some csv files that are required in graphgym/contrib/loader/molecule.py
, line 311, the function load_mol_datasets()
. May I ask if you have more details about the datasets processing especially about the csv file 'tox21.csv'? Where can i download the suitable format datasets?
Thank you so much, looking forward to hear from you. @JiaxuanYou
The execution goes pretty smoothly and then at the very end outputs this
train: {'epoch': 399, 'eta': 0.0, 'loss': 0.6025, 'lr': 0.0, 'params': 265218, 'time_iter': 0.0837, 'accuracy': 0.688, 'precision': 0.5556, 'recall': 0.0324, 'f1': 0.0612, 'auc': 0.6239}
val: {'epoch': 399, 'loss': 0.6162, 'lr': 0, 'params': 265218, 'time_iter': 0.0465, 'accuracy': 0.6667, 'precision': 0.6, 'recall': 0.0361, 'f1': 0.0682, 'auc': 0.6527}
test: {'epoch': 399, 'loss': 0.6214, 'lr': 0, 'params': 265218, 'time_iter': 0.0448, 'accuracy': 0.6545, 'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'auc': 0.6245}
Check point saved: results/example/1/ckpt/399.ckpt
Task done, results saved in results/example/1
359
{'epoch': 359, 'loss': 0.6233, 'lr': 0, 'params': 265218, 'time_iter': 0.0455, 'accuracy': 0.6585, 'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'auc': 0.6213}
{'epoch': 359, 'eta': 210.5677, 'loss': 0.6055, 'lr': 0.0003, 'params': 265218, 'time_iter': 0.0848, 'accuracy': 0.689, 'precision': 0.6667, 'recall': 0.0194, 'f1': 0.0377, 'auc': 0.6138}
{'epoch': 359, 'loss': 0.6175, 'lr': 0, 'params': 265218, 'time_iter': 0.0472, 'accuracy': 0.6707, 'precision': 0.75, 'recall': 0.0361, 'f1': 0.069, 'auc': 0.655}
239
{'epoch': 239, 'eta': 2.2639, 'loss': 0.0281, 'lr': 0.0035, 'params': 632328, 'time_iter': 0.0143, 'accuracy': 0.9991}
{'epoch': 239, 'loss': 0.4795, 'lr': 0, 'params': 632328, 'time_iter': 0.0079, 'accuracy': 0.8819}
359
{'epoch': 359, 'eta': 0.573, 'loss': 0.0139, 'lr': 0.0003, 'params': 632328, 'time_iter': 0.0154, 'accuracy': 1.0}
{'epoch': 359, 'loss': 0.4917, 'lr': 0, 'params': 632328, 'time_iter': 0.0073, 'accuracy': 0.8875}
Traceback (most recent call last):
File "main.py", line 60, in <module>
agg_runs(get_parent_dir(out_dir_parent, args.cfg_file), cfg.metric_best)
File "/nfs/data/patients_networks/olga_scripts/GraphGym/graphgym/utils/agg_runs.py", line 110, in agg_runs
results[key][i] = agg_dict_list(results[key][i])
File "/nfs/data/patients_networks/olga_scripts/GraphGym/graphgym/utils/agg_runs.py", line 47, in agg_dict_list
value = np.array([dict[key] for dict in dict_list])
File "/nfs/data/patients_networks/olga_scripts/GraphGym/graphgym/utils/agg_runs.py", line 47, in <listcomp>
value = np.array([dict[key] for dict in dict_list])
KeyError: 'precision'
Results folder is produced, but there are only accuracy metrics and no "test" folder. I suspect that maybe it's related to the fact that the precision is equal to 0 on the test set.
I tried to execute run_batch.sh. It seems that there is no mechanism to block the emerging of trials, as there are more and more processes running in my GPUs. And I observed the out-of-memory error of CUDA.
Passing a networkx List of graphs as an input
Hello,
I am interested in reproducing the IDGNN's results on graph classification. I looked at the code and had a few quick questions
are all configurations listed in https://github.com/snap-stanford/GraphGym/blob/master/run/grids/IDGNN/graph_enzyme.txt? i.e., the major arguments I need to change is dataset.augment_feature
? I am mainly interested in reproducing results at Table 6.
there are ID-GNN and ID-GNN-Fast. Are both implemented in this repository?
how is the heterogeneous message passing is implemented?
Thank you very much!!
Thank you for sharing this great work. It seems that you use even numbers only for the number of message passing layers, i.e. 2, 4, 6, 8. May I ask the reasons for doing so?
I don't know why the configuration files generated by using grid search are different from the initial settings. I didn't enumerate their changes.
I want to apply GNN to new applications i.e. Scenario 2 (Scenario 2: You want to apply GNN to your exciting applications.)
I see all the models are for predictive tasks.
I am wondering whether you are planning to include generative models in the future?
For example, whether there are any plans to include any of the following models:
Also, it lacks examples of any Graph2Seq based models. It would be awesome to include consider any of the following Graph2Seq based generative models.
Graph-to-Sequence Learning using Gated Graph Neural Networks, ACL’18
https://github.com/beckdaniel/acl2018_graph2seq
Using MXNet
Densely Connected Graph Convolutional Networks for Graph-to-Sequence Learning, ACL’19
https://github.com/Cartus/DCGCN
Using MXNet 1.3.0
Gated Graph Sequence Neural Networks, Y. Li, D. Tarlow, M. Brockschmidt, and R. Zemel.
Heterogeneous Graph Transformer for Graph-to-Sequence Learning, ACL’18
https://github.com/QAQ-v/HetGT
Including any other generative model would be a great starting point as well.
I am really interested to know your thoughts in this regard.
I see that 7.2 Experimental Setup in paper
For all the experiments in Sections 7.3 and 7.4, we use a consistent setup, where results on three random 80%/20% train/val splits are averaged, and the validation performance in the final epoch is reported.
In Sections 7.3 and 7.4, the performance used in ranking analysis are all validation performance.
First, the validation performance in the final epoch meanss it will run out of all epochs, which is final epoch, am I right?
Second, I am wondering why we don't use the early-stop and use the test performance mentioned below, which is the best validation epoch test performance.
how to report the performance (e.g., final epoch or the best validation epoch) in section 7.1.
GraphGym/graphgym/models/gnn.py
Line 157 in cf65c93
When cfg.gnn.layers_mp == 1
, GNN should have one message-passing layer, while the code here will not build self.mp
.
The correct code may be:
if cfg.gnn.layers_mp > 0:
Hi, in the paper "RELATIONAL MULTI-TASK LEARNING: MODELING RELATIONS BETWEEN DATA AND TASKS", the authors said the code has been released in this repository, but we could not find it. Could we know which part of the code is about MetaLink? Thanks.
I am a beginer of gnn
how do i choose lib to study?
In graphgym/contrib/network/metalink.py,
The class MetaLink seems to lack a function forward.
torch_geometric.data.keys() is a method. Since the code mentioned above tries to iterate through the method instead of calling the method and iterate through results, I was getting TypeError: 'method' object is not iterable when I try to run_single_pyg.sh
Please modify the code to resolve this issue.
Since this is minor bug and I'm still learning this wonderful library, I thought it's worth pointing out.
Hi,
this is not technically an issue but more of a general question, I apologize if this is not the best place to ask.
Looking at the GeneralConv
layer, it seems that the representation for a given node is, by default, only computed from the neighbours and not the node itself:
GraphGym/graphgym/contrib/layer/generalconv.py
Lines 85 to 86 in 77f1e7a
This is only true if normalize=False
in the layer, because the norm
staticmethod has a call to add_remaning_self_loop
that effectively adds nodes into their own neighbourhood:
GraphGym/graphgym/contrib/layer/generalconv.py
Lines 51 to 52 in 77f1e7a
If normalize=False
the call to norm
is skipped and so a node is not neighbour of itself.
I was wondering if this is intentional, and if so what is the rationale behind this design choice.
I am re-implementing this layer for another GNN library and I need to make a choice about whether to process nodes as part of their neighbourhood.
I know that 1) in the paper and 2) in the code there is no ambiguity about this fact, but it sounds strange to me so I wanted to double-check with you :D
Thanks
Thanks for the exciting program!
I am suffering from finding the optimal GNN model on the node classification task. This problem is caused by too much freedom of choice within and between layers. In other words, there are too many models to choose from and too many hyperparameters to optimize.
Referring to ogb's leaderboard to find the optimal model is a potential solution, but as the paper showed,
the best GNN designs for different tasks differ drastically.
From my understanding, GraphGym has provided the idea that similar tasks can share the optimal model design.
As mentioned,
GraphGym provides a simple interface to try out thousands of GNNs in parallel and understand the best designs for your specific task.
GraphGym also recommends a "go-to" GNN design space, after investigating 10 million GNN model-task combinations.
I would like to know if there are off-the-shelf model-task combinations that I can use directly, without using the interface to try out GNN designs.
Hi! Thank you for the great tool for working with GNN! However, it seems to me that there is probably an issue with the create_dataset() function in loader.py. Specifically, when calling GraphDataset(), it assigns "cfg.dataset.resample_disjoint" to "resample_disjoint". However, the GraphDataset does not have an attribute "resample_disjoint". I am wondering whether that should be "edge_train_mode" instead.
Hi,
I'm getting the following error when I try to use the TU_IMDB
dataset:
Traceback (most recent call last):
File "/Users/psanchez/Documents/GitHub/transformer_message_passing/run/main.py", line 42, in <module>
datasets = create_dataset()
File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/graphgym-0.3.1-py3.9.egg/graphgym/loader.py", line 197, in create_dataset
graphs = load_dataset()
File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/graphgym-0.3.1-py3.9.egg/graphgym/loader.py", line 111, in load_dataset
graphs = load_pyg(name, dataset_dir)
File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/graphgym-0.3.1-py3.9.egg/graphgym/loader.py", line 74, in load_pyg
graphs = GraphDataset.pyg_to_graphs(dataset_raw)
File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/deepsnap/dataset.py", line 1276, in pyg_to_graphs
return [
File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/deepsnap/dataset.py", line 1277, in <listcomp>
Graph.pyg_to_graph(
File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/deepsnap/graph.py", line 2027, in pyg_to_graph
Graph.add_node_attr(G, key, value)
File "/Users/psanchez/miniconda3/envs/transformer_mp/lib/python3.9/site-packages/deepsnap/graph.py", line 1911, in add_node_attr
attr_dict = dict(zip(node_list, node_attr))
TypeError: 'int' object is not iterable
I'm running the main.py
with the following dataset configuration file:
out_dir: results
dataset:
format: PyG
name: TU_IMDB
task: graph
task_type: classification
transductive: False
split: [0.8, 0.2]
augment_feature: []
augment_feature_dims: [10]
augment_feature_repr: position
augment_label: ''
augment_label_dims: 5
transform: none
train:
batch_size: 32
eval_period: 20
ckpt_period: 100
model:
type: gnn
loss_fun: cross_entropy
edge_decoding: dot
graph_pooling: add
gnn:
layers_pre_mp: 1
layers_mp: 2
layers_post_mp: 1
...
Any idea why this might be happening?
Thanks a lot in advance.
Hello!
This project is truly amazing, thank you. That said I'm finding it difficult to apply it to my own datasets. Naturally, I would like to customize the grid search however, I'm not sure what the valid options are for each field in the configuration. The valid options I know are thanks to the examples configs
and grids
in the repo, but a comprehensive list for each field would be greatly appreciated. Is there any existing documentation on this matter?
I'm also unsure about how to register my datasets. At which point in the pipeline should the customized version of graphgym/contrib/loader/example.py be run? I'm guessing before the config generation script as the configs must include the dataset information. Still, I'm unsure about how this piece of code fits in the pipeline.
Thank you in advance.
Hello!
Thank you for producing awesome library.
I'm new to Graph Neural Network, and i am using windows 10 now.
I found some files in guideline is not available to windows (like .sh files)
Is there any constraint for actually using GraphGym in windows?
Lines 22 to 25 in fa133e5
When the shape of the label vector true
is [node_num, 1]
instead of [node_num]
, lines 22-23 will misinterpret the task as a multi-task binary classification task rather than a multi-class classification task. So the pred
vector will be wrongly flattened. The correct code should be
pred = pred.squeeze(-1) if pred.ndim > 1 else pred
true = true.squeeze(-1) if true.ndim > 1 else true
if true.ndim > 1 and cfg.model.loss_fun == 'cross_entropy':
pred, true = torch.flatten(pred), torch.flatten(true)
, which moves lines 24-25 to the front of lines 22-23.
I am not sure how to use the modules I register. Thank you
Thank you for the nice library. I noticed that the under configs/IDGNN/graph_enzyme.yaml
, the name of the dataset is ba500. I guess this is a bug? Would you like to update the yaml file? Thank you!
-- EDIT --
Hi, I experienced an issue when following your instructions and testing the installation.
I'm using Ubuntu 20.04 and installed the CPU version of Pytorch.
When testing the single experiment (bash run_single_cpu.sh), I got the following error:
Traceback (most recent call last):
File "main.py", line 5, in <module>
from torch_geometric import seed_everything
File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/__init__.py", line 4, in <module>
import torch_geometric.data
File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/data/__init__.py", line 1, in <module>
from .data import Data
File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_geometric/data/data.py", line 9, in <module>
from torch_sparse import SparseTensor
File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_sparse/__init__.py", line 41, in <module>
from .tensor import SparseTensor # noqa
File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_sparse/tensor.py", line 13, in <module>
class SparseTensor(object):
File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch/jit/_script.py", line 974, in script
_compile_and_register_class(obj, _rcb, qualified_name)
File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch/jit/_script.py", line 67, in _compile_and_register_class
torch._C._jit_script_class_compile(qualified_name, ast, defaults, rcb)
RuntimeError:
Tried to access nonexistent attribute or method 'crow_indices' of type 'Tensor'.:
File "/home/flo/miniconda3/envs/graphgym/lib/python3.7/site-packages/torch_sparse/tensor.py", line 109
def from_torch_sparse_csr_tensor(self, mat: torch.Tensor,
has_value: bool = True):
rowptr = mat.crow_indices()
~~~~~~~~~~~~~~~~ <--- HERE
col = mat.col_indices()
I found a seemingly closely related problem, see this post: rusty1s/pytorch_sparse#207
and tried to resolve it by downgrading to an older torch-sparse version:
pip install torch-sparse==0.6.12
This seemed to fix the issue.
Hi,
In the current version there is no option of sampling neighbours while message passing and support for multi label node classification for datasets such as PPI right? Could you let me know if I am missing something?
Thank You.
Hi @JiaxuanYou,
Thank you for making this code base available!
I do not have an NVIDIA GPU. So I tried to run the example on CPU. However, I ran into errors because the logger was calling the get_current_gpu_usage()
function in device.py
.
To fix this, I wrapped this function in the same conditional as the auto_select_device()
function like so:
def get_current_gpu_usage():
if cfg.device != 'cpu' and torch.cuda.is_available():
result = subprocess.check_output(
[
'nvidia-smi', '--query-compute-apps=pid,used_memory',
'--format=csv,nounits,noheader'
], encoding='utf-8')
current_pid = os.getpid()
used_memory = 0
for line in result.strip().split('\n'):
line = line.split(', ')
if current_pid == int(line[0]):
used_memory += int(line[1])
return used_memory
else:
return 0
Just wanted to let you know. You may have a better way to do it :)
Kyle
when i run (bash bash run_single.sh), some error appear, but when i run(bash run_batch.sh), it is ok. i am not sure if it is a successful installation
When installing PyTorch, no check for missing g++ compiler.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.