[FEATURE]The wechat code out of date.

As titled. Hopefully there could be more discussion groups via wechat or qq. Thanks for the great work.

doc example: dataset.slices.update [BUG]

Describe the bug
following the doc example on FE:


# 1. Choose a dataset.
from autogl.datasets import build_dataset_from_name
data = build_dataset_from_name('cora')

# 2. Compose a feature engineering pipeline
from autogl.module.feature import BaseFeature,AutoFeatureEngineer
from autogl.module.feature.generators import GeEigen
from autogl.module.feature.selectors import SeGBDT
from autogl.module.feature.graph import SgNetLSD
# you may compose feature engineering bases through BaseFeature.compose
fe = BaseFeature.compose([
GeEigen(size=32) ,
SeGBDT(fixlen=100),
SgNetLSD()
])
# or just through '&' operator
fe = fe & AutoFeatureEngineer(fixlen=200,max_epoch=3)

# 3. Fit and transform the data
fe.fit(data)
data1=fe.transform(data,inplace=False)

I always get

Traceback (most recent call last):
File "", line 1, in
File "/home/ruhkopf/miniconda3/envs/AutoGL/lib/python3.9/site-packages/autogl-0.2.0rc0-py3.9.egg/autogl/module/feature/base.py", line 113, in fit
dataset = self._rebuild(dataset, _dataset)
File "/home/ruhkopf/miniconda3/envs/AutoGL/lib/python3.9/site-packages/autogl-0.2.0rc0-py3.9.egg/autogl/module/feature/base.py", line 55, in _rebuild
dataset.slices.update(slices)
AttributeError: 'NoneType' object has no attribute 'update'

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Environment (please complete the following information):

OS: Linux Ubuntu 20.04 LTS
python version: 3.9.7
autogl version: 0.2.0-pre
pip list: conda list

packages in environment at /home/ruhkopf/miniconda3/envs/AutoGL:

pip list
Package Version

astor 0.8.1
autogl 0.2.0rc0
bayesian-optimization 1.2.0
Bottleneck 1.3.2
brotlipy 0.7.0
certifi 2021.10.8
cffi 1.14.5
charset-normalizer 2.0.4
chocolate 0.0.2
click 8.0.3
cloudpickle 2.0.0
colorama 0.4.4
configparser 5.1.0
contextlib2 21.6.0
cryptography 35.0.0
cycler 0.11.0
dill 0.3.4
docker-pycreds 0.4.0
filelock 3.3.2
fonttools 4.28.1
future 0.18.2
gitdb 4.0.9
GitPython 3.1.24
googledrivedownloader 0.4
hyperopt 0.1.2
idna 3.2
isodate 0.6.0
Jinja2 3.0.2
joblib 1.1.0
json-tricks 3.15.5
kiwisolver 1.3.2
lightgbm 3.3.1
littleutils 0.2.2
MarkupSafe 2.0.1
matplotlib 3.5.0rc1
mkl-fft 1.3.1
mkl-random 1.2.2
mkl-service 2.4.0
NetLSD 1.0.2
networkx 2.6.3
nni 2.5
numexpr 2.7.3
numpy 1.21.2
ogb 1.3.2
outdated 0.2.1
packaging 21.2
pandas 1.3.4
pathtools 0.1.2
Pillow 8.4.0
pip 21.2.4
prettytable 2.4.0
promise 2.3
protobuf 3.19.1
psutil 5.8.0
pycparser 2.21
pymongo 3.12.1
pyOpenSSL 21.0.0
pyparsing 2.4.7
PySocks 1.7.1
python-dateutil 2.8.2
python-louvain 0.15
PythonWebHDFS 0.2.3
pytz 2021.3
PyYAML 6.0
rdflib 6.0.2
requests 2.26.0
responses 0.15.0
schema 0.7.4
scikit-learn 1.0.1
scipy 1.7.1
sentry-sdk 1.4.3
setuptools 58.0.4
setuptools-scm 6.3.2
shortuuid 1.0.8
simplejson 3.17.5
six 1.16.0
smmap 5.0.0
subprocess32 3.5.4
tabulate 0.8.9
termcolor 1.1.0
threadpoolctl 2.2.0
tomli 1.2.2
torch 1.10.0
torch-cluster 1.5.9
torch-geometric 2.0.2
torch-scatter 2.0.9
torch-sparse 0.6.12
torch-spline-conv 1.2.1
tqdm 4.62.3
typing-extensions 3.10.0.2
urllib3 1.26.7
wandb 0.12.6
wcwidth 0.2.5
websockets 10.0
wheel 0.37.0
yacs 0.1.6
yaspin 2.1.0

AttributeError: 'CoraDataset' object has no attribute 'data' [BUG]

Describe the bug
When getting the accuracy shared in the documentation using Cora Dataset I get this
AttributeError: 'CoraDataset' object has no attribute 'data'

To Reproduce
Steps to reproduce the behavior:

import torch
from autogl.datasets import build_dataset_from_name
from autogl.solver import AutoNodeClassifier
from autogl.module.train import Acc

cora_dataset = build_dataset_from_name('cora')

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

solver = AutoNodeClassifier(
    feature_module='deepgl',
    graph_models=['gcn', 'gat'],
    hpo_module='anneal',
    ensemble_module='voting',
    device=device
)

solver.fit(cora_dataset, time_limit=3600)

predicted = solver.predict_proba()
print('Test accuracy: ', Acc.evaluate(predicted,
    cora_dataset.data.y[cora_dataset.data.test_mask].cpu().numpy()))

Expected behavior
it should print out the accuracy
Test accuracy: 0.824

Screenshots

Environment (please complete the following information):

OS: [Google Colab / Linux]
python version: Python 3.7.13
autogl version: 0.3.1

Additional Info (Optional)
Torch version is ('1.12.1', 'cu113')

I installed this
!pip install torch-scatter -f https://data.pyg.org/whl/torch-{tver}+{cuver}.html
!pip install torch-sparse -f https://data.pyg.org/whl/torch-{tver}+{cuver}.html
!pip install torch-geometric
!pip install dgl-cu113 dglgo -f https://data.dgl.ai/wheels/repo.html
!pip install netlsd
!pip install chocolate
!-m pip install git+https://github.com/fmfn/BayesianOptimization
!pip install autogl

[UserWarning]: 'data.DataLoader' is deprecated, use 'loader.DataLoader' instead

please upgrade this UserWarnning

[BUG] cannot run examples

Describe the bug
FileNotFoundError: [Errno 2] No such file or directory: '../configs/nodeclf_nas_gasso.yml'
To Reproduce
Steps to reproduce the behavior:

Go to 'AutoGl'
Run 'python examples/gasso_test.py'

Expected behavior
I expect to be able to reproduce results in GASSO paper

Environment (please complete the following information):

OS: PopOS 21.10
python version:3.8
autogl version: 0.2.0-pre

[BUG] AutoGL import frozen when the backend is PyG

the procedure of import frozen when the backend is selected as PyG.
When the backend is selected as DGL, the import of autogl is good.

Environment (please complete the following information):

GNU/Linux (kernel=5.10.131)
python version: 3.7.13
autogl version: branch of dev
Pytorch: 1.12
Pytorch-Geometric: 2.0.4
DGL: 0.9.0

To Reproduce
Steps to reproduce the behavior:

Run Python Interpreter with PyG backend selected: AUTOGL_BACKEND=pyg python.
Run import autogl in Python shell, then the is import is frozen.
Press Ctrl + C to interrupt the process and you may see the traces as follows.
See the logs.

Trace
Traceback (most recent call last):
File "backend.py", line 2, in
import autogl
File "/home/featurize/PycharmProjects/AutoGL-develop/AutoGL-dev/autogl/init.py", line 1, in
from . import (
File "/home/featurize/PycharmProjects/AutoGL-develop/AutoGL-dev/autogl/datasets/init.py", line 14, in
from ._ogb import (
File "/home/featurize/PycharmProjects/AutoGL-develop/AutoGL-dev/autogl/datasets/_ogb.py", line 4, in
from ogb.nodeproppred import NodePropPredDataset
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/ogb/nodeproppred/init.py", line 1, in
from .evaluate import Evaluator
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/ogb/nodeproppred/evaluate.py", line 1, in
from sklearn.metrics import roc_auc_score
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/sklearn/init.py", line 82, in
from .base import clone
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/sklearn/base.py", line 17, in
from .utils import _IS_32BIT
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/sklearn/utils/init.py", line 25, in
from . import _joblib
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/sklearn/utils/_joblib.py", line 7, in
import joblib
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/joblib/init.py", line 113, in
from .memory import Memory, MemorizedResult, register_store_backend
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/joblib/memory.py", line 32, in
from ._store_backends import StoreBackendBase, FileSystemStoreBackend
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/joblib/_store_backends.py", line 15, in
from .backports import concurrency_safe_rename
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/joblib/backports.py", line 7, in
from distutils.version import LooseVersion
File "", line 983, in _find_and_load
File "", line 963, in _find_and_load_unlocked
File "", line 906, in _find_spec
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/_distutils_hack/init.py", line 97, in find_spec
return method()
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/_distutils_hack/init.py", line 108, in spec_for_distutils
mod = importlib.import_module('setuptools._distutils')
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/setuptools/init.py", line 16, in
import setuptools.version
File "/environment/miniconda3/envs/AutoGL-develop/lib/python3.7/site-packages/setuptools/version.py", line 1, in
import pkg_resources
File "", line 202, in _lock_unlock_module
File "", line 98, in acquire
KeyboardInterrupt

Screenshots
(omitted)

[BUG] from this import d

It seems that the first line of https://github.com/THUMNLab/AutoGL/tree/main/autogl/module/model/pyg/robust/gnnguard.py at main branch should be deleted:

from this import d

otherwise it keeps printing 'The Zen of Python' when running examples:

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
The Zen of Python, by Tim Peters

Todo list for v0.2

@THUMNLab/aglteam
We are now very close to the new version, and thanks all of the developers for your hardworking and contributions!
We still need the following things done before we finally comes to v0.2:

tutorial/documents (docstring) on sampling @CoreLeader
examples on sampling @CoreLeader
tutorial/documents (docstring) on nas @general502570 @wondergo2017
examples on nas @general502570 @wondergo2017
update dependency and version number @Frozenmad
update readme and solver toturial @Frozenmad

Let's keep working on new versions!

[BUG] autogl 0.1.1 is not compactible with torch-geometric 1.6.3

Describe the bug
When I run the graph_classification.py code, I got following error.
How can I fix the error? Any advice will be greatly appreciated.
Thanks

`
~/src/AutoGL/examples$ python graph_classification.py

Traceback (most recent call last):
File "graph_classification.py", line 19, in
cv_split=10,
File "/home/user/src/AutoGL/autogl/solver/classifier/graph_classifier.py", line 294, in fit
self.feature_module.fit(dataset.train_split)
File "/home/user/src/AutoGL/autogl/module/feature/base.py", line 116, in fit
dataset = self._rebuild(dataset, _dataset)
File "/home/user/src/AutoGL/autogl/module/feature/base.py", line 57, in _rebuild
data, slices = dataset.collate(datalist)
File "/home/user/miniconda3/envs/grapl/lib/python3.7/site-packages/torch_geometric/data/in_memory_dataset.py", line 128, in collate
dim=data.cat_dim(key, item))
RuntimeError: Sizes of tensors must match except in dimension 0. Got 1 and 2 in dimension 1 (The offending index is 5)

`

Environment (please complete the following information):

OS: [Ubuntu20.04]
python version: [Python 3.7.9]
autogl version: [0.1.1]

Could you please provide a unsupervised learning example, especially for heterogenerous graph?

The online docs only have some content for SSL, and the guidance is not enough. I found it difficult to solve it by reading source code myself. Could you please provide a clear sample for using 'GraphCLUnsupervisedTrainer' and make the online docs more complete? Thanks a lot.

Can't run examples

Hi, I just try to run examples/node_classification.py, but it reports an error ModuleNotFoundError: No module named 'networkx.algorithms.efficiency_measures'. I'm sure that I have installed networkx2.3. Could you tell me what should I do to successfully run the example?

Performance check for refactored branch

@CoreLeader @lihy96
Hi all, I've merged the dev-szx/sampling-reproduction into branch refactor_test, a copy of branch dev. And provide the performance check codes under benchmark.
@lihy96 Please help to run the comparison experiments on the new branch.
@CoreLeader I notice some bugs when running pc_link_prediction_trainer.py, The error message is:

Traceback (most recent call last):
  File "pc_link_prediction_trainer.py", line 75, in <module>
    trainer.train(dataset, keep_valid_result=True)
  File "../autogl/module/train/link_prediction.py", line 270, in train
    self.train_only(data)
  File "../autogl/module/train/link_prediction.py", line 200, in train_only
    z = self.model.model.lp_encode(data)
  File "../autogl/module/model/gcn.py", line 228, in lp_encode
    data.x = self.__sequential_encoding_layers[i](data)
  File "/home/guancy/.pyenv/versions/miniconda3-4.3.30/envs/autograph/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "../autogl/module/model/gcn.py", line 54, in forward
    edge_weight: _typing.Optional[torch.Tensor] = getattr(data, "edge_weight")
AttributeError: 'Data' object has no attribute 'edge_weight'

Can you help fix this bug? You can fix the bug directly on this branch (refactor_test) since all the codes inside benchmark will be kept for future development. When the branch is ready, you can merge this to your dev branch~

[FEATURE]how to save the solver as a disk file and load it？

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

[FEATURE] Clear description of building own dataset

Hello!
Thanks for your project!

I think that there is a need to add an extended description of creating your own dataset. Without references to other sources, since there, too, the description leaves much to be desired.
Or the simplest complete example from scratch.

The need is, since I see the main users cases just on the use of autogl on their own graphs. And for people who don’t really want to understand the intricacies of creating datasets of other libraries (maybe this wasn’t a big problem if there were normal examples in the documentation of these libraries).

Thanks!

[OSError:[WinError 127]找不到指定的程序] pytorch-sparse安装引起

Describe the bug
安装成功dgl和autogl后，import验证时报错，pytorch-sparse安装后python.exe无法找到入口，卸载pytorch-sparse又会提示不存在模块，问题在torch_sparse循环。

To Reproduce

import autogl
先出现torch_sparse找不到问题，安装wheel
成功后，import autogl
再次错误:OSError: [WinError 127] 找不到指定的程序。

import autogl
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\All Users\anaconda3\envs\autogl\lib\site-packages\autogl_init_.py", line 1, in
from . import (
File "C:\Users\All Users\anaconda3\envs\autogl\lib\site-packages\autogl\datasets_init_.py", line 14, in
from .ogb import (
File "C:\Users\All Users\anaconda3\envs\autogl\lib\site-packages\autogl\datasets_ogb.py", line 8, in
from torch_sparse import SparseTensor
File "C:\Users\llzhang\AppData\Roaming\Python\Python38\site-packages\torch_sparse_init.py", line 18, in
torch.ops.load_library(spec.origin)
File "C:\Users\All Users\anaconda3\envs\autogl\lib\site-packages\torch_ops.py", line 255, in load_library
ctypes.CDLL(path)
File "C:\Users\All Users\anaconda3\envs\autogl\lib\ctypes_init_.py", line 373, in init
self._handle = _dlopen(self._name, mode)
OSError: [WinError 127] 找不到指定的程序。
并且跳出警告，如下图：

无法定位程序输入点于动态链接库\torch_sparse_version_cpu.pyd上

Expected behavior
成功import autogl，如何更好的安装torch_sparse，使之成功安装并且不报错。

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: [Windows 10]
python version: [3.8.17]
autogl version: [0.4.0]
pip list: [# packages in environment at C:\Users\All Users\anaconda3\envs\autogl:

Additional Info (Optional)

提示的定位不到的动态链接库_version_cpu，本地可以找到几个，替换报错位置的文档，也没有解决。

[BUG]

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: [e.g. Windows 10]
python version: [The output of cmd python -V]
autogl version: [The output of cmd python -c 'import autogl; print(autogl.__version__)']
pip list: [the output of cmd pip list]

Additional Info (Optional)
Add any other context about the problem here. Such as the possible reason you think, or how to solve this bug if you are interested.

GraphClassification not working.

""""""""""""""""""""""""""""""""""""""""
HPO Search Phase:

0%| | 0/10 [00:00<?, ?it/s]WARNING:root:Ignore passed dec since enc is a whole model
0%| | 0/10 [00:00<?, ?it/s]

ModuleNotFoundError Traceback (most recent call last)
in <cell line: 2>()
1 # train
----> 2 autoClassifier.fit(dataset , evaluation_method=[Acc])
3 #autoClassifier.get_leaderboard().show()
4
5 #print("best single model:\n", autoClassifier.get_leaderboard().get_best_model(0))

4 frames
/usr/local/lib/python3.10/dist-packages/autogl/datasets/utils/_general.py in graph_get_split(dataset, mask, is_loader, batch_size, num_workers, shuffle)
369 raise RuntimeError("Unsupported backend")
370 elif _backend.DependentBackend.is_dgl():
--> 371 from dgl.dataloading.pytorch import GraphDataLoader
372 return GraphDataLoader(
373 sub_dataset,

ModuleNotFoundError: No module named 'dgl.dataloading.pytorch'

""""""""""""""""""""""""""""""""""""""""""""""""""""'''

[BUG] License

Hey Guys,
I am still intrigued by your package and strongly consider using it.
There is unfortunately a potentially boring hinderance:
You need to specify the copyright in the License:

Copyright [yyyy] [name of copyright owner]

[BUG Error building wheels for gensim

 note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for gensim
  Running setup.py clean for gensim
Successfully built autogl
Failed to build gensim
ERROR: Could not build wheels for gensim, which is required to install pyproject.toml-based projects

Is there a possibility of python 3.11 support? or bumping gensim dependency to it's latest version?

[FEATURE] validate & simplify hyperparameter space

Is your feature request related to a problem? Please describe.
When writing the Models into the interface, the hyperparameter space is tedious.

Describe the solution you'd like
To validate that the model works appropriately under the search space, it would be nice to be able to sanity check the h.space.

Describe alternatives you've considered
A concise way for declaring complex h.spaces is the Configspace package

It also allows to sample from the space and declare distributions explicit.

Additional context

Wrong returned type of autogl.datasets.utils.conversion.general_static_graphs_to_pyg_dataset

The returned value should be an instance of torch_geometric.data.InMemoryDataset.
Current returned value is of type autogl.data._dataset._dataset.InMemoryDataset.
Please make it succeed the correct class. @CoreLeader

some wrong problems happened when i run the examples in source code with pycharm.

run file:node_classification.py
*begin: wrong information in python console
WARNING:root:The OGB package is out of date. Your version is 1.2.3, while the latest version is 1.2.4.
Using backend: pytorch
Downloading https://github.com/kimiyoung/planetoid/raw/master/data/ind.cora.x
Traceback (most recent call last):
File "D:/autocogdl/AutoGL-main/examples/node_classification.py", line 42, in
dataset = build_dataset_from_name(args.dataset)
File "D:\autocogdl\AutoGL-main\autogl\datasets_init_.py", line 118, in build_dataset_from_name
dataset = DATASET_DICTdataset_name
File "D:\autocogdl\AutoGL-main\autogl\datasets\pyg.py", line 60, in init
Planetoid(path, dataset)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch_geometric\datasets\planetoid.py", line 55, in init
super(Planetoid, self).init(root, transform, pre_transform)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch_geometric\data\in_memory_dataset.py", line 54, in init
pre_filter)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch_geometric\data\dataset.py", line 89, in init
self._download()
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch_geometric\data\dataset.py", line 141, in _download
self.download()
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch_geometric\datasets\planetoid.py", line 105, in download
download_url('{}/{}'.format(self.url, name), self.raw_dir)
File "D:\Anaconda3\envs\pytorch\lib\site-packages\torch_geometric\data\download.py", line 31, in download_url
data = urllib.request.urlopen(url)
File "D:\Anaconda3\envs\pytorch\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "D:\Anaconda3\envs\pytorch\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "D:\Anaconda3\envs\pytorch\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "D:\Anaconda3\envs\pytorch\lib\urllib\request.py", line 563, in error
result = self._call_chain(*args)
File "D:\Anaconda3\envs\pytorch\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "D:\Anaconda3\envs\pytorch\lib\urllib\request.py", line 755, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "D:\Anaconda3\envs\pytorch\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "D:\Anaconda3\envs\pytorch\lib\urllib\request.py", line 641, in http_response
'http', request, response, code, msg, hdrs)
File "D:\Anaconda3\envs\pytorch\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "D:\Anaconda3\envs\pytorch\lib\urllib\request.py", line 503, in _call_chain
result = func(*args)
File "D:\Anaconda3\envs\pytorch\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

Process finished with exit code 1

end******

There may be a problem with the sample code in the documentation

In documentation Hyper Parameter Optimization https://autogl.readthedocs.io/en/latest/docfile/tutorial/t_hpo.html#add-your-hpoptimizer
The custom function provided has a
loss, self.is_higher_better = current_trainer.get_valid_score(dset) in function fn
First of all, I think the evaluation index obtained here is not loss, but ACC or AUC， according to the code at ./solver/classifier/node_classifier.py:

fit

    if evaluation_method == "infer":
        if hasattr(dataset, "metric"):
            evaluation_method = [dataset.metric]
        else:
            num_of_label = dataset.num_classes
            if num_of_label == 2:
                evaluation_method = ["auc"]
            else:
                evaluation_method = ["acc"]

I think if you don't explicitly specify LogLoss as the evaluation metric in the fit function, it should not be Loss by default.
AND
I'm not sure why there's a parameter dset here? I checked the function ‘get_valid_score’, There is no need to pass in the parameter of dataset. If it's because I didn't look at the code carefully, I got the wrong conclusion. Please point out my mistake, thank you.

Own Dataset

How do I run the entire code on my own dataset? I have 44k graphs in networkx format.

[Internal] checklist for v0.3

Heterogeneous trainer and solver adjustment
Heterogeneous solver test file
Add the full space of encoder and decoder for test file
Final performance check
Test the performance of all examples
Build the docs

Known issues:

May not compatible with sampling trainer in v0.2. (solved)
Cannot support cross module search space dependency. For example, in previous sampling trainer, some hp of trainer may depend on hp of model. While in new version, since spaces of different module are wrapped using dict, we cannot do things like this. (solved, currently the logic is handled by trainer)

[BUG] KeyError: 'NormalizeFeatures'

Describe the bug
Missing key in FEATURE_DICT line 764, in from_config from file
"/home/scotthoang/anaconda3/envs/autogl/lib/python3.8/site-packages/autogl/solver/classifier/node_classifier.py",
To Reproduce
Steps to reproduce the behavior:

Go to 'examples'
Run "python gasso_test.py"
See error

Expected behavior
To run normally without error.

Screenshots

Traceback (most recent call last):

  File "gasso_test.py", line 20, in <module>
    solver = AutoNodeClassifier.from_config(args.config)
  File "/home/scotthoang/anaconda3/envs/autogl/lib/python3.8/site-packages/autogl/solver/classifier/node_classifier.py", line 764, in from_config
    fe_list_ele.append(FEATURE_DICT[name](**feature_engineer))
KeyError: 'NormalizeFeatures'

Environment (please complete the following information):

OS: PopOS 21.10
python version: 3.8.5
autogl version: 0.2.0-pre

[BUG] AttributeError: 'DGLHeteroGraph' object has no attribute 'size'

Describe the bug
When running /autogl/test/nas/node_classification.py with AUTOGL_BACKEND=dgl, I got the following error.

Traceback (most recent call last):
  File "node_classification.py", line 116, in <module>
    model = algo.search(space, dataset, esti)
  File "/villa/zhangyp/anaconda3/envs/graproj/lib/python3.7/site-packages/autogl-0.3.0rc0-py3.7.egg/autogl/module/nas/algorithm/random_search.py", line 69, in search
    metric, loss, hardware_metric = self._infer(mask="val")
  File "/villa/zhangyp/anaconda3/envs/graproj/lib/python3.7/site-packages/autogl-0.3.0rc0-py3.7.egg/autogl/module/nas/algorithm/random_search.py", line 86, in _infer
    metric, loss = self.estimator.infer(self.arch._model, self.dataset, mask=mask)
  File "/villa/zhangyp/anaconda3/envs/graproj/lib/python3.7/site-packages/autogl-0.3.0rc0-py3.7.egg/autogl/module/nas/estimator/one_shot.py", line 35, in infer
    pred = model(dset)[mask]
  File "/villa/zhangyp/anaconda3/envs/graproj/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/villa/zhangyp/anaconda3/envs/graproj/lib/python3.7/site-packages/autogl-0.3.0rc0-py3.7.egg/autogl/module/nas/space/autoattend.py", line 202, in forward
    stem_out = bk_gconv(op, data, drop(input))
  File "/villa/zhangyp/anaconda3/envs/graproj/lib/python3.7/site-packages/autogl-0.3.0rc0-py3.7.egg/autogl/module/nas/backend.py", line 16, in bk_gconv
    return op(data,feat)
  File "/villa/zhangyp/anaconda3/envs/graproj/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/villa/zhangyp/anaconda3/envs/graproj/lib/python3.7/site-packages/torch_geometric/nn/conv/arma_conv.py", line 110, in forward
    edge_index, edge_weight, x.size(self.node_dim),
AttributeError: 'DGLHeteroGraph' object has no attribute 'size'

It happens in 'random + autoattend' stage、 'rl + autoattend' stage and 'darts + graphnas' stage.

To Reproduce
Steps to reproduce the behavior:

Go to '/AutoGL/autogl/test/nas/'
export AUTOGL_BACKEND=dgl
python node_classification.py
See error

Environment (please complete the following information):

OS: Ubuntu 20.04
python version: 3.7.11
autogl version: 0.3.0-pre
pip list:

alabaster                     0.7.12
astor                         0.8.1
attrs                         21.4.0
autogl                        0.3.0rc0
autopep8                      1.6.0
Babel                         2.9.1
bayesian-optimization         1.2.0
brotlipy                      0.7.0
certifi                       2021.10.8
cffi                          1.15.0
charset-normalizer            2.0.4
chocolate                     0.0.2
click                         8.1.2
cloudpickle                   2.0.0
colorama                      0.4.4
contextlib2                   21.6.0
cryptography                  36.0.0
cycler                        0.11.0
dgl-cu113                     0.8.0.post2
dglgo                         0.0.1
dill                          0.3.4
docutils                      0.17.1
filelock                      3.6.0
fonttools                     4.30.0
future                        0.18.2
hyperopt                      0.1.2
idna                          3.3
imagesize                     1.3.0
importlib-metadata            4.11.3
isort                         5.10.1
Jinja2                        3.0.3
joblib                        1.1.0
json-tricks                   3.15.5
jsonlines                     3.0.0
kiwisolver                    1.3.2
lightgbm                      3.3.2
littleutils                   0.2.2
MarkupSafe                    2.1.0
matplotlib                    3.5.1
mkl-fft                       1.3.1
mkl-random                    1.2.2
mkl-service                   2.4.0
NetLSD                        1.0.2
networkx                      2.6.3
nni                           2.6.1
numpy                         1.21.2
numpydoc                      1.2.1
ogb                           1.3.3
outdated                      0.2.1
packaging                     21.3
pandas                        1.3.5
Pillow                        9.0.1
pip                           21.2.2
prettytable                   3.2.0
psutil                        5.9.0
pycodestyle                   2.8.0
pycparser                     2.21
pydantic                      1.9.0
Pygments                      2.11.2
pymongo                       4.0.2
pyOpenSSL                     22.0.0
pyparsing                     3.0.7
PySocks                       1.7.1
python-dateutil               2.8.2
PythonWebHDFS                 0.2.3
pytz                          2021.3
PyYAML                        6.0
requests                      2.27.1
responses                     0.19.0
ruamel.yaml                   0.17.21
ruamel.yaml.clib              0.2.6
schema                        0.7.5
scikit-learn                  1.0.2
scipy                         1.7.3
setuptools                    58.0.4
simplejson                    3.17.6
six                           1.16.0
snowballstemmer               2.2.0
Sphinx                        4.5.0
sphinxcontrib-applehelp       1.0.2
sphinxcontrib-devhelp         1.0.2
sphinxcontrib-htmlhelp        2.0.0
sphinxcontrib-jsmath          1.0.1
sphinxcontrib-qthelp          1.0.3
sphinxcontrib-serializinghtml 1.1.5
tabulate                      0.8.9
threadpoolctl                 3.1.0
toml                          0.10.2
torch                         1.11.0
torch-geometric               2.0.4
torch-scatter                 2.0.9
torch-sparse                  0.6.13
torchaudio                    0.11.0
torchvision                   0.12.0
tqdm                          4.63.0
typeguard                     2.13.3
typer                         0.4.1
typing-extensions             3.10.0.2
urllib3                       1.26.8
wcwidth                       0.2.5
websockets                    10.2
wheel                         0.37.1
zipp                          3.7.0

Additional Info
As I can run this file with AUTOGL_BACKEND=pyg correctly, it may be caused by the mixed use of pyg method and dgl dataset.

request

i hope there is a requirements.txt , such as pandas <2.0 and lightgbm<4.0 and something else

Support for edge classification tasks

Good toolkit!
In addition, I wonder if AutoGL supports multi-classification edge prediction tasks. Thanks a lot.

[Internal] Performance Consistency Check Leaderboard

This issue is created to check whether the library has the same performance features with the native implemented models.

WARNING: This is not the evaluation results of this library. For benchmarking of AutoGL, please see the examples provided.

Guide to developers

What do we mean when we are checking performance?

First, remember that the performance inconsistency may not be because of our implementations. Sometimes you need to increase the repeat number, or change the range of seeds to see whether the performances match with each other under the "same" setting.

If the rules above do not apply, you need to carefully check whether there are some unwanted implementations in your code. Also, there are still chances that the performance check codes are incorrect, in which case you should point out to @Frozenmad .

Note

All the performance check results are listed below. All the performances inconsistencies are represented as bold in the Table.

[BUG]Dependency issues

Describe the bug
package gensim is seemingly not supported for python3.10 when using pip install autogl in python3.10
func graph_number_of_cliques is deprecated since networkx3.1
func append for DataFrame in pandas since pandas1.4.0

Environment (please complete the following information):

OS: Ubuntu 18.04.6 LTS
python version: 3.9.18
autogl version: 0.4.0
pip list:

Package                  Version
------------------------ ----------------
astor                    0.8.1
asttokens                2.4.1
autogl                   0.4.0
backcall                 0.2.0
bayesian-optimization    1.4.3
certifi                  2023.11.17
charset-normalizer       3.3.2
chocolate                0.0.2
cloudpickle              3.0.0
colorama                 0.4.6
contextlib2              21.6.0
contourpy                1.2.0
cycler                   0.12.1
debugpy                  1.6.7
decorator                5.1.1
deeprobust               0.2.9
dill                     0.3.7
dnspython                2.5.0
entrypoints              0.4
executing                2.0.1
filelock                 3.13.1
fonttools                4.47.2
fsspec                   2023.12.2
future                   0.18.3
gensim                   3.8.3
hyperopt                 0.1.2
idna                     3.6
imageio                  2.33.1
importlib-metadata       7.0.1
importlib-resources      6.1.1
ipykernel                6.14.0
ipython                  8.4.0
jedi                     0.19.1
Jinja2                   3.1.3
joblib                   1.3.2
json-tricks              3.17.3
jupyter-client           7.0.6
jupyter_core             5.7.1
kiwisolver               1.4.5
lazy_loader              0.3
lightgbm                 4.2.0
littleutils              0.2.2
llvmlite                 0.41.1
MarkupSafe               2.1.4
matplotlib               3.8.2
matplotlib-inline        0.1.6
mpmath                   1.3.0
nest_asyncio             1.6.0
NetLSD                   1.0.2
networkx                 3.0
nni                      2.8
numba                    0.58.1
numpy                    1.26.3
nvidia-cublas-cu12       12.1.3.1
nvidia-cuda-cupti-cu12   12.1.105
nvidia-cuda-nvrtc-cu12   12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12        8.9.2.26
nvidia-cufft-cu12        11.0.2.54
nvidia-curand-cu12       10.3.2.106
nvidia-cusolver-cu12     11.4.5.107
nvidia-cusparse-cu12     12.1.0.106
nvidia-nccl-cu12         2.18.1
nvidia-nvjitlink-cu12    12.3.101
nvidia-nvtx-cu12         12.1.105
ogb                      1.3.6
outdated                 0.2.2
packaging                23.2
pandas                   1.3.5
parso                    0.8.3
pexpect                  4.8.0
pickleshare              0.7.5
pillow                   10.2.0
pip                      23.3.1
platformdirs             4.1.0
prettytable              3.9.0
prompt-toolkit           3.0.42
protobuf                 4.25.2
psutil                   5.9.0
ptyprocess               0.7.0
pure-eval                0.2.2
Pygments                 2.17.2
pymongo                  4.6.1
pyparsing                3.1.1
python-dateutil          2.8.2
PythonWebHDFS            0.2.3
pytz                     2023.3.post1
PyYAML                   6.0.1
pyzmq                    19.0.2
requests                 2.31.0
responses                0.24.1
schema                   0.7.5
scikit-image             0.22.0
scikit-learn             1.4.0
scipy                    1.12.0
setuptools               68.2.2
simplejson               3.19.2
six                      1.16.0
smart-open               6.4.0
stack-data               0.6.2
sympy                    1.12
tabulate                 0.9.0
tensorboardX             2.6.2.2
texttable                1.7.0
threadpoolctl            3.2.0
tifffile                 2023.12.9
torch                    2.1.2
torch_geometric          2.4.0
torch-scatter            2.1.2+pt21cu121
torch-sparse             0.6.18+pt21cu121
torchvision              0.16.2
tornado                  6.1
tqdm                     4.66.1
traitlets                5.14.1
triton                   2.1.0
typeguard                4.1.5
typing_extensions        4.9.0
tzdata                   2023.4
urllib3                  2.1.0
wcwidth                  0.2.13
websockets               12.0
wheel                    0.41.2
zipp                     3.17.0

Issues when loading personal data

Hello, and thanks for this great autoML framework!

I'm encountering some issues when I try to use my own data with AutoGL. I have a list of PyTorch Geometric's Data objects, I tried to initialise a dataset using your doc, but it doesn't have the same behaviour than prebuilt datasets with random_splits_mask_class():

# Import my own dataset
data_list = [Data(...), ..., Data(...)]
class MyDataset(InMemoryDataset):
    def __init__(self, datalist) -> None:
        super().__init__()
        self.data, self.slices = self.collate(datalist)
myData = MyDataset(data_list)

# Use a prebuilt dataset
cora_dataset = build_dataset_from_name('cora')

# Trying to use AutoNodeClassifier
solver = AutoNodeClassifier(
    feature_module='deepgl',
    graph_models=['gcn', 'gat'],
    hpo_module='anneal',
    ensemble_module='voting',
    device=device
)
solver.fit(myData, train_split=0.8, val_split=0.2, time_limit=3600)

I obtained an error AssertionError: the total number of samples from every class used for training and validation is larger than the total samples in class 0 with myData, which I believe is coming from a different dataset structure:

# line 119 and 129 of autogl/datasets/utils.py
data = dataset[0]
num_classes = data.y.max().cpu().item() + 1

print(cora_dataset)
# Cora()
print(cora_dataset[0])
# Data(edge_index=[2, 10556], test_mask=[2708], train_mask=[2708], val_mask=[2708], x=[2708, 1433], y=[2708])
print(cora_dataset.data)
# Data(edge_index=[2, 10556], test_mask=[2708], train_mask=[2708], val_mask=[2708], x=[2708, 1433], y=[2708])

print(myData)
# MyDataset(375)
print(myData[0])
# Data(edge_index=[2, 264], idx=[1], pair="XXX_X--XXX_X", x=[39, 24], y=[1])
print(myData.data)
# Data(edge_index=[2, 123456], idx=[375], pair=[375], x=[13950, 24], y=[375])

(pair is a custom metadata from my data)

Am I missing something, or is there an unexpected behaviour from the suggested MyDataset?

Thanks in advance,

[BUG]v0.4.1 'DataFrame' has no attribute 'append'. Did you mean '_append'?

Describe the bug
Traceback (most recent call last):
File "/home/xxx/autoGL/ai1.py", line 6, in
custom_static_homogeneous_graph = GeneralStaticGraphGenerator.create_homogeneous_static_graph(
File "/home/xxx/.pyenv/versions/3.10.13/lib/python3.10/site-packages/autogl-0.4.1-py3.10.egg/autogl/data/graph/_general_static_graph/_general_static_graph_generator.py", line 73, in create_homogeneous_static_graph
_heterogeneous_edges_aggregation[('', '', '')] = (
File "/home/xxx/.pyenv/versions/3.10.13/lib/python3.10/site-packages/autogl-0.4.1-py3.10.egg/autogl/data/graph/_general_static_graph/_general_static_graph_default_implementation.py", line 494, in setitem
self._set_edges(edge_t, edges)
File "/home/xxx/.pyenv/versions/3.10.13/lib/python3.10/site-packages/autogl-0.4.1-py3.10.egg/autogl/data/graph/_general_static_graph/_general_static_graph_default_implementation.py", line 683, in _set_edges
self.__heterogeneous_edges_data_frame.append(
File "/home/xxx/.pyenv/versions/3.10.13/lib/python3.10/site-packages/pandas-2.1.4-py3.10-linux-x86_64.egg/pandas/core/generic.py", line 6204, in getattr
return object.getattribute(self, name)
AttributeError: 'DataFrame' object has no attribute 'append'. Did you mean: '_append'?

To Reproduce
Steps to reproduce the behavior:

Use the provided example code from AutoGL documentation to produce a customized dataset
The pip installed old version does not have this problem.

After changing the line in the file" _general_static_graph_default_implementation.py", line 683, in _set_edges:
self.__heterogeneous_edges_data_frame.append(
into:
self.__heterogeneous_edges_data_frame._append(
there's no error message again.

Install Error

I install AutoGL through pip.
Like this 'pip install auto-graph-learning -i https://pypi.org/simple'
But return shows an error 'ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.'
I want to know how to solve, thank you!

[BUG] Error 502 Bad Gateway

It seems like the documentation website is down?

[BUG]

This problem occurs when I run graphnas.py in examples

<autogl.data.graph._general_static_graph._general_static_graph_default_implementation._HeterogeneousNodeView object at 0x7fb67443d970>
Traceback (most recent call last):
File "/AutoGL/examples/graphnas.py", line 24, in
label = dataset[0].y
AttributeError: 'GeneralStaticGraphImplementation' object has no attribute 'y'

The same goes for using dgl
Namespaces are one honking great idea -- let's do more of those!
Traceback (most recent call last):
File "/AutoGL/examples/graphnas.py", line 27, in
label = dataset[0].ndata['label']
AttributeError: 'GeneralStaticGraphImplementation' object has no attribute 'ndata'

[Dependency] NNI V2.9 depreciates nni.nas.pytorch

Describe the bug
With NNI V2.9, nni.nas.pytorch interface is no longer used, causing attribute error when running examples.

Suggestion
lock nni==2.8, works for now.

[FEATURE] Use Pytorch Lightning for BaseTrainer.

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

[BUG] : Provide an full example for hyperparameter optimization for graph-classification-task on a custom dataset using K-FOLD cross validation

Describe the bug
A clear and concise description of what the bug is.

To Reproduce
Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: [e.g. Windows 10]
python version: [The output of cmd python -V]
autogl version: [The output of cmd python -c 'import autogl; print(autogl.__version__)']
pip list: [the output of cmd pip list]

Additional Info (Optional)
Add any other context about the problem here. Such as the possible reason you think, or how to solve this bug if you are interested.

Support for NAS in node classification

Sorry for bothering. I have some problems when I try to use your given NAS examples for semi-supervised node classification.
For example, I tried to run the file: gasso_test.py, all the parameters are default, but I got the accuracy 0.8150. Clearly, it's too low, compared with the accuracy you have given in the GASSO paper.
So I have checked the hyper-parameters you provide in the appendix, but I found no mistakes.
I do not know what leads to the situation. Thanks a lot if you can provide me some solutions.

thumnlab / autogl Goto Github PK

autogl's People

Contributors

Stargazers

Watchers

Forkers

autogl's Issues

packages in environment at /home/ruhkopf/miniconda3/envs/AutoGL:

Name Version Build Channel

Name Version Build Channel

0%| | 0/10 [00:00<?, ?it/s]WARNING:root:Ignore passed dec since enc is a whole model 0%| | 0/10 [00:00<?, ?it/s]

fit

Guide to developers

Note

Recommend Projects

Recommend Topics

Recommend Org

0%| | 0/10 [00:00<?, ?it/s]WARNING:root:Ignore passed dec since enc is a whole model
0%| | 0/10 [00:00<?, ?it/s]