tkipf / keras-gcn Goto Github PK

View Code? Open in Web Editor NEW

793.0 793.0 266.0 175 KB

Keras implementation of Graph Convolutional Networks

License: MIT License

Python 100.00%

keras-gcn's People

Contributors

Stargazers

Watchers

Forkers

vsuriya93 kaeflint pseth bkj zilongzhong giserh tigerneil zaxtax vyraun benjamesbabala tsingzao stella-gao techstone mqrshiyan minzc mdmustafizurrahman buaahsh libardo1 wgmueller1 apilastri egaebel huyongjun alexanderdesouza rejito zhaoj9014 iqbal-chowdhury tivaro xinhandi zhengzetao pombredanne ilyeong-ai koshinryuu embedxj shagunsodhani actionone aabbcc0812206523 dawnranger ningshiqi deasmhumhna b2220333 ericschles leezqcst ndres wpfhtl shubhampachori12110095 alexliyang miguelperalvo chrsitinass liuenda coastsunny mingltu nle-ml wuqiong23 zithasasi grseb9s statml arkc drindrajit1729 ghalib-bello rburactaon curryli weifengchiu sucrerouge afcarl davy9501 qitong pramodsu morganjk khullartanu malizheng why94nb gehongpeng nova0930 feifanrensheng talorwu littleboy7 ychervonyi hecongqing zhangluoyang ming1993li nd1511 patriciaxiao littlehann katedoan shengwenbo dizzyhot janweldert alyato yafanyin xiaoliang008 limt15 ashuein tsjain mr-yuppie jeongchanwoo hsiyjnd sushantjha8 science4fun xcgoner gnn2qsu

keras-gcn's Issues

Deprecated Keras API

Using keras-1.2.2 raises this warning on python train.py:

Using Theano backend.
Using gpu device 0: TITAN X (Pascal) (CNMeM is enabled with initial size: 25.0% of memory, cuDNN 5110)
/home/bjohnson/.anaconda/lib/python2.7/site-packages/theano/sandbox/cuda/__init__.py:600: UserWarning: Your cuDNN version is more recent than the one Theano officially supports. If you see any problems, try updating Theano or downgrading cuDNN to version 5.
  warnings.warn(warn)
Loading cora dataset...
Dataset has 2708 nodes, 5429 edges, 1432 features.
/home/bjohnson/.anaconda/lib/python2.7/site-packages/keras/engine/topology.py:379: UserWarning: The `regularizers` property of layers/models is deprecated. Regularization losses are now managed via the `losses` layer/model property.
  warnings.warn('The `regularizers` property of layers/models '

And experiments show that the W_regularizer parameter in GraphConvolution does not have the expected effect. Looks like Keras changed the way that regularizers get attached to weight matrices, but I'm not sure what best practices on the new API are. Presumably could copy from

https://github.com/fchollet/keras/blob/master/keras/layers/core.py#L654

All inputs to the layer should be tensors.

Hi,
I'm trying to use GCN for graph learning, but when I run your train.py script I get the following error:

Using local pooling filters...
Traceback (most recent call last):
File "train.py", line 55, in
H = GraphConvolution(16, support, activation='relu', kernel_regularizer=l2(5e-4))([H]+G)
File "/Users/aloreggia/Documents/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 575, in call
self.assert_input_compatibility(inputs)
File "/Users/aloreggia/Documents/anaconda2/lib/python2.7/site-packages/keras/engine/topology.py", line 448, in assert_input_compatibility
str(inputs) + '. All inputs to the layer '
ValueError: Layer graph_convolution_1 was called with an input that isn't a symbolic tensor. Received type: <class 'theano.sparse.basic.SparseVariable'>. Full input: [if{}.0, SparseVariable{csr,float32}]. All inputs to the layer should be tensors.

Can you help me solving this issue?

Thank you
Andrea

Doing graph-level classification?

My input is such that each subject has their own graph. This is different from the example given in train.py where there is only 1 graph (a citation network). In the tensorflow implementation of gcn, you suggest doing graph-level classification by combining the adjacency matrices of all the graphs in the input sample into one large adjacency matrix (as a sparse block-diagonal matrix). The part I am not sure how to implement in keras is the pooling of the output to produce 1 classification per graph. Any tips would be greatly appreciated!

multiple-graph

I would like to use gcn in case of multiple edges. How i can include different adjacency matrix?

Thank you

Infinity Loss on Default Training

Hi,
I tried to run the package and after reading the other issues, I got it to work, using the other fork.
However, using the default parameters and the cora data set, I get an infinity loss:

/home/sasse/.local/lib/python2.7/site-packages/kegra-0.0.1-py2.7.egg/kegra/utils.py:77: RuntimeWarning: divide by zero encountered in log Epoch: 0001 train_loss= inf train_acc= 0.1643 val_loss= inf val_acc= 0.1600 time= 1.5656 Epoch: 0002 train_loss= inf train_acc= 0.1071 val_loss= inf val_acc= 0.1167 time= 0.0403 Epoch: 0003 train_loss= inf train_acc= 0.1643 val_loss= inf val_acc= 0.1600 time= 0.0467 Epoch: 0004 train_loss= inf train_acc= 0.1643 val_loss= inf val_acc= 0.1600 time= 0.0410 Epoch: 0005 train_loss= inf train_acc= 0.1643 val_loss= inf val_acc= 0.1600 time= 0.0405 Epoch: 0006 train_loss= inf train_acc= 0.1643 val_loss= inf val_acc= 0.1600 time= 0.0426 Epoch: 0007 train_loss= inf train_acc= 0.1643 val_loss= inf val_acc= 0.1600 time= 0.0462 Epoch: 0008 train_loss= inf train_acc= 0.1643 val_loss= inf val_acc= 0.1600 time= 0.0430 Epoch: 0009 train_loss= inf train_acc= 0.1643 val_loss= inf val_acc= 0.1600 time= 0.0420 /home/sasse/.local/lib/python2.7/site-packages/kegra-0.0.1-py2.7.egg/kegra/utils.py:77: RuntimeWarning: invalid value encountered in log Epoch: 0010 train_loss= nan train_acc= 0.1571 val_loss= nan val_acc= 0.1167 time= 0.0531 Epoch: 0011 train_loss= inf train_acc= 0.1643 val_loss= inf val_acc= 0.1600 time= 0.0459 Epoch 11: early stopping Test set results: loss= inf accuracy= 0.1400
This is probably because of the encountered zeros in log but I don't know why they occur.

Positional offsets argument error in train.py file

Running the train.py file using Python 3.4.2 yielded the following error

Using Theano backend. Loading cora dataset... Dataset has 2708 nodes, 5429 edges, 1432 features. Using local pooling filters... Traceback (most recent call last): File "C:/Users/sagars/PycharmProjects/GCN/kegra/train.py", line 31, in <module> A_ = preprocess_adj(A, SYM_NORM) File "C:\Users\sagars\PycharmProjects\GCN\kegra\utils.py", line 52, in preprocess_adj adj = normalize_adj(adj, symmetric) File "C:\Users\sagars\PycharmProjects\GCN\kegra\utils.py", line 42, in normalize_adj d = sp.diags(np.power(np.array(adj.sum(1)), -0.5).flatten()) TypeError: diags() missing 1 required positional argument: 'offsets'

ValueError: setting an array element with a sequence.

Loading cora dataset...
Dataset has 2708 nodes, 5429 edges, 1433 features.
Using local pooling filters...
WARNING:tensorflow:From /Users/manohar/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py:422: The name tf.global_variables is deprecated. Please use tf.compat.v1.global_variables instead.

ValueError Traceback (most recent call last)
in ()
66 epochs=1,
67 shuffle=False,
---> 68 verbose=0)
69
70 # Predict on full dataset

~/anaconda3/lib/python3.6/site-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_freq, max_queue_size, workers, use_multiprocessing, **kwargs)
1237 steps_per_epoch=steps_per_epoch,
1238 validation_steps=validation_steps,
-> 1239 validation_freq=validation_freq)
1240
1241 def evaluate(self,

~/anaconda3/lib/python3.6/site-packages/keras/engine/training_arrays.py in fit_loop(model, fit_function, fit_inputs, out_labels, batch_size, epochs, verbose, callbacks, val_function, val_inputs, shuffle, initial_epoch, steps_per_epoch, validation_steps, validation_freq)
194 ins_batch[i] = ins_batch[i].toarray()
195
--> 196 outs = fit_function(ins_batch)
197 outs = to_list(outs)
198 for l, o in zip(out_labels, outs):

~/anaconda3/lib/python3.6/site-packages/tensorflow/python/keras/backend.py in call(self, inputs)
3275 tensor_type = dtypes_module.as_dtype(tensor.dtype)
3276 array_vals.append(np.asarray(value,
-> 3277 dtype=tensor_type.as_numpy_dtype))
3278
3279 if self.feed_dict:

~/anaconda3/lib/python3.6/site-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
83
84 """
---> 85 return array(a, dtype, copy=False, order=order)
86
87

ValueError: setting an array element with a sequence.

I am trying to run the code as is but getting this error. The code ran fine 2 weeks back, but I am having trouble now.

Do you think its tensorflow or keras versions?

Extraction of node-level embeddings

Hello -- thank you so much for your work! I've been working with the keras implementation of GCN, and was wondering if there was a way to extract node-level embeddings from this framework. Thank you in advance!

Manan

Errors when testing the train.py with tensorflow as backend

Sorry, the previous issue report was closed. So I created a new one.

The error message showed up again when I change the FILTER method from 'localpool' to 'chebyshev' even by running "python train.py" from terminal.

ValueError: Dimensions must be equal, but are 4299 and 1433 for 'graph_convolution_1/MatMul' (op: 'MatMul') with input shapes: [?,4299], [1433,16].

Problems running GCN in a loop

When i try to put the GCN model construction in a loop (to test various randomly generated hyperparameters), I get the following error: ValueError: Initializer for variable graph_convolution_22/kernel/ is from inside a control-flow construct, such as a loop or conditional. When creating a variable inside a loop or conditional, use a lambda as the initializer

Has anyone come across this error?

mse

Got error when fitting

I got the following error when running model.fit. How do I fix it?

environment: python3.7, tensorflow2.1.0, scipy1.4.1

2020-05-05 20:56:17.600919: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Traceback (most recent call last):
File "~/.../train.py", line 77, in
batch_size=A.shape[0], epochs=1, shuffle=False, verbose=0)
File "/usr/local/lib/python3.7/site-packages/keras/engine/training.py", line 1239, in fit
validation_freq=validation_freq)
File "/usr/local/lib/python3.7/site-packages/keras/engine/training_arrays.py", line 196, in fit_loop
outs = fit_function(ins_batch)
File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/keras/backend.py", line 3721, in call
value = ops.convert_to_tensor(value, dtype=tensor.dtype)
File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1314, in convert_to_tensor
ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py", line 317, in _constant_tensor_conversion_function
return constant(v, dtype=dtype, name=name)
File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py", line 258, in constant
allow_broadcast=True)
File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py", line 266, in _constant_impl
t = convert_to_eager_tensor(value, ctx, dtype)
File "/usr/local/lib/python3.7/site-packages/tensorflow_core/python/framework/constant_op.py", line 96, in convert_to_eager_tensor
return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: TypeError: sparse matrix length is ambiguous; use getnnz() or shape[0]
Traceback (most recent call last):

File "/usr/local/lib/python3.7/site-packages/scipy/sparse/base.py", line 295, in len
raise TypeError("sparse matrix length is ambiguous; use getnnz()"

TypeError: sparse matrix length is ambiguous; use getnnz() or shape[0]

Regression Task

Hi,

I'm having a bit of trouble getting the GCN to work for me. I am currently trying to accomplish the following task:
given X in R^N and Y in R^N find a function mapping X to Y. Here I know both X and Y correspond to values on the vertices of a graph and I have a dataset of many rows mapping X to Y.

I have installed the keras-gcn module that you've so graciously provided and done the updates suggested by buashh which update the code for keras 2.0.5 and python 3.5. Using that I was able to get train.py to run correctly.

I have modified the code for the regression task I have at hand and gotten a little bit confused.

If I have say 1000 rows of X and Y data, should I also be passing the Laplacian 1000 times into the graph input layer?

So I would pass in X_in = [<nx1000 array> X, <nxnx1000 array> T_0, <nxnx1000 array> T_1... etc]?

how to install it? No module named 'kegra'

I am not famiIiar with python.I create an anaconda environment to install keras and tensorflow, then I run the "pip setup.py install" to install the kegra. I find the kegra-0.0.1-py3.5.egg folder in the site-packages, but I can't import it in the python file.

SGD

Have you ever pursued methods for training these models with batches smaller than the entire size of the data?

I understand that making a prediction for a node involves access to it's n-th degree neighbors, but have you ever tried to implement these methods in practice and/or are you aware of pitfalls involved?

Cannot use GPU when output.shape[1] * nnz(a) > 2^31

When I use the gcn layer, I always meets this problem.

InvalidArgumentError: 2 root error(s) found.
(0) Invalid argument: Cannot use GPU when output.shape[1] * nnz(a) > 2^31
[[node graph_convolution_23/SparseTensorDenseMatMul/SparseTensorDenseMatMul (defined at /home/labadmin/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:1083) ]]
[[loss_3/add_2/_677]]
(1) Invalid argument: Cannot use GPU when output.shape[1] * nnz(a) > 2^31
[[node graph_convolution_23/SparseTensorDenseMatMul/SparseTensorDenseMatMul (defined at /home/labadmin/anaconda2/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py:1083) ]]
0 successful operations.
0 derived errors ignored.

Error in train.py

I used the command python train.py from Anaconda prompt and I got the following error

(C:\gcn-master\gcn>python train.py
Traceback (most recent call last):
File "train.py", line 59, in
model = model_func(placeholders, input_dim=features[2][1], logging=True)
File "C:\Users\Ibrahim\Anaconda3\lib\site-packages\gcn-1.0-py3.6.egg\gcn\model
s.py", line 145, in init
self.build()
File "C:\Users\Ibrahim\Anaconda3\lib\site-packages\gcn-1.0-py3.6.egg\gcn\model
s.py", line 42, in build
self._build()
File "C:\Users\Ibrahim\Anaconda3\lib\site-packages\gcn-1.0-py3.6.egg\gcn\model
s.py", line 170, in _build
logging=self.logging))
File "C:\Users\Ibrahim\Anaconda3\lib\site-packages\gcn-1.0-py3.6.egg\gcn\layer
s.py", line 184, in init
self.phase_train = placeholders['phase_train']
KeyError: 'phase_train'

Also, the file train.py cann't detect the line
from gcn.utils import *
By the way, I'm using Spyder 3.2.4 (python 3.6.2)

Thanks

How to use the model for online predict

Because the training process is only a Semi-supervised learning ， So can't use for online predict.

build symmetric adjacency matrix

Hi,

Just curious about this line in utils/load_data
adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)

Seems like it fails due to the comparison which makes it boolean. Is it even needed ? As I see it , the adjacency matrix is already symmetric with a shape of (2708, 2708).

Error converting string to float in train.py

Ran into the following issue when running the train.py file. I am using python 3.4.2 and Keras 1.1.0

Using Theano backend. Loading cora dataset... Traceback (most recent call last): File "C:/Users/sagars/PycharmProjects/GCN/kegra/train.py", line 22, in <module> X, A, y = load_data(dataset=DATASET) File "C:\Users\sagars\PycharmProjects\GCN\kegra\utils.py", line 20, in load_data features = sp.csr_matrix(idx_features_labels[:, 1:-2], dtype=np.float32) File "C:\Python34\lib\site-packages\scipy\sparse\compressed.py", line 69, in __init__ self._set_self(self.__class__(coo_matrix(arg1, dtype=dtype))) File "C:\Python34\lib\site-packages\scipy\sparse\coo.py", line 204, in __init__ self.data = self.data.astype(dtype) ValueError: could not convert string to float: "b'0'"

error occur when save and reload the model

Hi Thomas,
A problem occur when i learning the gcn.I tried to save the model and reload it at the end of the train.py,

model.save('gcn_model.h5')
print('save successfully')
model2 = load_model('gcn_model.h5')
preds = model.predict(graph, batch_size=A.shape[0])
print(preds)

and the result as follow:

...
Epoch: 0196 train_loss= 0.4453 train_acc= 0.9714 val_loss= 0.8001 val_acc= 0.8233 time= 0.0692
Epoch: 0197 train_loss= 0.4434 train_acc= 0.9714 val_loss= 0.7983 val_acc= 0.8233 time= 0.0692
Epoch: 0198 train_loss= 0.4415 train_acc= 0.9714 val_loss= 0.7969 val_acc= 0.8200 time= 0.0672
Epoch: 0199 train_loss= 0.4394 train_acc= 0.9714 val_loss= 0.7952 val_acc= 0.8200 time= 0.0672
Epoch: 0200 train_loss= 0.4373 train_acc= 0.9714 val_loss= 0.7932 val_acc= 0.8167 time= 0.0762
Test set results: loss= 0.8567 accuracy= 0.8000
Traceback (most recent call last):
  File "train.py", line 107, in <module>
    model.save('gcn_model.h5')
  File "D:\Anaconda3\lib\site-packages\keras\engine\topology.py", line 2553, in save
    save_model(self, filepath, overwrite, include_optimizer)
  File "D:\Anaconda3\lib\site-packages\keras\models.py", line 107, in save_model
    'config': model.get_config()
  File "D:\Anaconda3\lib\site-packages\keras\engine\topology.py", line 2326, in get_config
    layer_config = layer.get_config()
  File "D:\Anaconda3\lib\site-packages\kegra-0.0.1-py3.5.egg\kegra\layers\graph.py", line 80, in get_config
AttributeError: 'VarianceScaling' object has no attribute '__name__'

I run the code on windows 10,keras version==2.0.5,tensorflow==1.3.0, I have try many other methods but they didn't work.
Look forward your reply!

Negative values in the adjacency matrix and the feature matrix

Hello @tkipf

Are we allowed to have negative values in the adjacency matrix ?
The same question for the feature matrix ?

About the conv operation

In the GraphConvolution layer, did the conv operation in the call() just renew the first example's features? I saw in your tensorflow gcn, the conv operation is implented on every colum of "adj", so the conv operation can attain a new feature X. Is Keras different from tensorflow? Hope your reply!

Tensorflow bug

Hi Thomas,

I've tried to run the train.py code. When using Theano as a backend it works but when I try to use Tensorflow it breaks. I've changed nothing in the source code but the code still breaks. Do you happen to know why this is the case? Or is Tensorflow just not supported yet?

python train.py Using TensorFlow backend. I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.5.1.5 locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so.8.0 locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so.8.0 locally Loading cora dataset... Dataset has 2708 nodes, 5429 edges, 1432 features. Using local pooling filters... I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties: name: GeForce GTX 960M major: 5 minor: 0 memoryClockRate (GHz) 1.176 pciBusID 0000:01:00.0 Total memory: 3.95GiB Free memory: 3.54GiB I tensorflow/core/common_runtime/gpu/gpu_device.cc:906] DMA: 0 I tensorflow/core/common_runtime/gpu/gpu_device.cc:916] 0: Y I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 960M, pci bus id: 0000:01:00.0) Traceback (most recent call last): File "train.py", line 76, in <module> batch_size=A.shape[0], nb_epoch=1, shuffle=False, verbose=0) File "/home/wolf/anaconda2/envs/tf_gpu/lib/python2.7/site-packages/keras/engine/training.py", line 1124, in fit callback_metrics=callback_metrics) File "/home/wolf/anaconda2/envs/tf_gpu/lib/python2.7/site-packages/keras/engine/training.py", line 842, in _fit_loop outs = f(ins_batch) File "/home/wolf/anaconda2/envs/tf_gpu/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 1040, in __call__ updated = session.run(self.outputs + [self.updates_op], feed_dict=feed_dict) File "/home/wolf/anaconda2/envs/tf_gpu/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 766, in run run_metadata_ptr) File "/home/wolf/anaconda2/envs/tf_gpu/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 945, in _run raise ValueError('Tensor %s may not be fed.' % subfeed_t) ValueError: Tensor Output("Const:0", shape=(2,), dtype=int64) may not be fed.

Error while training using CORA dataset

Hi,

I wanted to check if the Windows 10 installation was successful using the train.py script that came with the package.

Here are the messages and the error:

\keras-gcn-master\kegra>python train.py
Using Theano backend.
WARNING (theano.configdefaults): g++ not available, if using conda: conda install m2w64-toolchain
\AppData\Roaming\Python\Python27\site-packages\theano-1.0.1+unknown-py2.7.egg\theano\configdefaults.py:560: UserWarning: DeprecationWarning: there is no c++ compiler.This is deprecated and with Theano 0.11 a c++ compiler will be mandatory
warnings.warn("DeprecationWarning: there is no c++ compiler."
WARNING (theano.configdefaults): g++ not detected ! Theano will be unable to execute optimized C-implementations (for both CPU and GPU) and will default to Python implementations. Performance will be severely degraded. To remove this warning, set Theano flags cxx to an empty string.
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
Loading cora dataset...
Dataset has 2708 nodes, 5429 edges, 1433 features.
Using local pooling filters...
Traceback (most recent call last):
File "train.py", line 55, in
H = GraphConvolution(16, support, activation='relu', kernel_regularizer=l2(5e-4))([H]+G)
File "Anaconda2\lib\site-packages\keras\engine\topology.py", line 573, in call
self.assert_input_compatibility(inputs)
File "Anaconda2\lib\site-packages\keras\engine\topology.py", line 446, in assert_input_compatibility
str(inputs) + '. All inputs to the layer '
ValueError: Layer graph_convolution_1 was called with an input that isn't a symbolic tensor. Received type: <class 'theano.sparse.basic.SparseVariable'>. Full input: [if{}.0, SparseVariable{csr,float32}]. All inputs to the layer should be tensors.

Thanks.

use keras gcn+ lstm, with dynamic graph G at each time step

Hi Thomas, this works is great and very useful to me!
I noticed that the current implementation of constructing graph G, is based on real-valued input adjacency matrix A. But if the graph is dynamic, then this adjacency matrix also changes over time, then I am thinking maybe we need to build the tensor G based on another adjacency tensor A instead of real-valued matrix?
I might have some stupid idea that is wrong. Look forward to your suggestions!

Graph Convolution Layer Correct?

Hey,
I must misunderstand your code about graph convolution layer. Is graph convolution implemented within graph.py? I know in your tf version, there's this chunk of code commented by #convolve in layers.py. So what is the corresponding part in graph.py? Thank you for your time.

Differences on the Cora dataset

The lables here at keras-gcn does not seem to corresponds with the labels of the gcn repository when you load the data. It's the same indices, but not the same values.
Also if you sum y_train here, there aren't 20 labels per class.

Are the two datasets actually different as it seems?
What's the reason for that and which one should we use to replicate the paper results?

index dislocation when spliting loading features

Thanks for your enlightening work!

I think there is a minor mistake in function load_data, file utils.py, line 15:

features = sp.csr_matrix(idx_features_labels[:, 1:-2], dtype=np.float32)

idx_features_labels[:, 1:-2] should be replaced by idx_features_labels[:, 1:-1], otherwise the last column of word attributes will be omitted.

Face TypeError during load data

As the format of Cora Dataset(cora.cites & cora.content), I generate similar data as Cora. However, It gets error below after a few times modification on my data.
Traceback (most recent call last):
File "train.py", line 26, in
X, A, y = load_data(cites_path, content_path)
File "/data/zhangqifan/projects/keras-gcn-master/kegra/utils.py", line 26, in load_data
edges = np.array(list(map(idx_map.get, edges_unordered.flatten())), dtype=np.int64).reshape(edges_unordered.shape)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

Could you give me some ideas about addressing this problem?
Thanks!

Parallel processing argument for train.py (feature request)

Is there a way to limit the number of CPUs?

Using a `tf.Tensor` as a Python `bool` is not allowed in Graph execution

Traceback：
File "train.py", line 55, in
H = GraphConvolution(16, support, activation='relu', kernel_regularizer=l2(5e-4))([H]+G)
File "/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py", line 489, in call
output = self.call(inputs, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/kegra-0.0.1-py3.6.egg/kegra/layers/graph.py", line 76, in call
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py", line 450, in bool
return bool(self.read_value())
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 765, in bool
self._disallow_bool_casting()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 534, in _disallow_bool_casting
self._disallow_in_graph_mode("using a tf.Tensor as a Python bool")
File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py", line 523, in _disallow_in_graph_mode
" this function with @tf.function.".format(task))
tensorflow.python.framework.errors_impl.OperatorNotAllowedInGraphError: using a tf.Tensor as a Python bool is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.

I am using TF== 1.15 and TF== 2.0.0( both with keras==2.3.1), the problem occurs in both environments.

Any help or suggestions?

Error running train.py

ValueError: Dimensions must be equal, but are 16 and 1432 for 'graph_convolution_2/MatMul' (op: 'MatMul') with input shapes: [?,16], [1432,7].

This is from

File "kegra/train.py", line 57, in <module>
  Y = GraphConvolution(y.shape[1], support, activation='softmax')([H]+G)
File "/Users/owang/Library/Python/2.7/lib/python/site-packages/keras/engine/topology.py", line 596, in __call__
  output = self.call(inputs, **kwargs)
File "kegra/graph.py", line 143, in call
  output = K.dot(supports, self.W)

Errors occur during the training using chebyshev way

Hi
I am very interested in your work.
It trains well using the default filter (localpool)
But failed to run when I change the code to:

FILTER = 'chebyshev'  # 'chebyshev'

The environment I use is:

tensorflow==1.5.0
tensorflow-tensorboard==1.5.1
Keras==2.1.4

And here is the error:

(py36) liu@pattaya:~/download/keras-gcn/kegra$ python train.py
/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
Loading cora dataset...
Dataset has 2708 nodes, 5429 edges, 1433 features.
Using Chebyshev polynomial basis filters...
Calculating largest eigenvalue of normalized graph Laplacian...
Calculating Chebyshev polynomials up to order 2...
Traceback (most recent call last):
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/common_shapes.py", line 686, in _call_cpp_shape_fn_impl
    input_tensors_as_shapes, status)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Dimensions must be equal, but are 4299 and 1433 for 'graph_convolution_1/MatMul' (op: 'MatMul') with input shapes: [?,4299], [1433,16].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 55, in <module>
    H = GraphConvolution(16, support, activation='relu', kernel_regularizer=l2(5e-4))([H]+G)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/keras/engine/topology.py", line 617, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/kegra/layers/graph.py", line 73, in call
    output = K.dot(supports, self.kernel)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 1072, in dot
    out = tf.matmul(x, y)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 2022, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 2516, in _mat_mul
    name=name)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3162, in create_op
    compute_device=compute_device)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3208, in _create_op_helper
    set_shapes_for_outputs(op)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2427, in set_shapes_for_outputs
    return _set_shapes_for_outputs(op)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2400, in _set_shapes_for_outputs
    shapes = shape_func(op)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2330, in call_with_requiring
    return call_cpp_shape_fn(op, require_shape_fn=True)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/common_shapes.py", line 627, in call_cpp_shape_fn
    require_shape_fn)
  File "/home/liu/miniconda3/envs/py36/lib/python3.6/site-packages/tensorflow/python/framework/common_shapes.py", line 691, in _call_cpp_shape_fn_impl
    raise ValueError(err.message)
ValueError: Dimensions must be equal, but are 4299 and 1433 for 'graph_convolution_1/MatMul' (op: 'MatMul') with input shapes: [?,4299], [1433,16].
(py36) liu@pattaya:~/download/keras-gcn/kegra$

Look forward to your reply.
Thank you very much.

Edge parameters

Hello, thx for the great work!

Cora dataset contains only link existance (0 or 1), but can I somehow add edge category as an additional feature? I have edges with different types.

difference between evaluate_preds and model.evaluate

Thanks for your excellent work. Your codes are really helpful.

In your code about evaluating the gcn model, what confused me is the difference between utils.evaluate_preds(your implementation) and model.evaluate(keras API). Here are my changes to evaluate gcn using model.evaluate function:

add metric accuracy to model.compile for accuracy logging:

model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.01),
                        metrics=['accuracy'])

evaluate the train data using mode.evaluate:

train_loss, train_acc = model.evaluate(
    graph, y_train, sample_weight=train_mask, batch_size=X.shape[0], verbose=0)

the rest code:

print('evaluate: train_loss={:.4f}, train_acc:{:.4f}'.format(train_loss, train_acc))

preds = model.predict(graph, batch_size=X.shape[0])
train_val_loss, train_val_acc = utils.evaluate_preds(preds, [y_train, y_val],
                                                             [idx_train, idx_val])
print("predict:  train_loss={:.4f}, train_acc={:.4f}".format(train_val_loss[0], train_val_acc[0]))

And here are the outputs I got after 10 loops:

evaluate: train_loss=1.9505, train_acc:0.0240
predict:  train_loss=1.9389, train_acc=0.4286

evaluate: train_loss=1.9400, train_acc:0.0222
predict:  train_loss=1.9310, train_acc=0.4143

evaluate: train_loss=1.9294, train_acc:0.0233
predict:  train_loss=1.9216, train_acc=0.4429

evaluate: train_loss=1.9191, train_acc:0.0233
predict:  train_loss=1.9114, train_acc=0.4500

evaluate: train_loss=1.9091, train_acc:0.0229
predict:  train_loss=1.9007, train_acc=0.4429

evaluate: train_loss=1.8993, train_acc:0.0229
predict:  train_loss=1.8895, train_acc=0.4429

evaluate: train_loss=1.8895, train_acc:0.0240
predict:  train_loss=1.8777, train_acc=0.4643

evaluate: train_loss=1.8797, train_acc:0.0240
predict:  train_loss=1.8655, train_acc=0.4643

evaluate: train_loss=1.8697, train_acc:0.0240
predict:  train_loss=1.8529, train_acc=0.4643

evaluate: train_loss=1.8595, train_acc:0.0236
predict:  train_loss=1.8398, train_acc=0.4571

Test set results: loss= 1.8782 accuracy= 0.3590
[Finished in 19.3s]

According to keras doc, regularization mechanisms, such as Dropout and L1/L2 weight regularization, are turned off at testing time.

So why does the loss returned by model.evaluate is not exactly the same as utils.evaluate_preds?

what I have tried:

I tried to implement categorical_crossentropy loss function according to keras tensorflow backend. Here are my codes:

def categorical_crossentropy_keras(preds, labels):
    # # scale preds so that the class probas of each sample sum to 1
    preds /= np.sum(preds, axis=- 1).reshape(-1, 1)
    _epsilon = 10e-8
    output = np.clip(preds, _epsilon, 1. - _epsilon)
    loss = - np.sum(labels * np.log(output), axis= - 1)
    return np.mean(loss)

but the results of this function are exactly the same as utils.categorical_crossentropy.

Trying to set up network for regression task

Could you possibly point me on where to start with this so I can set this up for a regression task? Plan on using L2_loss.

I have multiple samples of data that lie on the same graph, however the features at each node is a vector. I was wondering how to get around since I'm dealing with a 3D tensor:

X : (Samples x Nodes x features)
Y: (Samples x Nodes x features) as well

Most of the operations in both implementations seem to only work for X being a matrix, unless I'm mistaken.

Of course, my adjacency & laplacian will remain 2D matrices for all operations (there the Cheby. will be too)

Errors when testing the train.py with tensorflow as backend

Hi,
I was trying to test the keras-gcn package with the cora dataset.
But there is always an error when I try to run the command: H = GraphConvolution(16, support, activation='relu', kernel_regularizer=l2(5e-4))([H]+G)

Output message:
Loading cora dataset...
Dataset has 2708 nodes, 5429 edges, 1433 features.

InvalidArgumentError: Dimensions must be equal, but are 4299 and 1433 for 'graph_convolution_2/MatMul' (op: 'MatMul') with input shapes: [?,4299], [1433,16].

The dataset has a shape of (2708, 1433) for input and (2708, 7) for output.

Do you have a clue what causes this error and how can I fix it ?
My keras version is 2.1.1, with tensorflow (version 1.4.0) as backend.

Thanks!
Best,
Yu.

Model.call() fails on GraphConvolution layer, cannot connect to other models

Dear Thomas,

Thank you for sharing this interesting package. When trying to link a GraphConvolution model to other models via the call() function, I ran into an error. Here is the minimal code for reproducing it, with the same input structure as in train.py:

from keras.layers import Input
from keras.models import Model

from keras.optimizers import Adam

from kegra.layers.graph import GraphConvolution

featureInput = Input(shape=(1,))
adjacencyInput = Input(shape=(None, None), batch_shape=(None,None), sparse=False)
support=1

output = GraphConvolution(1, support, activation='linear')([featureInput,adjacencyInput])

# Compile model
graphConvModel = Model(inputs=[featureInput, adjacencyInput], outputs=output)
graphConvModel.compile(loss='mean_squared_error', optimizer=Adam(lr=1e-4))

The model compiles successfully, and I can train it, and predict with it. However, when I try to run the call function, for example like this: graphConvModel([featureInput,adjacencyInput]), I get the following error message:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-826-c1aa401b2630> in <module>()
----> 1 graphConvModel([featureInput,adjacencyInput])

~\Anaconda3\lib\site-packages\keras\engine\base_layer.py in __call__(self, inputs, **kwargs)
    455             # Actually call the layer,
    456             # collecting output(s), mask(s), and shape(s).
--> 457             output = self.call(inputs, **kwargs)
    458             output_mask = self.compute_mask(inputs, previous_mask)
    459 

~\Anaconda3\lib\site-packages\keras\engine\network.py in call(self, inputs, mask)
    562             return self._output_tensor_cache[cache_key]
    563         else:
--> 564             output_tensors, _, _ = self.run_internal_graph(inputs, masks)
    565             return output_tensors
    566 

~\Anaconda3\lib\site-packages\keras\engine\network.py in run_internal_graph(self, inputs, masks)
    759                                 'and output masks. Layer ' + str(layer.name) + ' has'
    760                                 ' ' + str(len(output_tensors)) + ' output tensors '
--> 761                                 'and ' + str(len(output_masks)) + ' output masks.')
    762                     # Update model updates and losses:
    763                     # Keep track of updates that depend on the inputs

Exception: Layers should have equal number of output tensors and output masks. Layer graph_convolution_90 has 1 output tensors and 2 output masks.

With multiple GraphConvolution layers, the error always occurs at the first layer.

Changing node counts does not do anything. I'm suspecting that the batch shape difference between the two inputs might be why there are 2 output masks, but I couldn't make a change to the shape and batch_shape arguments of the inputs that would compile successfully and evade the issue.

Setup details:
Keras version: 2.2.4
Tensorflow version: 1.12.0

Sincerely,
Robert Beck

About the input Matrix A.

In this train model ,the train data's adj is 0 or 1.
But my adj is the N*N similarity matrix which is filled within [0,1].
Should I change the method of normalize_adj ?
Looking forward for you reply.

ImportError: cannot import name initializations

from kegra.layers.graph import GraphConvolution, GraphInput
gives the error
ImportError: cannot import name initializations

the errors is due to
----> 3 from keras import activations, initializations
and it comes when using both tensorflow or theano as backend

some problem need help

@tkipf Thanks for your code. Now I have some problem. I do a jupyter notebook keras implement. However, when I run model.compile, throw a error.Can you help me? My jupyer url is https://github.com/searchlink/gcn/blob/master/keras_gcn_implement.ipynb

About the call() in the graph.py

I have several questions about the call() in the graph.py
1>Does the "basis" represent for Aˆ (Aˆ = D˜ ^-0.5 A˜D˜ ^-0.5) ?
2>I saw the "support"=1, does it mean that the "supports" only contains one example's new feature ?If so, where are other examples' new features?

Besides, for the first graph layer, the "inputs" contains Aˆ, but for the second graph layer, how is Aˆ contained in the "inputs"?
Thank you very much! Look forward for your reply! Thanks!

Convert string error. Any comments?

run python train.py I got following error. Seems scipy problem. But I can't figure out it.

The information:

Using TensorFlow backend.
Loading cora dataset...
Traceback (most recent call last):
  File "train.py", line 22, in <module>
    X, A, y = load_data(dataset=DATASET)
  File "/Users/Tiger/anaconda/envs/rllab3/lib/python3.5/site-packages/kegra-0.0.1-py3.5.egg/kegra/utils.py", line 20, in load_data
    features = sp.csr_matrix(idx_features_labels[:, 1:-2], dtype=np.float32)
  File "/Users/Tiger/anaconda/envs/rllab3/lib/python3.5/site-packages/scipy/sparse/compressed.py", line 79, in __init__
    self._set_self(self.__class__(coo_matrix(arg1, dtype=dtype)))
  File "/Users/Tiger/anaconda/envs/rllab3/lib/python3.5/site-packages/scipy/sparse/coo.py", line 182, in __init__
    self.data = self.data.astype(dtype, copy=False)
ValueError: could not convert string to float: "b'0'"

About the adjacency Matrix A coding

Hello, I'm puzzled about the adjacency matrix A.
Paper said that A˜ = A + IN, but the code computed the adjacency A as "adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)", where is the IN? Did I miss it?
Thank you very much! Looking forward to your reply!

GCN apply for my own dataset but training_acc is low

my data is supervised, I don‘t know whether it will influence the result, and my training_acc is 0.0911,but val_acc is alwasy 0. and trainning_loss is decresing but val_loss is incresing. I don't know how to fix it , if anyone can help me , thank you a lot.

TypeError: Using a `tf.Tensor` as a Python `bool` is not allowed.

I have multiple graph instance but they have different number of nodes.

Different graph means different size of adjacency matrix. How do I deal with it?

Model Fails when we change batch size

For Cora dataset if I change the batch size to say 100 the model fails to run.

Traceback (most recent call last):
File "/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/anaconda3/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot multiply A and B because inner dimension does not match: 2708 vs. 100. Did you forget a transpose? Dimensions of A: [100, 2708). Dimensions of B: [100,1433]
[[{{node graph_convolution_1/SparseTensorDenseMatMul/SparseTensorDenseMatMul}}]]

name 'Y' is not defined

It seems that there is no predefined "Y" in train.py

tkipf / keras-gcn Goto Github PK

keras-gcn's People

Contributors

Stargazers

Watchers

Forkers

keras-gcn's Issues

Hi Thomas, A problem occur when i learning the gcn.I tried to save the model and reload it at the end of the train.py,

Recommend Projects

Recommend Topics

Recommend Org

Hi Thomas,
A problem occur when i learning the gcn.I tried to save the model and reload it at the end of the train.py,