hanjun-dai / graph_comb_opt Goto Github PK

View Code? Open in Web Editor NEW

484.0 18.0 134.0 2.28 MB

Implementation of "Learning Combinatorial Optimization Algorithms over Graphs"

Home Page: https://arxiv.org/abs/1704.01665

License: MIT License

Shell 5.45% Python 18.33% C++ 75.06% C 1.15%

graph_comb_opt's Introduction

graph_comb_opt

Implementation of "Learning Combinatorial Optimization Algorithms over Graphs" (https://arxiv.org/abs/1704.01665)

Step-by-step demo of MVC solution found by different methods. From left to right: (1) S2V-DQN (our method), (2) node-degree heuristic, (3) edge-degree heuristic

1. build

**** Below shows an example of MVC. For all the problems, you can follow the similar pipeline ****

Get the source code, and install all the dependencies.

git clone --recursive https://github.com/Hanjun-Dai/graph_comb_opt

build the graphnn library with the instructions here:
  https://github.com/Hanjun-Dai/graphnn

For each task, build the dynamic library. For example, to build the Minimum Vertex Cover library:

cd code/s2v_mvc/mvc_lib/
cp Makefile.example Makefile

customize your Makefile if necessary
( add CXXFLAGS += -DGPU_MODE in the Makefile if you want to run it in GPU mode)

make -j

Now you are all set with the C++ backend.

2. Experiments on synthetic data

Generate synthetic data

To generate the synthetic data for Minimum Vertex Cover task, you can do the following:

cd code/data_generator/mvc
modify parameters in run_generate.sh
./run_generate.sh

The above code will generate 1000 test graphs under /path/to/the/project/data/mvc

Training with n-step Q-learning

Navigate to the MVC folder and run the training script. Modify the script to change the parameters.

cd code/s2v_mvc
./run_nstep_dqn.sh

By default it will save all the model files, the logs under currentfolder/results. Note that the code will generate the data on the fly, including the validation dataset. So the training code itself doesn't rely on the data generator.

Test the performance

Navigate to the MVC folder and run the evaluation script. Modify the script to change the parameters. Make sure the parameters are consistent with your training script.

cd code/s2v_mvc
./run_eval.sh

The above script will load the 1000 test graphs you generated before, and output the solution in a csv file, under the same results folder. Format of the csv for MVC:

cover size, cover_size a_1 a_2 a_3 ...., time in seconds

Here the second column shows a solution found by S2V-DQN, in the same order of how each node is picked.

2. Experiments on real-world data

For TSP we test on part of the tsplib instances; For MVC and SCP, we use memetracker dataset; For MAXCUT, we test on optsicom dataset;

All the data can be found through the dropbox link below. Code folders that start with 'realworld' are for this set of experiments.

Reproducing the results that reported in the paper

Here is the link to the dataset that was used in the paper:

https://www.dropbox.com/sh/r39596h8e26nhsp/AADRm5mb82xn7h3BB4KXgETsa?dl=0

Reference

Please cite our work if you find our code/paper is useful to your work.

@article{dai2017learning,
  title={Learning Combinatorial Optimization Algorithms over Graphs},
  author={Dai, Hanjun and Khalil, Elias B and Zhang, Yuyu and Dilkina, Bistra and Song, Le},
  journal={arXiv preprint arXiv:1704.01665},
  year={2017}
}

graph_comb_opt's People

Contributors

Stargazers

Watchers

Forkers

monkeytang ywang737 decheng-zhang davidurpani shubhampachori12110095 michaeljyt atumanov pengchengpcx pemami4911 huanfachen kyriechin douxiaotian cafe mariusgutzeit tonydeep tianshangaga tarunyadav ai3dvision drrivest afcarl ayhanfuat arwyer sycraft zfjsail zamlz pengzheng9045 jdc08161063 omkarthakoor avrech yinyee rahulptel aikinogard yangyingxiang ysun57 brodaua wh-forker caozhengquan sigmam pkulzb benz326 nkmxh scorpjd rongkaizhang alexbouril willsmithsky archcangyuan andrewjiangxw acedesci tahsinkose jjh3024 phi-lab zhuqingling ravi-lanka-4 wenyuema-cs chen-yuan-zhang noureldinyosri hoangcuong2011 tiancivalen xinhen obitoquilt duanxinpeng no-wings kuonanhong marcusyf c8pan kejiejiang sucrerouge dvhieu phuongpnh nh333 zhangyxyyx snazari lvalenci ffrankyy wlonging yongwonshin sduxzh nikzak mzy2240 hanbaoan123 masoumeh64 edebie sawndip sawndip2014 pqros timtimchen dschaub95 wonlee2019 halehdizaji udeshmg onogif nizhengguo818 alan-ic dddlovelll xrosliang lcrypto superarbor daotranbk washake sudomishra

graph_comb_opt's Issues

errors on training

when I run ./run_nstep_dqn.sh in s2v_tsp2d
it returns

and some problems also happen when I run ./run_nstep_dqn in s2v_mvc

can you help me ??

How to implement with pytorch_structure2vec?

Hi ,thanks for your code! I want to know what I should to do if I use pytorch_structure2vec do this work(TSP instance)?

Testing tsp instances

If I want to test the tsp instances, do I have to re-train the dqn model?

How to understand the NN part, e.g., qnet.cpp

I find it is not easy to understand how does the nn works, e.g., the func 'QNet::SetupGraphInput()' and ' QNet::BuildNet()', so can you give more detailed instructions about the code? thanks a lot!

Questions about understanding s2v_mvc

Hi, thanks for the great paper and sharing your code!

I really liked your paper and currently trying to re-implement it in pytorch / deep graph library (https://www.dgl.ai/). I would be grateful if you could help me out with some of my questions regarding the code (specifically s2v_mvc).

For the n-step q net fitting in s2v_mvc folder,

it says n=5 for minimum vertex cover in the paper and "evaluate.sh", but the code "run_nstep_dqn.sh" use n=2 for training. Would I be able to obtain the results in the Figure D.2 in the appendix of the paper, if I switch to n=5 without changing other hyper-parameters?

For the implementation of q net, I am trying to understand the differences between the code and the paper (which I am totally happy with). Could you confirm my following understandings?
2. The implemented q_net takes input as an "uncovered" subgraph with respect to the currently selected nodes with node features=1.
3. The network takes an additional 3-dimensional input "aux_feat" containing a) ratio of covered nodes, b) ratio of covered edges and c) a bias term.

It would also be more than helpful if you could point out other "mvc-specific" implementation of the code. Thanks very much in advance!

MaxCut loss always 0

Hi,
Thanks for the code. I am trying to run the MaxCut algorithm, but it seems that the loss is always 0. Any idea what could be going wrong?
Thanks

run error in realworld_s2v_mvc

I was tested in real-world data using Infonet data that uploaded in readme link.
I already compiled mvc_lib and so much warning arise.
And when i try to run realworld_s2v_mvc these error come up.
Please help me.

File "/home/juseong/graph_comb_opt/code/realworld_s2v_mvc/../memetracker/meme.py", line 50, in build_full_graph
for edge in g.edges_iter(data=True):
AttributeError: 'Graph' object has no attribute 'edges_iter'

Test the evaluation can't load model

I'm able to run the generate synthetic data and train sessions. I use python 2.7.18 for both training and testing. I use cPickle load. The evaluate.py can't load the model generated from the training session. The error is:

Traceback (most recent call last):
File "evaluate.py", line 64, in
g = cp.load(f)
File "/usr/lib/python2.7/pickle.py", line 1384, in load
return Unpickler(file).load()
File "/usr/lib/python2.7/pickle.py", line 864, in load
dispatchkey
File "/usr/lib/python2.7/pickle.py", line 892, in load_proto
raise ValueError, "unsupported pickle protocol: %d" % proto
ValueError: unsupported pickle protocol: 5

How to fix this issue? Thanks for your help in advance.

Error on evaluation of s2v_maxcut

I managed to execute run_nstep_dqn.sh of s2v_maxcut.
But, I got this error when executing run_eval.sh:
Traceback (most recent call last): File "evaluate.py", line 69, in <module> api.InsertGraph(g, is_test=True) File "/home/abcd/graph_comb_opt/code/s2v_maxcut/maxcut_lib/maxcut_lib.py", line 45, in InsertGraph n_nodes, n_edges, e_froms, e_tos, weights = self.__CtypeNetworkX(g) File "/home/abcd/graph_comb_opt/code/s2v_maxcut/maxcut_lib/maxcut_lib.py", line 24, in __CtypeNetworkX edges = list(g.edges_iter(data='weight', default=1)) AttributeError: 'Graph' object has no attribute 'edges_iter'

There is an error in running tsp. How can I solve it? Thank you

I'm running sample of tsp in \code\s2v_tsp2d\main.py , an error is displayed:
“Could not find module 'C:\Users\admin\Downloads\graph_comb_opt-master\code\s2v_tsp2d\tsp2d_lib\build\dll\libtsp2d.so' (or one of its dependencies). Try using the full path with constructor syntax.”
How can I solve it? Thank you!

How to use the trained model

Great work! I have been trying to use the trained model. I figured that only a few of the arguments are needed to initialize MvcLib so I tried api = MvcLib(['test.py', '-dev_id', 0, '-mem_size', 5000, '-num_env', 10, '-max_n', 20, '-batch_size', 64, '-n_step', 5]) to initiate api. However, it seems to be throwing a Segmentation Fault.

On running the evaluation script - ./run_eval.sh, its throwing this error:

mem_size = 500000
num_env = 10
n_step = 5
min_n = 15
max_n = 20
max_iter = 100000
dev_id = 1
max_bp_iter = 5

batch_size = 64
embed_dim = 64
learning_rate = 0.0001
w_scale = 0.01
l2_penalty = 0
momentum = 0.9
terminate called after throwing an instance of 'thrust::system::system_error'
  what():  out of memory
./run_eval.sh: line 66: 86715 Aborted                 python2 evaluate.py -n_step $n_step -dev_id $dev_id -data_test $data_test -min_n $min_n -max_n $max_n -num_env $num_env -
max_iter $max_iter -mem_size $mem_size -g_type $g_type -learning_rate $learning_rate -max_bp_iter $max_bp_iter -net_type $net_type -max_iter $max_iter -save_dir $save_dir -emb
ed_dim $embed_dim -batch_size $batch_size -reg_hidden $reg_hidden -momentum 0.9 -l2 0.00 -w_scale $w_scale

I have followed the instructions to build graphnn and maxcut/ and mvc builds. My training scripts worked and created a results folder with model_name.model files in there. I want to load a model and use it for my problem. Is it possible to have a convinient way to use the models?

how to build graphnn

Could you tell me how to build graphnn? And I want to know whether use pytorch version graphnn?

Error on training

when I enter the command:
./run_nstep_dqn.sh
it returned:

Traceback (most recent call last):
File "main.py", line 77, in
api = MvcLib(sys.argv)
File "/home/fanchangjun/Code/graph_comb_opt/code/s2v_mvc/mvc_lib/mvc_lib.py", line 11, in init
self.lib = ctypes.CDLL('%s/build/dll/libmvc.so' % dir_path)
File "/home/fanchangjun/anaconda2/lib/python2.7/ctypes/init.py", line 366, in init
self._handle = _dlopen(self._name, mode)
OSError: /home/fanchangjun/Code/graph_comb_opt/code/s2v_mvc/mvc_lib/build/dll/libmvc.so: cannot open shared object file: No such file or directory

Do you know the reasons? Thanks!

Building graphNN

Hello,

When I try to build the graphNN library I get this error.

make: *** No rule to make target build_cpuonly/objs/cxx/src//nn/hit_at_k.o', needed by build_cpuonly/lib/libgnn.a'. Stop.

Thanks!

Understanding TSP2D

Hello,

I was going through the code for s2v_tsp2d and was wondering if you could clarify something. When you call the predict function you use arg_max to select the best node even though it is a minimization problem. I also tried setting int sign = -1; in tsp2d_env and still got minimum tour lengths. Could you explain where the minimization is occurring ?

Thank you for your time !

how to install mkl at now?

Excuse me!Thank you very much for your excellent work。When I try to run this code,I could not find appropriate intel mkl Download link with graphnn‘s environment"https://github.com/Hanjun-Dai/graphnn/tree/bdf51e66231d51bc2b9a560b2be255bc642d4a03#download-and-install-intel-mkl"
Could you help me ? Thank you very much .

trap divide error

Hi!
I'm training another combinatorial problem and it works well. But, sometimes when I run ./run_nstep_dqn.sh this is stopped without error message.

In the system log appears this:

kernel: traps: python[7276] trap divide error ip:7f96bd2cea0b sp:7fffbb710190 error:0 in libcapmds.so[7f96bd202000+106000]

I think that occur because is trying divide a floating-point by zero, but what could be the cause of it?
Thanks.

Are there some tricks not mentioned in the paper?

Hi, thank you for the code. I try to rewrite these codes with tensorflow according to the paper, but fail to find a right answer. May I ask if there are some tricks not mentioned in the paper?
Sorry for my slow-witted question, but I can barely read C++ codes

Is discount factor wrongly implemented?

Hello, I have a question about the decay factor (discount factor) in N-step DQN (paper & implementation)
I found the following equation in the paper(page 6; first paragraph):

As far as I know, discount factor should be applied as following:

I checked the implementation of tsp2d(code/s2v_tsp2d/tsp2d_lib/src/tsp2d_lib.cpp), and in the Fit() function, former version of equation is implemented.

double Fit(const double lr)
{
    NStepReplayMem::Sampling(cfg::batch_size, sample);
    bool ness = false;
    for (int i = 0; i < cfg::batch_size; ++i)
        if (!sample.list_term[i])
        {
            ness = true;
            break;
        }
    if (ness)
        PredictWithSnapshot(sample.g_list, sample.list_s_primes, list_pred);
    
    list_target.resize(cfg::batch_size);
    for (int i = 0; i < cfg::batch_size; ++i)
    {
        double q_rhs = 0;
        if (!sample.list_term[i])
            q_rhs = cfg::decay * max(sample.g_list[i]->num_nodes, list_pred[i]->data());
        q_rhs += sample.list_rt[i];
        list_target[i] = q_rhs;
    }

    return Fit(lr, sample.g_list, sample.list_st, sample.list_at, list_target);
}

I'm currently studying the DRL, so I might be wrong... But since I couldn't also found the reason why you used discount factor in that way, I write this issue.
Thank you for providing nice implementation! I've really learned a lot from it!

Data generator for TSP?

Hello, I'm going to run s2v_tsp2d. However, I didn't find data generator for TSP problem. So the program cannot run without data file. Is that my mistake?

Understanding s2v_mc

Hi,
Thank you for sharing your code.

I am trying to understand how to write the equation for Q in s2v_mvc, given the additional auxiliary input. Following the notation in the paper, is it like this :

Where a^{T} is the auxiliary input and θ{8}_ is 3-dimensional. Am I right? If I am not, can you explain how to write the equation for s2v_mvc?

Thank you!

understanding variable names

In code/s2v_mvc/mvc_lib/src/lib/qnet.cpp, line 66, what is the "aux_input" used to do? And in code/s2v_mvc/mvc_lib/include/config.h, what are "cfg::reg_hidden" and "cfg::aux_dim" used to indicate?

New clone fails due to lack of graphnn permission

When I run the first command,
[..]> git clone --recursive https://github.com/Hanjun-Dai/graph_comb_opt
I get this error, which seems to be related to outdated ssh keys online?

Cloning into 'graph_comb_opt'...
remote: Enumerating objects: 6, done.
remote: Counting objects: 100% (6/6), done.
remote: Compressing objects: 100% (5/5), done.
remote: Total 271 (delta 1), reused 3 (delta 1), pack-reused 265
Receiving objects: 100% (271/271), 2.28 MiB | 2.52 MiB/s, done.
Resolving deltas: 100% (134/134), done.
Submodule 'graphnn' ([email protected]:Hanjun-Dai/graphnn) registered for path 'graphnn'
Cloning into '[..]/graph_comb_opt/graphnn'...
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: clone of '[email protected]:Hanjun-Dai/graphnn' into submodule path '[..]/graph_comb_opt/graphnn' failed
Failed to clone 'graphnn'. Retry scheduled
Cloning into '[..]/graph_comb_opt/graphnn'...
[email protected]: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
fatal: clone of '[email protected]:Hanjun-Dai/graphnn' into submodule path '[..]/graph_comb_opt/graphnn' failed
Failed to clone 'graphnn' a second time, aborting

Error when running run_eval.sh in s2v_mvc

Can't run s2v_mvc

I run the training, run_nstep_dqn.sh of s2v_mvc and got this error:
File "main.py", line 77, in <module> api = MvcLib(sys.argv) File "/home/abcd/S2V-DQN/code/s2v_mvc/mvc_lib/mvc_lib.py", line 11, in __init__ self.lib = ctypes.CDLL('%s/build/dll/libmvc.so' % dir_path) File "/home/abcd/anaconda2/envs/tensorflow-gpu/lib/python2.7/ctypes/__init__.py", line 366, in __init__ self._handle = _dlopen(self._name, mode) OSError: /home/abcd/S2V-DQN/code/s2v_mvc/mvc_lib/build/dll/libmvc.so: undefined symbol: _ZTVN3fmt11FormatErrorE

Data for TSP is unreadable

Hi Hanjun,
I downloaded the synthetic data for TSP from the Dropbox link you provided. However, the content in the .gz file is unreadable. It is neither a folder, nor a file of any know extension. I wonder if there is something wrong with the file, or there is a special way to extract it.

Thanks a lot!
Yuhan

tbb error？

/tmp/ccXHgDI5.o: In function LoadModel': /home/anqichen/graph_comb_opt/code/s2v_mvc/mvc_lib/src/mvc_lib.cpp:29: undefined reference to gnn::ParamSet<gnn::CPU, float>::Load(std::__cxx11::basic_string<char, std::char_traits, std::allocator >)'
/tmp/ccXHgDI5.o: In function SaveModel': /home/anqichen/graph_comb_opt/code/s2v_mvc/mvc_lib/src/mvc_lib.cpp:36: undefined reference to gnn::ParamSet<gnn::CPU, float>::Save(std::__cxx11::basic_string<char, std::char_traits, std::allocator >)'
build/lib/nn_api.o: In function Predict(std::vector<std::shared_ptr<Graph>, std::allocator<std::shared_ptr<Graph> > >&, std::vector<std::vector<int, std::allocator<int> >*, std::allocator<std::vector<int, std::allocator<int> >*> >&, std::vector<std::vector<double, std::allocator<double> >*, std::allocator<std::vector<double, std::allocator<double> >*> >&)': /home/anqichen/graph_comb_opt/code/s2v_mvc/mvc_lib/src/lib/nn_api.cpp:31: undefined reference to gnn::FactorGraph::FeedForward(std::vector<std::shared_ptrgnn::Variable, std::allocator<std::shared_ptrgnn::Variable > >, std::map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, void*, std::less<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, void*> > >, gnn::Phase, unsigned int)'
build/lib/nn_api.o: In function Fit(std::vector<std::shared_ptr<Graph>, std::allocator<std::shared_ptr<Graph> > >&, std::vector<std::vector<int, std::allocator<int> >*, std::allocator<std::vector<int, std::allocator<int> >*> >&, std::vector<int, std::allocator<int> >&, std::vector<double, std::allocator<double> >&)': /home/anqichen/graph_comb_opt/code/s2v_mvc/mvc_lib/src/lib/nn_api.cpp:80: undefined reference to gnn::FactorGraph::FeedForward(std::vector<std::shared_ptrgnn::Variable, std::allocator<std::shared_ptrgnn::Variable > >, std::map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, void*, std::less<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, void*> > >, gnn::Phase, unsigned int)'
build/lib/qnet.o: In function void __gnu_cxx::new_allocator<gnn::GraphVar>::construct<gnn::GraphVar, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(gnn::GraphVar*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)': /NIRAL/work/jprieto/install/include/c++/5.3.0/ext/new_allocator.h:120: undefined reference to gnn::GraphVar::GraphVar(std::__cxx11::basic_string<char, std::char_traits, std::allocator >)'
build/lib/qnet.o: In function void __gnu_cxx::new_allocator<gnn::TensorVarTemplate<gnn::CPU, gnn::CSR_SPARSE, float> >::construct<gnn::TensorVarTemplate<gnn::CPU, gnn::CSR_SPARSE, float>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(gnn::TensorVarTemplate<gnn::CPU, gnn::CSR_SPARSE, float>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)': /NIRAL/work/jprieto/install/include/c++/5.3.0/ext/new_allocator.h:120: undefined reference to gnn::TensorVarTemplate<gnn::CPU, gnn::CSR_SPARSE, float>::TensorVarTemplate(std::__cxx11::basic_string<char, std::char_traits, std::allocator >)'
build/lib/qnet.o: In function void __gnu_cxx::new_allocator<gnn::TensorVarTemplate<gnn::CPU, gnn::DENSE, float> >::construct<gnn::TensorVarTemplate<gnn::CPU, gnn::DENSE, float>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(gnn::TensorVarTemplate<gnn::CPU, gnn::DENSE, float>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)': /NIRAL/work/jprieto/install/include/c++/5.3.0/ext/new_allocator.h:120: undefined reference to gnn::TensorVarTemplate<gnn::CPU, gnn::DENSE, float>::TensorVarTemplate(std::__cxx11::basic_string<char, std::char_traits, std::allocator >)'
build/lib/qnet.o: In function gnn::TensorVarTemplate<gnn::CPU, gnn::DENSE, float>::TensorVarTemplate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::vector<int, std::allocator<int> >)': /home/anqichen/graph_comb_opt/code/s2v_mvc/mvc_lib/../../../graphnn/include/nn/variable.h:204: undefined reference to gnn::TensorVarTemplate<gnn::CPU, gnn::DENSE, float>::TensorVarTemplate(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<unsigned long, std::allocator >)'
build/lib/qnet.o: In function void __gnu_cxx::new_allocator<gnn::MatMul<gnn::CPU, float> >::construct<gnn::MatMul<gnn::CPU, float>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(gnn::MatMul<gnn::CPU, float>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)': /NIRAL/work/jprieto/install/include/c++/5.3.0/ext/new_allocator.h:120: undefined reference to gnn::MatMul<gnn::CPU, float>::MatMul(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, gnn::Trans, gnn::Trans, gnn::PropErr)'
build/lib/qnet.o: In function void __gnu_cxx::new_allocator<gnn::ReLU<gnn::CPU, float> >::construct<gnn::ReLU<gnn::CPU, float>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(gnn::ReLU<gnn::CPU, float>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)': /NIRAL/work/jprieto/install/include/c++/5.3.0/ext/new_allocator.h:120: undefined reference to gnn::ReLU<gnn::CPU, float>::ReLU(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, gnn::PropErr)'
build/lib/qnet.o: In function void __gnu_cxx::new_allocator<gnn::ElewiseAdd<gnn::CPU, float> >::construct<gnn::ElewiseAdd<gnn::CPU, float>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(gnn::ElewiseAdd<gnn::CPU, float>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)': /NIRAL/work/jprieto/install/include/c++/5.3.0/ext/new_allocator.h:120: undefined reference to gnn::ElewiseAdd<gnn::CPU, float>::ElewiseAdd(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<float, std::allocator >, gnn::PropErr)'
build/lib/qnet.o: In function void __gnu_cxx::new_allocator<gnn::ConcatCols<gnn::CPU, float> >::construct<gnn::ConcatCols<gnn::CPU, float>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(gnn::ConcatCols<gnn::CPU, float>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)': /NIRAL/work/jprieto/install/include/c++/5.3.0/ext/new_allocator.h:120: undefined reference to gnn::ConcatCols<gnn::CPU, float>::ConcatCols(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, gnn::PropErr)'
build/lib/qnet.o: In function void __gnu_cxx::new_allocator<gnn::SquareError<gnn::CPU, float> >::construct<gnn::SquareError<gnn::CPU, float>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(gnn::SquareError<gnn::CPU, float>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)': /NIRAL/work/jprieto/install/include/c++/5.3.0/ext/new_allocator.h:120: undefined reference to gnn::SquareError<gnn::CPU, float>::SquareError(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, gnn::PropErr)'
build/lib/qnet.o: In function void __gnu_cxx::new_allocator<gnn::ReduceMean<gnn::CPU, float> >::construct<gnn::ReduceMean<gnn::CPU, float>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&>(gnn::ReduceMean<gnn::CPU, float>*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)': /NIRAL/work/jprieto/install/include/c++/5.3.0/ext/new_allocator.h:120: undefined reference to gnn::ReduceMean<gnn::CPU, float>::ReduceMean(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, int, bool, gnn::PropErr)'
build/lib/qnet.o: In function gnn::Node2NodeMsgPass<gnn::CPU, float>::Node2NodeMsgPass(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool)': /home/anqichen/graph_comb_opt/code/s2v_mvc/mvc_lib/../../../graphnn/include/nn/msg_pass.h:71: undefined reference to gnn::IMsgPass<gnn::CPU, float>::IMsgPass(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, bool)'
build/lib/qnet.o: In function gnn::SubgraphMsgPass<gnn::CPU, float>::SubgraphMsgPass(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool)': /home/anqichen/graph_comb_opt/code/s2v_mvc/mvc_lib/../../../graphnn/include/nn/msg_pass.h:180: undefined reference to gnn::IMsgPass<gnn::CPU, float>::IMsgPass(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, bool)'
build/lib/qnet.o:(.data.rel.ro._ZTVN3gnn17TensorVarTemplateINS_3CPUENS_5DENSEEfEE[_ZTVN3gnn17TensorVarTemplateINS_3CPUENS_5DENSEEfEE]+0x58): undefined reference to non-virtual thunk to gnn::TensorVarTemplate<gnn::CPU, gnn::DENSE, float>::ZeroGrad()' build/lib/qnet.o:(.data.rel.ro._ZTVN3gnn17TensorVarTemplateINS_3CPUENS_5DENSEEfEE[_ZTVN3gnn17TensorVarTemplateINS_3CPUENS_5DENSEEfEE]+0x60): undefined reference to non-virtual thunk to gnn::TensorVarTemplate<gnn::CPU, gnn::DENSE, float>::OnesGrad()'
../../../graphnn/build_cpuonly/lib/libgnn.a(cpu_dense_tensor.o): In function tbb::interface7::task_arena::initialize()': /home/anqichen/intel/tbb/include/tbb/task_arena.h:250: undefined reference to tbb::interface7::internal::task_arena_base::internal_initialize()'
../../../graphnn/build_cpuonly/lib/libgnn.a(cpu_dense_tensor.o): In function tbb::interface7::task_arena::terminate()': /home/anqichen/intel/tbb/include/tbb/task_arena.h:281: undefined reference to tbb::interface7::internal::task_arena_base::internal_terminate()'
../../../graphnn/build_cpuonly/lib/libgnn.a(cpu_dense_tensor.o): In function tbb::interface7::task_arena::current_thread_index()': /home/anqichen/intel/tbb/include/tbb/task_arena.h:369: undefined reference to tbb::interface7::internal::task_arena_base::internal_current_slot()'
../../../graphnn/build_cpuonly/lib/libgnn.a(cpu_dense_tensor.o): In function void tbb::interface7::task_arena::execute_impl<void, tbb::flow::interface10::graph::wait_functor const>(tbb::flow::interface10::graph::wait_functor const&)': /home/anqichen/intel/tbb/include/tbb/task_arena.h:213: undefined reference to tbb::interface7::internal::task_arena_base::internal_execute(tbb::interface7::internal::delegate_base&) const'
collect2: error: ld returned 1 exit status

Which file provides the information of network?

I hope to read your code and understand some details of your algorithm. However, it seems like the neural network introduced in your paper is encapsulated in the self.lib = ctypes.CDLL('%s/build/dll/libtsp2d.so' % dir_path). I wonder how I may get more information.

What causes the following error？Assertion `idx_map[act] >= 0...’

This problem always occurs when the program is executed. I don’t know if it is a problem with my own data and how to avoid it？
QNet::SetupGraphInput(std::vector&, std::vector<std::shared_ptr >&, std::vector<std::vector>&, const int): Assertion `idx_map[act] >= 0 && act >= 0 && act num_nodes' failed.
idx_map[act]=-1

When do the Test, how to chose the S0 (first state)? If you make the wrong choice in the first step, you're going to end up with a bad result. I can't understand how the selection of the first state in the program.

When do the Test, how to chose the S0 (first state)? If you make the wrong choice in the first step, you're going to end up with a bad result. I can't understand how the selection of the first state in the program.

Originally posted by @NH333 in #19 (comment)

How to debug?

GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git
......
(gdb) b simlulator.cpp:57
No source file named simlulator.cpp.
This message comes up when I'm debugging, i don't know how to deal with it.
Could you give me some advice on debugging?

when i run ./run_nstep_dqn.sh and minist ,the problem appear.

Assertion failed when running run_nstep_dqn.sh

Thank for this great work!

I tried to reproduce the results but encountered a problem when training with n-step Q learning.

I am running the code on Ubuntu 18.04 with Python 3.6.7 (anaconda). (I modified the Python2 code to run under Python 3). The compilation of GraphNN was successful and the examples are run correctly. The dynamic libraries are also built successfully with only some slight warnings. When I ran

cd code/s2v_mvc
./run_nstep_dqn.sh

I got

mem_size = 500000
num_env = 1
n_step = 2
min_n = 15
max_n = 20
max_iter = 1000000
dev_id = 0
max_bp_iter = 5
batch_size = 64
embed_dim = 64
learning_rate = 0.0001
w_scale = 0.01
l2_penalty = 0
momentum = 0.9
generating validation graphs
100%|██████████| 100/100 [00:00<00:00, 4412.46it/s]generating new training graphs
100%|██████████| 1000/1000 [00:00<00:00, 4410.58it/s]iter 0 eps 1.0 average size of vc:  10.79
iter 300 eps 0.9715 average size of vc:  16.5
Assertion `fid` failed in src/nn/param_set.cpp line 32: file  is not found
/home/sven/Study/GML/graph_comb_opt/code/s2v_mvc/mvc_lib/build/dll/libmvc.so(_ZN3gnn8ParamSetINS_3GPUEfE4SaveENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x1a5)[0x7f6e72430b25]
/home/sven/Study/GML/graph_comb_opt/code/s2v_mvc/mvc_lib/build/dll/libmvc.so(SaveModel+0x8b)[0x7f6e723bc40b]
/home/sven/Apps/miniconda3/lib/python3.6/lib-dynload/../../libffi.so.6(ffi_call_unix64+0x4c)[0x7f6e81440ec0]
/home/sven/Apps/miniconda3/lib/python3.6/lib-dynload/../../libffi.so.6(ffi_call+0x22d)[0x7f6e8144087d]
/home/sven/Apps/miniconda3/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(_ctypes_callproc+0x2ce)[0x7f6e83ce4e4e]
/home/sven/Apps/miniconda3/lib/python3.6/lib-dynload/_ctypes.cpython-36m-x86_64-linux-gnu.so(+0x13885)[0x7f6e83ce5885]
python(_PyObject_FastCallDict+0x8b)[0x563bd22be5bb]
python(+0x19cd6e)[0x563bd2347d6e]
python(_PyEval_EvalFrameDefault+0x30a)[0x563bd236a71a]
python(+0x196d8b)[0x563bd2341d8b]
terminate called without an active exception

Would you please help me to solve this problem? Thanks in advance.