bowenliu16 / rl_graph_generation Goto Github PK
View Code? Open in Web Editor NEWLicense: BSD 3-Clause "New" or "Revised" License
License: BSD 3-Clause "New" or "Revised" License
How long will it take for this command mpirun -np 8 python run_molecule.py 2>/dev/null
to terminate? It is running on my machine for more than 2 days. I am running the program on GeForce GTX 1080 Ti along with 4 CPU cores.
Hello, I want to ask a problem about this special action space, and I hope I can get some help from here.
In fact, in my own environment, I try to train an agent using ppo algorithm (without supervised learning in this paper), but I think the training fails because my action is similar to the paper (MultiCatCategoricalPd). I wonder that if an action A is composed of [A1, A2, A3], then another action B is [B1, B2, B3], how to calculate the loss between them (neglogp, kl)? In the code, it seems that we just treat each part of subaction as independent ones, but if the A1 is not the same with B1, there is no meaning to calcalute the similarity between A2 and B2. I mean if the second subaction is conditioned on the first, how can we perform calculation without the need to care for the first subaction .
More importantly, my second subaction space will be masked based on the first subaction, so if A1 != B1, the valid space of A2 and B2 is not the same (I have to set the logits of invalid actions to -1e6), so calculating the neglogp between two actions under the same state will be huge if the first sub-action is different for these two actions. This issue has been around me about seveal months, so if you can help with this, I will appreciate that a lot.
Thanks in advance!
I directly ran the script run_molecule.py
with no modification, but only 3614 molecules are generated. Is this the normal case?
Hello
The code has many bugs.
Can you tell me the Python version and the packages you used?
Hi,
Just wondering what the function load_conditional() does in gym_molecule/envs/molecule.py.
Sorry I'm very new in this field.
Thanks.
Cheers,
Lesley
Hello,
thanks you for sharing your code. I am very interested in your model but there is no way to make it working.
I have fixed a lot of bugs (I cannot summarize all since there are a lot) but there are many other that would require to re-write some parts.
Do you have an up-to-date version that works?
Thanks again.
Hi,
I am trying to run this model on new dataset. I know one of the data file I have to give as input is SMILES as by default is ZINC 250K dataset.
When I checked the code it also requires, other files too, such as opt.test.logP-SA/zinc_plogp_sorted.csv. These files contains 800 molecules with I guess logP values but I am not sure how to generate such files from any other dataset. Did you providing any helper function or could you please suggest me how to do that and also how to get the penalized logP values?
Also what other data files are required to run the model?
In the code there are some hard coded values such as for the normalization, do I need to change according to new dataset and also is there any components needs to be updated in the code?
Thanks.
although the code natively supports the gdb13 dataset, choosing it as the dataset option in the code raises an exception after a few iterations.
it raises an error on molecule env in the get_observation method line 548
F[0,n:n+n_shift,:] = auxiliary_atom_features
which it tried to broadcast shape (5,5) to shape (4,5), this occurs because the last list of the array is from 16 to 21 but the array is only size 20.
can you please share a fix so it could also run on the dataset you already support?
read your paper, i am interested in the thoughts,but when i deploy the project ,i met too many problems , the first command i got the problem ,so can you supply the detail of env,such as the version of python, mujoco and some else .
i got too many problems, and debug several days got fail
Hello, thank you very much for sharing the code. But, I have a question in line 525 of ppo1.pposgd_simple_gcn.py, passing the parameter pi to the function traj_segment_generator. However, I think oldpi should be passed to the function instead of pi. Is it true? Please let me know. Thank you, look forward to your reply.
Best wishes!
Anny
Here is the error log
[10:{'node':` <gym.core.Space object at 0x7effce1cd6a0>, 'adj': <gym.core.Space object at 0x7effce1cd438>}
WARN: Could not seed environment <MoleculeEnv<molecule-v0>>
ob_adj (?, 3, ?, ?) ob_node (?, 1, ?, 10)
logits_first (?, ?) logits_second (?, ?) logits_edge (?, 3)
ac_edge (?,)
ob_adj (?, 3, ?, ?) ob_node (?, 1, ?, 10)
logits_first (?, ?) logits_second (?, ?) logits_edge (?, 3)
ac_edge (?,)
[10:54:02] Explicit valence for atom # 2 O, 3, is greater than permitted
[10:54:02] Explicit valence for atom # 4 N, 4, is greater than permitted
[10:54:02] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 5 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 4 N, 4, is greater than permitted
[10:54:02] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 8 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 9 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 22 O, 3, is greater than permitted
[10:54:02] Explicit valence for atom # 7 N, 4, is greater than permitted
[10:54:02] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 20 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 2 O, 3, is greater than permitted
[10:54:02] Explicit valence for atom # 0 Br, 2, is greater than permitted
[10:54:02] Explicit valence for atom # 1 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 14 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 20 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 14 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 3 O, 3, is greater than permitted
[10:54:02] Explicit valence for atom # 8 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 8 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 8 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 11 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 19 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 24 O, 3, is greater than permitted
[10:54:02] Explicit valence for atom # 0 N, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 30 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 2 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 30 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 19 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 2 N, 4, is greater than permitted
[10:54:02] Explicit valence for atom # 6 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 5 N, 4, is greater than permitted
[10:54:02] Explicit valence for atom # 2 N, 4, is greater than permitted
[10:54:02] Explicit valence for atom # 7 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 16 O, 3, is greater than permitted
[10:54:02] Explicit valence for atom # 3 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 1 C, 6, is greater than permitted
[10:54:02] Explicit valence for atom # 13 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 27 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 22 C, 5, is greater than permitted
[10:54:02] Explicit valence for atom # 14 N, 5, is greater than permitted
/home/bowen/anaconda3/envs/rl_graph_generation_apr_11_2018/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
Traceback (most recent call last):
File "run_molecule.py", line 86, in main
train(args,args.env, num_timesteps=args.num_timesteps, seed=args.seed,writer=writer)
File "run_molecule.py", line 46, in train
schedule='linear', writer=writer
File "/home/bowen/pycharm_deployment_directory/rl_graph_generation/rl-baselines/baselines/ppo1/pposgd_simple_gcn.py", line 254, in learn
seg = seg_gen.__next__()
File "/home/bowen/pycharm_deployment_directory/rl_graph_generation/rl-baselines/baselines/ppo1/pposgd_simple_gcn.py", line 74, in traj_segment_generator
info = env.get_info()
File "/home/bowen/pycharm_deployment_directory/rl_graph_generation/gym-molecule/gym_molecule/envs/molecule.py", line 281, in get_info
info['reward_sa'] = calculateScore(m) * self.sa_ratio # lower better
File "/home/bowen/pycharm_deployment_directory/rl_graph_generation/gym-molecule/gym_molecule/envs/sascorer.py", line 59, in calculateScore
2) #<- 2 is the *radius* of the circular fingerprint
Boost.Python.ArgumentError: Python argument types in
rdkit.Chem.rdMolDescriptors.GetMorganFingerprint(NoneType, int)
did not match C++ signature:
GetMorganFingerprint(RDKit::ROMol mol, int radius, boost::python::api::object invariants=[], boost::python::api::object fromAtoms=[], bool useChirality=False, bool useBondTypes=True, bool useFeatures=False, bool useCounts=True, boost::python::api::object bitInfo=None)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "run_molecule.py", line 93, in <module>
main()
File "run_molecule.py", line 88, in main
writer.export_scalars_to_json("./all_scalars.json")
AttributeError: 'NoneType' object has no attribute 'export_scalars_to_json'
After running for 12 days with 64 CPU's and few available GPU's (which I guess were not used by this method) finally this method finished training. It has generated a number of molecules over each iteration. When trying to generate the results using given evaluation code (which first needed to fixed for test data, authors just used sample of training data for evaluation), the results are different from training set with no improvement and less than input data statistics.
Tried to get the values for penalized logP (trained model with conditions=True and has_scaffold=True) using given reward_penalized_logp function in molecule.py and checked the top 3 largest values which has maximum of 3.34 and when checked minimum that is -89.43 which are nowhere near as reported in paper.
Either full code is not given for training or the evaluation is done differently for which code is not provided. I don't know how I can get the same results in in paper.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.