Giter Site home page Giter Site logo

rl_graph_generation's People

Contributors

bowenliu16 avatar jiaxuanyou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rl_graph_generation's Issues

Takes too long to complete

How long will it take for this command mpirun -np 8 python run_molecule.py 2>/dev/null to terminate? It is running on my machine for more than 2 days. I am running the program on GeForce GTX 1080 Ti along with 4 CPU cores.

Confusion about MultiCatCategoricalPd

Hello, I want to ask a problem about this special action space, and I hope I can get some help from here.

In fact, in my own environment, I try to train an agent using ppo algorithm (without supervised learning in this paper), but I think the training fails because my action is similar to the paperMultiCatCategoricalPd). I wonder that if an action A is composed of [A1, A2, A3], then another action B is [B1, B2, B3], how to calculate the loss between them (neglogp, kl)? In the code, it seems that we just treat each part of subaction as independent ones, but if the A1 is not the same with B1, there is no meaning to calcalute the similarity between A2 and B2. I mean if the second subaction is conditioned on the first, how can we perform calculation without the need to care for the first subaction .

More importantly, my second subaction space will be masked based on the first subaction, so if A1 != B1, the valid space of A2 and B2 is not the same (I have to set the logits of invalid actions to -1e6), so calculating the neglogp between two actions under the same state will be huge if the first sub-action is different for these two actions. This issue has been around me about seveal months, so if you can help with this, I will appreciate that a lot.

Thanks in advance!

Only 3614 molecules generated.

I directly ran the script run_molecule.py with no modification, but only 3614 molecules are generated. Is this the normal case?

Package Version

Hello
The code has many bugs.
Can you tell me the Python version and the packages you used?

load_conditional()

Hi,
Just wondering what the function load_conditional() does in gym_molecule/envs/molecule.py.

Sorry I'm very new in this field.

Thanks.

Cheers,
Lesley

many bugs

Hello,
thanks you for sharing your code. I am very interested in your model but there is no way to make it working.
I have fixed a lot of bugs (I cannot summarize all since there are a lot) but there are many other that would require to re-write some parts.
Do you have an up-to-date version that works?
Thanks again.

Running this model on new dataset

Hi,

I am trying to run this model on new dataset. I know one of the data file I have to give as input is SMILES as by default is ZINC 250K dataset.
When I checked the code it also requires, other files too, such as opt.test.logP-SA/zinc_plogp_sorted.csv. These files contains 800 molecules with I guess logP values but I am not sure how to generate such files from any other dataset. Did you providing any helper function or could you please suggest me how to do that and also how to get the penalized logP values?

Also what other data files are required to run the model?

In the code there are some hard coded values such as for the normalization, do I need to change according to new dataset and also is there any components needs to be updated in the code?

Thanks.

running model on gdb doesn't work

although the code natively supports the gdb13 dataset, choosing it as the dataset option in the code raises an exception after a few iterations.
it raises an error on molecule env in the get_observation method line 548
F[0,n:n+n_shift,:] = auxiliary_atom_features
which it tried to broadcast shape (5,5) to shape (4,5), this occurs because the last list of the array is from 16 to 21 but the array is only size 20.
can you please share a fix so it could also run on the dataset you already support?

env problem

read your paper, i am interested in the thoughts,but when i deploy the project ,i met too many problems , the first command i got the problem ,so can you supply the detail of env,such as the version of python, mujoco and some else .

i got too many problems, and debug several days got fail

About oldpi in function traj_segment_generator

Hello, thank you very much for sharing the code. But, I have a question in line 525 of ppo1.pposgd_simple_gcn.py, passing the parameter pi to the function traj_segment_generator. However, I think oldpi should be passed to the function instead of pi. Is it true? Please let me know. Thank you, look forward to your reply.

Best wishes!
Anny

a runtime error after a few hundred iterations

Here is the error log

[10:{'node':` <gym.core.Space object at 0x7effce1cd6a0>, 'adj': <gym.core.Space object at 0x7effce1cd438>}

WARN: Could not seed environment <MoleculeEnv<molecule-v0>>

ob_adj (?, 3, ?, ?) ob_node (?, 1, ?, 10)

logits_first (?, ?) logits_second (?, ?) logits_edge (?, 3)

ac_edge (?,)

ob_adj (?, 3, ?, ?) ob_node (?, 1, ?, 10)

logits_first (?, ?) logits_second (?, ?) logits_edge (?, 3)

ac_edge (?,)

[10:54:02] Explicit valence for atom # 2 O, 3, is greater than permitted

[10:54:02] Explicit valence for atom # 4 N, 4, is greater than permitted

[10:54:02] Explicit valence for atom # 1 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 5 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 4 N, 4, is greater than permitted

[10:54:02] Explicit valence for atom # 1 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 8 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 9 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 22 O, 3, is greater than permitted

[10:54:02] Explicit valence for atom # 7 N, 4, is greater than permitted

[10:54:02] Explicit valence for atom # 1 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 20 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 2 O, 3, is greater than permitted

[10:54:02] Explicit valence for atom # 0 Br, 2, is greater than permitted

[10:54:02] Explicit valence for atom # 1 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 14 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 20 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 14 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 2 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 3 O, 3, is greater than permitted

[10:54:02] Explicit valence for atom # 8 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 8 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 8 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 11 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 19 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 2 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 24 O, 3, is greater than permitted

[10:54:02] Explicit valence for atom # 0 N, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 30 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 2 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 30 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 7 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 19 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 2 N, 4, is greater than permitted

[10:54:02] Explicit valence for atom # 6 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 5 N, 4, is greater than permitted

[10:54:02] Explicit valence for atom # 2 N, 4, is greater than permitted

[10:54:02] Explicit valence for atom # 7 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 16 O, 3, is greater than permitted

[10:54:02] Explicit valence for atom # 3 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 1 C, 6, is greater than permitted

[10:54:02] Explicit valence for atom # 13 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 27 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 22 C, 5, is greater than permitted

[10:54:02] Explicit valence for atom # 14 N, 5, is greater than permitted

/home/bowen/anaconda3/envs/rl_graph_generation_apr_11_2018/lib/python3.5/site-packages/tensorflow/python/ops/gradients_impl.py:100: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.

  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

Traceback (most recent call last):

  File "run_molecule.py", line 86, in main

    train(args,args.env, num_timesteps=args.num_timesteps, seed=args.seed,writer=writer)

  File "run_molecule.py", line 46, in train

    schedule='linear', writer=writer

  File "/home/bowen/pycharm_deployment_directory/rl_graph_generation/rl-baselines/baselines/ppo1/pposgd_simple_gcn.py", line 254, in learn

    seg = seg_gen.__next__()

  File "/home/bowen/pycharm_deployment_directory/rl_graph_generation/rl-baselines/baselines/ppo1/pposgd_simple_gcn.py", line 74, in traj_segment_generator

    info = env.get_info()

  File "/home/bowen/pycharm_deployment_directory/rl_graph_generation/gym-molecule/gym_molecule/envs/molecule.py", line 281, in get_info

    info['reward_sa'] = calculateScore(m) * self.sa_ratio  # lower better

  File "/home/bowen/pycharm_deployment_directory/rl_graph_generation/gym-molecule/gym_molecule/envs/sascorer.py", line 59, in calculateScore

    2)  #<- 2 is the *radius* of the circular fingerprint

Boost.Python.ArgumentError: Python argument types in

    rdkit.Chem.rdMolDescriptors.GetMorganFingerprint(NoneType, int)

did not match C++ signature:

    GetMorganFingerprint(RDKit::ROMol mol, int radius, boost::python::api::object invariants=[], boost::python::api::object fromAtoms=[], bool useChirality=False, bool useBondTypes=True, bool useFeatures=False, bool useCounts=True, boost::python::api::object bitInfo=None)



During handling of the above exception, another exception occurred:



Traceback (most recent call last):

  File "run_molecule.py", line 93, in <module>

    main()

  File "run_molecule.py", line 88, in main

    writer.export_scalars_to_json("./all_scalars.json")

AttributeError: 'NoneType' object has no attribute 'export_scalars_to_json'

Cannot reproduce results

After running for 12 days with 64 CPU's and few available GPU's (which I guess were not used by this method) finally this method finished training. It has generated a number of molecules over each iteration. When trying to generate the results using given evaluation code (which first needed to fixed for test data, authors just used sample of training data for evaluation), the results are different from training set with no improvement and less than input data statistics.
Tried to get the values for penalized logP (trained model with conditions=True and has_scaffold=True) using given reward_penalized_logp function in molecule.py and checked the top 3 largest values which has maximum of 3.34 and when checked minimum that is -89.43 which are nowhere near as reported in paper.
Either full code is not given for training or the evaluation is done differently for which code is not provided. I don't know how I can get the same results in in paper.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.