Giter Site home page Giter Site logo

desert's People

Contributors

longlongman avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

desert's Issues

question regarding encoder-decoder

Hi longlongman,

I've tried using your encoder-decoder on CASP-ligands but wasn't successful in fully decoding(recovering) the ligands after encoding (molecule cut off, some atoms missing)

I'm suspecting whether if it's because the patch size for the shapepretrainingencoder is too small? or maybe due to maxlen_coef being too small?

any ideas on getting the full molecule?

thanks

Dataset unable to download for preprocessing

I want to reproduce the results of this paper for my research work but I am not able to download datasets from ZINC20 or ZINC15 for the data preprocessing, the links I am using are: https://zinc20.docking.org/substances/, https://zinc15.docking.org/substances/
the downloads are not happening like after sometimes it says failed, Is there any google drive/one drive link where the data is already available to use or any other references will also help. Please help me out to sort this issue.

Thanks in advance.

question about the vocab file

Dear Longlongman,

is the vocab file provided, a partial vocab file?? If it isn't could you provide the full vocab file? some fragments aren't being properly decoded compared to the original. Maybe this is the reason why?

thanks

issue with get_training_data

after get_fragment_vocab i get 2 pkl files

at get_training_data i put in
BRICS_RING_R.vocab.pkl path..

there's an error

Traceback (most recent call last):

frag_idx = vocab[frag_smi][2]

TypeError: 'Mol' object is not subscriptable

how to reduce batch size during generation

hi, my gpu keeps busting up when i'm trying to generate.

horovodrun -np 8 bycha-run
--config configs/generating.yaml
--lib shape_pretraining
--task.mode evaluate
--task.data.train.path data
--task.data.valid.path.samples /home/kiwoong/DESERT/data/sample_shapes.pkl
--task.data.test.path.samples /home/kiwoong/DESERT/data/sample_shapes.pkl
--task.dataloader.train.max_samples 1
--task.dataloader.valid.sampler.max_samples 1
--task.dataloader.test.sampler.max_samples 1
--task.model.path /home/kiwoong/DESERT/trainer/save_model_dir/1WW_30W_5048064.pt
--task.evaluator.save_hypo_dir /home/kiwoong/DESERT/trainer/save_hypo_di

this is the bash file I'm running right now.
How do you reduce the batch size?? In the config file for generation it seems that the only option is max_sample, which i set to 1 but increases steadily. thanks.

Some issues during installing and using

Hello longlongman,
I'm new to deep learning. Please bear with me for some dump questions.

  1. Could you provide more details of installation? ex. Which python version and which pip
    version to choose for installing?
    I used python 3.7.1 and at the last step of installing ' pip install pybel scikit-image pebble
    meeko==0.1.dev1 vina pytransform3d' an error message prints:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tensorflow 2.4.0 requires typing-extensions~=3.7.4, but you have typing-extensions 4.7.1 which is incompatible.

  1. at the pretraining stage after I executed the 'python get_training_data.py' an error occurred:
File "get_training_data.py", line 31, in <module>
    with open(vocab_path, 'rb') as fr:
IsADirectoryError: [Errno 21] Is a directory: '/home/ubuntu/user_space/DESERT/preparation/vocab'

The vocab directory contains two files produced by
'get_fragment_vocab.py' : 'BRICS_RING_R.vocab.pkl' and 'BRICS_RING_R.494789.pkl'

  1. Since I was stuck on step 2, I tried to use the training data and vocab upload online to train the shape2mol. What's the difference between 0.pkl and 1.pkl? Do I need to use both pretrained model? I'm a bit confused about how to fill out the blanks in training.yaml
class: ShapePretrainingDatasetShard
      path: ---TRAINING DATA PATH---
      vocab_path: ---VOCAB PATH---
      sample_each_shard: 500000
      shuffle: True
    valid:
      class: ShapePretrainingDataset
      path: 
        samples: ---VALID DATA PATH---
        vocab: ---VOCAB PATH---
    test:
      class: ShapePretrainingDataset
      path: 
        samples: ---TEST DATA PATH---
        vocab: ---VOCAB PATH---

The ---TRAINING DATA PATH--- is simply the path to get 0.pkl and 1.pkl
But what about valid data path and test data path? ( I just put it same as training data path)

When executing train.sh another error occured
pkg_resources.DistributionNotFound: The 'typing-extensions~=3.7.4' distribution was not found and is required by tensorflow
And I believe it is related with the first error during installing.
4. For the sketching process, where to fill in the path for input ligand sdf in sketch.py?
When sketching the pocket by sketch.py , the following error occurred:

File "sketching.py", line 2, in <module>
   from shape_utils import get_atom_stamp
 File "/home/ubuntu/user_space/DESERT/sketch/shape_utils.py", line 3, in <module>
   from common import ATOM_RADIUS, ATOMIC_NUMBER, ATOMIC_NUMBER_REVERSE
ImportError: cannot import name 'ATOM_RADIUS' from 'common' (/home/ubuntu/miniconda3/envs/DESERT/lib/python3.7/site-packages/common/__init__.py)
  1. For generating process
bycha-run \
   --config configs/generating.yaml \
   --lib shape_pretraining \
   --task.mode evaluate \
   --task.data.train.path data \
   --task.data.valid.path.samples ❗❗❗FILL_THIS(MOLECULE SHAPES SAMPLED FROM CAVITY)❗❗❗ \
   --task.data.test.path.samples  ❗❗❗FILL_THIS❗❗❗ \
   --task.dataloader.train.max_samples 1 \
   --task.dataloader.valid.sampler.max_samples 1 \
   --task.dataloader.test.sampler.max_samples 1 \
   --task.model.path ❗❗❗FILL_THIS❗❗❗ \
   --task.evaluator.save_hypo_dir ❗❗❗FILL_THIS❗❗❗

What should the path for task.data.test.path.samples ?
Is task.model.path the path to 1WW_30W_5048064.pt ?

Sorry for bothering

unable to generate

Hi !
thanks for sharing your excellent work!

I am unable to do the final step of generate after modifying part of your code to successfully train and get pockets, could you please upload the latest and correct full code?

Thanks!

about Preprocessing of data

Hello, I have a question, when running the file get_fragment_vocab.py, the vocab of the fragment can be saved, why does the fragment need to be re-acquired in the file get_training_data.py, and align it with the previously saved fragment, and finally get the rotation matrix ? why do that? thank you very much.

fail to work on mac(m1)

Hi!
Thank u for sharing your excellent work!
when I try to run this on my mac book pro(m1 pro),it just doesn't work
so could u please share a dock file or something can work on mac?
thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.