Giter Site home page Giter Site logo

molecularai / mmp_project Goto Github PK

View Code? Open in Web Editor NEW
5.0 3.0 1.0 9.9 MB

Code for paper

License: Apache License 2.0

Jupyter Notebook 71.82% Python 26.11% Shell 2.07%
matched-molecular-pair qsar-models qsar non-additive-effects drug-design machine-learning

mmp_project's Introduction

Please note: this repository is no longer being maintained.

Additivity and Nonadditivity for ML in Drug Design

This repository contains code for paper:

Karolina Kwapien, Eva Nittinger, Jiazhen He, Christian Margreitter, Alexey Voronov, and Christian Tyrchan. Implications of Additivity and Nonadditivity for Machine Learning and Deep Learning Models in Drug Design. ACS Omega 2022 7 (30), 26573-26581. DOI: 10.1021/acsomega.2c02738

This repository contains code to run hyper-parameter optimization for RF, SVR, XGBoost, and PLS algorithms. The data is not included in this repository.

Directories

  • root - Python and shell scripts for running hyper-parameter optimization.
  • notebooks - Jupyter Notebooks for splitting data and computing test scores.
  • data-initial - Initial data, not included in this repo.
  • data - Main data: random split of initial data into train and test data, not included in this repo.
  • downsampled-10-percent - Down-sampled 10% of main data, not included in this repo.
  • optuna-storage - Auxiliary storage for optuna library to track hyper-parameter optimization progress, not included in this repo.
  • best-models - Models with best hyper-parameters, not included in this repo.
  • pred_values - Predicted vs expected values for models with best hyper-parameters.
  • fill-gaps-configs - build configurations for best found hyper-parameters for "filling gaps" (see paper).

Workflow

  1. First, split initial data into training and test datasets using Jupyter Notebook.
  2. Then run all 32 optimization jobs using script submit_all_to_slurm_on_full_data.sh.
  3. If any of the jobs fails:
    • Prepare down-sampled data using Jupyter Notebook.
    • Re-submit failed optimization jobs using down-sampled data.
    • Prepare "fill-gaps" build configurations for the best-found hyper-parameters using Jupyter Notebook.
    • Submit "fill-gaps" build jobs.
  4. Then prepare summary table using Jupyter Notebook.

Dependencies

This code uses QPTUNA to set up hyper-parameter optimization.

Optimization jobs are started using SLURM, but they can be started without SLURM too.

License

Apache 2.0.

Contributors

mmp_project's People

Contributors

alexvoronov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

cmargreitter

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.