Giter Site home page Giter Site logo

reinforcement-learning-active-learning's Introduction

Reinforcement Learning with Active Learning

Project Description


This repository houses the necessary source code for executing Reinforcement Learning in tandem with Active Learning.

This project was created to alleviate the rising computational burdens associated with the detailed affinity calculations required for in-silico screening of small-molecule drug candidates.

With this in mind, we designed an active learning system that uses an iterative selection process to identify informative samples, thereby directing the evolution of a reinforcement learning agent. This approach ensures that the agent is trained using only the most pertinent and informative examples, thereby reducing redundancy and computational wastage.

By leveraging a smaller set of high-quality samples, we are able to expedite the convergence process. This has resulted in substantial efficiency gains with respect to CPU wall time per lead identified.

Table of Contents


  1. [[#Project Description]]
  2. [[#Installation Instructions]]
  3. [[#Project Usage]]
  4. [[#Licence]]

Installation Instructions


We recommend using the Miniconda โ€” conda documentationto set up a virtual environment for installing dependencies.

To clone this repository, you'll need Git installed on your computer.

From your command line:

  • Open your terminal/command line
  • Change the current working directory to the location where you want the cloned directory to be made.
cd location/to/your/folder
git clone https://github.com/MolecularAI/reinforcement-learning-active-learning
cd reinforcement-learning-active-learning

ReInvent


To install ReInvent we create an environment and install a single dependency.

cd reinvent/reinventcli
conda env create -f reinvent.yml

Once the reinvent.v3.2 environment is set up, we need to install the reinvent scoring package

cd reinvent/reinvent_scoring
conda activate reinvent.v3.2
pip install -e . 

For more information please refer to the main repository: Reinvent

Icolos


There is an Icolos build included with this repository. Create the Icolos environment

cd icolos
conda env create -f environment_min.yml

For more information please refer to the main repository: Icolos

Project Usage


There have been two significant modifications to the ReInvent code base: the integration of an active learning system, and the introduction of a relevance weighting mechanism for Reinforcement Learning (RL) policy updates. These enhancements provide a new layer of sophistication to the ReInvent environment, augmenting its capacity for machine learning tasks. The configuration files included in this release contain references to the docking and ROCS query files that were utilized during the system testing phase. These can serve as practical examples or starting points for your own experiments.

Reinvent Scoring


reinvent_scoring.reinvent_scoring.scoring.score_components.active_learning

The active learning score component is furnished with the necessary tools to approximate a given oracle. As of now, it supports ROCS and ADV implementations.

  • retrospectiveReinvent.py:

    • This Python script forms the main execution code for the active learning algorithm. It initiates and controls the learning process.
  • acquisition_functions.py:

    • This module houses various strategies for compound selection. It includes classes implementing diverse selection methods such as random, upper confidence bound, greedy, and uncertainty-based selection.
  • molecular_representations.py:

    • This file includes methods for converting compounds into features suitable for training machine learning models. These features include physchem properties, ecfp, hash_ecfp, avalon, and macc's keys representations.
  • oracles.py:

    • This script is responsible for interacting with Icolos and retrieving the values needed for model training. Currently, it supports ADV and ROCS oracles.
  • surrogate_models.py:

    • This module manages the training and inference of various machine learning models. It supports a variety of models, including Random Forest, XGBoost, Support Vector Regression (SVR), Gaussian Processes, and K-Nearest Neighbours.

Reinvent Model


reinvent.reinventcli.running_modes.reinforcement_learning

We present an enhancement for REINVENT that assigns a unique weight value to each SMILES score component. This process involves multiplying the individual weights of each SMILES to derive a final cumulative weight. This final weight is then applied to the specific contribution of an individual SMILES, resulting in a weighted loss update.

This allows fine-tuning of the individual contribution of a specific SMILES during the gradient update process. Through this, users can better manage and optimize their gradient adjustments.

Example


Example configuration files are provided in the configs directory and the prepared files for docking and chemical structure comparison is found in the data folder.

Licence


reinforcement-learning-active-learning's People

Contributors

detriatis avatar

Stargazers

 avatar Congcong Sun avatar liwentao avatar

Watchers

Alex Voronov avatar Samuel Genheden avatar Hannes Loeffler avatar Jon Paul Janet avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.