Giter Site home page Giter Site logo

bere's Introduction

BERE

Implementation of the paper [BERE: A novel machine learning framework for accurate biomedical entity relation extraction].

Environments

Tested on a linux server with GeForce GTX 1080 and the running environment is as follows:

  • Python 3.5.2

  • PyTorch 1.0.0

  • sklearn 0.20.2

  • numpy 1.15.4

  • cuda 9.0

Installation Guide

  1. Download the pretrained word embedding PubMed-and-PMC-w2v.bin from http://evexdb.org/pmresources/vec-space-models/ and put it in ./data/.

  2. Download the complete DTI dataset from https://cloud.tsinghua.edu.cn/d/1bdb3bed3031479c8aa9/ and put it in ./data/dti/.

How to Run

[DDI Expirement](less than 1h for one training)

  1. Run ./data/ddi/data_prepare.py to preprocess the DDI dataset.

  2. Run ./train_ddi.py to train BERE with different learning rates.

  3. Run ./test_ddi.py to test BERE with the best model.

 

[DTI Expirement](taking 10~20h before convergence)

  1. Run ./data/dti/data_prepare.py to preprocess the DTI dataset.

  2. Run ./train_dti.py to train BERE with different learning rates.

  3. Run ./test_dti.py to test BERE with the best model.

 

[Demo of DTI Prediction]

  1. Train the model on the DTI dataset.

  2. Run ./data/dti/transform.py to preprocess the pmc_nintedanib dataset.

  3. Run ./predict.py to predict the targets of the drug nintedanib by the well-trained model.

Data Description

  • PubMed-and-PMC-w2v.bin: The pretrained word embedding.
  • train.json, valid.json, test.json: The original dataset.
  • label2id.json: The label file.
  • pmc_nintedanib.json: The data for DTI Prediction demo.
  • tree_examples.json: The data for visualization demo.
  • config.py: The hyper-parameter settings.

File Description

  • ./data/: This directory contains DDI dataset, DTI dataset and pretrained word embedding.

  • ./network/: This directory contains the codes of our model.

  • ./checkpoint/(generated): This directory contains the checkpoints of model in the training process.

  • ./result/(generated): This directory contains the test results and prediction results

  • ./output/(generated): This directory contains the prediction results, which is more convenient for reading.

  • ./train_ddi.py: This is a demo for training the BERE on DDI dataset.

  • ./train_dti.py: This is a demo for testing the BERE on DDI dataset.

  • ./test_ddi.py: This is a demo for training the BERE on DTI dataset.

  • ./test_dti.py: This is a demo for testing the BERE on DTI dataset.

  • ./predict.py: This is a demo for predicting the targets of the drug nintedanib.

  • ./plot_pr.py: This file is used to plot the precision-recall curve of the results in ./result/.

  • ./visualize.py(optional): This file is used for the visualization of word attention, sentence attention and sentence tree structures.

Notes

  • The full backup data, codes and results can be found in https://cloud.tsinghua.edu.cn/d/0e3a253403914c33b3dd/, which could be used for reproduction.
  • The full datasets for discovering novel DTIs is available from the corresponding authors upon request.
  • If you have any other questions or comments, please feel free to email Lixiang Hong (honglx17[at]mails[dot]tsinghua[dot]edu[dot]cn) and/or Jianyang Zeng (zengjy321[at]tsinghua[dot]edu[dot]cn).

bere's People

Contributors

haiya1994 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.