This repo contains the source code for extracting drug-protein relations in BioCreative VII DrugProt Track. We propose a novel sequence labeling framework to the drug-protein relation extraction. Our method achieves the top performance in the BioCreative VII DrugProt Track. Please refer to our paper for more details:
The codes have been tested using Python3.7 on CentOS and uses the following dependencies on a CPU and GPU:
To run this code, you need to first download the model file ( the best single model, i.e., BioM-ELECTRAL with P->D), then unzip and put the model folder into the root folder.
You can use our trained model to extract drug-protein relations from biomedical texts by the /src/DrugProt_Tagging_PD.py file.
The file requires 2 parameters:
- --input, -i, help="input file"
- --output, -o, help="output file to save the extracted relations"
The input file need to provide the text and named entity recognition (NER) information. There is one example in the /example/ folder.
Example:
$ python DrugProt_Tagging_PD.py -i ../example/example_input.tsv -o ../example/example_out.tsv