Giter Site home page Giter Site logo

kyzhouhzau / clinical-ner Goto Github PK

View Code? Open in Web Editor NEW
182.0 9.0 69.0 543 KB

面向中文电子病历的命名实体识别

License: Apache License 2.0

Perl 3.95% Python 3.35% Shell 1.32% Makefile 0.23% Roff 15.12% C 76.02%
wapiti crf ner ccks 2017 clinical

clinical-ner's Introduction

运行流程:

@Author zhoukaiyin

任务描述

本评测任务为面向中文电子病历的命名实体识别,即对于给定的一组电子病历纯文本文档,任务的目标是识别并抽取出与医学临床相关的实体提及(entity mention),并将它们归类到预先定义好的类别(pre-defined categories),比如症状,药品,手术等。

第一步:数据处理(Linux)

$python raw2bio.py -1 #将训练数据分词并贴上字典特征
$python raw2bio.py -2 #将标签数据分词并贴上标签
$python raw2bio.py -3 #将标签保存成pickle文件为了后面将训练数据与标签合在一起
$python raw2bio.py -4 #将标签与训练数据文本接起来构成如下格式
$python raw2bio.py -1 test #将测试数据处理成需要的格式

第二部:模型训练(Linux)

$bash wapiti_ccks.sh #训练模型,模型储存在/eval/bio_ccks中

第三部分:获得结果(Linux)

$python get_result.py #提取结果文件,结果保存在CCKS_result中其格式为BIO和finall中格式为官方标签格式
$python onefile.py #将结果转成提交格式

结果文件

Flyon\CCKS_CRF\eval\result.txt

Wapiti is a simple and fast discriminative sequence labeling toolkit ( http://wapiti.limsi.fr )。A little same as CRF++

注:可以尝试BERT,ALBERT等预训练模型 参见:NLPGNN

clinical-ner's People

Contributors

kyzhouhzau avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clinical-ner's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.