Giter Site home page Giter Site logo

apache / ctakes Goto Github PK

View Code? Open in Web Editor NEW
41.0 12.0 10.0 131.45 MB

Apache cTAKES is a Natural Language Processing (NLP) platform for clinical text.

Home Page: https://ctakes.apache.org

License: Apache License 2.0

Java 96.54% Python 1.51% Shell 0.18% Batchfile 0.34% Groovy 0.27% HTML 0.25% AMPL 0.01% Bluespec 0.19% Perl 0.02% Dockerfile 0.04% Rich Text Format 0.32% CSS 0.03% JavaScript 0.05% XSLT 0.02% TSQL 0.19% PLSQL 0.04%
clinical nlp bioinformatics

ctakes's Introduction

Apache cTAKES™

Introduction

The Apache™ clinical Text Analysis and Knowledge Extraction System (cTAKES™) focuses on extracting knowledge from clinical text through Natural Language Processing (NLP) techniques.

cTAKES is engineered in a modular fashion and employs leading-edge rule-based and machine learning methods.

cTAKES has standard features for biomedical text processing software, including the ability to extract concepts such as symptoms, procedures, diagnoses, medications and anatomy with attributes and standard codes.

More powerful components can perform tasks as complex as identifying temporal events, dates and times – resulting in placement of events in a patient timeline.

Components are trained on gold standards from the biomedical as well as the general domain. This affords usability across different types of clinical narrative (e.g. radiology reports, clinical notes, discharge summaries) in various institution formats as well as other types of health-related narrative (e.g. twitter feeds), using multiple data standards (e.g. Health Level 7 (HL7), Clinical Document Architecture (CDA), Fast Healthcare Interoperability Resources (FHIR), SNOMED-CT, RxNORM).

cTAKES is the NLP platform for many initiatives across the world covering a variety of research purposes and large datasets. Contributors include professionals at medical and commercial institutions, NLP and Machine Learning researchers, Medical Doctors, and students of many disciplines and levels. We encourage people from all backgrounds to get involved! (link)


Supported Environments

  1. Java 1.8 is required to run cTAKES. Run this command to check your Java version:
$ java -version
  1. Maven 3 is required to build cTAKES. Run this to command to check your Maven version:
$ mvn -version
  1. A license for the Unified Medical Language System (UMLS) is required to use the named entity recognition module (dictionary lookup) with the default dictionary.
  2. Python 3 is required to use cTAKES Python Bridge to Java (PBJ). Run this to command to check your Python version:
$ python -V

Getting Started

New Users

The easiest way for new users to get a jump start running cTAKES is to use the Standard Pipeline Installation Facility. The Standard Pipeline Installation Facility is a tool that can install cTAKES configured to run the most popular cTAKES pre-built pipelines. You can then use the Piper File Submitter GUI to submit jobs or submit them from the command line.

For access to all cTAKES capabilities, download a zip or tar.z file containing a fully-built installation of the most recent cTAKES release. Then, after obtaining a UMLS license, use the UMLS Package Fetcher GUI to install a copy of the default dictionary for Named Entity Recognition (NER) using cTAKES Fast Dictionary Lookup.

New Developers

Notice: cTAKES 6.0.0-SNAPSHOT requires jdk 17 to build and run.

All source code for cTAKES versions 5+ is available from the cTAKES GitHub repository.

  1. Clone this repository
$ git clone https://github.com/apache/ctakes.git
  1. Open your local copy of the repository in an IDE of your choice.
  2. Run directly from the code (link).
    or
  3. Build a binary installation (link), and
  4. Run a binary installation (link).

More information

Much more information can be found on the cTAKES wiki.

You can also write to the cTAKES user and developer mailing lists: user at ctakes.apache.org and dev at apache.ctakes.org and find answers to previously asked questions by searching the user and developer mail archives.

ctakes's People

Contributors

johnsd11 avatar pabramowitsch avatar reckart avatar seanfinan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ctakes's Issues

Issue clean install using maven for ctakes-ytex

I was wondering if anybody had any advice for me on the following error, I am trying to clean install from the head of the main branch in git.

Below are my commands; this error start with commit d998331
The commands below work for the previous commit a97258b

steps to reproduce
git clone the ctakes repo

mvn clean install -ff -DskipTests=true;

error message
/root/projects/ctakes/ctakes-ytex/scripts/build-setup.xml:149: The following error occurred while executing this line:
[ERROR] /root/projects/ctakes/ctakes-ytex/scripts/data/build.xml:148: The following error occurred while executing this line:
[ERROR] /root/projects/ctakes/ctakes-ytex/scripts/data/build.xml:531: Warning: Could not find file /root/projects/ctakes/ctakes-ytex/scripts/data/${project.basedir}/conn.xml.template to copy.

mvn -v
Apache Maven 3.6.3
Maven home: /usr/share/maven
Java version: 1.8.0_282, vendor: AdoptOpenJDK, runtime: /opt/java/openjdk/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "5.10.147+", arch: "amd64", family: "unix"

Deprecate ctakes-utils

ctakes-utils is relatively small. Some classes are still used, and those should be migrated to ctakes-core.

Is there a piper file that includes the smoking status?

I've been using the DefaultFastPipeline.piper configuration on version 4.0.0.1 and wanted to include the published Smoking Status pipeline to this (or as a separate pipeline if that makes more sense) . Is there an example available? Or a suggestion on how to convert a descriptor file like SimulatedProdSmokingTAE.xml to a piper file? Thanks!

cTAKES custom dictionary setup documentation

I have a CSV file of dictionary vocabulary and concept code. Are there any documentations on how we can set up a custom dictionary for when using bin/runClinicalPipeline.sh?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.