Giter Site home page Giter Site logo

fastent's Introduction

fastent

The fastent Python library is a tool for end-to-end creation of custom models for named-entity recognition.

Custom Models

To train a model for a new type of entity, you just need a list of examples.

You are not limited to only predefined types like person, location and organization.

How It Works

fastent does end-to-end creation: dataset generation, annotation, contextualiziation and training a model.

You can also use fastent modules as standalone tools.

Made for Prod

fastent includes integrations with tools like spaCy, fastText pre-trained models and NLTK.

fastent is built to scale to very large text datasets in many languages.


Installation

fastent is developed for Python 3 on Unix systems.

Clone this repo or install from PyPI:

pip install fastent

Download NLTK data:

python -m nltk.downloader stopwords

Install and set up CouchDB:

wget -O - https://raw.githubusercontent.com/fastent/fastent/master/install.sh | bash

Downloading data files

TODO: fastText stuff

How To

Generation

fastent can generate a dataset from a list

TODO

fastent can even generate a list from one or two examples.

from fastent import dataset_pseudo_generator

model = dataset_pseudo_generator.spacy_initialize('en_core_web_lg')
dataset_pseudo_generator.dataset_generate(model,['cocaine', 'heroin'], 100)

The equivalent on the command line:

python dataset_pseudo_generator.py -m en_core_web_lg -s cocaine,heroin

Annotation

TODO

Contextualization

TODO

Training

To train a model from the annotated and contextualized dataset:

For now the only supported learning framework is spaCy.

Request support for a new learning framework

TODO: sample output

Testing

Coming soon!

Integrations

fastent includes integrations for downloading datasets and pre-trained models.

TODO

More

See how fastent performs on benchmarks

Try the tutorial or fork examples

Browse frequently asked questions

Report bugs or request new features

fastent's People

Contributors

osoblanco avatar bittlingmayer avatar

Watchers

James Cloos avatar Nitin Solanki avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.