Giter Site home page Giter Site logo

humansignal / label-studio-transformers Goto Github PK

View Code? Open in Web Editor NEW
173.0 9.0 31.0 195 KB

Label data using HuggingFace's transformers and automatically get a prediction service

Home Page: https://labelstud.io/

License: Apache License 2.0

Python 100.00%
label-studio nlp transformers natural-language-processing natural-language-understanding bert pytorch-transformers text-labeling data-labeling

label-studio-transformers's Introduction

Label Studio for Hugging Face's Transformers

WebsiteDocsTwitterJoin Slack Community


Transfer learning for NLP models by annotating your textual data without any additional coding.

This package provides a ready-to-use container that links together:


Quick Usage

Install Label Studio and other dependencies

pip install -r requirements.txt
Create ML backend with BERT classifier
label-studio-ml init my-ml-backend --script models/bert_classifier.py
cp models/utils.py my-ml-backend/utils.py

# Start ML backend at http://localhost:9090
label-studio-ml start my-ml-backend

# Start Label Studio in the new terminal with the same python environment
label-studio start
  1. Create a project with Choices and Text tags in the labeling config.
  2. Connect the ML backend in the Project settings with http://localhost:9090
Create ML backend with BERT named entity recognizer
label-studio-ml init my-ml-backend --script models/ner.py
cp models/utils.py my-ml-backend/utils.py

# Start ML backend at http://localhost:9090
label-studio-ml start my-ml-backend

# Start Label Studio in the new terminal with the same python environment
label-studio start
  1. Create a project with Labels and Text tags in the labeling config.
  2. Connect the ML backend in the Project settings with http://localhost:9090

Training and inference

The browser opens at http://localhost:8080. Upload your data on Import page then annotate by selecting Labeling page. Once you've annotate sufficient amount of data, go to Model page and press Start Training button. Once training is finished, model automatically starts serving for inference from Label Studio, and you'll find all model checkpoints inside my-ml-backend/<ml-backend-id>/ directory.

Click here to read more about how to use Machine Learning backend and build Human-in-the-Loop pipelines with Label Studio

License

This software is licensed under the Apache 2.0 LICENSE © Heartex. 2020

label-studio-transformers's People

Contributors

dependabot[bot] avatar farioas avatar makseq avatar niklub avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

label-studio-transformers's Issues

When predict ner datas use ner samples [KeyError: 'ner'] has been occured

When predict ner datas,next error has been occured,I print tasks,I found out that the place where it should be 'ner' was programmed with '$undefined$' ,

tasks data:

tasks:[{'id': 17, 'data': {'$undefined$': 'This work proposes a novel adaptation of a pretrained sequence-to-sequence model to the task of document ranking.'}, 'meta': {}, 'created_at': '2021-07-05T02:30:36.230799Z', 'updated_at': '2021-07-05T02:30:36.230834Z', 'is_labeled': True, 'overlap': 1, 'project': 10, 'file_upload': 6, 'annotations': [{'id': 17, 'created_username': ' [email protected], 1', 'created_ago': '0\xa0minutes', 'completed_by': 1, 'result': [{'value': {'start': 54, 'end': 74, 'text': 'sequence-to-sequence', 'labels': ['ORG']}, 'id': 'FA_HTHugoH', 'from_name': 'label', 'to_name': 'text', 'type': 'labels'}], 'was_cancelled': False, 'ground_truth': False, 'created_at': '2021-07-05T02:38:39.940285Z', 'updated_at': '2021-07-05T02:38:39.940321Z', 'lead_time': 11.491, 'task': 17}], 'predictions': []}]

error:

[2021-07-05 10:38:40,048] [ERROR] [label_studio_ml.exceptions::exception_f::53] Traceback (most recent call last):
  File "/workspace/label-studio/label-studio-ml-backend/label_studio_ml/exceptions.py", line 39, in exception_f
    return f(*args, **kwargs)
  File "/workspace/label-studio/label-studio-ml-backend/label_studio_ml/api.py", line 31, in _predict
    predictions, model = _manager.predict(tasks, project, label_config, force_reload, try_fetch, **params)
  File "/workspace/label-studio/label-studio-ml-backend/label_studio_ml/model.py", line 274, in predict
    predictions = m.model.predict(tasks, **kwargs)
  File "/workspace/label-studio/label-studio-transformers/ner-backend-test/ner.py", line 369, in predict
    texts = [task['data'][self.value] for task in tasks]
  File "/workspace/label-studio/label-studio-transformers/ner-backend-test/ner.py", line 369, in <listcomp>
    texts = [task['data'][self.value] for task in tasks]
KeyError: 'ner'

Traceback (most recent call last):
  File "/workspace/label-studio/label-studio-ml-backend/label_studio_ml/exceptions.py", line 39, in exception_f
    return f(*args, **kwargs)
  File "/workspace/label-studio/label-studio-ml-backend/label_studio_ml/api.py", line 31, in _predict
    predictions, model = _manager.predict(tasks, project, label_config, force_reload, try_fetch, **params)
  File "/workspace/label-studio/label-studio-ml-backend/label_studio_ml/model.py", line 274, in predict
    predictions = m.model.predict(tasks, **kwargs)
  File "/workspace/label-studio/label-studio-transformers/ner-backend-test/ner.py", line 369, in predict
    texts = [task['data'][self.value] for task in tasks]
  File "/workspace/label-studio/label-studio-transformers/ner-backend-test/ner.py", line 369, in <listcomp>
    texts = [task['data'][self.value] for task in tasks]
KeyError: 'ner'

Ner.py pretrained_config_archive_map not found for any model

On the initialitation process
label-studio-ml init smdia-backend-ner --script models/ner.py --force

I'm receiving this error to all the models
AttributeError: type object 'BertConfig' has no attribute 'pretrained_config_archive_map'


Traceback (most recent call last):
  File "/usr/local/bin/label-studio-ml", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/label_studio_ml/server.py", line 119, in main
    create_dir(args)
  File "/usr/local/lib/python3.6/dist-packages/label_studio_ml/server.py", line 73, in create_dir
    model_classes = get_all_classes_inherited_LabelStudioMLBase(script_path)
  File "/usr/local/lib/python3.6/dist-packages/label_studio_ml/utils.py", line 29, in get_all_classes_inherited_LabelStudioMLBase
    module = importlib.import_module(module_name)
  File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 955, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 665, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/labelstudio/label-studio-transformers/models/ner.py", line 36, in <module>
    [list(conf.pretrained_config_archive_map.keys()) for conf in (BertConfig,CamembertConfig, RobertaConfig, DistilBertConfig)],
  File "/labelstudio/label-studio-transformers/models/ner.py", line 36, in <listcomp>
    [list(conf.pretrained_config_archive_map.keys()) for conf in (BertConfig,CamembertConfig, RobertaConfig, DistilBertConfig)],
AttributeError: type object 'BertConfig' has no attribute 'pretrained_config_archive_map'

I tried to downgrade transformers to 2.0.0 but them fails the transformers import

could someone check this issue?

Can't make predictions: ML backend returns an error (ner.py)

Steps to reproduce:

  1. Using docker to start up the server docker-compose up --build
  2. Used import sample with three tasks [{"text":"To have faith is to trust yourself to the water"},{"text":"To have faith is to trust yourself to the water"},{"text":"To have faith is to trust yourself to the water"}]
  3. Completed two tasks and trained huggingface transformer from ner.py.
  4. Go to UI for third task prediction.
  5. No prediction.

Requirements:
torch==1.5.0
transformers==2.4.1
tensorboardX==1.9
label-studio>=0.7.0

Full logs are here:

[2020-08-31 15:07:49,882] [ERROR] [label_studio.utils.models::make_predictions::528] Can't make predictions: ML backend returns an error: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>500 Internal Server Error</title>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error and was unable to complete your request. Either the server is overloaded or there is an error in the application.</p>

Can you help me please with this issue?

--ml-backend-url option changed and where should i put --ml-backend-name ?

The current label-studio start command in the docker-compose.yml contains these options.
https://github.com/heartexlabs/label-studio-transformers/blob/9450322/docker-compose.yml#L15-L16

      --ml-backend-url http://label-studio-ml-backend:9090
      --ml-backend-name my_model

but the current label-studio doesn't have them.

label-studio start -h and I found it changed to --ml-backend.
I fixed this and I could see localhost:8200.

but where should I put a model name?

not showing predictions after training

Describe the bug
There are 2 problems:

  1. After training ML backend model, I cannot find the model predictions in the UI when labelling.
  2. Often cannot train all 100 epochs, the system will crash at middle, 30-70 epochs although dataset is small (50) and have GPU.
  3. Error shows that: get latest job results from work dir doesn’t exist
  4. Sometimes, when 3 not occurs, other issue is that: unable to load weight from pytorch checkpoint file.
    image
    image

To reproduce
Steps to reproduce the behaviour

  1. Import pre annotated data
  2. Manually label some of them
  3. Go to ML UI in Setting, connect model (BERT classifier) and start training
  4. After finishing, come back to Label UI. In prediction tab, only the pre annotated predictions are shown.

Expected behaviour
ML training should be completed and new predictions should be shown in UI

No module named label_studio_ml.api while starting ML backend

Was able to create ML backends successfully based on bert_classifier.py with:
"label-studio-ml init my-ml-backend-bert --script models/bert_classifier.py"

but while starting it with command "label-studio-ml start my-ml-backend-bert" i'm getting following error:

__File "././my-ml-backend-bert/_wsgi.py", line 30, in
from label_studio_ml.api import init_app
ImportError: No module named label_studio_ml.api

Also tried with other classifiers from this source
"https://github.com/heartexlabs/label-studio-ml-backend/tree/master/label_studio_ml/examples"
but each of them gives me the same error while starting.

label-studio requirement incorrect, also getting old

The README example does not work as is -- label-studio==1.0.0 does not provide the command label-studio-ml, and does not expose LabelStudioMLBase.

It works OK with label-studio==0.7, but that's not what's specified in requirements.txt.

(NB that it also doesn't work with the current head of label-studios-ml-backend).

Error with ner.py

When using the quick start for BERT NER:
label-studio-ml init my-ml-backend --script models/ner.py

This error occurs:
AttributeError: type object 'BertConfig' has no attribute 'pretrained_config_archive_map'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.