ricardorei / lightning-text-classification Goto Github PK

Minimalist implementation of a BERT Sentence Classifier with PyTorch Lightning, Transformers and PyTorch-NLP.

Python 100.00%

lightning-text-classification's Introduction

Minimalist Implementation of a BERT Sentence Classifier

This repo is a minimalist implementation of a BERT Sentence Classifier. The goal of this repo is to show how to combine 3 of my favourite libraries to supercharge your NLP research.

My favourite libraries:

Requirements:

This project uses Python 3.6

Create a virtual env with (outside the project folder):

virtualenv -p python3.6 sbert-env
source sbert-env/bin/activate

Install the requirements (inside the project folder):

pip install -r requirements.txt

Getting Started:

Train:

python training.py

Available commands:

Training arguments:

optional arguments:
  --seed                      Training seed.
  --batch_size                Batch size to be used.
  --accumulate_grad_batches   Accumulated gradients runs K small batches of \
                              size N before doing a backwards pass.
  --val_percent_check         If you dont want to use the entire dev set, set \
                              how much of the dev set you want to use with this flag.

Early Stopping/Checkpoint arguments:

optional arguments:
  --metric_mode             If we want to min/max the monitored quantity.
  --min_epochs              Limits training to a minimum number of epochs
  --max_epochs              Limits training to a max number number of epochs
  --save_top_k              The best k models according to the quantity \
                            monitored will be saved.

Model arguments:

optional arguments:
  --encoder_model             BERT encoder model to be used.
  --encoder_learning_rate     Encoder specific learning rate.
  --nr_frozen_epochs          Number of epochs we will keep the BERT parameters frozen.
  --learning_rate             Classification head learning rate.
  --dropout                   Dropout to be applied to the BERT embeddings.
  --train_csv                 Path to the file containing the train data.
  --dev_csv                   Path to the file containing the dev data.
  --test_csv                  Path to the file containing the test data.
  --loader_workers            How many subprocesses to use for data loading.

Note: After BERT several BERT-like models were released. You can test different size models like Mini-BERT and DistilBERT which are much smaller.

Mini-BERT only contains 2 encoder layers with hidden sizes of 128 features. Use it with the flag: --encoder_model google/bert_uncased_L-2_H-128_A-2
DistilBERT contains only 6 layers with hidden sizes of 768 features. Use it with the flag: --encoder_model distilbert-base-uncased

Training command example:

python training.py \
    --gpus 0 \
    --batch_size 32 \
    --accumulate_grad_batches 1 \
    --loader_workers 8 \
    --nr_frozen_epochs 1 \
    --encoder_model google/bert_uncased_L-2_H-128_A-2 \
    --train_csv data/MP2_2022_train.csv \
    --dev_csv data/MP2_2022_dev.csv \

Testing the model:

python test.py --experiment experiments/version_{date} --test_data data/MP2_2022_dev.csv

Tensorboard:

Launch tensorboard with:

tensorboard --logdir="experiments/"

Code Style:

To make sure all the code follows the same style we use Black.

lightning-text-classification's People

Contributors

Stargazers

Watchers

lightning-text-classification's Issues

CUDA out of memory

I'm having CUDA out of memory even with --batch_size 1 . But it raises the error only after the first epoch. Any idea/advice on to solve such an issue?

Multi GPU half precision training enhancement request

Currently an error is thrown when using multi-GPU's and 16 bit precision.

To setup this, the following hparams are added to trainer:

    parser.add_argument("--gpus", type=str, default='0,1,2', help="Which gpus")
    parser.add_argument('--precision', default=16, type=int)

    trainer=Trainer(
    ...
        precision=hparams.precision,

Error trace:

  File "/path_to/lightning-text-classification/classifier.py", line 183, in forward
    word_embeddings = self.bert(tokens, mask)[0]
  File "/home/me/.virtualenvs/sbert-env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/me/.virtualenvs/sbert-env/lib/python3.8/site-packages/transformers/modeling_bert.py", line 824, in forward
    embedding_output = self.embeddings(
  File "/home/me/.virtualenvs/sbert-env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/me/.virtualenvs/sbert-env/lib/python3.8/site-packages/transformers/modeling_bert.py", line 207, in forward
    inputs_embeds = self.word_embeddings(input_ids)
  File "/home/me/.virtualenvs/sbert-env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/me/.virtualenvs/sbert-env/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 124, in forward
    return F.embedding(
  File "/home/me/.virtualenvs/sbert-env/lib/python3.8/site-packages/torch/nn/functional.py", line 1814, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: arguments are located on different GPUs at /pytorch/aten/src/THC/generic/THCTensorIndex.cu:403

Checking this, at the call: word_embeddings = self.bert(tokens, mask)[0], both tokens and mask are on the same GPU, it appears an issue with the embeddings that I haven't been able to isolate exactly where yet.

Running with multi GPU and 32 bit precision works fine, as does 1 GPU and 16 bit precision. Error occurs with both dp and dpp distributed modes.

Error during training: Can't pickle local object

Hi,

I really like how you combined PyTorch Lightning with the Transformers library. I tried to upgrade all dependencies to the latest version. There are some smaller issues, like renamed keyword arguments in PyTorch Lightning and so on. Eventually, I face the following error message.

Error:
AttributeError: Can't pickle local object 'LayerSummary._register_hook..hook'

Environment:
python = "^3.7"
torch = "^1.5.1"
torchvision = "^0.6.1"
transformers = "^2.11.0"
pytorch-lightning = "^0.8.1"
pytorch-nlp = "^0.5.0"
test-tube = "^0.7.5"
pandas = "^1.0.5"
sklearn = "^0.0"

Do you have any idea what the error might be? Is it a Lightning problem or a transformers problem? Any way to fix it?

Best
Dominique

ricardorei / lightning-text-classification Goto Github PK

lightning-text-classification's Introduction

Minimalist Implementation of a BERT Sentence Classifier

Requirements:

Getting Started:

Train:

Tensorboard:

Code Style:

lightning-text-classification's People

Contributors

Stargazers

Watchers

Forkers

lightning-text-classification's Issues

CUDA out of memory

Multi GPU half precision training enhancement request

Error during training: Can't pickle local object

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent