hasanhuz / spanemo Goto Github PK

SpanEmo

License: Other

Python 100.00%

multi-label-classification emotion-recognition span-prediction emotion-analysis eacl2021 natural-language-processing english arabic spanish emotion-detection

spanemo's Introduction

SpanEmo

Source code for the paper "SpanEmo: Casting Multi-label Emotion Classification as Span-prediction" in EACL2021.

Dependencies

We used Python=3.6, torch=1.2.0. Other packages can be installed via:

pip install -r requirements.txt

The model was trained on an Nvidia GeForce GTX1080 with 11GB memory, Ubuntu 18.10.

Usage

You first need to download the dataset Link for the language of your choice (i.e., English, Arabic or Spanish) and then place them in the data directory data/.

Next, run the main script to do the followings:

data loading and preprocessing
model creation and training

Training

python scripts/train.py --train-path {} --dev-path {}

Options:
    -h --help                         show this screen
    --loss-type=<str>                 which loss to use cross-ent|corr|joint. [default: cross-entropy]
    --max-length=<int>                text length [default: 128]
    --output-dropout=<float>          prob of dropout applied to the output layer [default: 0.1]
    --seed=<int>                      fixed random seed number [default: 42]
    --train-batch-size=<int>          batch size [default: 32]
    --eval-batch-size=<int>           batch size [default: 32]
    --max-epoch=<int>                 max epoch [default: 20]
    --ffn-lr=<float>                  ffn learning rate [default: 0.001]
    --bert-lr=<float>                 bert learning rate [default: 2e-5]
    --lang=<str>                      language choice [default: English]
    --dev-path=<str>                  file path of the dev set [default: '']
    --train-path=<str>                file path of the train set [default: '']
    --alpha-loss=<float>              weight used to balance the loss [default: 0.2]

Once the above step is done, you can then evaluate on the test set using the trained model:

Evaluation

python scripts/test.py --test-path {} --model-path {}

Options:
    -h --help                         show this screen
    --model-path=<str>                path of the trained model
    --max-length=<int>                text length [default: 128]
    --seed=<int>                      seed [default: 0]
    --test-batch-size=<int>           batch size [default: 32]
    --lang=<str>                      language choice [default: English]
    --test-path=<str>                 file path of the test set [default: ]

Citation

Please cite the following paper if you found it useful. Thanks:)

@inproceedings{alhuzali-ananiadou-2021-spanemo,
    title = "{S}pan{E}mo: Casting Multi-label Emotion Classification as Span-prediction",
    author = "Alhuzali, Hassan  and
      Ananiadou, Sophia",
    booktitle = "Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume",
    month = apr,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2021.eacl-main.135",
    pages = "1573--1584",
}

spanemo's People

Contributors

Stargazers

Watchers

spanemo's Issues

Dataset

Can you please list which portions of the dataset you are using such el-reg, el-oc, etc. I was not able to fully determine how you collated the dataset based on your README and your paper.

Thanks!

I just try to run the code, I download the dataset, but I got an error about "TypeError: linear(): argument 'input' (position 1) must be Tensor, not str". I check the code, and I find the two lines code maybe have a mistake.

I just change them to

output = self.bert(input_ids=input_ids)
return output.last_hidden_state

but I still got the same mistake, I don't know why.
Here is my colab code
I would appreciate it if you could help me again!

Problem with test.py

Hey @hasanhuz!
I am a student and want to use your state of the art model for a school project, to analyze emotions in tweets and customer reviews.
Whenever i run the train.py file on GoogleColabs, i received the following error:

Traceback (most recent call last): File "/content/SpanEmo_MLEC/scripts/test.py", line 442, in <module> args = docopt(__doc__) File "/usr/local/lib/python3.7/dist-packages/docopt.py", line 558, in docopt DocoptExit.usage = printable_usage(doc) File "/usr/local/lib/python3.7/dist-packages/docopt.py", line 466, in printable_usage usage_split = re.split(r'([Uu][Ss][Aa][Gg][Ee]:)', doc) File "/usr/lib/python3.7/re.py", line 215, in split return _compile(pattern, flags).split(string, maxsplit) TypeError: expected string or bytes-like object

i am not very comfortable with Python and don't know whats wrong.
Preprocessing and model creation went well so far
Appreciate your help a lot!

questions on overfitting issue and co-existing emotions percentage calculation

Hello,

Thank you for sharing your work on multi-label emotion recognition. I successfully ran the code and got some results, and I made sure to use the same parameters as described in the paper (made sure loss-type is joint loss). But, the model after four epochs seems to be overfitting a lot.

I was just wondering if you observed this phenomenon and what could be the cause if you didn't, I request you to kindly let me know if I'm doing something wrong.

Apart from the above question, I also had a doubt about what co-existing percentage refers to in the paper and how it was calculated.

I'd greatly appreciate some clarifications.

Thank you!

Type error during test

When running test.py file, I got error below:

TypeError: default_collate: batch must contain tensors, numpy arrays, numbers, dicts or lists; found object

any idea on this?
Thanks!

Cannot reproduce results, could you please provide the pretrained weights?

Hi! I am trying to reproduce the results of paper and while training, the best results I can get on dev set is
Val_loss F1-Ma. F1-Mi JS
0.3648 0.4966 0.6917 0.5657

And this values are reached at the second epoch. Im using the joint loss as the paper suggests, with alpha = 0.2. Is there anything else i should do?

Could you please provide the pretrained weights?

Thanks in advance.

AttributeError: 'NoneType' object has no attribute 'update'

Salam Hassan

I was trying to run your code for SpanEmo paper. Interesting work by the way. However, I am not sure why when I try to run it on an Arabic dataset, it gives me the following errors and warnings.

AttributeError: 'NoneType' object has no attribute 'update'

Details are below

It looks like it is something related to fastprogress, but I tried every single possibility including upgrading and downgrading some libraries

Any help is appreciated

Thanks

usr/local/lib/python3.9/site-packages/google/colab/data_table.py:30: UserWarning: IPython.utils.traitlets has moved to a top-level traitlets package.
from IPython.utils import traitlets as _traitlets
Currently using GPU: cuda:0
/usr/local/lib/python3.9/site-packages/ekphrasis/classes/tokenizer.py:225: FutureWarning: Possible nested set at position 2190
self.tok = re.compile(r"({})".format("|".join(pipeline)))
Reading twitter_2018 - 1grams ...
Reading twitter_2018 - 2grams ...
/usr/local/lib/python3.9/site-packages/ekphrasis/classes/exmanager.py:14: FutureWarning: Possible nested set at position 42
regexes = {k.lower(): re.compile(self.expressions[k]) for k, v in
Reading twitter_2018 - 1grams ...
PreProcessing dataset ...: 0% 0/178 [00:00<?, ?it/s]/usr/local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:2323: FutureWarning: The pad_to_max_length argument is deprecated and will be removed in a future version, use padding=True or padding='longest' to pad to the longest sequence in the batch, or use padding='max_length' to pad to a max length. In this case, you can give a specific length with max_length (e.g. max_length=45) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
warnings.warn(
PreProcessing dataset ...: 100% 178/178 [00:01<00:00, 122.84it/s]
The number of training batches: 6
Reading twitter_2018 - 1grams ...
Reading twitter_2018 - 2grams ...
Reading twitter_2018 - 1grams ...
PreProcessing dataset ...: 0% 0/178 [00:00<?, ?it/s]/usr/local/lib/python3.9/site-packages/transformers/tokenization_utils_base.py:2323: FutureWarning: The pad_to_max_length argument is deprecated and will be removed in a future version, use padding=True or padding='longest' to pad to the longest sequence in the batch, or use padding='max_length' to pad to a max length. In this case, you can give a specific length with max_length (e.g. max_length=45) or leave max_length to None to pad to the maximal input size of the model (e.g. 512 for Bert).
warnings.warn(
PreProcessing dataset ...: 100% 178/178 [00:01<00:00, 89.23it/s]
The number of validation batches: 6
Some weights of the model checkpoint at asafaya/bert-base-arabic were not used when initializing BertModel: ['cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight']

This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
/usr/local/lib/python3.9/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True to disable this warning
warnings.warn(
ahmed aleroud
<IPython.core.display.HTML object>
Traceback (most recent call last):
File "/content/SpanEmo/scripts/train.py", line 79, in
learn.fit(
File "/content/SpanEmo/scripts/learner.py", line 91, in fit
for step, batch in enumerate(progress_bar(self.train_data_loader, parent=pbar)):
File "/usr/local/lib/python3.9/site-packages/fastprogress/fastprogress.py", line 39, in iter
if self.total != 0: self.update(0)
File "/usr/local/lib/python3.9/site-packages/fastprogress/fastprogress.py", line 56, in update
self.update_bar(0)
File "/usr/local/lib/python3.9/site-packages/fastprogress/fastprogress.py", line 76, in update_bar
else: self.on_update(val, f'{100 * val/self.total:.2f}% [{val}/{self.total} {elapsed_t}<{remaining_t}{end}]')
File "/usr/local/lib/python3.9/site-packages/fastprogress/fastprogress.py", line 126, in on_update
elif self.parent is not None: self.parent.show()
File "/usr/local/lib/python3.9/site-packages/fastprogress/fastprogress.py", line 168, in show
self.out.update(HTML(self.html_code))
AttributeError: 'NoneType' object has no attribute 'update'

Question about evaluation on SemEval2018

Hi!

I've trained on English SemEval2018 train and dev splits with default configurations, but haven't been able to get similar test results on SemEval2018 test split (E-c-En-test-gold.txt specifically).

For average of three models with the same default setting, I got
F1-Micro 0.6953, F1-Macro 0.53, JS 0.567

I read your results from your paper being higher, especially F1-Macro with around 4% difference. I want to reach out and see if there's something else beyond the default setting that I should note to reproduce the results.

Thank you so much.

Spanish test accuracy pretty low (> 10%)

Hello authors,
I am trying to reproduce these results on the Spanish dataset, using all default parameters. I do understand that without joint loss there should be lower accuracy than with joint loss, but the difference is around 10 to 20 percent. I checked for overfitting and it does not seem to be an issue, plus models are being saved only for best cases of val loss. Do you reckon changing any of the default parameters per se?

Here are results on the test set:
F1-Micro: 0.4789 (expected 0.654)
F1-Macro: 0.3418 (expected 0.534)
JS: 0.3665 (expected 0.481)

Thanks for your time!

Thanks in advance.

I want to ask a few questions.

the paper is very awesome, I just have a problem, which I don't understand totally.

$$H_i = Encoder([CLS]+|C| + [SEP] + s_i)$$

Does |C| in here is already chosen emotion category? Does C is fixed?

hasanhuz / spanemo Goto Github PK

spanemo's Introduction

SpanEmo

Dependencies

Usage

Training

Evaluation

Citation

spanemo's People

Contributors

Stargazers

Watchers

Forkers

spanemo's Issues

Recommend Projects

Recommend Topics

Recommend Org