Giter Site home page Giter Site logo

emotion_dataset's Introduction

Emotion Dataset

This is a dataset that can be used for emotion classification. It has already been preprocessed based on the approach described in our paper. It is also stored as a pandas dataframe and ready to be used in an NLP pipeline.

Note that the version of the data provided here corresponds to a six emotions variant that's meant to be used for educational and research purposes.

Download

Hugging Face: https://huggingface.co/datasets/emotion

Download link: https://www.icloud.com/iclouddrive/084E9TMZ_lykn3QhU-kIX1DDQ#merged_training

Papers with Code Public Leaderboad: https://paperswithcode.com/sota/text-classification-on-emotion

Load the Dataset Using Pandas

import pandas as pd

df = pd.read_pickle("merged_training.pkl")

Notebooks

Here is a notebook showing how to use it for fine-tuning a pretrained language model for the task of emotion classification.

Here is another notebook which shows how to fine-tune T5 model for emotion classification along with other tasks.

Here is also a hosted fine-tuned model on HuggingFace which you can directly use for inference in your NLP pipeline.

Feel free to reach out to me on Twitter for more questions about the dataset.

Usage

The dataset should be used for educational and research purposes only. If you use it, please cite:

@inproceedings{saravia-etal-2018-carer,
    title = "{CARER}: Contextualized Affect Representations for Emotion Recognition",
    author = "Saravia, Elvis  and
      Liu, Hsien-Chi Toby  and
      Huang, Yen-Hao  and
      Wu, Junlin  and
      Chen, Yi-Shin",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    month = oct # "-" # nov,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/D18-1404",
    doi = "10.18653/v1/D18-1404",
    pages = "3687--3697",
    abstract = "Emotions are expressed in nuanced ways, which varies by collective or individual experiences, knowledge, and beliefs. Therefore, to understand emotion, as conveyed through text, a robust mechanism capable of capturing and modeling different linguistic nuances and phenomena is needed. We propose a semi-supervised, graph-based algorithm to produce rich structural descriptors which serve as the building blocks for constructing contextualized affect representations from text. The pattern-based representations are further enriched with word embeddings and evaluated through several emotion recognition tasks. Our experimental results demonstrate that the proposed method outperforms state-of-the-art techniques on emotion recognition tasks.",
}

emotion_dataset's People

Contributors

mario-rc avatar omarsar avatar patil-suraj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

emotion_dataset's Issues

tweet id

How can we get the tweet-id of tweets on the dataset to extract their information on Twitter API?

How did you use LIWC to measure emotion?

Could you please provide details of how you used LIWC 2007 to detect 8 emotions (anger, anticipation, disgust, fear, joy, sadness, surprise, trust) as described in the paper?

Here are text features features I got from LIWC 2007:

['text', 'labels', 'Segment', 'WC', 'WPS', 'Sixltr', 'Dic', 'funct',
'pronoun', 'ppron', 'i', 'we', 'you', 'shehe', 'they', 'ipron',
'article', 'verb', 'auxverb', 'past', 'present', 'future', 'adverb',
'preps', 'conj', 'negate', 'quant', 'number', 'swear', 'social',
'family', 'friend', 'humans', 'affect', 'posemo', 'negemo', 'anx',
'anger', 'sad', 'cogmech', 'insight', 'cause', 'discrep', 'tentat',
'certain', 'inhib', 'incl', 'excl', 'percept', 'see', 'hear', 'feel',
'bio', 'body', 'health', 'sexual', 'ingest', 'relativ', 'motion',
'space', 'time', 'work', 'achieve', 'leisure', 'home', 'money', 'relig',
'death', 'assent', 'nonfl', 'filler', 'AllPunc', 'Period', 'Comma',
'Colon', 'SemiC', 'QMark', 'Exclam', 'Dash', 'Quote', 'Apostro',
'Parenth', 'OtherP']

I really don't know how you used LIWC to detect some emotions such as anticipation, disgust, surprise, trust?
Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.