Giter Site home page Giter Site logo

amby602 / sarcasm-detection-using-nn Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rishabhmisra/sarcasm-detection-using-nn

0.0 0.0 0.0 38.92 MB

This is the PyTorch implementation of work presented in 'Modelling Context with User Embeddings for Sarcasm Detection in Social Media' (https://arxiv.org/pdf/1607.00976.pdf). We further extend the approach by proposing a hybrid NN architecture and perform experiments on a newly collected data.

Shell 0.70% Python 94.59% Jupyter Notebook 4.71%

sarcasm-detection-using-nn's Introduction

Sarcasm-Detection-using-HNN

This is the PyTorch implementation of work presented in 'Modelling Context with User Embeddings for Sarcasm Detection in Social Media' (https://arxiv.org/pdf/1607.00976.pdf). The neural network takes a tweet (content) and corresponding user embedding (context) as input, and classifies the tweets as sarcastic/non-sarcastic. We further provide an implementation of our improved framework proposed in Sarcasm Detection using Hybrid Neural Network. (https://arxiv.org/abs/1908.07414).

System requirments

  • python 2.7
  • PyTorch 0.3.1
  • python package gensim
  • python package yandex.translate
  • python package ipdb

Running the code

1. Pre-requisites

  1. Get pre-trained word embeddings (e.g. Skip-gram)

    • Install the bin file from this link
    • Unzip the .bin.gz fine and run the iPython notebook get_word2vec_embeddings.ipynb
    • Place the .txt file obtained in DATA/embeddings/ and change its name to words.txt
  2. Get pre-trained user embeddings for the user. The embeddings we used can be found here. Place the embeddings in DATA/embeddings and name the file as usr2vec.txt

  3. Execute iPython notebook get_data.ipynb. This utility code is used to download tweets corresponding to the tweet ids and then preprocess these tweet messages.

2. Training and Evaluation

a. To run the original code

Run python train_CUE_CNN.py

b. To run the RNN + CNN Hybrid model on the new Dataset

Run python Headlines_RNN.py

Output, results and visualization

The code generate a progress folder, that contains sub folder for every run. Inside every run folder following two file are generated -

  1. logs.txt which contains loss and accuracy on train/test/validation set after every epoch
  2. stats.jpg that plots
    • train/test/validation loss on a single plot
    • train/test/validation accuracy on a single plot

Note:

Util files, pre-trained user embeddings and raw tweet ids were obtained from Original CUE-CNN

Cite

Please cite the following articles in suitable format if you use the dataset:

Text Format:

1. Misra, Rishabh and Prahal Arora. "Sarcasm Detection using News Headlines Dataset." AI Open (2023).
2. Misra, Rishabh and Jigyasa Grover. "Sculpting Data for ML: The first act of Machine Learning." ISBN 978-0-578-83125-1 (2021).

BibTex Format:

@article{misra2023Sarcasm,
  title = {Sarcasm Detection using News Headlines Dataset},
  journal = {AI Open},
  volume = {4},
  pages = {13-18},
  year = {2023},
  issn = {2666-6510},
  doi = {https://doi.org/10.1016/j.aiopen.2023.01.001},
  url = {https://www.sciencedirect.com/science/article/pii/S2666651023000013},
  author = {Rishabh Misra and Prahal Arora},
}

@book{misra2021sculpting,
author = {Misra, Rishabh and Grover, Jigyasa},
year = {2021},
month = {01},
pages = {},
title = {Sculpting Data for ML: The first act of Machine Learning},
isbn = {978-0-578-83125-1}
}

sarcasm-detection-using-nn's People

Contributors

prarora avatar rishabhmisra avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.