Giter Site home page Giter Site logo

phlippe / sentimentclassification_treelstm Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 1.0 2.79 MB

Assignment 2 (Sentiment Classification with Deep Learning) for the course "Natural Language Processing 1" at the University of Amsterdam

Jupyter Notebook 100.00%

sentimentclassification_treelstm's Introduction

Efficient bottom-up learning of tree-structured neural networks for sentiment classification

We present a novel approach for efficient loss weighting in a tree-structured neural network. Current methods consider either only the top-node prediction for loss calculation, or weight all nodes equally yielding to a strongly imbalanced class loss. Our method progresses through the tree, starting at word level, to focus the loss on misclassified nodes. We propose three different heuristics for determining such misclassifications and investigate their effect and performance on the Stanford Sentiment Treebank based on a binary Tree-LSTM model. The results show a significant improvement compared to previous models concerning accuracy and overfitting. The figure below visualizes the concept.

alt text

This paper was written in the context of the second partical assignment (Sentiment Classification with Deep Learning) for the course "Natural Language Processing 1" at the University of Amsterdam. The full paper can be found here.

Code structure

The code is structured into two jupyter notebooks.

Mandatory_model.ipynb

This notebook summarizes all experiments and models from the mandatory assignment part (BOW, CBOW, Deep CBOW, LSTM, Tree-LSTM). In addition, we create all plots which are shown in the paper in this notebook. The results for other models are taken into account by the text files provided next to the notebooks.

TreeLSTMs.ipynb

This notebook contains all experiments from our proposed models. Please note that executing this notebook will take a longer time as the training is set to 50,000 iterations. We therefore advise to use the pretrained models (see below) for test purpose.

Pretrained models

Pretrained models can be found here. Please download the checkpoint folder and save it in the folder notebook. For plotting, the needed test predictions/accuracies are provided in text file which are saved in the notebook folder as well. If the checkpoint folder is not downloaded, the benchmark models need to be trained and saved because the plot evaluation is based on loading these models and running over test set.

Contact

For questions regarding code, models or paper, please contact [email protected].

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.