Giter Site home page Giter Site logo

athene_system's Introduction

2010-07-07_ukp_banner

aiphes_logo - small tud_weblogo

Introduction

The repository was developed as a part of the Fake News Challenge Stage 1 (FNC-1 http://www.fakenewschallenge.org/) by team Athene: Andreas Hanselowski, Avinesh PVS, Benjamin Schiller and Felix Caspelherr. In the project, we worked in close collaboration with Debanjan Chaudhuri.

Our new paper in COLING 2018: A Retrospective Analysis of the Fake News Challenge Stance Detection Task

Our Blog Post on the Fake News Challenge.

Prof. Dr. Iryna Gurevych, AIPHES-Ubiquitous Knowledge Processing (UKP) Lab, TU-Darmstadt, Germany

Requirements

  • Software dependencies

      python >= 3.4.0 (tested with 3.4.0)
    

Installation

  1. Install required python packages.

     python3.4 -m pip install -r requirements.txt --upgrade
    
  2. In order to reproduce the the results of our best submission to the FNC-1, please go to Athene_FNC-1 Google Drive and download the features.zip and model.zip and unzip them in respective folders.

     unzip  features.zip athene_system/data/fnc-1/features
     unzip  model.zip athene_system/data/fnc-1/mlp_models
    
  3. Parts of the Natural Language Toolkit (NLTK) might need to be installed manually.

     python3.4 -c "import nltk; nltk.download('stopwords'); nltk.download('punkt'); nltk.download('wordnet')"
    
  4. Copy Word2Vec GoogleNews-vectors-negative300.bin.gz in folder athene_system/data/embeddings/google_news/

  5. Download Paraphrase Database: Lexical XL Paraphrases 1.0 and extract it to the ppdb folder.

     gunzip ppdb-1.0-xl-lexical.gz athene_system/data/ppdb/
    
  6. To use the Stanford-parser an instance has to be started in parallel: Download Stanford CoreNLP, extract anywhere and execute following command:

     wget http://nlp.stanford.edu/software/stanford-corenlp-full-2016-10-31.zip
     java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9020
    

Additional notes

  • In order to reproduce the classification results of the best submission at the day of the FNC-1, it is mandatory to use tensorflow v0.9.0 (ideally GPU version) and the exact library versions stated in requirements.txt, including python 3.4.

  • Setup tested on Anaconda3 (tensorflow 0.9 gpu version)*

      conda create -n env_python3.4 python=3.4 anaconda
    
      source activate env_python3.4
    
      env_python3.4/bin/python3.4 -m pip install -r requirements.txt --upgrade
    
      env_python3.4/bin/python3.4 -m pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0rc0-cp34-cp34m-linux_x86_64.whl
    

To Run

To run the pre trained model and test

python pipeline.py -p ftest

For more details

python pipeline.py --help         
    
    e.g.: python pipeline.py -p crossv holdout ftrain ftest
    
    * crossv: runs 10-fold cross validation on train / validation set and prints the results
    * holdout: trains classifier on train and validation set, tests it on holdout set and prints the results
    * ftrain: trains classifier on train/validation/holdout set and saves it to athene_systems/data/fnc-1/mlp_models
    * ftest: predicts stances of unlabeled test set based on the model (see Installation, step 2) 

After ftest was executed, the labeled stances will be saved to disk:

cat athene_system/data/fnc-1/fnc_results/submission.csv

System description

A more detailed description of the system including the features, which have been used, can be found in the document: system_description_athene.pdf

athene_system's People

Contributors

hanselowski avatar avineshpvs avatar v1nc3nt27 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.