Giter Site home page Giter Site logo

athene_system's Introduction

2010-07-07_ukp_banner

aiphes_logo - small tud_weblogo

Introduction

The repository was developed as a part of the Fake News Challenge Stage 1 (FNC-1 http://www.fakenewschallenge.org/) by team Athene: Andreas Hanselowski, Avinesh PVS, Benjamin Schiller and Felix Caspelherr. In the project, we worked in close collaboration with Debanjan Chaudhuri.

Our new paper in COLING 2018: A Retrospective Analysis of the Fake News Challenge Stance Detection Task

Our Blog Post on the Fake News Challenge.

Prof. Dr. Iryna Gurevych, AIPHES-Ubiquitous Knowledge Processing (UKP) Lab, TU-Darmstadt, Germany

Requirements

  • Software dependencies

      python >= 3.4.0 (tested with 3.4.0)
    

Installation

  1. Install required python packages.

     python3.4 -m pip install -r requirements.txt --upgrade
    
  2. In order to reproduce the the results of our best submission to the FNC-1, please go to Athene_FNC-1 Google Drive and download the features.zip and model.zip and unzip them in respective folders.

     unzip  features.zip athene_system/data/fnc-1/features
     unzip  model.zip athene_system/data/fnc-1/mlp_models
    
  3. Parts of the Natural Language Toolkit (NLTK) might need to be installed manually.

     python3.4 -c "import nltk; nltk.download('stopwords'); nltk.download('punkt'); nltk.download('wordnet')"
    
  4. Copy Word2Vec GoogleNews-vectors-negative300.bin.gz in folder athene_system/data/embeddings/google_news/

  5. Download Paraphrase Database: Lexical XL Paraphrases 1.0 and extract it to the ppdb folder.

     gunzip ppdb-1.0-xl-lexical.gz athene_system/data/ppdb/
    
  6. To use the Stanford-parser an instance has to be started in parallel: Download Stanford CoreNLP, extract anywhere and execute following command:

     wget http://nlp.stanford.edu/software/stanford-corenlp-full-2016-10-31.zip
     java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9020
    

Additional notes

  • In order to reproduce the classification results of the best submission at the day of the FNC-1, it is mandatory to use tensorflow v0.9.0 (ideally GPU version) and the exact library versions stated in requirements.txt, including python 3.4.

  • Setup tested on Anaconda3 (tensorflow 0.9 gpu version)*

      conda create -n env_python3.4 python=3.4 anaconda
    
      source activate env_python3.4
    
      env_python3.4/bin/python3.4 -m pip install -r requirements.txt --upgrade
    
      env_python3.4/bin/python3.4 -m pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.9.0rc0-cp34-cp34m-linux_x86_64.whl
    

To Run

To run the pre trained model and test

python pipeline.py -p ftest

For more details

python pipeline.py --help         
    
    e.g.: python pipeline.py -p crossv holdout ftrain ftest
    
    * crossv: runs 10-fold cross validation on train / validation set and prints the results
    * holdout: trains classifier on train and validation set, tests it on holdout set and prints the results
    * ftrain: trains classifier on train/validation/holdout set and saves it to athene_systems/data/fnc-1/mlp_models
    * ftest: predicts stances of unlabeled test set based on the model (see Installation, step 2) 

After ftest was executed, the labeled stances will be saved to disk:

cat athene_system/data/fnc-1/fnc_results/submission.csv

System description

A more detailed description of the system including the features, which have been used, can be found in the document: system_description_athene.pdf

athene_system's People

Contributors

avineshpvs avatar hanselowski avatar v1nc3nt27 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

athene_system's Issues

Probability of a label as output

Hello
Thank you for your code.
I'm trying to get the label probabilities out instead of the softmaxed label.
Do you have a simple method to recover them?
I admit to losing myself a little in your code.

Thank you in advance,
Enzo

submission file

Hi,
Can I have the submission file you used to get your results? I'm doing some comparisons and It would be so great if I could have it without running the system.

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.