Giter Site home page Giter Site logo

bluesky1018 / relation-classification Goto Github PK

View Code? Open in Web Editor NEW

This project forked from sahitya0000/relation-classification

0.0 0.0 0.0 22.8 MB

Relation Classification - SEMEVAL 2010 task 8 dataset

License: MIT License

Perl 15.46% Jupyter Notebook 80.96% Python 0.24% Java 2.73% Batchfile 0.32% Shell 0.29%

relation-classification's Introduction

Relation-Classification

Relation Classification - SEMEVAL 2010 task 8 dataset (Master's Thesis)

  • Relation classification is a task of assigning predefined relation labels to the entity pairs that occur in texts.
    • Example:
    • Sentence: [People]_e1 have been moving back into [downtown]_e2
    • Relation: Entity-Destination(e1,e2) where e1 = people, e2 = downtown

Cite Us As

@MastersThesis{Sahitya:2018,
 author = { {Sahitya Patel} and Harish Karnick}, 
 title = {Multi-Way Classification of Relations Between Pairs of Entities}, 
 school = {Indian Institute of Technology Kanpur (IITK)}, 
 address = {India},
 year = 2018, 
 month = 6
}

Presentation

Relation-Classification-github.pdf

Dataset

Paper: SemEval-2010 Task 8: Multi-Way Classification of Semantic Relations Between Pairs of Nominals

Zip: SemEval2010_task8_all_data.zip

Files

Preprocessing of data

  • 01_create_train_test_attn
  • 02_train_val_split
  • 03_data_preprocess

Model training

  • 04_CBGRU_MEA_Model

Environment

  • Python 3.5.4 | Anaconda custom (64-bit)
  • Keras 2.1.5
  • Tensorflow 1.4.0
  • CUDA compilation tools, release 8.0, V8.0.44 (nvcc --version)
  • CuDNN 6.0.21
  • Perl

Running the model without preprocessing

  1. Get preprocessed data. Download "data_all.npy" from this-link (94.6 MB) and put it in the folder "./data/".
  2. Run "04_CBGRU_MEA_Model"

Running the model with preprocessing

01_create_train_test_attn

Description: Pre-processing of dataset files

Reads:

  • "./corpus/SemEval2010_task8_training/TRAIN_FILE.TXT"
  • "./corpus/SemEval2010_task8_testing_keys/TEST_FILE_FULL.TXT"

Creates:

  • "./files/train_attn.txt"
  • "./files/test_attn.txt"

To Do:

  1. Set the following path in "01_create_train_test_attn"
os.environ['CLASSPATH'] = "H:/Relation-Classification/stanford/stanford-postagger-2017-06-09"
  1. Run "01_create_train_test_attn"

02_train_val_split

Description: Spliting of the training data into training and validation data

Reads:

  • "./files/train_attn.txt"
  • "./files/test_attn.txt"

Creates:

  • "./files/train_attn_sp.txt"
  • "./files/val_attn_sp.txt"
  • "./files/test_attn_sp.txt"

To Do:

  1. Run "02_train_val_split"

03_data_preprocess

Description: Generating a single input file for the model

Creates:

  • "./data/data_all.npy"

Steps:

  1. Place "GoogleNews-vectors-negative300.bin" in "./word_embeddings" folder. (Download-Link, Website-word2vec)
  2. Run "./word_embeddings/GoogleNews-vectors-negative300_bin_to_txt.py" to create "./word_embeddings/GoogleNews-vectors-negative300.txt"
  3. Run "03_data_preprocess"

04_CBGRU_MEA_Model

Description: Model training. Best model is saved in "./model" folder.

Steps:

  1. Run "04_CBGRU_MEA_Model"

Creates:

  • "./model/model.keras" - Model

Model CBGRU-MEA

Model CBGRU-MEA

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.