lukecq1231 / nli Goto Github PK
View Code? Open in Web Editor NEWEnhanced LTSM for natural language inference
License: Apache License 2.0
Enhanced LTSM for natural language inference
License: Apache License 2.0
Hello, I'm following your work, and try to reimplement ESIM by tensorflow.
I noticed in you lstm_layer() , you masked c and h, I'm wondering how much will the mask improve the model compared with the no-mask(just basic LSTM).
And how much will the ortho_weight help?
Thank you so much.
Hello
I want to run this code and I want to reduce the number of samples first, for example, on 1000 samples.
I want to reduce the amount of dataset.
What changes should I make in the code?
Which files do I change?
thanks for your help
Hi
Since I don't have access to GPU, I can't execute your code, but there is another code in the github that implements your model with the keras Library . Are you confirming the following code and correct?
"""
Implementation of ESIM(Enhanced LSTM for Natural Language Inference)
https://arxiv.org/abs/1609.06038
"""
import numpy as np
from keras.layers import *
from keras.activations import softmax
from keras.models import Model
def StaticEmbedding(embedding_matrix):
in_dim, out_dim = embedding_matrix.shape
return Embedding(in_dim, out_dim, weights=[embedding_matrix], trainable=False)
def subtract(input_1, input_2):
minus_input_2 = Lambda(lambda x: -x)(input_2)
return add([input_1, minus_input_2])
def aggregate(input_1, input_2, num_dense=300, dropout_rate=0.5):
feat1 = concatenate([GlobalAvgPool1D()(input_1), GlobalMaxPool1D()(input_1)])
feat2 = concatenate([GlobalAvgPool1D()(input_2), GlobalMaxPool1D()(input_2)])
x = concatenate([feat1, feat2])
x = BatchNormalization()(x)
x = Dense(num_dense, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout_rate)(x)
x = Dense(num_dense, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout_rate)(x)
return x
def align(input_1, input_2):
attention = Dot(axes=-1)([input_1, input_2])
w_att_1 = Lambda(lambda x: softmax(x, axis=1))(attention)
w_att_2 = Permute((2,1))(Lambda(lambda x: softmax(x, axis=2))(attention))
in1_aligned = Dot(axes=1)([w_att_1, input_1])
in2_aligned = Dot(axes=1)([w_att_2, input_2])
return in1_aligned, in2_aligned
def build_model(embedding_matrix, num_class=1, max_length=30, lstm_dim=300):
q1 = Input(shape=(max_length,))
q2 = Input(shape=(max_length,))
# Embedding
embedding = StaticEmbedding(embedding_matrix)
q1_embed = BatchNormalization(axis=2)(embedding(q1))
q2_embed = BatchNormalization(axis=2)(embedding(q2))
# Encoding
encode = Bidirectional(LSTM(lstm_dim, return_sequences=True))
q1_encoded = encode(q1_embed)
q2_encoded = encode(q2_embed)
# Alignment
q1_aligned, q2_aligned = align(q1_encoded, q2_encoded)
# Compare
q1_combined = concatenate([q1_encoded, q2_aligned, subtract(q1_encoded, q2_aligned), multiply([q1_encoded, q2_aligned])])
q2_combined = concatenate([q2_encoded, q1_aligned, subtract(q2_encoded, q1_aligned), multiply([q2_encoded, q1_aligned])])
compare = Bidirectional(LSTM(lstm_dim, return_sequences=True))
q1_compare = compare(q1_combined)
q2_compare = compare(q2_combined)
# Aggregate
x = aggregate(q1_compare, q2_compare)
x = Dense(num_class, activation='sigmoid')(x)
return Model(inputs=[q1, q2], outputs=x)
link github: https://gist.github.com/namakemono/b74547e82ef9307da9c29057c650cdf1
Hi,
I am checking the implementation, and I couldnt find the parts related to the tree lstm. are you planning to release that part too?
thanks
Is there a way to replace GloVe embeddings in ESIM stack with BERT or any other dynamic embeddings?
Hello, I am interested to this model and want to know its training time of one epoch in SNLI dataset. And how many epoches does it need to reach the convergence state?
hello!
I have python 2 and theano 0.8.2.
I want run project in google colab.
I encounter the following error:
Theano does not recognise this flag: CUDA_DIR
warnings.warn('Theano does not recognise this flag: {0}'.format(key))
I set device=cuda0 , but I see the following error:
ERROR (theano.sandbox.gpuarray): pygpu was configured but could not be imported
now, I run the code below:
!wget -c https://repo.continuum.io/archive/Anaconda2-5.1.0-Linux-x86_64.sh
!chmod +x Anaconda2-5.1.0-Linux-x86_64.sh
!bash ./Anaconda2-5.1.0-Linux-x86_64.sh -b -f -p /usr/local
!conda install theano pygpu
but , I have the following error:
ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
....
Please check the implementation of the project on Google colab. I want to run Kim and ESIM projects on Google colab, but both give the same errors.
please help me.
Hi, I am getting a NaN detected error. Just running your scripts with minor adaptations, i.e.:
Training runs fine until Epoch 5 Update 91000 ...
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.