lukecq1231 / nli Goto Github PK

View Code? Open in Web Editor NEW

263.0 13.0 70.0 36 KB

Enhanced LTSM for natural language inference

License: Apache License 2.0

Python 99.43% Shell 0.57%

nli's People

Contributors

Stargazers

Watchers

Forkers

codeaudit laos1984 luheng jacksbrain andy-yangz leezqcst qgzang stevenlol coopertian qxjl1010 ntson2002 nipandha sunjingting uduse sword9 gtesei binbinbian houzhenzhen vtrn988 cnrpman herbertchen1 zhaosm zdstandup zy1417548204 genbien yorick76ee chenglongchen chensvm eight-corner llv22 txye hjw1 davidie kaqikaqi frankey419 liutong-cnu chenmoshushi xu-zhou asirem16 wentropy fendaq happyyolanda yishuihanhan jabalazs jingzhij icetimes danieltangxz isfengg caoyuji1986 wushanzha gaonnr bzqweiyi ai-coder haozijie imran3180 qianrenjian aloha0424 deftruth yolo-cultivate dennisgfr sudipta90 eric11eca luomuqinghan klauscui ivanchenph smith6036 frederico-klein iambusra gkuo06

nli's Issues

GPU

If I want to use GPU, I will deal with this problem.

While Google colab uses Tesla k80, what should I do to get the code running on the GPU?

Hello, I'm following your work, and try to reimplement ESIM by tensorflow.
I noticed in you lstm_layer() , you masked c and h, I'm wondering how much will the mask improve the model compared with the no-mask(just basic LSTM).
And how much will the ortho_weight help?
Thank you so much.

Question

Hello
I want to run this code and I want to reduce the number of samples first, for example, on 1000 samples.
I want to reduce the amount of dataset.
What changes should I make in the code?
Which files do I change?
thanks for your help

ESIM using keras

Hi
Since I don't have access to GPU, I can't execute your code, but there is another code in the github that implements your model with the keras Library . Are you confirming the following code and correct?

"""
Implementation of ESIM(Enhanced LSTM for Natural Language Inference)
https://arxiv.org/abs/1609.06038
"""
import numpy as np
from keras.layers import *
from keras.activations import softmax
from keras.models import Model

def StaticEmbedding(embedding_matrix):
in_dim, out_dim = embedding_matrix.shape
return Embedding(in_dim, out_dim, weights=[embedding_matrix], trainable=False)

def subtract(input_1, input_2):
minus_input_2 = Lambda(lambda x: -x)(input_2)
return add([input_1, minus_input_2])

def aggregate(input_1, input_2, num_dense=300, dropout_rate=0.5):
feat1 = concatenate([GlobalAvgPool1D()(input_1), GlobalMaxPool1D()(input_1)])
feat2 = concatenate([GlobalAvgPool1D()(input_2), GlobalMaxPool1D()(input_2)])
x = concatenate([feat1, feat2])
x = BatchNormalization()(x)
x = Dense(num_dense, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout_rate)(x)
x = Dense(num_dense, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(dropout_rate)(x)
return x

def align(input_1, input_2):
attention = Dot(axes=-1)([input_1, input_2])
w_att_1 = Lambda(lambda x: softmax(x, axis=1))(attention)
w_att_2 = Permute((2,1))(Lambda(lambda x: softmax(x, axis=2))(attention))
in1_aligned = Dot(axes=1)([w_att_1, input_1])
in2_aligned = Dot(axes=1)([w_att_2, input_2])
return in1_aligned, in2_aligned

def build_model(embedding_matrix, num_class=1, max_length=30, lstm_dim=300):
q1 = Input(shape=(max_length,))
q2 = Input(shape=(max_length,))

# Embedding
embedding = StaticEmbedding(embedding_matrix)
q1_embed = BatchNormalization(axis=2)(embedding(q1))
q2_embed = BatchNormalization(axis=2)(embedding(q2))

# Encoding
encode = Bidirectional(LSTM(lstm_dim, return_sequences=True))
q1_encoded = encode(q1_embed)
q2_encoded = encode(q2_embed)

# Alignment
q1_aligned, q2_aligned = align(q1_encoded, q2_encoded)

# Compare
q1_combined = concatenate([q1_encoded, q2_aligned, subtract(q1_encoded, q2_aligned), multiply([q1_encoded, q2_aligned])])
q2_combined = concatenate([q2_encoded, q1_aligned, subtract(q2_encoded, q1_aligned), multiply([q2_encoded, q1_aligned])]) 
compare = Bidirectional(LSTM(lstm_dim, return_sequences=True))
q1_compare = compare(q1_combined)
q2_compare = compare(q2_combined)

# Aggregate
x = aggregate(q1_compare, q2_compare)
x = Dense(num_class, activation='sigmoid')(x)

return Model(inputs=[q1, q2], outputs=x)

link github: https://gist.github.com/namakemono/b74547e82ef9307da9c29057c650cdf1

Building optimizers...

Hello
I run your project, but it's been there for several hours. what is the reason?

tree lstm

Hi,
I am checking the implementation, and I couldnt find the parts related to the tree lstm. are you planning to release that part too?
thanks

error

Hi, What is the cause of the following error?

Please respond faster.
Thanks

Use Dynamic embeddings with ESIM stack

Is there a way to replace GloVe embeddings in ESIM stack with BERT or any other dynamic embeddings?

Training time

Hello, I am interested to this model and want to know its training time of one epoch in SNLI dataset. And how many epoches does it need to reach the convergence state?

run on google colab

hello!
I have python 2 and theano 0.8.2.
I want run project in google colab.
I encounter the following error:

Theano does not recognise this flag: CUDA_DIR
warnings.warn('Theano does not recognise this flag: {0}'.format(key))

I set device=cuda0 , but I see the following error:
ERROR (theano.sandbox.gpuarray): pygpu was configured but could not be imported

now, I run the code below:

!wget -c https://repo.continuum.io/archive/Anaconda2-5.1.0-Linux-x86_64.sh
!chmod +x Anaconda2-5.1.0-Linux-x86_64.sh
!bash ./Anaconda2-5.1.0-Linux-x86_64.sh -b -f -p /usr/local
!conda install theano pygpu

but , I have the following error:
ERROR (theano.gpuarray): Could not initialize pygpu, support disabled
....

Please check the implementation of the project on Google colab. I want to run Kim and ESIM projects on Google colab, but both give the same errors.
please help me.

NaN detected

Hi, I am getting a NaN detected error. Just running your scripts with minor adaptations, i.e.:

theano 0.10-dev (bleeding edge)
python3 (basically just changing some "print", "xrange" commands in the code)

Training runs fine until Epoch 5 Update 91000 ...

save model

Hello.
I lowered the amount of data so I can run with CPU. But at last the model did not run and I encountered the following error. what is the reason.