strongio / keras-elmo Goto Github PK

How to use ELMo embeddings in Keras with Tensorflow Hub

Jupyter Notebook 100.00%

keras-elmo's Introduction

Elmo Embeddings with Tensorflow Hub

This notebook presents a brief demonstration on how to integrate Elmo Embeddings from tensorflow hub into a custom Keras layer that can be directly integrated into a Keras or tensorflow model.

A similar process can be utilized for additional tf-hub models for easy integration of state of the art pre-trained models into your custom workflows.

See the accompanying blog post with further description

keras-elmo's People

Contributors

Stargazers

Watchers

Forkers

pappagari zhaofengwu sbrugman tedrepo zhangyijia1979 bosco-raju algobeach 2793145003 thekevinscott hitheory amansrivastava17 ameasure currylym supertramp01 sano176 sainttde msfenlei cooper111 edelweissno1 newenglandml williamwhe sxj569408735 jundongq aimvoma allensmile lql0716 lastrei angelo337 ahmadhajmosa yueping123 yuanjie-ai daishu7 is5882 xiaojia1234 deepaknlp cdjasonj fahmisalman mayurmorin bladedsupernova ztx0728 snistor1 berryhn vikchopde snail-fuji romangrebnev shubhampachori12110095 charlottesean gdh756462786 berfubuyukoz yueyedeai fredo838 frankchu0229 selmiyoussef hellofranker gongkecun harnit-bakshi algorine gracaliffo94 cold-eye ntoniocp santhoshbjeeffy technologycoder queevin mortmortsu mabu-dev bmanobel hasetz lclindu davidlehoux songyandong biranchi2018 headonenjoy tjunlp leiqi ymcnabb keenborder786 chetannitk ahmedyes2000 sylason jaingaurav3 stephennfernandes samiabutt dededon rizwandel onjeniko1 treefriend yuvrajshivtare beracle sandy4321 1031850687

keras-elmo's Issues

Is there a way to save this trained model?

I used model.save(path) but there appears to be a lot of issues. Could you let me know what would be a good way of saving it? Thanks!

P.S. you could replicate these issues by adding model.save('temp.h5') at the end, although you probably want to limit the number of training/testing sentences to, say, 50, to avoid hours of training time.

Load Model NoneType Error

Hii,

When I try and load the model I get always get a NoneType error.
I assume this means that the model is not loading properly.
I have tried saving the model architecture and the weights both separately and together.

The only difference in my code and this one (with the corresponding output shape mod) is that I am using the ['elmo'] param instead of ['default'].

Any suggestions ??!

Weights are not trainable.

The weights need to be registered as trainable weights for Keras.

Solution:

import tensorflow_hub as hub
from keras import backend as K
from keras.engine import Layer

class ElmoEmbeddingLayer(Layer):
    def __init__(self, **kwargs):
        self.dimensions = 1024
        super(ElmoEmbeddingLayer, self).__init__(trainable=True, **kwargs)

    def build(self, input_shape):
        self.elmo = hub.Module('https://tfhub.dev/google/elmo/2', trainable=self.embed_trainable,
                               name="{}_module".format(self.name))

        self.trainable_weights += K.tf.trainable_variables(scope="^{}_module/.*".format(self.name))
        super(ElmoEmbeddingLayer, self).build(input_shape)

    def call(self, x, mask=None):
        lengths = K.cast(K.argmax(K.cast(K.equal(x, '--PAD--'), 'uint8')), 'int32')
        result = self.elmo(inputs=dict(tokens=x, sequence_len=lengths),
                      as_dict=True,
                      signature='tokens',
                      )['elmo']
        return result

    def compute_mask(self, inputs, mask=None):
        return K.not_equal(inputs, '--PAD--')

    def compute_output_shape(self, input_shape):
        return input_shape + (self.dimensions,)

Loading hdf5 file instead of hub

Hey
is there any equivalent to load a pretrained ELMo with ( hdf5 and options.json ) instead of loading tensorflow hub?

what i want to replace is

def build(self, input_shape): self.elmo = hub.Module('https://tfhub.dev/google/elmo/2', trainable=self.trainable, name="{}_module".format(self.name)) self.trainable_weights += K.tf.trainable_variables(scope="^{}_module/.*".format(self.name)) super(ElmoEmbeddingLayer, self).build(input_shape)

with for example something like this

def build (options ,hdf5): elmo = allennlp.modules.elmo.Elmo(ELMo_options, ELMo_hdf5File, 2, dropout=0) return elmo

Other cell types on top of Elmo layer

Hi, thanks for the code!

similarly to the issue 'GRU on top of ELMo embedding layer', I am having trouble changing the architecture to suit other cell types. specifically, I'm trying to add a BiLSTM followed by a CRF, but getting issues with compute_mask.

It would be amazing if you could post a follow-up to your other tutorial, addressing generally how to use other cell types with this.

Here's my stack overflow post with more details.

efficiency question

Hi, thanks for the example.

Wondering by creating a lambada layer, does it make the process less efficient? i.e. during each training epoch, does the model end up encoding the sentence repeatedly? Will it be more "efficient" to encode the data once up front? Or maybe I am missing some benefit for the dynamic embedding approach implemented here...? Maybe when consider the embedding for trainable? but then we could still do static encoding while using Keras's Embedding layer for training... no?

cannot load saved model

File "main.py", line 166, in
model = load_model('ElmoModel.h5', custom_objects={'ElmoEmbeddingLayer': ElmoEmbeddingLayer})
File "/Users/wei/ELMo_keras/venv/lib/python3.6/site-packages/keras/engine/saving.py", line 419, in load_model
model = _deserialize_model(f, custom_objects, compile)
File "/Users/wei/ELMo_keras/venv/lib/python3.6/site-packages/keras/engine/saving.py", line 317, in _deserialize_model
model._make_train_function()
File "/Users/wei/ELMo_keras/venv/lib/python3.6/site-packages/keras/engine/training.py", line 509, in _make_train_function
loss=self.total_loss)
File "/Users/wei/ELMo_keras/venv/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/Users/wei/ELMo_keras/venv/lib/python3.6/site-packages/keras/optimizers.py", line 478, in get_updates
grads = self.get_gradients(loss, params)
File "/Users/wei/ELMo_keras/venv/lib/python3.6/site-packages/keras/optimizers.py", line 94, in get_gradients
raise ValueError('An operation has None for gradient. '
ValueError: An operation has None for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined

AttributeError: can't set attribute when using tf.keras

#7 we set self.trainable_weights+= ... .But tf.keras doesn't allow setting self.trainable_weights parameter. It raises Attribute error: can't set attribute. Any work arounds?

Does compute_mask() work?

Hi, thanks for your code!
I have a question about compute_mask():
Since the inputs are sentences instead of tokens, how does K.not_equal(inputs, '--PAD--') work?

How to fine tune elmo with tensorflow model?

Hi, I am using a TensorFlow model not Keras API, But I want to fine-tune Elmo model inside TensorFlow. How to do that?

Thank you!

Should use ELMO version 2 from lab

There is a version 2 of ELMO in TF lab. See here:
https://www.tensorflow.org/hub/modules/google/elmo/2

You maybe want to use this instead.

keras-elmo/Elmo Keras.ipynb

Line 153 in 9726d2c

    
           "elmo_model = hub.Module(\"https://tfhub.dev/google/elmo/1\", trainable=True)\n",

Results change with different batch size

I try to use the code for implementing sentence embeddings and hit a strange issue. I've got different embeddings when changing batch size. When I run the code with suggested batch size 32, smaller numbers and up to 35 I receive one type of results that is not the same as when using TF.hub without Keras wrapper. But when I use batch size 36 or bigger I've got different type of results that is just the same as TF.hub without Keras wrapper. Please advise how to find source of my problem.
`# Function to build model
def build_model():
input_text = layers.Input(shape=(1,), dtype="string")
embedding = ElmoEmbeddingLayer()(input_text)
#dense = layers.Dense(128, activation='relu')(embedding)
#pred = layers.Dense(6, activation='softmax')(dense)

model = Model(inputs=[input_text], outputs=embedding)

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

return model`

Elmo+BiLSTM ERROR

input_text = Input(shape=(1,), dtype="string")
embedding = ElmoEmbeddingLayer()(input_text)
lstm_output =  Bidirectional(LSTM(120))(embedding)

error:
File "E:/workspaces/python/elmo/code/elmo-own.py", line 98, in build_model
lstm_output = Bidirectional(LSTM(units=120))(embedding)
File "e:\ProgramData\Anaconda3\lib\site-packages\keras\engine\topology.py", line 528, in call
self.build(input_shapes[0])
File "e:\ProgramData\Anaconda3\lib\site-packages\keras\layers\wrappers.py", line 232, in build
self.forward_layer.build(input_shape)
File "e:\ProgramData\Anaconda3\lib\site-packages\keras\layers\recurrent.py", line 959, in build
self.input_dim = input_shape[2]
IndexError: tuple index out of range

AttributeError: module 'keras.backend' has no attribute 'tf'

self._trainable_weights += K.tf.trainable_variables(
AttributeError: module 'keras.backend' has no attribute 'tf'

Python '3.6.4'
Tensorflow '1.12.0'
Keras '2.2.5'

data type "string" not understood

while fitting the model i am getting following error
data type "string" not understood
Traceback (most recent call last):
File "/home/manish/Downloads/NER-latest/NER/Semi_Supervised/train.py", line 246, in
batch_size=32)
File "/root/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 1042, in fit
validation_steps=validation_steps)
File "/root/anaconda3/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 199, in fit_loop
outs = f(ins_batch)
File "/root/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2661, in call
return self._call(inputs)
File "/root/anaconda3/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 2614, in _call
dtype=tensor.dtype.base_dtype.name))
File "/root/anaconda3/lib/python3.6/site-packages/numpy/core/numeric.py", line 492, in asarray
return array(a, dtype, copy=False, order=order)
TypeError: data type "string" not understood

Could we use keras-elmo module on other language?

Really appreciate for sharing. I have one question is that could the keras-elmo module use other languages as input or keras-elmo is only for english NLP problems?

GRU on top of ELMo embedding layer

Hi,
I replaced my embedding layer with the ELMo embedding layer. The code looks like this -

` embedding_layer = ElmoEmbeddingLayer()

# Embedded version of the inputs
encoded_left = embedding_layer(left_input)
encoded_right = embedding_layer(right_input)

# Since this is a siamese network, both sides share the same GRU
shared_gru = GRU(n_hidden, name='gru')

left_output = shared_gru(encoded_left)
right_output = shared_gru(encoded_right)`

But I am running to error - Input 0 is incompatible with layer gru: expected ndim=3, found ndim=2. The architecture worked well with the default embedding layer. Any idea what am I doing wrong?