zimmerrol / attention-is-all-you-need-keras Goto Github PK
View Code? Open in Web Editor NEWImplementation of the Transformer architecture described by Vaswani et al. in "Attention Is All You Need"
License: MIT License
Implementation of the Transformer architecture described by Vaswani et al. in "Attention Is All You Need"
License: MIT License
Hello, I've been trying to train this implementation, but the results are very poor:
While the loss decreases during trainning and the accuracy is exellent, I am not able to have significant results in inference:
I have tryied with various datasets and parsing methods, these results are with the original model, parser and dataset.
Thank you for your work
I was just trying to create a model using this code
training_model, inference_model = create_model(source_vocabulary_size=98,target_vocabulary_size=98,max_length=20, share_word_embedding=False,n=6, h=8, d_k=64, d_v=64, d_model=512, optimizer="adam", null_token_value=98)
model.summary()
I get that error
File "aisayn.py", line 5, in
n=6, h=8, d_k=64, d_v=64, d_model=512, optimizer="adam", null_token_value=98)
File "/home/hebagamal/keras-retinanet/modelcl.py", line 172, in create_model
n=n, h=h, d_k=d_k, d_v=d_v,d_model=d_model, optimizer=optimizer, null_token_value=null_token_value)
File "/home/hebagamal/keras-retinanet/modelcl.py", line 145, in build_transformer
enc_output = enc(enc_input)
File "/home/hebagamal/keras-retinanet/modelcl.py", line 88, in call
x = layer(x)
File "/home/hebagamal/keras-retinanet/modelcl.py", line 33, in call
x = self._ln_a(y)
File "/usr/lib64/python3.6/site-packages/keras/engine/topology.py", line 592, in call
self.build(input_shapes[0])
File "/usr/lib/python3.6/site-packages/kulc/layer_normalization.py", line 45, in build
trainable=True
File "/usr/lib64/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/usr/lib64/python3.6/site-packages/keras/engine/topology.py", line 413, in add_weight
weight = K.variable(initializer(shape),
File "/usr/lib64/python3.6/site-packages/keras/initializers.py", line 46, in call
return K.constant(1, shape=shape, dtype=dtype)
File "/usr/lib64/python3.6/site-packages/keras/backend/tensorflow_backend.py", line 423, in constant
return tf.constant(value, dtype=dtype, shape=shape, name=name)
File "/usr/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 208, in constant
value, dtype=dtype, shape=shape, verify_shape=verify_shape))
File "/usr/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 380, in make_tensor_proto
if shape is not None and np.prod(shape, dtype=np.int64) == 0:
File "/usr/lib64/python3.6/site-packages/numpy/core/fromnumeric.py", line 2585, in prod
initial=initial)
File "/usr/lib64/python3.6/site-packages/numpy/core/fromnumeric.py", line 83, in _wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'
Hi,
Thanks for sharing your code. can I use this code for text to text rather than translation?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.