mesolitica / nlp-models-tensorflow Goto Github PK

Gathers machine learning and Tensorflow deep learning models for NLP problems, 1.13 < Tensorflow < 2.0

License: MIT License

Jupyter Notebook 96.96% Python 3.04%

nlp machine-learning deep-learning lstm attention lstm-seq2seq-tf neural-machine-translation optical-character-recognition dnc-seq2seq pos-tagging

nlp-models-tensorflow's People

Contributors

Stargazers

Watchers

Forkers

ridzuan05 satadru5 aliendeep sukeshtedla sunit1409 akilibob 1601120453 naeemhussien s4sarath esmaeilinia jacknhat biranchi2018 studydeeplearningai ameenali duke24k meelement hridaydutta123 arynas caoxu915683474 gihandesilva sreekanthgoud gyani123 hazemabbas dxvo beautifultango hunglethanh9 hsouporto apachesep earlbabson tran-nam songxianjin mohanraj-nlp wangkanger maziyarpanahi bealeson999 pb-pravin tomarraj008 gym0569 techieaditya gdcollect dhruvagupta2014 iamshivamjaiswal horrorkumani lulllabs patpenetrante shiva16 anammari ajinkyapuar nomiscientist solonalves hoanghungict dgiunchi woosheep binbinche yc-wind lduml fengzhou4 yueyedeai charlottesean fly-ww karanr93 msj905 jayceyxc liybu36 mrvege liuning123 lu839684437 colionx eajack markliou chatbotbox moonlight1776 donghaozhang95 iamsile doctorliu wangshuai9517 siyongxu allensmile zldeng lql0716 jdc08161063 qiuyuew ashora little1tow lzjtt2017 leedong123 scottishfold007 shuangyumo kaiyuangao ankur287 yue1harriet1 liianghuang gunjan98 sekhar2889 mingkin jerryten legendtianjin abhishekrk batermj davidbaron331

nlp-models-tensorflow's Issues

Explain the time taken column

Hi, could you document a bit more what the time taken column means?

Tensorflow 1.1not compatible with cuda 9 or 10

ImportError Traceback (most recent call last)
~/anaconda3/envs/research3.5/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py in ()
40 sys.setdlopenflags(_default_dlopen_flags | ctypes.RTLD_GLOBAL)
---> 41 from tensorflow.python.pywrap_tensorflow_internal import *
42 from tensorflow.python.pywrap_tensorflow_internal import version

~/anaconda3/envs/research3.5/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in ()
27 return _mod
---> 28 _pywrap_tensorflow_internal = swig_import_helper()
29 del swig_import_helper

~/anaconda3/envs/research3.5/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in swig_import_helper()
23 try:
---> 24 _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
25 finally:

~/anaconda3/envs/research3.5/lib/python3.5/imp.py in load_module(name, file, filename, details)
242 else:
--> 243 return load_dynamic(name, filename, file)
244 elif type_ == PKG_DIRECTORY:

~/anaconda3/envs/research3.5/lib/python3.5/imp.py in load_dynamic(name, path, file)
342 name=name, loader=loader, origin=path)
--> 343 return _load(spec)
344

ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

ImportError Traceback (most recent call last)
in ()
1 import json
2 import numpy as np
----> 3 import tensorflow as tf
4 import collections
5 from sklearn.cross_validation import train_test_split

~/anaconda3/envs/research3.5/lib/python3.5/site-packages/tensorflow/init.py in ()
22
23 # pylint: disable=wildcard-import
---> 24 from tensorflow.python import *
25 # pylint: enable=wildcard-import
26

~/anaconda3/envs/research3.5/lib/python3.5/site-packages/tensorflow/python/init.py in ()
49 import numpy as np
50
---> 51 from tensorflow.python import pywrap_tensorflow
52
53 # Protocol buffers

~/anaconda3/envs/research3.5/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py in ()
50 for some common reasons and solutions. Include the entire stack trace
51 above this error message when asking for help.""" % traceback.format_exc()
---> 52 raise ImportError(msg)
53
54 # pylint: enable=wildcard-import,g-import-not-at-top,unused-import,line-too-long

ImportError: Traceback (most recent call last):
File "/home/mandarin/anaconda3/envs/research3.5/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow.py", line 41, in
from tensorflow.python.pywrap_tensorflow_internal import *
File "/home/mandarin/anaconda3/envs/research3.5/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in
_pywrap_tensorflow_internal = swig_import_helper()
File "/home/mandarin/anaconda3/envs/research3.5/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
File "/home/mandarin/anaconda3/envs/research3.5/lib/python3.5/imp.py", line 243, in load_module
return load_dynamic(name, filename, file)
File "/home/mandarin/anaconda3/envs/research3.5/lib/python3.5/imp.py", line 343, in load_dynamic
return _load(spec)
ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory

Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions. Include the entire stack trace
above this error message when asking for help.

attention/1.bahdanau.ipynb文件中引用了utils中的函数，但是没找到utils相关文件

在attention/1.bahdanau.ipynb文件中存在
from utils import *
语句
但是未找到utils的实现python文件

[ASK]

I tried using your code for lemmatization, my problem is i can't save the model so i dont have to re train whenever i predict new data. can you show us how to save the model? thank you.

Could you please provide the installation guide for augmentation in Speech2text parts?

Hi,
I am trying to run the examples in 'speech-to-text'.
But the caching.ipynb needs the augmentation module.
It seems like this module is not installed by 'pip install augmentation', which does not include the 'change_pitch_speech', 'change_amplitude', ... methods.
so could you pls provide me some infos about the module?
Many thanks.

bug in skip-thought.ipynb

NLP-Models-Tensorflow/unsupervised-summarization/skip-thought.ipynb

bugs:

for i in range(5):
pbar = tqdm(range(0, len(middle), batch_size), desc='train minibatch loop')
for p in pbar:

should be

for k in range(5):
pbar = tqdm(range(0, len(middle), batch_size), desc='train minibatch loop')
for i in pbar:

problem in data download in OCR

I tried to download data from the given link:
!wget http://baidudeeplearning.bj.bcebos.com/image_contest_level_1.tar.gz
but I got the error

--2023-02-06 09:42:01-- http://baidudeeplearning.bj.bcebos.com/image_contest_level_1.tar.gz Resolving baidudeeplearning.bj.bcebos.com (baidudeeplearning.bj.bcebos.com)... 103.235.46.61, 2409:8c04:1001:1002:0:ff:b001:368a Connecting to baidudeeplearning.bj.bcebos.com (baidudeeplearning.bj.bcebos.com)|103.235.46.61|:80... connected. HTTP request sent, awaiting response... 403 Forbidden 2023-02-06 09:42:03 ERROR 403: Forbidden.

so how to get this data please help me

Spelling Correction- Shape must be rank 2 but is rank 3 for 'cls/predictions/MatMul' (op: 'MatMul') with input shapes: [?,?,768], [768,30522].

@huseinzol05 Can you please help me solve the below error??

Versions:
Python: 3.6.10
Tensorflow: 1.13.1
Bert: 2.2.0

Code Source: https://github.com/huseinzol05/NLP-Models-Tensorflow/blob/master/spelling-correction/3.bert-base-fast.ipynb

I am running the exact same code that is in the above code source link, but getting the below attached error while running the below chunk of code :

Code:
`tf.reset_default_graph() sess = tf.InteractiveSession() model = Model() sess.run(tf.global_variables_initializer()) var_lists = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope = 'bert')`
Error:

InvalidArgumentError Traceback (most recent call last)
~/anaconda3/envs/projectenv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in _create_c_op(graph, node_def, inputs, control_inputs)
1658 try:
-> 1659 c_op = c_api.TF_FinishOperation(op_desc)
1660 except errors.InvalidArgumentError as e:

InvalidArgumentError: Shape must be rank 2 but is rank 3 for 'cls/predictions/MatMul' (op: 'MatMul') with input shapes: [?,?,768], [768,30522].

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in
1 tf.reset_default_graph()
2 sess = tf.InteractiveSession()
----> 3 model = Model()
4
5 sess.run(tf.global_variables_initializer())

in init(self)
32 initializer = tf.zeros_initializer(),
33 )
---> 34 logits = tf.matmul(input_tensor, tf.transpose(embedding))
35 self.logits = tf.nn.bias_add(logits, output_bias)

~/anaconda3/envs/projectenv/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py in matmul(a, b, transpose_a, transpose_b, adjoint_a, adjoint_b, a_is_sparse, b_is_sparse, name)
2453 else:
2454 return gen_math_ops.mat_mul(
-> 2455 a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
2456
2457

~/anaconda3/envs/projectenv/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py in mat_mul(a, b, transpose_a, transpose_b, name)
5331 _, _, _op = _op_def_lib._apply_op_helper(
5332 "MatMul", a=a, b=b, transpose_a=transpose_a, transpose_b=transpose_b,
-> 5333 name=name)
5334 _result = _op.outputs[:]
5335 _inputs_flat = _op.inputs

~/anaconda3/envs/projectenv/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords)
786 op = g.create_op(op_type_name, inputs, output_types, name=scope,
787 input_types=input_types, attrs=attr_protos,
--> 788 op_def=op_def)
789 return output_structure, op_def.is_stateful, op
790

~/anaconda3/envs/projectenv/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py in new_func(*args, **kwargs)
505 'in a future version' if date is None else ('after %s' % date),
506 instructions)
--> 507 return func(*args, **kwargs)
508
509 doc = _add_deprecated_arg_notice_to_docstring(

~/anaconda3/envs/projectenv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in create_op(failed resolving arguments)
3298 input_types=input_types,
3299 original_op=self._default_original_op,
-> 3300 op_def=op_def)
3301 self._create_op_helper(ret, compute_device=compute_device)
3302 return ret

~/anaconda3/envs/projectenv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in init(self, node_def, g, inputs, output_types, control_inputs, input_types, original_op, op_def)
1821 op_def, inputs, node_def.attr)
1822 self._c_op = _create_c_op(self._graph, node_def, grouped_inputs,
-> 1823 control_input_ops)
1824
1825 # Initialize self._outputs.

~/anaconda3/envs/projectenv/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in _create_c_op(graph, node_def, inputs, control_inputs)
1660 except errors.InvalidArgumentError as e:
1661 # Convert to ValueError for backwards compatibility.
-> 1662 raise ValueError(str(e))
1663
1664 return c_op

ValueError: Shape must be rank 2 but is rank 3 for 'cls/predictions/MatMul' (op: 'MatMul') with input shapes: [?,?,768], [768,30522].

TF version

Is it possible to update which tensorflow version you are using?
Thanks for the clear code.

Something wrong with the loss

Hello, I have run your codes of 'chatbot' on a conversation dataset. But the loss seems unnormally low. The results on dailydialog dataset from paper 'DailyDialog: AManuallyLabelledMulti-turnDialogueDataset' show that perplexity is more than 30 and loss is more than 3. But the perplexity obtained by your codes is lower 3 which is absolutely wrong. Could you provide some advice? Thank you.
'

AssertionError: assert not np.isnan(cost). How to solve this NAN error. Thanks in advance.

Could you share the file of image_contest_level_1/ in the "ocr/cnn-rnn-lstm" ?

embedded-util

hello, where is utils package in embedded

Could you add more details for the spell correction section?

thank you

Thank you for your contribution. Can you add some necessary notes in your notebooks?
thank you very much！

Could you code predict function transfer-learning-albert-base.ipynb

pardon me if english not well, i run your code to train after that i can't recode predict function in new data , could you help me , thank you so much

1.lstm-seq2seq-greedy.ipynb In [17] missing 1 required positional argument: 'maxlen'

In [17]:
def pad_sentence_batch(sentence_batch, pad_int, maxlen):
In [18]:
batch_x, _ = pad_sentence_batch(train_X[k: min(k+batch_size,len(train_X))], PAD)
batch_y, _ = pad_sentence_batch(train_Y[k: min(k+batch_size,len(train_X))], PAD)
error:
TypeError: pad_sentence_batch() missing 1 required positional argument: 'maxlen'

maybe:
def pad_sentence_batch(sentence_batch, pad_int):
padded_seqs = []
seq_lens = []
max_sentence_len = max([len(sentence) for sentence in sentence_batch])
for sentence in sentence_batch:
padded_seqs.append(sentence + [pad_int] * (max_sentence_len - len(sentence)))
seq_lens.append(len(sentence))
return padded_seqs, seq_lens

从文本分类中测试结果看，好像fasttext的性价比最高，acc：0.76，耗时：0.49499；为啥fasttext直接训练会达到如此好的效果？

missing embed_seq()

https://github.com/huseinzol05/NLP-Models-Tensorflow/blob/master/entity-tagging/7.attention-is-all-you-need.ipynb

def learned_position_encoding(inputs, mask, embed_dim):
T = tf.shape(inputs)[1]
outputs = tf.range(tf.shape(inputs)[1]) # (T_q)
outputs = tf.expand_dims(outputs, 0) # (1, T_q)
outputs = tf.tile(outputs, [tf.shape(inputs)[0], 1]) # (N, T_q)
outputs = embed_seq(outputs, T, embed_dim, zero_pad=False, scale=False)
return tf.expand_dims(tf.to_float(mask), -1) * outputs

why you have so many Chatbot notebooks??

https://github.com/huseinzol05/NLP-Models-Tensorflow/tree/master/chatbot

Have bert?

embedded for data?

Hello, you can introduce this data? i meet data like this {a, b, c}，each element is article.
now, it tells me this a and b's similarity greater than a and c.(distance(a, b) > distance(a, c)), i know to use triplet loss，but i don't my data how to match your positive and negative data？

Thanks in advance.