Giter Site home page Giter Site logo

just_another_seq2seq's Issues

运行 python3 extract_tmx.py 内存不够

大神你好,我16G内存,跑满了,最后写入文件的时候报 memoryerror,第一次用ElementTree,想只取一部分tu标签,如何实现,我看了api只有findall方法。或者有什么缩小内存占用的办法呢?

线程错误

python3 train.py

2019-01-23 09:16:28.072380: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not comp
iled to use: AVX2 FMA
epoch 1 loss=1.902081 lr=0.001000: 92%|#######################################################################9 | 119/129 [00:47<00:04, 2.48it/s]
Exception in thread <generator object batch_flow_bucket at 0x7fd488384eb8>:
Traceback (most recent call last):
File "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/usr/lib/python3.5/threading.py", line 862, in run
self._target(*self._args, **self._kwargs)
File "../threadedgenerator.py", line 43, in _run
for value in self._iterator:
File "../data_utils.py", line 197, in batch_flow_bucket
data_batch = random.sample(ind_data[choice_ind], batch_size)
File "/usr/lib/python3.5/random.py", line 315, in sample
raise ValueError("Sample larger than population")
ValueError: Sample larger than population

epoch 1 loss=1.866029 lr=0.001000: 100%|##############################################################################| 129/129 [00:51<00:00, 2.54it/s]
epoch 2 loss=1.381537 lr=0.001000: 16%|############2 | 20/129 [00:07<00:44, 2.44it/s]
Traceback (most recent call last):
File "train.py", line 163, in
main()
File "train.py", line 159, in main
test(json.load(open('params.json')))
File "train.py", line 74, in test
x, xl, y, yl = next(flow)
File "../threadedgenerator.py", line 75, in next
raise StopIteration()
StopIteration

loss sometimes jumpes to 200

my params are {
"cell_type": "lstm",
"depth": 2,
"attention_type": "Luong",
"bidirectional": true,
"use_residual": true,
"use_dropout": false,
"time_major": true,
"hidden_units": 1024,
"optimizer": "adam",
"learning_rate": 0.001
}
and i am using fasttext pre-trained word2vec as well
batch size is 128 which i haven't changed it.
But the loss is always fluctuating and sometimes jumps to a huge number (200), it usually started as 10-20. It's cuz this version of seq2seq add a part of reinforcement learning? plz tell u how to fix it?

如何训练自己搜集的语料

本人搜集了一些语料,格式是txt,形式是(问题a回答b问题c回答d......,分行),不知道如何训练?请大神解答

线程问题

我想问一下,我训练完一个epoch之后,开始第二个epoch的时候,多线程那里报错threads can only be started once,有人遇到吗?这个是怎么改的

beam search generate the same sentence

设置beam size 为5,生成的5个结果会有一些是重复的,查看了原因是因为beam search没有在停止符的时候停止,导致加上停止符等符号,整个句子是不一样的,去掉停止符后,句子是一样的了。请问beam search生成结果在停止符的时候不停止这个问题怎么解决了?
code 中是调用如下api:
inference_decoder = BeamSearchDecoder(
cell=self.decoder_cell,
embedding=embed_and_input_proj,
start_tokens=start_tokens,
end_token=end_token,
initial_state=self.decoder_initial_state,
beam_width=self.beam_width,
output_layer=self.decoder_output_projection,
)

ValueError: Sample larger than population or is negative

您好,我用自己的数据跑程序,训练到中途出现下面的问题,该如何解决呢,谢谢

  • Traceback (most recent call last):
    File "/home/jiang.li/ENTER/envs/pytorch/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
    File "/home/jiang.li/ENTER/envs/pytorch/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
    File "../threadedgenerator.py", line 43, in _run
    for value in self._iterator:
    File "../data_utils.py", line 197, in batch_flow_bucket
    data_batch = random.sample(ind_data[choice_ind], batch_size)
    File "/home/jiang.li/ENTER/envs/pytorch/lib/python3.6/random.py", line 320, in sample
    raise ValueError("Sample larger than population or is negative")
    ValueError: Sample larger than population or is negative

epoch 7 loss=3.628152 lr=0.000777: 14%|█▍ | 4573/32280 [10:52<1:01:13, 7.54it/s]Traceback (most recent call last):
File "train.py", line 163, in
main()
File "train.py", line 159, in main
test(json.load(open('params.json')))
File "train.py", line 74, in test
x, xl, y, yl = next(flow)
File "../threadedgenerator.py", line 75, in next
raise StopIteration()
StopIteration

ner/train_crf_loss.py Error

目前program run 在 ubuntu python3.5
在 ner/train_crf_loss.py 142 行的測試的部分,遇到以下的 Error

try load model from ./s2ss_crf.ckpt
0%| | 0/100 [00:00<?, ?it/s]Traceback (most recent call last):

File "train_crf_loss.py", line 179, in <module>
main()
File "train_crf_loss.py", line 175, in main
test(True, 'lstm', 1, False, True, False, 64, 'tanh')
File "train_crf_loss.py", line 142, in test
if rr:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

看了一下 rr 是 numpy,所以直接用 if rr: 好像會出問題
要把他轉成 list 或是用 if not np.isnan(rr).any() ?
因為 code 還沒完全理解,所以不曉得是我的過程有問題還是code問題

训练出来的模型预测的时候只能出两种结果

param_json如下:
{ "bidirectional": true,
"use_residual": false,
"use_dropout": false,
"time_major": false,
"cell_type": "lstm",
"depth": 2,
"attention_type": "Bahdanau",
"hidden_units": 128,
"optimizer": "adam",
"learning_rate": 0.001,
"embedding_size": 300
}

结果:

输入: ['', '好', '你']
x [[ 3 1625 739]]
xl [3]
Greedy_pred [[2253 2599 739 3668 3]]
输出: 我是你的

输入: ['', '哪', '在', '你']
x [[ 3 1291 1485 739]]
xl [4]
Greedy_pred [[2253 2599 739 3668 3]]
输出: 我是你的

输入: ['', '字', '名', '么', '什', '叫', '你']
x [[ 3 1734 1166 604 665 1142 739]]
xl [7]
Greedy_pred [[ 30 372 30 3]]
输出: =。=

输入: ['', '吗', '友', '朋', '个', '交', '能', '们', '我']
x [[ 3 1175 1122 2664 587 650 4306 691 2253]]
xl [9]
Greedy_pred [[ 30 372 30 3]]
输出: =。=

==我尝试过100epoch、5epoch、2epoch 都是这样的结果。咋回事,请大神指教。

训练好后测试显示乱码

我是在windows下跑的,跑完后测试时的样例句子显示:
鐣 鍗 鍚 渚

然后我encode为gbk又显示[b'\xe7\x95', b'\xe5\x8d', b'\xe5\x90', b'\xe4\xbe']

最后我在linux环境下测试,同样显示:鐣 鍗 鍚 渚

求问作者的训练环境和测试环境(不会是因为不该在windows下训练吧。。。)

语料没有到百万行也可以训练吗?

各位好
我用自己蒐集的语料 在train.py时会报错
不知道是哪个部分出了问题?
因为没有像大大给的文本到400多万行
是哪边的参数需要修改吗?

谢谢

pretrained_embedding

你好啊,

认真看了一遍代码,发现chatbot 中pretrained_embedding部分的代码,没有预训练embedding,只是初始化为0了呢,真是麻烦您了

how about the training time?

It was 10 hours per epoch in my one-GPU workstation(GTX1060). is that normal? or something wrong that slows my training process down? I am just using your offered data-set, thx.

seq2seq-ner效果询问

hi,
看到您使用 seq2seq model 来做 ner ,请问您最后得出的效果怎么样?相比与 Bi-lstm-crf 之类,
thanks,

模型启动出错

您好,真的非常感谢您公开自己的源代码
但是我遇到一个很棘手的问题,想请教一下您
我在服务器上训练的模型,服务器上可以跑模型,但是下载到本机之后restore模型报错了。但是本地训练的模型,本地启动模型也是没有问题的,按照道理,保存模型应该和机器没有关系吧
以下是报错的内容:

tensorflow.python.framework.errors_impl.DataLossError: Checksum does not match: stored 942326751 vs. calculated on the restored bytes 3112943709
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

用其他的数据训练,一个epoch中step数减少太多

您好,
我用其他的数据集训练,该训练集的总条数为1363683,部分数据如下

- E
- M 读大学
- M 没有什么过不去的
- E
- M 我不是你爸
- M 是!否则妈妈怎么老和我说爱他呢,我都吃醋了

在训练的过程中,每个epoch的总数为1085,

而使用您给的数据集的总条数4268087,每个epoch数目可以达到26158,
26158/1085>4268087/1363683, 这个差距比较大,这是什么原因,难道是数据集没有加 “/” 的原因吗

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.