Could we add new words? about chinesener HOT 16 OPEN

hgjt8989 commented on August 20, 2024

Could we add new words?

from chinesener.

Comments (16)

buppt commented on August 20, 2024

E.g. if a word (北大) is not recognized as an organisation, could we add this word to let the model know this word?

of course, you can add 北/B_org 大/E_org into the train set.

from chinesener.

chenbaicheng commented on August 20, 2024

@buppt thanks 有个疑问 tensorflow那个是先python train.py 然后再python train.py pretrained 吗

from chinesener.

buppt commented on August 20, 2024

@buppt thanks 有个疑问 tensorflow那个是先python train.py 然后再python train.py pretrained 吗

不用，python train.py是不使用预训练词向量的训练，python train.py pretrained 是使用预训练的词向量训练。

from chinesener.

chenbaicheng commented on August 20, 2024

恩 O(∩_∩)O谢谢还有个疑问这个怎么增量数量的每次的新句子都要加在前面那个训练集吗然后重新跑一次train吗

from chinesener.

chenbaicheng commented on August 20, 2024

@buppt 这个train.py 会被执行吗
elif len(sys.argv)==3: 看了很久都没有看到过有输入3个参数的冗余代码吗谢谢

from chinesener.

buppt commented on August 20, 2024

恩 O(∩_∩)O谢谢还有个疑问这个怎么增量数量的每次的新句子都要加在前面那个训练集吗然后重新跑一次train吗

什么意思，是想自己加一些实体的例句？放训练集里或者在训练好的模型基础上继续训练都可以。
三个参数不是文件名那个么，readme里有。

from chinesener.

chenbaicheng commented on August 20, 2024

@buppt 恩原来是我看漏了原来还有个文件批处理的谢谢，有比较详细的步骤，现在我已经跑完了train.py 如果要加新的语料训练在现在模型基础继续训练要执行那个命令呢谢谢

from chinesener.

badbabys commented on August 20, 2024

谁能提供一下TensorFlow训练的模型

from chinesener.

chenbaicheng commented on August 20, 2024

你训练不了吗用显卡大概3个小时

from chinesener.

bobkentt commented on August 20, 2024

说一下我遇到的问题哈，
cd data/renMinRiBao/
python data_renmin_word.py
然后 cd tensorflow/
python train.py pretrained
然后报错如下：
train len: 24271
test len: 7585
word2id len 3917
Creating the data generator ...
Finished creating the data generator.
use pretrained embedding
begin to train...
Traceback (most recent call last):
File "train.py", line 107, in
model = Model(config,embedding_pre,dropout_keep=0.5)
File "/home/liyang22/github/ChineseNER/tensorflow/bilstm_crf.py", line 20, in init
self._build_net()
File "/home/liyang22/github/ChineseNER/tensorflow/bilstm_crf.py", line 56, in _build_net
self.viterbi_sequence, viterbi_score = tf.contrib.crf.crf_decode(bilstm_out, self.transition_params,tf.tile(np.array([self.sen_len]),np.array([self.batch_size])))
File "/home/liyang22/tensorflow/local/lib/python2.7/site-packages/tensorflow/contrib/crf/python/ops/crf.py", line 537, in crf_decode
false_fn=_multi_seq_fn)
File "/home/liyang22/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/layers/utils.py", line 206, in smart_cond
pred, true_fn=true_fn, false_fn=false_fn, name=name)
File "/home/liyang22/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/smart_cond.py", line 56, in smart_cond
return false_fn()
File "/home/liyang22/tensorflow/local/lib/python2.7/site-packages/tensorflow/contrib/crf/python/ops/crf.py", line 501, in _multi_seq_fn
sequence_length_less_one = math_ops.maximum(0, sequence_length - 1)
File "/home/liyang22/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_math_ops.py", line 4602, in maximum
"Maximum", x=x, y=y, name=name)
File "/home/liyang22/tensorflow/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 546, in _apply_op_helper
inferred_from[input_arg.type_attr]))
TypeError: Input 'y' of 'Maximum' Op has type int64 that does not match type int32 of argument 'x'.

from chinesener.

chenbaicheng commented on August 20, 2024

@bobkentt 你看一下你的语料是不是有问题是你自己编写的吗

from chinesener.

bobkentt commented on August 20, 2024

@bobkentt 你看一下你的语料是不是有问题是你自己编写的吗

就是把项目直接clone下去啊，没用自己的语料，难到是我TensorFlow版本的问题？你是啥版本的啊？我这俩虚拟机安装的tf环境，版本分别是：1.10.0 1.12.0 都不行

from chinesener.

bobkentt commented on August 20, 2024

train.py 中改成int64也不行，同时也试了把数据label强转成int32

from chinesener.

chenbaicheng commented on August 20, 2024

你重新训练前有将前面训练好的模型文件删掉吗我用的是tensorflow-gpu==1.10.0

from chinesener.

chenbaicheng commented on August 20, 2024

@bobkentt

from chinesener.

bubblewu commented on August 20, 2024

@bobkentt 类型转为int32就可以了
self.viterbi_sequence, viterbi_score = tf.contrib.crf.crf_decode(tf.cast(bilstm_out, dtype=tf.int32), tf.cast(self.transition_params, dtype=tf.int32), tf.cast(sequence_length, dtype=tf.int32))

from chinesener.

Could we add new words? about chinesener HOT 16 OPEN

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent