prohiryu / bert-chinese-ner Goto Github PK
View Code? Open in Web Editor NEW使用预训练语言模型BERT做中文NER
License: MIT License
使用预训练语言模型BERT做中文NER
License: MIT License
hello,有仔细看了您的代码,非常感谢能有这样的参考。有些疑问,源码上do_predict的部分是被前面检测过滤掉的,然后我开起来之后,跑起来是有问题的,包括读取输入,现在是读取text.txt,好像没什么意义吧,如果自己弄一些完整的句子,那么dataprocesser里面的_read_data不适用,我尝试改了这个方法,但得到的NER结果很诡异,比如,我输入“今天北京的天气真好。你说呢?是啊,北京天气难得那么好啊。”,得到的结果全部都是[CLS],还望不吝赐教,非常感谢!
这个问题是什么原因呢
Q1:您跑完这个要多久,什么硬件配置
Q2: 跑完之后,怎么用这个NER
目前自己有新闻语料,但是未标注,请问大佬们用什么工具进行BIO标注的呢?
训练时执行命令:
python BERT_NER.py --data_dir=data/ --bert_config_file=checkpoint/bert_config.json --init_checkpoint=checkpoint/bert_model.ckpt --vocab_file=vocab.txt --output_dir=./output/result_dir/
但是BERT_NER里需要train.tfrecord文件,是不是需要先生成这个文件
为什么train和eval必须有一个为true呢,不能只做predict吗?
谢谢
/Users/dr/.virtualenvs/Django面试/bin/python /Users/dr/PycharmProjects/bert-chinese-ner/tf_metrics.py
/Users/dr/.virtualenvs/Django面试/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/Users/dr/.virtualenvs/Django面试/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/Users/dr/.virtualenvs/Django面试/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/Users/dr/.virtualenvs/Django面试/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/Users/dr/.virtualenvs/Django面试/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/Users/dr/.virtualenvs/Django面试/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
我训练好了实体识别的模型,但不知道如何调用这个模型。这里涉及到模型的调用,需要识别的文本怎么处理成模型需要的格式等问题。麻烦帮忙解答一下,万分感谢!
打印的数据:NFO:tensorflow:global_step/sec: 1.77403
这些是loss吗?还是训练速度啊。。。
1.在数据处理的时候超过max_seq_length的句子都被截断了,那么预测的时候就会出现标签数量变少的问题。
2.test数据集按理应该是没有标签的,因此数据处理部分存在一定的缺失?
仔细阅读了阁下的代码,有一个问题。
模型最后产生的参数是:(total_loss, per_example_loss,logits,predicts),predicts保存的是预测值,而logits应该是没有进行softmax归一化的矩阵。
为何在验证阶段使用predictions = tf.argmax(logits, axis=-1, output_type=tf.int32)作为和label_id比较的矩阵进行三项指标的计算,而不是使用模型产生的predicts矩阵来比较。
这点十分令人困扰,如果这是bug的话那为什么得到的几项指标的数值与实际情况还是比较符合的;如果这不是bug的话那为什么没有进行归一化的矩阵也能反映出正确的预测值呢。
最后感谢阁下提供的代码和数据。
能否提供一下label2id.pkl?
你好,我用自己的数据集运行的出现了这个错,请教一下这是哪里有问题呢?
ValueError: Dimensions must be equal, but are 17 and 11 for 'loss/mul' (op: 'Mul') with input shapes: [?,128,17], [?,128,11]
AttributeError: module 'tensorflow.contrib.tpu' has no attribute 'InputPipelineConfig'
跑代码报这个错,是只能在TPU下跑吗
最后一层是softmax,所以每个标记都是相互独立的。
看到代码中有这一段
mask = tf.cast(input_mask,tf.float32)
loss = tf.contrib.seq2seq.sequence_loss(logits,labels,mask)
return (loss, logits, predict)
请问为何被注释后换成了自己手写的损失函数,使用这个带mask的sequence loss有什么问题吗?
create_model时传入的num_labels为什么是len(label_list)+1 ? 这里的加1的目的是啥?
2019-04-24 03:30:02.313101: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at save_restore_v2_ops.cc:137 : Unknown: output/result_dir/model.ckpt-11000_temp_14deb76b392641039027fcde01faa942/part-00000-of-00001.data-00000-of-00001.tempstate15913268613058786132; Input/output error
INFO:tensorflow:Error recorded from training_loop: output/result_dir/model.ckpt-11000_temp_14deb76b392641039027fcde01faa942/part-00000-of-00001.data-00000-of-00001.tempstate15913268613058786132; Input/output error
[[node save/SaveV2 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_estimator/python/estimator/estimator.py:1403) ]]
Caused by op 'save/SaveV2', defined at:
为什么官网说加载预训练模型只训练几个epoch就行,而你的模型训练100个epoch效果却这么好?
我clone下来数据,按原参数跑了跑,奇怪,没复原结果,不知道那里错了?
INFO:tensorflow: eval_f = 0.6572645
INFO:tensorflow: eval_precision = 0.7749974
INFO:tensorflow: eval_recall = 0.57219213
INFO:tensorflow: global_step = 4749
INFO:tensorflow: loss = 60.91251
本来我的tensorflow是1.8.0的,发现和那个bert的源码不兼容
后来我升级到1.9.0了,可是还是报错了。更新为1.9.0报错信息如下,跪求解决方案
2019-03-12 22:29:47.417690: E T:\src\github\tensorflow\tensorflow\core\common_runtime\executor.cc:696] Executor failed to create kernel. Not found: No registered '_CopyFromGpuToHost' OpKernel for CPU devices compatible with node swap_out_gradients/bert/encoder/layer_0/attention/self/key/MatMul_grad/MatMul_1_0 = _CopyFromGpuToHostT=DT_FLOAT, _class=["loc@gradients/bert/encoder/layer_0/attention/self/key/MatMul_grad/MatMul_1_0"], _device="/job:localhost/replica:0/task:0/device:CPU:0"
. Registered: device='GPU'
[[Node: swap_out_gradients/bert/encoder/layer_0/attention/self/key/MatMul_grad/MatMul_1_0 = _CopyFromGpuToHost[T=DT_FLOAT, _class=["loc@gradients/bert/encoder/layer_0/attention/self/key/MatMul_grad/MatMul_1_0"], _device="/job:localhost/replica:0/task:0/device:CPU:0"](bert/encoder/Reshape_1/_4857)]]
Traceback (most recent call last):
我用自己的数据尝试跑了一下 test文件中有100条样例 最终label_text的输出却高于100,请问这是怎么回事
问题1:
论文上说模型是预训练的,只需要在dev数据集上微调,那为什么还需要先在train数据集上训练呢?
问题2:
此模型的评估是基于dev数据集,没有基于test数据集是吗?
真心求教!谢谢
谢谢您分享的工作,做的非常棒!
在看预测部分时,发现预测读的数据也是test的数据,而test数据中已经有标签,读入的过程中标签也读入了,不知道用模型预测的时候是怎么去掉这个的?
另外请教下命令行输入字符串进行预测,程序输入部分该如何改动,非常感谢!
Traceback (most recent call last):
File "BERT_NER.py", line 621, in
tf.compat.v1.app.run()
File "D:\Users\A\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "C:\Users\A\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 300, in run
_run_main(main, args)
File "C:\Users\A\AppData\Roaming\Python\Python37\site-packages\absl\app.py", line 251, in _run_main
sys.exit(main(argv))
File "BERT_NER.py", line 518, in main
train_examples = processor.get_train_examples(FLAGS.data_dir)
File "BERT_NER.py", line 176, in get_train_examples
self._read_data(os.path.join(data_dir, "train.txt")), "train"
File "BERT_NER.py", line 153, in _read_data
for line in f:
UnicodeDecodeError: 'gbk' codec can't decode byte 0x93 in position 2: illegal multibyte sequence
what does the simplest predict.py
look like?
你好,我把data文件夹中的数据换成自己的数据,为什么会报这样的错误呀?应该如何解决呀?谢谢分享
自己的数据集(train和dev没问题)
进行test的时候报错:
INFO:tensorflow:*** Example ***
INFO:tensorflow:guid: test-4
INFO:tensorflow:tokens: 岳 红 朋 友 否 认 此 事 , 警 方 已 成 立 专 案 组 调 查 前 晚 1 0 时 许 , 著 名 演 员 岳 红 被 一 男 子 以 商 谈 拍 片 之 事 约 至 世 纪 金 源 大 饭 店 , 随 即 被 持 刀 威 胁 勒 索 5 万 元 , 后 经 岳 红 报 警 并 逃 脱
INFO:tensorflow:input_ids: 101 2277 5273 3301 1351 1415 6371 3634 752 117 6356 3175 2347 2768 4989 683 3428 5299 6444 3389 1184 3241 122 121 3198 6387 117 5865 1399 4028 1447 2277 5273 6158 671 4511 2094 809 1555 6448 2864 4275 722 752 5276 5635 686 5279 7032 3975 1920 7649 2421 117 7390 1315 6158 2898 1143 2014 5516 1239 5164 126 674 1039 117 1400 5307 2277 5273 2845 6356 2400 6845 5564 102 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:input_mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:segment_ids: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
INFO:tensorflow:label_ids: 9 2 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 3 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Traceback (most recent call last):
File "BERT_NER.py", line 621, in
tf.app.run()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "BERT_NER.py", line 592, in main
predict_file,mode="test")
File "BERT_NER.py", line 303, in filed_based_convert_examples_to_features
feature = convert_single_example(ex_index, example, label_map, max_seq_length, tokenizer,mode)
File "BERT_NER.py", line 286, in convert_single_example
write_tokens(ntokens,mode)
File "BERT_NER.py", line 211, in write_tokens
wf.close()
OSError: [Errno 5] Input/output error
请问如何解决呢?
你好。我换上了自己的标签跟文本,运行完之后要怎么去应用这个模型进行实体的识别。
Line 213 in 5043d2f
如果有的话,麻烦能发一份么,谢谢
你好,请问你在训练这个人民日报数据集的时候,用的参数设置是多少,batch_size, epoch, max_seq_length,有设置early_stop吗?你在结果中说是在100个epoch训练结束后的dev结果,那其他的参数是怎样设置的呢?
在函数model_fn
if init_checkpoint:
(assignment_map, initialized_variable_names) =
modeling.get_assignment_map_from_checkpoint(tvars,init_checkpoint)
tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
if use_tpu:
def tpu_scaffold():
tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
return tf.train.Scaffold()
scaffold_fn = tpu_scaffold
else:
tf.train.init_from_checkpoint(init_checkpoint, assignment_map)
为何 tf.train.init_from_checkpoint要运行两次
因为想要输入句子,得到各个实体。
模型的f1是把单个的字合并成句子算的F1吗。
I0717 06:42:58.134186 140155849557824 tpu_estimator.py:2286] global_step/sec: 2.81616
I0717 06:42:58.134613 140155849557824 tpu_estimator.py:2287] examples/sec: 90.1171
I0717 06:42:58.491854 140155849557824 tpu_estimator.py:2286] global_step/sec: 2.79538
I0717 06:42:58.492231 140155849557824 tpu_estimator.py:2287] examples/sec: 89.4523
I0717 06:42:58.493007 140155849557824 basic_session_run_hooks.py:606] Saving checkpoints for 1284 into ./output/result_dir/model.ckpt.
I0717 06:43:01.181317 140155849557824 estimator.py:368] Loss for final step: 522.48596.
I0717 06:43:01.188701 140155849557824 error_handling.py:101] training_loop marked as finished
WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0.
For more information, please see:
WARNING:tensorflow:Estimator's model_fn (<function model_fn_builder..model_fn at 0x7f74b46b78c8>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Using config: {'_model_dir': './output/result_dir/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': 1000, '_save_checkpoints_secs': None, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': None, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f74b22524a8>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1, '_tpu_config': TPUConfig(iterations_per_loop=1000, num_shards=8, num_cores_per_replica=None, per_host_input_for_training=3, tpu_job_name=None, initial_infeed_sleep_secs=None, input_partition_dims=None), '_cluster': None}
INFO:tensorflow:_TPUContext: eval_on_tpu True
WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False.
Traceback (most recent call last):
File "BERT_NER.py", line 621, in
tf.app.run()
File "/home/lyy/anaconda3/envs/tensorflow/lib/python3.7/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "BERT_NER.py", line 544, in main
train_examples, label_list, FLAGS.max_seq_length, tokenizer, train_file)
File "BERT_NER.py", line 296, in filed_based_convert_examples_to_features
with open('./output/label2id.pkl','wb') as w:
FileNotFoundError: [Errno 2] No such file or directory: './output/label2id.pkl'
Original stack trace for 'bert/encoder/layer_2/attention/self/MatMul': File "BERT_NER.py", line 621, in tf.app.run() File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow\python\platform\app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\absl\app.py", line 300, in run _run_main(main, args) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\absl\app.py", line 251, in _run_main sys.exit(main(argv)) File "BERT_NER.py", line 554, in main estimator.train(input_fn=train_input_fn, max_steps=num_train_steps) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2871, in train saving_listeners=saving_listeners) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 367, in train loss = self._train_model(input_fn, hooks, saving_listeners) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1158, in _train_model return self._train_model_default(input_fn, hooks, saving_listeners) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1188, in _train_model_default features, labels, ModeKeys.TRAIN, self.config) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2709, in _call_model_fn config) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow_estimator\python\estimator\estimator.py", line 1146, in _call_model_fn model_fn_results = self._model_fn(features=features, **kwargs) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 2967, in _model_fn features, labels, is_export_mode=is_export_mode) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 1549, in call_without_tpu return self._call_model_fn(features, labels, is_export_mode=is_export_mode) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow_estimator\python\estimator\tpu\tpu_estimator.py", line 1867, in _call_model_fn estimator_spec = self._model_fn(features=features, **kwargs) File "BERT_NER.py", line 411, in model_fn num_labels, use_one_hot_embeddings) File "BERT_NER.py", line 361, in create_model use_one_hot_embeddings=use_one_hot_embeddings File "C:\Users\leade\Desktop\bert-chinese-ner-v2\bert\modeling.py", line 216, in init do_return_all_layers=True) File "C:\Users\leade\Desktop\bert-chinese-ner-v2\bert\modeling.py", line 844, in transformer_model to_seq_length=seq_length) File "C:\Users\leade\Desktop\bert-chinese-ner-v2\bert\modeling.py", line 701, in attention_layer attention_scores = tf.matmul(query_layer, key_layer, transpose_b=True) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper return target(*args, **kwargs) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow\python\ops\math_ops.py", line 2609, in matmul return batch_mat_mul_fn(a, b, adj_x=adjoint_a, adj_y=adjoint_b, name=name) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 1806, in batch_mat_mul_v2 "BatchMatMulV2", x=x, y=y, adj_x=adj_x, adj_y=adj_y, name=name) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper op_def=op_def) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op op_def=op_def) File "D:\ProgramFiles\Anaconda3\envs\roots\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in init self._traceback = tf_stack.extract_stack()
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.