Giter Site home page Giter Site logo

gaoq1 / ner-slot_filling Goto Github PK

View Code? Open in Web Editor NEW
176.0 8.0 46.0 1.39 MB

中文自然语言的实体抽取和意图识别(Natural Language Understanding),可选Bi-LSTM + CRF 或者 IDCNN + CRF

Python 100.00%
nlu slot slot-filling ner nlp bi-lstm crf idcnn medicle emr

ner-slot_filling's Issues

准确率问题

这个模型的意图识别正确率能达到多少,我们直接跑你的很低

测试错误

您好:我用python3.6,训练出现下面情况

  • Building prefix dict from the default dictionary ...
  • Loading model from cache /tmp/jieba.cache
  • Loading model cost 0.699 seconds.
  • Prefix dict has been built succesfully.
  • 开始训练模型!!!
  • 13724it [00:00, 133884.01it/s]
  • Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34)
  • [GCC 7.3.0] on linux
  • Type "help", "copyright", "credits" or "license" for more information.
  • (InteractiveConsole)

测试时报错如下

  • Building prefix dict from the default dictionary ...
  • Loading model from cache /tmp/jieba.cache
  • Loading model cost 0.692 seconds.
  • Prefix dict has been built succesfully.
  • 2019-06-10 23:07:51.152866: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
  • 2019-06-10 23:07:52.790032: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
  • name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.545
  • pciBusID: 0000:02:00.0
  • totalMemory: 10.73GiB freeMemory: 10.53GiB
  • 2019-06-10 23:07:52.790099: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
  • 2019-06-10 23:07:53.229309: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
  • 2019-06-10 23:07:53.229363: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
  • 2019-06-10 23:07:53.229376: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
  • 2019-06-10 23:07:53.229666: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10168 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:02:00.0, compute capability: 7.5)
  • WARNING:tensorflow:From /data/proj/Captcha/ner-slot_filling/models/model.py:385: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
  • Instructions for updating:
  • Future major versions of TensorFlow will allow gradients to flow
  • into the labels input on backprop by default.
  • See tf.nn.softmax_cross_entropy_with_logits_v2.
  • /home/jiang.li/.local/lib/python3.6/site-packages/tensorflow/python/ops/gradients_impl.py:112: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  • "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
  • 2019-06-10 23:07:55,057 - /data/proj/Captcha/ner-slot_filling/log/train.log - INFO - Created model with fresh parameters.
  • Loading pretrained embeddings from /Users/gaoquan/Documents/ml-learning/ner-learning/NER_medical_records/assets/cooked_corpus/vec.txt...
  • Traceback (most recent call last):
  • File "train_evaluate.py", line 251, in
  • main(args)
    
  • File "train_evaluate.py", line 247, in main
  • evaluate_line()
    
  • File "train_evaluate.py", line 230, in evaluate_line
  • load_word2vec, config, id_to_char, logger)
    
  • File "/data/proj/Captcha/ner-slot_filling/utils/utils.py", line 158, in create_model
  • emb_weights = load_vec(config["emb_file"], id_to_char, config["char_dim"], emb_weights)
    
  • File "/data/proj/Captcha/ner-slot_filling/utils/data_utils.py", line 172, in load_word2vec
  • for i, line in enumerate(codecs.open(emb_path, 'r', 'utf-8')):
    
  • File "/home/jiang.li/ENTER/envs/pytorch/lib/python3.6/codecs.py", line 897, in open
  • file = builtins.open(filename, mode, buffering)
    
  • FileNotFoundError: [Errno 2] No such file or directory: '/Users/gaoquan/Documents/ml-learning/ner-learning/NER_medical_records/assets/cooked_corpus/vec.txt'

请问该如何修改,谢谢

data and performance problem

Thanks for sharing the code.
Questions:

  1. what is the training data?
  2. do you compare the NER of deep learning with NER in jieba or Hannlp ?
  3. what is the key factor in identifying the performance of NER models?
    Thanks!

what do the slots mean?

slots = ['DIS', 'SYM', 'SGN', 'TES', 'DRU', 'SUR', 'PRE', 'PT', 'Dur', 'TP', 'REG', 'ORG', 'AT', 'PSB', 'DEG', 'FW', 'CL']

初次接触槽填充,这些槽分别表示的是什么意思

标签

DIS', 'SYM', 'SGN', 'TES', 'DRU', 'SUR', 'PRE', 'PT', 'Dur', 'TP', 'REG', 'ORG', 'AT', 'PSB', 'DEG', 'FW', 'CL']
你好 请问下这些标签是什么意思

sort

from tensorflow.contrib.framework import sort can not found sort

data

数据格式能不能给下?

运行时报错

Traceback (most recent call last):
File "E:/8、Chatbot机器人/nlu-master/sample_code/random_output.py", line 33, in
dev_dct = json.load(open(sys.argv[1]), encoding='utf8')
IndexError: list index out of range

训练时报错

Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\dan\AppData\Local\Temp\jieba.cache
Loading model cost 0.647 seconds.
Prefix dict has been built successfully.
0it [00:00, ?it/s]开始训练模型!!!
13724it [00:00, 257175.79it/s]
3494it [00:00, 264304.62it/s]
1216it [00:00, 23299.77it/s]
576it [00:00, 115511.31it/s]
49it [00:00, 49108.94it/s]
155it [00:00, 155456.03it/s]
100%|██████████| 576/576 [00:00<00:00, 8812.50it/s]
100%|██████████| 155/155 [00:00<00:00, 7778.66it/s]
0%| | 0/49 [00:00<?, ?it/s]576 / 155 / 49 sentences in train / dev / test.
100%|██████████| 49/49 [00:00<00:00, 8189.39it/s]
Traceback (most recent call last):
File "E:/8、Chatbot机器人/ner-slot_filling-master/train_evaluate.py", line 248, in
main(args)
File "E:/8、Chatbot机器人/ner-slot_filling-master/train_evaluate.py", line 242, in main
train()
File "E:/8、Chatbot机器人/ner-slot_filling-master/train_evaluate.py", line 143, in train
config = load_config(args.config_file)
File "E:\8、Chatbot机器人\ner-slot_filling-master\utils\utils.py", line 112, in load_config
return json.load(f)
File "K:\Anaconda\lib\json_init_.py", line 299, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "K:\Anaconda\lib\json_init_.py", line 354, in loads
return _default_decoder.decode(s)
File "K:\Anaconda\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "K:\Anaconda\lib\json\decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 10 column 5 (char 180)

请问如何解决呢?

what we suppose to do?

ub16c9@ub16c9-gpu:~/ub16_prj/ner-slot_filling$ python3.6 train_evaluate.py --clean True --train True --model_type bilstm
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.669 seconds.
Prefix dict has been built succesfully.
开始训练模型!!!
14253it [00:00, 291884.34it/s]
Python 3.6.8 (default, Dec 24 2018, 19:24:27)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)

执行训练语句,python train_evaluate.py --clean True --train True --model_type bilstm 直接跳转进了python

您好:我用python3.6,训练出现下面情况

Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.699 seconds.
Prefix dict has been built succesfully.
开始训练模型!!!
13724it [00:00, 133884.01it/s]
Python 3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)

请问了解为什么吗?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.