Giter Site home page Giter Site logo

baidu / ddparser Goto Github PK

View Code? Open in Web Editor NEW
951.0 951.0 164.0 363 KB

百度开源的依存句法分析系统

License: Apache License 2.0

Python 98.92% Shell 1.08%
chinese-dependency-parser chinese-nlp dependency-parser dependency-parsing python syntax-parser

ddparser's Introduction

house.baidu.com

ddparser's People

Contributors

zhangyimi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ddparser's Issues

请教一下OOM问题

用自己造的数据跑模型时,由于部分句子较长,容易出现oom问题,所以我在代码中加了句子长度不超过15的限制(11g显存)才能正常训练。默认的batchsize是2048,我改这个数字发现实际使用的显存没有变化,似乎不起作用。想请教一下,如果不想限制句子长度,应该改哪部分参数或代码来解决oom?

NOAVX环境下如何使用DDParser

服务器是ESXi集群,因为EVC特性CPU被降级不支持AVX指令,我编译了noavx的paddlepaddle,但是ddparser中直接使用了core_avx模块
from paddle.fluid.core_avx import VarDesc
请问ddparser如何与noavx的paddlepaddle一起运行

显示结构表示

请问里面有很多三元组的subject为None是怎么回事:((None, '用于', '描述客观世界中概念'), 'SVO')

fastapi调用服务器api进行dep分析时卡死。

用fastapi搭了个简单的http server,在client传递句子进行句法分析,服务器执行到
r = ddp.parse(sentence)时卡死。请问是不支持该模式下使用吗?有没有解决方案呢?

ddparser 1.0.5 只支持 ernie-lstm 模型,填入其他出错

例如
ddp = DDParser(encoding_model='transformer')
报错
File "/usr/local/lib/python3.8/site-packages/ddparser/parser/data_struct/utils.py", line 295, in download_model_from_url download_model_path = DOWNLOAD_MODEL_PATH_DICT[model] KeyError: 'transformer'

原因应是 DOWNLOAD_MODEL_PATH_DICT 中只有一个模型
DOWNLOAD_MODEL_PATH_DICT = { 'ernie-lstm': "https://ddparser.bj.bcebos.com/DDParser-ernie-lstm-1.0.3.tar.gz", }
而 download_model_from_url函数中并未检查 model in DOWNLOAD_MODEL_PATH_DICT
download_model_path = DOWNLOAD_MODEL_PATH_DICT[model]

请问文档里提到的其它几个模型还支持吗?
lstm, transformer, ernie-1.0, ernie-tiny

dependency tree可视化

您好,ddparser需要paddlepaddle的版本低于2.0,但是paddlehub要求paddlepaddle的版本高于2.0,如果我想把处理的结果做成像论文中的树的样子,还有其它办法吗?谢谢!

完全相同的句式,两种不同结果

屏幕快照 2020-09-06 下午6 17 48
屏幕快照 2020-09-06 下午6 20 03
屏幕快照 2020-09-06 下午6 18 22
屏幕快照 2020-09-06 下午6 20 34

把水果、苹果、梨子、凤梨 放在同一个位置, 得到两种不同结果。
貌似“宾语前置” 更加符合语法习惯,请教该如何自行训练?

使用后一直显示 loading the fields.

[2021-04-22 20:55:08,953][root][INFO] - loading the fields.
[2021-04-22 20:55:16,132][root][INFO] - loading the fields.
[2021-04-22 20:55:24,069][root][INFO] - loading the fields.
[2021-04-22 20:55:32,051][root][INFO] - loading the fields.
[2021-04-22 20:55:41,988][root][INFO] - loading the fields.
[2021-04-22 20:55:59,328][root][INFO] - loading the fields.
[2021-04-22 20:56:09,152][root][INFO] - loading the fields.


def _add_Parser_seq(data: List[Dict], cfg) -> None:
for d in data:
"""
使用百度DDparse工具进行依存句法分析
"""
ddp = DDParser()
dictParse = ddp.parse(d['sentence'])
d['dependency'] = dictParse[0]['head']
通过debug发现就是这段代码显示的,想问一下这个loading the fields.代码是在哪?

现在ddparser有个bug,没法支持高版本的ernie

在ddparser\ernie_init_.py中有个对版本的检查,源代码如下:
paddle_version = [int(i) for i in paddle.version.split('.')]
if paddle_version[1] < 7:
raise RuntimeError('paddle-ernie requires paddle 1.7+, got %s' %paddle.version)

按照这样的说法,当paddle更新到2.0.0+的时候,会导致paddle_version[1]<7的问题,这其实是不合理的,同时paddle_version = [int(i) for i in paddle.version.split('.')] 这句中有可能因为版本中附带字母导致int(i)编译报错,因此将源代码修改为:
paddle_version = [i for i in paddle.version.split('.')]
if 10 * int(paddle_version[0]) +int(paddle_version[1]) < 17:
raise RuntimeError('paddle-ernie requires paddle 1.7+, got %s' %
paddle.version)

保存之后输入from ddparser import DDParser 显示无误,问题解决。

ddparser不支持paddlepaddle2.0

现在pip安装ddparser时会将环境中的paddlepaddle2.0卸载去安装1.8.5版本,现在paddlepaddle2.0.0已是稳定版,请将ddparser支持paddlepaddle2.0.0

自己训练模型时的数据格式

image
如图,我在执行sh run_trash.sh时候报错了,我打印了这个puncts.shape 数值为(1,1,0)
image
请问是哪里的问题呢?
这是我标点符号那一行的数据
image

复杂词句处理效果不好

**石油网消息(记者储宝 杨碧泓)在抗击新冠肺炎疫情的关键时刻,3月12日,集团公司党组书记、董事长、集团公司新冠肺炎疫情防控工作领导小组组长戴厚良与中东地区疫情防控领导小组通电话,了解当地疫情防控进展情况,关心慰问奋战在海外抗疫一线员工,嘱咐大家要坚持科学防控,细化落实防控措施,特别要注重加强自身防护,为疫情控制做出实实在在的贡献

1 **石油网 ATT 2 nz
2 消息 HED 0 n
3 ( MT 7 w
4 记者 ATT 5 n
5 储宝 SBV 6 PER
6 ATT 7 w
7 杨碧泓 COO 2 PER
8 ) MT 7 w
9 在 ATT 15 p
10 抗击 ATT 15 v
11 新冠 ATT 13 n
12 肺炎 ATT 13 n
13 疫情 VOB 10 n
14 的 MT 10 u
15 关键时刻 ADV 17 n
16 , MT 15 w
17 3月12日 IC 2 TIME
18 , MT 17 w
19 集团公司 ATT 20 n
20 党组书记 ATT 31 job
21 、 MT 20 w
22 董事长 COO 20 job
23 、 MT 22 w
24 集团公司 COO 20 n
25 新冠 ATT 29 n
26 肺炎 ATT 27 n
27 疫情 ATT 28 n
28 防控 ATT 29 vn
29 工作 ATT 30 n
30 领导小组 ATT 31 n
31 组长 SBV 32 n
32 戴厚良 IC 17 PER
33 与 MT 39 p
34 中东 ATT 35 LOC
35 地区 ATT 38 n
36 疫情 ATT 37 n
37 防控 ATT 38 vn
38 领导小组 ATT 39 n
39 通电话 COO 32 v
40 , MT 39 w
41 了解 COO 32 v
42 当地 ATT 46 s
43 疫情 ATT 44 n
44 防控 ATT 45 vn
45 进展 ATT 46 vn
46 情况 VOB 41 n
47 , MT 41 w
48 关心 SBV 57 v
49 慰问 VOB 48 v
50 奋战 COO 48 v
51 在 ADV 50 p
52 海外 ATT 55 s
53 抗疫 ATT 55 vn
54 一线 ATT 55 n
55 员工 POB 51 n
56 , MT 48 w
57 嘱咐 COO 41 v
58 大家 DBL 57 r
59 要 ADV 60 v
60 坚持 DBL 57 v
61 科学 ATT 62 ad
62 防控 VOB 60 v
63 , MT 60 w
64 细化 IC 71 v
65 落实 ATT 67 v
66 防控 ATT 67 vn
67 措施 VOB 64 n
68 , MT 64 w
69 特别 ADV 71 d
70 要 ADV 71 v
71 注重 IC 57 v
72 加强 VOB 71 v
73 自身 ATT 74 r
74 防护 VOB 72 vn
75 , MT 71 w
76 为 ADV 79 p
77 疫情 ATT 78 n
78 控制 POB 76 vn
79 做出 COO 71 v
80 实实在在 ATT 82 a
81 的 MT 80 u
82 贡献 VOB 79 n

象这种,HED,SBV 抽的不准。

bad case

第一句: 核心词是 PER
屏幕快照 2020-08-12 下午7 16 38

第二句:一个SBV 指向 另一个 SBV
屏幕快照 2020-08-12 下午7 29 51

第三句: 看上去也很不对劲
屏幕快照 2020-08-13 下午5 11 00

第四句:手工修改后,似乎正确的依存关系应该如下
屏幕快照 2020-08-12 下午7 29 27

bad case @ 8.30

第一句: 看上去最离谱, HED 是“两队”
屏幕快照 2020-08-30 下午6 40 12

第二句:"甲"、"乙" 既不是并列关系, 也没有共同作为 "两队" 的 定语
屏幕快照 2020-08-30 下午6 35 44

第三句: 结果尚可接受
屏幕快照 2020-08-30 下午6 43 04

第四句: 结果也还可以接受
屏幕快照 2020-08-30 下午6 45 38

感觉较为理想的是下面这种结果:
屏幕快照 2020-08-30 下午6 52 42

bad case

第一句, "数月后" 似乎应修饰 "调查"
屏幕快照 2020-08-19 下午3 50 36

第二句,将"闻名"换为"出名"后, "数月后" 成为 "调查" 的宾语
屏幕快照 2020-08-19 下午3 51 49

第三句, 情况仍与第一句类似
屏幕快照 2020-08-19 下午3 59 55

可以分析小说吗?

可以分析中文小说吗?
例如主角,人物对话提取,情绪分析,场景识别等

ddparser v1.0.5 segs中有空格时报错

例如
ddp.parse_seg([['百度', ' ', '是', '一家', '高科技', '公司']])

ddp.parse("百度 是一家高科技公司")

报错:
list index out of range

v0.1.2无此问题

ddparser导入失败

from ddparser import DDParser
Traceback (most recent call last):
File "", line 1, in
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/ddparser/init.py", line 24, in
from .run import DDParser
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/ddparser/run.py", line 26, in
import LAC
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/LAC/init.py", line 23, in
from .lac import LAC
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/LAC/lac.py", line 28, in
import paddle.fluid as fluid
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/init.py", line 37, in
import paddle.complex
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/complex/init.py", line 15, in
from . import tensor
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/complex/tensor/init.py", line 15, in
from . import math
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/complex/tensor/math.py", line 15, in
from paddle.common_ops_import import *
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/common_ops_import.py", line 15, in
from paddle.fluid.layer_helper import LayerHelper
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/fluid/init.py", line 56, in
from . import contrib
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/fluid/contrib/init.py", line 27, in
from . import slim
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/fluid/contrib/slim/init.py", line 15, in
from .core import *
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/fluid/contrib/slim/core/init.py", line 15, in
from . import config
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/fluid/contrib/slim/core/config.py", line 19, in
from ..prune import *
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/fluid/contrib/slim/prune/init.py", line 17, in
from . import prune_strategy
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/paddle/fluid/contrib/slim/prune/prune_strategy.py", line 20, in
import prettytable as pt
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/prettytable/init.py", line 48, in
version = importlib_metadata.version(name)
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/importlib_metadata/init.py", line 869, in version
return distribution(distribution_name).version
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/importlib_metadata/init.py", line 513, in version
return self.metadata['Version']
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/importlib_metadata/init.py", line 496, in metadata
self.read_text('METADATA')
File "/home/suixf/miniconda3/envs/nlp/lib/python3.6/site-packages/importlib_metadata/init.py", line 828, in read_text
return self._path.joinpath(filename).read_text(encoding='utf-8')
AttributeError: 'PosixPath' object has no attribute 'read_text'

requests.exceptions.SSLError: HTTPSConnectionPool(host='ddparser.bj.bcebos.com', port=443): Max retries exceeded with url: /DDParser-ernie-lstm-1.0.6.tar.gz (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:852)'),))

您好,我使用ddparser工具的时候遇到了下面的错误,请问应该如何解决呢,谢谢

C:\Users\dell\AppData\Local\conda\conda\envs\frw\python.exe C:/Users/dell/Desktop/data(1)/DecompRC-master/baidu_parser.py
ERROR:root:Failed to download model, please try again
ERROR:root:error: HTTPSConnectionPool(host='ddparser.bj.bcebos.com', port=443): Max retries exceeded with url: /DDParser-ernie-lstm-1.0.6.tar.gz (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:852)'),))
Traceback (most recent call last):
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\urllib3\connectionpool.py", line 696, in urlopen
self._prepare_proxy(conn)
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\urllib3\connectionpool.py", line 964, in _prepare_proxy
conn.connect()
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\urllib3\connection.py", line 359, in connect
conn = self._connect_tls_proxy(hostname, conn)
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\urllib3\connection.py", line 502, in connect_tls_proxy
ssl_context=ssl_context,
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\urllib3\util\ssl
.py", line 432, in ssl_wrap_socket
ssl_sock = ssl_wrap_socket_impl(sock, context, tls_in_tls)
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\urllib3\util\ssl
.py", line 474, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock)
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\ssl.py", line 407, in wrap_socket
_context=self, _session=session)
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\ssl.py", line 817, in init
self.do_handshake()
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\ssl.py", line 1077, in do_handshake
self._sslobj.do_handshake()
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\ssl.py", line 689, in do_handshake
self._sslobj.do_handshake()
ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:852)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\requests\adapters.py", line 449, in send
timeout=timeout
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\urllib3\connectionpool.py", line 756, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\urllib3\util\retry.py", line 573, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='ddparser.bj.bcebos.com', port=443): Max retries exceeded with url: /DDParser-ernie-lstm-1.0.6.tar.gz (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:852)'),))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:/Users/dell/Desktop/data(1)/DecompRC-master/baidu_parser.py", line 2, in
ddp=DDParser()
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\ddparser\run.py", line 304, in init
raise e
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\ddparser\run.py", line 300, in init
utils.download_model_from_url(model_files_path, encoding_model)
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\ddparser\parser\data_struct\utils.py", line 305, in download_model_from_url
r = requests.get(download_model_path, stream=True)
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\requests\api.py", line 76, in get
return request('get', url, params=params, **kwargs)
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\requests\api.py", line 61, in request
return session.request(method=method, url=url, **kwargs)
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\requests\sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\requests\sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "C:\Users\dell\AppData\Local\conda\conda\envs\frw\lib\site-packages\requests\adapters.py", line 514, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='ddparser.bj.bcebos.com', port=443): Max retries exceeded with url: /DDParser-ernie-lstm-1.0.6.tar.gz (Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:852)'),))

Process finished with exit code 1

关于导入用户词典的功能

您好,感谢您的开源工具。
依赖库LAC中可以添加用户自定义的词典,DDParser中是否可以添加一个参数或者添加对应的方法呢?毕竟再import一下LAC有点冗余了。

同时想请教一下DDParser和哈工大的LTP在依存分析上的优劣势。

请问有POStag的解释文档吗?

您好,因为我的project需要POS标记,ddparser也会有标记的结果,只是我没找到这些标记所对应的具体词性,请问有相关文档我可以阅读吗(论文里没有这方面的解释)?非常感谢!
P.S.如果有标注的文献依据就更好了!(比如如何处理兼类,虚词是如何标记的等等)

AttributeError: 'PosixPath' object has no attribute 'read_text'

sudo pip install ddparser
安装成功后出现如下错误信息:

from ddparser import DDParser
Traceback (most recent call last):
File "", line 1, in
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/ddparser/init.py", line 24, in
from .run import DDParser
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/ddparser/run.py", line 26, in
import LAC
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/LAC/init.py", line 23, in
from .lac import LAC
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/LAC/lac.py", line 28, in
import paddle.fluid as fluid
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/paddle/init.py", line 31, in
import paddle.dataset
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/paddle/dataset/init.py", line 25, in
import paddle.dataset.sentiment
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/paddle/dataset/sentiment.py", line 30, in
import nltk
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/nltk/init.py", line 143, in
from nltk.chunk import *
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/nltk/chunk/init.py", line 157, in
from nltk.chunk.api import ChunkParserI
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/nltk/chunk/api.py", line 13, in
from nltk.parse import ParserI
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/nltk/parse/init.py", line 100, in
from nltk.parse.transitionparser import TransitionParser
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/nltk/parse/transitionparser.py", line 22, in
from sklearn.datasets import load_svmlight_file
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/sklearn/datasets/init.py", line 22, in
from .twenty_newsgroups import fetch_20newsgroups
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/sklearn/datasets/twenty_newsgroups.py", line 44, in
from ..feature_extraction.text import CountVectorizer
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/sklearn/feature_extraction/init.py", line 10, in
from . import text
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/sklearn/feature_extraction/text.py", line 28, in
from ..preprocessing import normalize
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/sklearn/preprocessing/init.py", line 6, in
from ._function_transformer import FunctionTransformer
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/sklearn/preprocessing/_function_transformer.py", line 5, in
from ..utils.testing import assert_allclose_dense_sparse
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/sklearn/utils/testing.py", line 718, in
import pytest
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/pytest.py", line 6, in
from _pytest.assertion import register_assert_rewrite
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/_pytest/assertion/init.py", line 6, in
from _pytest.assertion import rewrite
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/_pytest/assertion/rewrite.py", line 20, in
from _pytest.assertion import util
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/_pytest/assertion/util.py", line 5, in
import _pytest._code
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/_pytest/_code/init.py", line 2, in
from .code import Code # noqa
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/_pytest/_code/code.py", line 11, in
import pluggy
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/pluggy/init.py", line 16, in
from .manager import PluginManager, PluginValidationError
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/pluggy/manager.py", line 6, in
import importlib_metadata
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/importlib_metadata/init.py", line 547, in
version = version(name)
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/importlib_metadata/init.py", line 509, in version
return distribution(distribution_name).version
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/importlib_metadata/init.py", line 260, in version
return self.metadata['Version']
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/importlib_metadata/init.py", line 248, in metadata
self.read_text('METADATA')
File "/Users/xxxxx/anaconda3/lib/python3.7/site-packages/importlib_metadata/init.py", line 469, in read_text
return self._path.joinpath(filename).read_text(encoding='utf-8')
AttributeError: 'PosixPath' object has no attribute 'read_text'

AttributeError: 'DDParser' object has no attribute 'lac'

我的环境是python3.7, LAC 2.0.4, ddparser 0.1.1

尝试
from ddparser import DDParser
ddp = DDParser()
ddp.parse("百度是一家高科技公司")

报错如下:

AttributeError Traceback (most recent call last)
in
4 from ddparser import DDParser
5 ddp = DDParser()
----> 6 ddp.parse("百度是一家高科技公司")

~/tfpy3/lib/python3.7/site-packages/ddparser/run.py in parse(self, inputs)
336 'head': [2, 0, 5, 5, 2], 'deprel': ['SBV', 'HED', 'ATT', 'ATT', 'VOB'], 'prob': [1.0, 1.0, 1.0, 1.0, 1.0]}]
337 """
--> 338 if not self.lac:
339 self.lac = LAC.LAC(mode='lac' if self.use_pos else "seg",
340 use_cuda=self.args.use_cuda)

AttributeError: 'DDParser' object has no attribute 'lac'

import ddparser报错

/Users/laiwenbo/anaconda3/envs/testddp/bin/python /Users/laiwenbo/it/docker/info_extract/parse_structure_ddparser.py
Traceback (most recent call last):
File "/Users/laiwenbo/it/docker/info_extract/parse_structure_ddparser.py", line 60, in
structure_info = parse_structure_from_text(text, grain='big')
File "/Users/laiwenbo/it/docker/info_extract/parse_structure_ddparser.py", line 19, in parse_structure_from_text
from ddparser import DDParser
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/ddparser/init.py", line 24, in
from .run import DDParser
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/ddparser/run.py", line 26, in
import LAC
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/LAC/init.py", line 23, in
from .lac import LAC
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/LAC/lac.py", line 28, in
import paddle.fluid as fluid
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/init.py", line 37, in
import paddle.complex
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/complex/init.py", line 15, in
from . import tensor
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/complex/tensor/init.py", line 15, in
from . import math
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/complex/tensor/math.py", line 15, in
from paddle.common_ops_import import *
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/common_ops_import.py", line 15, in
from paddle.fluid.layer_helper import LayerHelper
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/fluid/init.py", line 56, in
from . import contrib
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/fluid/contrib/init.py", line 27, in
from . import slim
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/fluid/contrib/slim/init.py", line 15, in
from .core import *
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/fluid/contrib/slim/core/init.py", line 15, in
from . import config
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/fluid/contrib/slim/core/config.py", line 19, in
from ..prune import *
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/fluid/contrib/slim/prune/init.py", line 17, in
from . import prune_strategy
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/paddle/fluid/contrib/slim/prune/prune_strategy.py", line 20, in
import prettytable as pt
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/prettytable/init.py", line 48, in
version = importlib_metadata.version(name)
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/importlib_metadata/init.py", line 861, in version
return distribution(distribution_name).version
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/importlib_metadata/init.py", line 523, in version
return self.metadata['Version']
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/importlib_metadata/init.py", line 506, in metadata
self.read_text('METADATA')
File "/Users/laiwenbo/anaconda3/envs/testddp/lib/python3.6/site-packages/importlib_metadata/init.py", line 820, in read_text
return self._path.joinpath(filename).read_text(encoding='utf-8')
AttributeError: 'PosixPath' object has no attribute 'read_text'

Process finished with exit code 1

经核对,版本
ddparser 0.1.2
paddlepaddle 1.8.5
LAC 2.1.1
是符合的,请问是什么问题呢,各种折腾了

对把字句的解析似乎不太正确

image
把->书的POB的依赖关系是对的
卖->把的POB指向似乎不太正确, 并不是介词, 被解析为介宾关系应该是不正确的

其他的把字句也存在类似的问题
image

debug 不能用

debug只能执行一次,后面直接崩了,不能debug

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.