Giter Site home page Giter Site logo

Comments (13)

ansjsun avatar ansjsun commented on June 3, 2024
## ansj配置
ansj:
 dic_path: "ansj/dic/user/" ##用户词典位置
 ambiguity_path: "ansj/dic/ambiguity.dic" ##歧义词典
 enable_name_recognition: true ##人名识别
 enable_num_recognition: true ##数字识别
 enable_quantifier_recognition: false ##量词识别
 enabled_stop_filter: true ##是否基于词典过滤
 stop_path: "ansj/dic/stopLibrary.dic" ##停止过滤词典

首页..说明文档中有的..ctrl+f : 分词文件配置

from elasticsearch-analysis-ansj.

fqhaier avatar fqhaier commented on June 3, 2024

按照这个配置过,测试实验后,还是得不到预想结果,是不是还有别的注意点?
以下是我的配置和测试结果:
1.配置
################################## ANSJ PLUG CONFIG ################################
#默认分词器,索引
index.analysis.analyzer.default.type: index_ansj
#默认分词器,查询
index.analysis.analyzer.default_search.type: query_ansj
ansj:
dic_path: "ansj/dic/user/"
ambiguity_path: "ansj/dic/ambiguity.dic"
enable_name_recognition: true
enable_num_recognition: true
enable_quantifier_recognition: false
enabled_stop_filter: true
stop_path: "ansj/dic/stopLibrary.dic"
2.启动log
[2016-05-27 13:42:04,546][INFO ][DICLOG ] init user userLibrary ok path is : D:\IDEA_APL\elasticsearch-2.3.1\config\ansj\dic\user\userlib.dic
[2016-05-27 13:42:04,548][WARN ][DICLOG ] init ambiguity warning :D:\IDEA_APL\elasticsearch-2.3.1\config\ansj\dic\ambiguity.dic because : file not found or faile
d to read !
[2016-05-27 13:42:05,550][INFO ][DICLOG ] init user userLibrary ok path is : D:\IDEA_APL\elasticsearch-2.3.1\plugins\elasticsearch-analysis-ansj\default.dic
3.测试结果
http://127.0.0.1:9200/_cat/test_index1/analyze?text=斌斌$强强$庆雨&analyzer=dic_ansj
斌 0 1 0 word
斌 1 2 1 word
$ 2 3 2 word
强强 3 5 3 word
$ 5 6 4 word
庆 6 7 5 word
雨 7 8 6 word

from elasticsearch-analysis-ansj.

fqhaier avatar fqhaier commented on June 3, 2024

自定义词典内容如下:
D:\IDEA_APL\elasticsearch-2.3.1\config\ansj\dic\user\userlib.dic
斌斌 a 37557
强强 a 37557
庆雨 a 37557

from elasticsearch-analysis-ansj.

ansjsun avatar ansjsun commented on June 3, 2024

配置都没有修改吗??

你改成这个


1.配置
################################## ANSJ PLUG CONFIG ################################
#默认分词器,索引 
index.analysis.analyzer.default.type: index_ansj
#默认分词器,查询 
index.analysis.analyzer.default_search.type: query_ansj
ansj:
    dic_path: "D:\IDEA_APL\elasticsearch-2.3.1\config\ansj\dic\user\userlib.dic"

记得yml的缩进..

from elasticsearch-analysis-ansj.

ansjsun avatar ansjsun commented on June 3, 2024

现在好像连词典都没找到. ...词典个是之间用tab \t隔开

from elasticsearch-analysis-ansj.

fqhaier avatar fqhaier commented on June 3, 2024

还是不行
1,配置缩进和全路径已改
ansj:
dic_path: "D:/IDEA_APL/elasticsearch-2.3.1/config/ansj/dic/user/userlib.dic"
2.词典个是之间已经用tab \t隔开,这个没问题,这三行词放到D:\IDEA_APL\elasticsearch-2.3.1\plugins\elasticsearch-analysis-ansj\default.dic能得到预想效果。

from elasticsearch-analysis-ansj.

fqhaier avatar fqhaier commented on June 3, 2024

这个问题解决了,
原因:我的自定义词典的文件类型PC,文件编码ANSI。
解决:改成文件类型unix,文件编码UTF-8就可以了。

from elasticsearch-analysis-ansj.

fqhaier avatar fqhaier commented on June 3, 2024

现在出现了个新问题:停用词词典不起作用。
1.配置:

ansj:
dic_path: "ansj/dic/user/" ##用户词典位置
ambiguity_path: "ansj/dic/ambiguity.dic" ##歧义词典
enabled_stop_filter: true ##是否基于词典过滤
stop_path: "ansj/dic/stopLibrary.dic" ##停止过滤词典
2.D:\IDEA_APL\elasticsearch-2.3.1_1\config\ansj\dic\stopLibrary.dic
文件类型unix,文件编码UTF-8
"
.

,



3.测试结果(停用词没有过滤掉)
http://127.0.0.1:9200/_cat/test_index1/analyze?text=斌斌"强强"庆雨&analyzer=dic_ansj
斌斌 0 2 0 word
" 2 3 1 word
强强 3 5 2 word
" 5 6 3 word
庆雨 6 8 4 word

from elasticsearch-analysis-ansj.

fqhaier avatar fqhaier commented on June 3, 2024

log显示停用词加载成功
[2016-05-27 17:15:16,785][INFO ][ansj-initializer ] ansj停止词典加载完毕!
[2016-05-27 17:15:17,722][INFO ][DICLOG ] init core library ok use time :918
[2016-05-27 17:15:18,504][INFO ][DICLOG ] init ngram ok use time :753
[2016-05-27 17:15:18,510][INFO ][ansj-initializer ] ansj分词器预热完毕,可以使用!

from elasticsearch-analysis-ansj.

fqhaier avatar fqhaier commented on June 3, 2024

我看了一下elasticsearch-analysis-ansj-master源码,好像没有调用AnsjElasticConfigurator.filter

from elasticsearch-analysis-ansj.

ansjsun avatar ansjsun commented on June 3, 2024

很有可能 因为这个功能好像给删掉了 我回头确认下

发自我的 iPhone

在 2016年5月27日,18:28,fqhaier [email protected] 写道:

我看了一下elasticsearch-analysis-ansj-master源码,好像没有调用AnsjElasticConfigurator.filter


You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

from elasticsearch-analysis-ansj.

kinhunt avatar kinhunt commented on June 3, 2024

+1

from elasticsearch-analysis-ansj.

buaanie avatar buaanie commented on June 3, 2024

请问下我的2.3.1版本也遇到了相同的问题,要怎么解决?

from elasticsearch-analysis-ansj.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.