infinilabs / analysis-stconvert Goto Github PK

View Code? Open in Web Editor NEW

349.0 15.0 74.0 368 KB

🚲 STConvert is analyzer that convert chinese characters between traditional and simplified.中文简繁體互相转换.

License: Apache License 2.0

Java 100.00%

analyzer traditional elasticsearch convert-chinese-characters

analysis-stconvert's People

Contributors

Stargazers

Watchers

Forkers

zhuomingliang shadow000fire zhangwei5095 vincentruan huanglz firedraky suood chengyuanheng bigsan xiwanglr gogobook kbuckner chinarefers onode wwwa kaychang houcheng hj5 ronyuzhang nomoa fengshao0907 kentfrazier lishenzhi zhiji6 eliauktm xiaonengqiu meldonization eacoyun fly-coder oaishuoshihua tool-recommender-bot jzxyouok kltong endellzh baisui1981 hit-zcc jlleitschuh wangruiling zendesk mingfai montanablood27 loongbiao blongz bluelibra jiajiajing6 qq605490312 wchch bestjex iyah35 zhanglei laashub-soa ajunlonglive cr7258 caoccao tim1104 amooncake fanqk19 naughtyjoint dummyprodigal onedoggo infinitas-plus xuanyuanbao udemy xing-ma shawn0s googuteam lizongbo hxgjg morpheus9631 timjlee

analysis-stconvert's Issues

elasticsearch-analysis-stconvert 1.8.3

I installed the plugin elasticsearch-analysis-stconvert 1.8.3 , and updated elasticsearch.yml to restart elasticsearch and encounter the following issues:

[2016-07-27 14:27:02,163][WARN ][cluster.action.shard ] [Nitro] [test][1] received shard failed for target shard [[test][1], node[0tGrnF7ZQZyKV__YKdwq_Q], [P], v[3], s[INITIALIZING], a[id=I03NjNN7SCaezVLXd9PNpw], unassigned_info[[reason=CLUSTER_RECOVERED], at[2016-07-27T06:27:01.631Z]]], indexUUID [nb_j7plXT4q6vqH9OZnkTQ], message [failed to create index], failure [IndexCreationException[failed to create index]; nested: IllegalStateException[[index.version.created] is not present in the index settings for index with uuid: [null]]; ]
[test] IndexCreationException[failed to create index]; nested: IllegalStateException[[index.version.created] is not present in the index settings for index with uuid: [null]];
at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:360)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyNewIndices(IndicesClusterStateService.java:294)
at org.elasticsearch.indices.cluster.IndicesClusterStateService.clusterChanged(IndicesClusterStateService.java:163)
at org.elasticsearch.cluster.service.InternalClusterService.runTasksForExecutor(InternalClusterService.java:610)
at org.elasticsearch.cluster.service.InternalClusterService$UpdateTask.run(InternalClusterService.java:772)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:231)
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalStateException: [index.version.created] is not present in the index settings for index with uuid: [null]
at org.elasticsearch.Version.indexCreated(Version.java:580)
at org.elasticsearch.index.analysis.Analysis.parseAnalysisVersion(Analysis.java:99)
at org.elasticsearch.index.analysis.AbstractTokenizerFactory.(AbstractTokenizerFactory.java:40)
at org.elasticsearch.index.analysis.STConvertTokenizerFactory.(STConvertTokenizerFactory.java:38)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.elasticsearch.common.inject.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:50)
at org.elasticsearch.common.inject.ConstructorInjector.construct(ConstructorInjector.java:86)
at org.elasticsearch.common.inject.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:104)
at org.elasticsearch.common.inject.FactoryProxy.get(FactoryProxy.java:54)
at org.elasticsearch.common.inject.InjectorImpl$4$1.call(InjectorImpl.java:823)
at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:886)
at org.elasticsearch.common.inject.InjectorImpl$4.get(InjectorImpl.java:818)
at org.elasticsearch.common.inject.assistedinject.FactoryProvider2.invoke(FactoryProvider2.java:236)
at com.sun.proxy.$Proxy15.create(Unknown Source)
at org.elasticsearch.index.analysis.AnalysisService.(AnalysisService.java:95)
at org.elasticsearch.index.analysis.AnalysisService.(AnalysisService.java:70)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.elasticsearch.common.inject.DefaultConstructionProxyFactory$1.newInstance(DefaultConstructionProxyFactory.java:50)
at org.elasticsearch.common.inject.ConstructorInjector.construct(ConstructorInjector.java:86)
at org.elasticsearch.common.inject.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:104)
at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:47)
at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:886)
at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:43)
at org.elasticsearch.common.inject.Scopes$1$1.get(Scopes.java:59)
at org.elasticsearch.common.inject.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:46)
at org.elasticsearch.common.inject.SingleParameterInjector.inject(SingleParameterInjector.java:42)
at org.elasticsearch.common.inject.SingleParameterInjector.getAll(SingleParameterInjector.java:66)
at org.elasticsearch.common.inject.ConstructorInjector.construct(ConstructorInjector.java:85)
at org.elasticsearch.common.inject.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:104)
at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:47)
at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:886)
at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:43)
at org.elasticsearch.common.inject.Scopes$1$1.get(Scopes.java:59)
at org.elasticsearch.common.inject.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:46)
at org.elasticsearch.common.inject.SingleParameterInjector.inject(SingleParameterInjector.java:42)
at org.elasticsearch.common.inject.SingleParameterInjector.getAll(SingleParameterInjector.java:66)
at org.elasticsearch.common.inject.ConstructorInjector.construct(ConstructorInjector.java:85)
at org.elasticsearch.common.inject.ConstructorBindingImpl$Factory.get(ConstructorBindingImpl.java:104)
at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter$1.call(ProviderToInternalFactoryAdapter.java:47)
at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:886)
at org.elasticsearch.common.inject.ProviderToInternalFactoryAdapter.get(ProviderToInternalFactoryAdapter.java:43)
at org.elasticsearch.common.inject.Scopes$1$1.get(Scopes.java:59)
at org.elasticsearch.common.inject.InternalFactoryToProviderAdapter.get(InternalFactoryToProviderAdapter.java:46)
at org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:201)
at org.elasticsearch.common.inject.InjectorBuilder$1.call(InjectorBuilder.java:193)
at org.elasticsearch.common.inject.InjectorImpl.callInContext(InjectorImpl.java:879)
at org.elasticsearch.common.inject.InjectorBuilder.loadEagerSingletons(InjectorBuilder.java:193)
at org.elasticsearch.common.inject.InjectorBuilder.injectDynamically(InjectorBuilder.java:175)
at org.elasticsearch.common.inject.InjectorBuilder.build(InjectorBuilder.java:110)
at org.elasticsearch.common.inject.InjectorImpl.createChildInjector(InjectorImpl.java:157)
at org.elasticsearch.common.inject.ModulesBuilder.createChildInjector(ModulesBuilder.java:55)
at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:358)
... 9 more

stconvert char filter causing highlighted search errors

char filter definition:

"ts_char_filter" : {
    "type" : "stconvert",
    "delimiter" : "#",
    "keep_both" : false,
    "convert_type" : "t2s"
 }

without the char filter, a plain highlighted match_all query could return the matched document, query json is shown below:

{
    "_source": {"exclude": ["text"]},
    "query" : {
        "match_all" : {}
    },
    "highlight": {
        "encoder": "html",
        "pre_tags": ["<span style='color: red;'>"],
        "post_tags": ["</span>"],
        "fields": {
            "text": {
                "fragment_size": 120
            }
        }
    }
}

however by add the traditional to simplified char filter, an error would return:

"failures": [
    {
        "shard": 3,
        "index": "my_index",
        "node": "59j1d4i9Qdm0RcVfIN2UoQ",
        "reason": {
            "type": "invalid_token_offsets_exception",
            "reason": "Token www.victorycity.com.hk exceeds length of provided text sized 90702"
        }
    }
]

raw text is attached below:
test.txt

elasticsearch log:

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_141]
Caused by: org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token www.victorycity.com.hk  exceeds length of provided text sized 90702

can you please take a look into it, thanks.

Custom normalizer [my_normalizer] may not use char filter [tsconvert]

6.8.0 norma 报错

{
	"settings": {
		"analysis": {
			"analyzer": {
				"my_analyzer": {
					"type": "custom",
					"tokenizer": "ik_max_word",
					"char_filter": [
						"tsconvert"
					],
					"filter": [
						"lowercase"
					]
				}
			},
			"char_filter": {
				"tsconvert": {
					"type": "stconvert",
					"convert_type": "t2s"
				}
			},
			"normalizer": {
				"my_normalizer": {
					"type": "custom",
					"char_filter": [
						"tsconvert"
					],
					"filter": [
						"lowercase"
						, "asciifolding"
					]
				}
			}
		}
	},
	"mappings": {
		"_doc": {
			"properties": {
				"foo": {
					"type": "keyword",
					"normalizer": "my_normalizer"
				},
				"name": {
					"type": "text",
					"analyzer": "my_analyzer"
				}
			}
		}
	}
}

{
	"error": {
		"root_cause": [
			{
				"type": "remote_transport_exception",
				"reason": "[node-1][172.19.0.1:9301][indices:admin/create]"
			}
		],
		"type": "illegal_argument_exception",
		"reason": "Custom normalizer [my_normalizer] may not use char filter [tsconvert]"
	},
	"status": 400
}

麻烦看下

Unknown char_filter type [stconvert] when trying to add to index settings

Using Elastic Cloud, ES version is 6.4.1.
elasticsearch-analysis-stconvert-7.0.0.zip was installed as an installable custom plugin, and then enabled in "Manage plugins and settings".

Then run (similar to as shown in the example):

PUT /index_name/_settings
{
    "analysis" : {
            "char_filter" : {
                "tsconvert" : {
                    "type" : "stconvert",
                    "convert_type" : "t2s"
                }
            }
        }
}

Got response:

{
  "status": 400,
  "error": {
    "root_cause": [
      {
        "reason": "Unknown char_filter type [stconvert] for [tsconvert]",
        "type": "illegal_argument_exception"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "Unknown char_filter type [stconvert] for [tsconvert]"
  }
}

“電子”转换简体错误

如题，“電子“一词转换简体错误，除”電子鐘錶“、”電子錶" 等已在词典之外，其他如“電子報" 等会转换错误，仍为”電子“。

{
  "tokenizer" : "keyword",
  "filter" : ["lowercase"],
  "char_filter" : ["tsconvert"],
  "text" : "電子"
}

Output：
{
    "tokens": [
        {
            "token": "電子",
            "start_offset": 0,
            "end_offset": 2,
            "type": "word",
            "position": 0
        }
    ]
}

7.11.2 releases zip 为什么没了

delimiter参数不是很理解到底是干啥的

我本地测试也没看出具体作用。。请赐教~

如何在charfilter中获取当前字段名

我在基于stconvert的charfilter研究es的字符串过滤，也找到了这里面进行字符处理的位置。但是现在我需要对一次输入的不同字段进行分别处理，比如对于索引：

我只想处理字段word而不处理foo；现在我能拿到的仅仅是输入的字符串，并不能获取当前处理的字符串是属于哪个字段的，请问大家有办法可以获取么？ @medcl

不写个WIKI吗, 不知道咋用啊, 安卓能用吗

木有文档啊,

Unknown elasticsearch-analysis-stconvert-6.7.0.zip

my elasticsearch version is 6.7.0，I need to use it stconvert， But it was not found elasticsearch-analysis-stconvert-6.7.0.zip

keep-both是什么意思呢？

how to config stconvert in django-dsl?

html_strip = analyzer(
'ik',
tokenizer="ik_smart",
filter=["standard", "lowercase", "stop", "snowball"],
# char_filter=["html_strip"]
# char_filter={
# "tsconvert" : {
# "convert_type": "t2s",
# "type": "stconvert"
# }
# }
char_filter=["stconvert"]
)

I want to config the convert_type:t2s ,But there are errors:
elasticsearch.exceptions.TransportError: TransportError(500, 'settings_exception', 'Failed to load settings from [{"number_of_shards":3,"analysis":{"analyzer":{"ik":{"filter":["standard","lowercas e","stop","snowball"],"char_filter":[{"tsconvert":{"convert_type":"t2s","type":"stconvert"}}],"typ e":"custom","tokenizer":"ik_smart"}}},"number_of_replicas":1}]')

简繁体转换CharFilter实现

提供一个简繁体转换的charfilter，在分词之前统一成简体或者繁体，方便后续的分词，避免tokenizer分词的不正确

想要一个stconvert加ik的效果

仿照之前medcl大神写的pinyin+ik的分析器写的：

PUT http://localhost:9200/medcl/ -d'
{
    "index" : {
        "analysis" : {
            "analyzer" : {
                "custom_stconvert_analyzer" : {
                    "tokenizer" : "ik_smart",
                    "filter" : "stconvert"
                }
            }
        }
    }
}'

但是并没有效果哇，求解？

写入es数据时，Properties类导致锁竞争降低并发性能

并发写入es数据时，发现这个插件会有性能瓶颈。瓶颈代码位置 STConverter.java:104 java.util.Hashtable.containsKey
这里很容易导致锁竞争，写并发上去后，就容易卡在这里。
事实上，字典在初始化时用Properties（继承自Hashtable）加载完成后，并没有在其他地方并触发任何改动，那么这里的锁就是没有必要的。
不要用Hashtable（或者Properties），可以直接改用Hashmap。

Elasticsearch 8.9 是否兼容

之前下载的版本是8.9，我不知道是不是要升级到8.10才能用，还望指教

How to build for ElasticSearch 7.17.7?

There's no release binaries..

Building elastic.search version manually in pom.xml is generating errors:
Shouldn't elasticsearch-analysis-stconvert matches target ElasticSearch version?

[ERROR] /root/elasticsearch-analysis-stconvert/src/main/java/org/elasticsearch/index/analysis/STConvertAnalyzerProvider.java:[28,9] constructor AbstractIndexAnalyzerProvider in class org.elasticsearch.index.analysis.AbstractIndexAnalyzerProvider<T> cannot be applied to given types;

Error in t2s conversion of 恭弘

Using the tsconvert tokenizer or char_filter, "恭弘" gets converted to "叶叶恭弘:叶:叶".

I think it is because of line 4050 in t2s.properties, which is weirdly formatted, and the only line with "=" in the whole file:

恭弘=叶恭弘:叶

This causes downstream problems with SmartCN, and the character counts for tokens are wrong for everything on the same line that comes after.

I recompiled a copy of v1.8.5 without this line and 恭弘 is unchanged, which is definitely better if not ideal.

请问在 elasticsearch.yml，如何配置自定义stconvert分词？

how to use stconvert in ik analyzed indexes?

Here is my setting and mapping information

settings index: { number_of_shards: 1 } do
    mappings do
      indexes :seq_in_nb, type: :integer
      indexes :likes_count, type: :integer
      indexes :title, analyzer: :ik
      indexes :content_without_markup, analyzer: :ik
      indexes :shared, analyzer: :keyword, type: :boolean
      indexes :locked, analyzer: :keyword, type: :boolean
    end
  end

the content in our setting includes traditional and simplified Chinese. Therefore we want to search terms with both traditional and simplified Chinese and we can get the same searching result.

Now under this setting, my contents are divided into two groups with respect to each Chinese words(e.g. 中國 and **).

Here comes two question:

could I make the words with same meaning into just one keyword (e.g. the document containing "中國" or "**" could come to the same index)
As I use the following query setting

...
{ match_phrase: {content_without_markup: { analyzer: :t2s_convert, query: keyword, slop: 10} } }
...

I could use "國" to search for "国", but can't use "中國" to search for "**". Why this situation happen? It seems almost worked. But why phrase failed?

Support es 2.4.0

medcl大大，能不能發布一下 es 2.4.0 的版本，在此謝過 orz

7.17.7 zip release

Hi,

The 7.17.7 release is missing the zip. Can you please build that?
Thank you very much.

请问能做到简繁转换后ik分词吗

大神你好，请问能做到简繁转换后ik分词吗？

Failed to find analyzer tsconvert/tsconvert_keep_both/stconvert_keep_both

stconvert

curl -XGET http://localhost:9200/index/_analyze\?text\=%e5%8c%97%e4%ba%ac%e5%9b%bd%e9%99%85%e7%94%b5%e8%a7%86%e5%8f%b0%2c%e5%8c%97%e4%ba%ac%e5%9c%8b%e9%9a%9b%e9%9b%bb%e8%a6%96%e8%87%ba\&analyzer\=stconvert

{"tokens":[{"token":"北京國際電視檯","start_offset":0,"end_offset":7,"type":"word","position":0},{"token":"北京國際電視臺"

tsconvert

{"error":{"root_cause":[{"type":"remote_transport_exception","reason":"[XYXYHAu][127.0.0.1:9300][indices:admin/analyze[s]]"}],"type":"illegal_argument_exception","reason":"failed to find analyzer [tsconvert]"},"status":400}

tsconvert_keep_both

stconvert_keep_both

version info

elasticsearch version: 5.3.0
elasticsearch-analysis-stconvert: 5.3.0

构建了 8.10.2 ，8.10.3，8.10.4，7.17.14供使用

elasticsearch-analysis-stconvert 新包构建说明

elasticsearch-analysis-stconvert 作者创业之后应该是没时间维护这个插件了，
没有构建支持8.10.2版本的包，于是自己fork了代码，拉到本地修改修改pom.xml中的版本号，比如：
<elasticsearch.version>8.10.2</elasticsearch.version>
并升级了nlp-lang-1.7.jar 为nlp-lang-1.7.9.jar, nlp-lang组件中拼音数据有更新

修改之后运行构建命令 mvn clean compile package，打包成功，得到elasticsearch-analysis-stconvert-8.10.2.zip
插件安装命令示例：
./elasticsearch-plugin -v install file:///var/services/homes/lizongbo/esplugins/elasticsearch-analysis-stconvert-8.10.2.zip

为方便大家使用，我构建了以下版本的包供使用：

8.10.2,8.10.3,8.10.4,7.17.14

https://github.com/lizongbo/elasticsearch-analysis-stconvert/releases/tag/v8.10.4
https://github.com/lizongbo/elasticsearch-analysis-stconvert/releases/tag/v8.10.3
https://github.com/lizongbo/elasticsearch-analysis-stconvert/releases/tag/v8.10.2
https://github.com/lizongbo/elasticsearch-analysis-stconvert/releases/tag/v7.17.14

有其它版本包需求的也可以在此提出来，我可以构建上传。

ElasticSearch 2.X support

是否可以有 whitelist 或者 blacklist

用到了简体字转为繁体字的 query，但是发现有些简体字不能转，比如

松自动转为了鬆，这两个字在繁体中都有，会出现问题。

请问有解决办法吗？

Support for Elastsearch 5.5

Hi, thank you for this plugin. Will you be releasing a version for ES 5.5 soon?

如何结合jieba分词进行简繁体查询呢

这个插件只是将繁体转化成简体或者简体转化成繁体吗，如何结合jieba分词器，输入简体时候，给序列化成繁体然后分词搜索

Elastic 6.3.1 support

Are there plans to upgrade this plugin to elastic 6.3.1?

Thank you!

6.4.1版本使用例子中的自定义char_filter有问题

如图，目前发现这个词 “电子”，不能转，反而使用简体转繁体可以转。char_filter定义如图：
手机打得字，排版不方便，抱歉！

ES 5.6.8 support

Can you please update stconvert to support ES 5.6.8 as you did for ik and pinyin?

please release a new version for es

找不到：tsconvert_keep_both 、stconvert_keep_both

能找到tsconvert和stconvert，但没找到：tsconvert_keep_both 和stconvert_keep_both

{
"error" : {
"root_cause" : [
{
"type" : "remote_transport_exception",
"reason" : "[HadoopTemp][192.168.1.208:9300][indices:admin/analyze[s]]"
}
],
"type" : "illegal_argument_exception",
"reason" : "failed to find tokenizer under [tsconvert_keep_both]"
},
"status" : 400
}

curl -XGET http://localhost:9200/stconvert/_analyze?text=%e5%8c%97%e4%ba%ac%e5%9b%bd%e9%99%85%e7%94%b5%e8%a7%86%e5%8f%b0%2c%e5%8c%97%e4%ba%ac%e5%9c%8b%e9%9a%9b%e9%9b%bb%e8%a6%96%e8%87%ba\&tokenizer=stconvert_keep_both\&pretty
{
"error" : {
"root_cause" : [
{
"type" : "remote_transport_exception",
"reason" : "[HadoopTemp][192.168.1.208:9300][indices:admin/analyze[s]]"
}
],
"type" : "illegal_argument_exception",
"reason" : "failed to find tokenizer under [stconvert_keep_both]"
},
"status" : 400
}

"手機"经过tokenFilter=stconvert后结果变成了"手","機". 不应该是"手机"吗?

#手機 %E6%89%8B%E6%A9%9F
curl -XGET http://localhost:9200/_analyze\?text\=%E6%89%8B%E6%A9%9F\&filter\=stconvert

{"tokens":[{"token":"手","start_offset":0,"end_offset":1,"type":"","position":0},{"token":"機","start_offset":1,"end_offset":2,"type":"","position":1}]}

版本: stconvert-1.9.3 es-2.4.3

設置了轉換器后分詞不太理想

設置了轉換器后分詞不太理想，比如搜索清華沒有結果，搜索清華大學有結果，這個是什麽原因，我設置如下：
{
"analysis": {
"char_filter": {
"tsconvert": {
"type": "stconvert",
"convert_type": "t2s"
}
},
"analyzer": {
"my_analyzer": {
"type": "custom",
"char_filter": [
"tsconvert"
],
"tokenizer": "ik_smart",
"filter": [
"lowercase"
]
}
}
}
}
有什麽建議嗎?

Mapping definition for [key_word] has unsupported parameters: [analyzer : tsconvert]

我在创建索引的时候报错，下面是我的代码：

curl -XPUT my_index
{
    "mappings": {
        "item": {
            "properties": {
                "key_word": {
                    "type": "keyword", 
                    "analyzer": "tsconvert"
                }, 
                "title": {
                    "type": "keyword",
                    "analyzer": "tsconvert"
                }
            }
        }
    }
}

提交的时候出现下面的错误，插件我已经安装好了。

{
    "error": {
        "root_cause": [
            {
                "type": "mapper_parsing_exception",
                "reason": "Mapping definition for [key_word] has unsupported parameters:  [analyzer : tsconvert]"
            }
        ],
        "type": "mapper_parsing_exception",
        "reason": "Failed to parse mapping [item]: Mapping definition for [key_word] has unsupported parameters:  [analyzer : tsconvert]",
        "caused_by": {
            "type": "mapper_parsing_exception",
            "reason": "Mapping definition for [key_word] has unsupported parameters:  [analyzer : tsconvert]"
        }
    },
    "status": 400
}

无法在normalizer中使用该char_filter

版本:6.3.1，
"st_normalizer": {
"type": "custom",
"char_filter": [
"tsconvert"
]
}

STConvertCharFilterFactory.class 应该implements MultiTermAwareComponent
否则在es代码中检查时会有报错,导致无法正常应用在normalizer中
if (charFilter instanceof MultiTermAwareComponent == false) {
throw new IllegalArgumentException("Custom normalizer [" + name() + "] may not use char filter ["+ charFilterName + "]");
}

新版本还是有这种问题。