Giter Site home page Giter Site logo

jolicode / emoji-search Goto Github PK

View Code? Open in Web Editor NEW
217.0 25.0 65.0 19.07 MB

:smile: Emoji synonyms to build your own emoji-capable search engine (elasticsearch, solr, OpenSearch)

Home Page: https://jolicode.com/blog/elasticsearch-icu-now-understands-emoji

License: MIT License

PHP 92.08% Makefile 7.92%
emoji elasticsearch plugin emoticons cldr analyzer elasticsearch-plugin hacktoberfest opensearch

emoji-search's Introduction

πŸ™‚ Emoji, flags & emoticons support for Elasticsearch

Add support for emoji and flags in any Lucene compatible search engine!

If you wish to search 🍩 to find donuts in your documents, you came to the right place. We offer synonym files ready for usage in Elasticsearch and OpenSearch analyzer.

Test all synonym files on a real Elasticsearch

Requirements to index emoji in Elasticsearch

There is no requirements for Elasticsearch >= 6.7.

Using older version of Elasticsearch? Open me! πŸ–±
Version Requirements
Elasticsearch >= 6.4 and < 6.7 You need to install the official ICU Plugin. See our blog post about this change.
Elasticsearch < 6.4 You need our custom ICU Tokenizer Plugin, see our blog post (2016).

Run the following test to verify that you get 4 EMOJI tokens:

GET _analyze
{
  "text": ["🍩 πŸ‡«πŸ‡· πŸ‘©β€πŸš’ πŸš£πŸΎβ€β™€"]
}

The Synonyms, flags and emoticons

What you need to search with emoji is a way to expand them to words that can match searches and documents, in your language. That's the goal of the synonym dictionaries.

We build Solr / Lucene compatible synonyms files in all languages supported by Unicode CLDR so you can set them up in an analyzer. It looks like this:

πŸ‘©β€πŸš’ => πŸ‘©β€πŸš’, firefighter, firetruck, woman
πŸ‘©β€βœˆ => πŸ‘©β€βœˆ, pilot, plane, woman
πŸ₯“ => πŸ₯“, bacon, meat, food
πŸ₯” => πŸ₯”, potato, vegetable, food
πŸ˜… => πŸ˜…, cold, face, open, smile, sweat
πŸ˜† => πŸ˜†, face, laugh, mouth, open, satisfied, smile
🚎 => 🚎, bus, tram, trolley
πŸ‡«πŸ‡· => πŸ‡«πŸ‡·, france
πŸ‡¬πŸ‡§ => πŸ‡¬πŸ‡§, united kingdom

For emoticons, use this mapping with a char_filter to replace emoticons by emoji.

Installation

Download the emoji and emoticon file you want from this repository and store them in PATH_TO_ES/config/analysis (or anywhere Elasticsearch can read).

config
β”œβ”€β”€ analysis
β”‚Β Β  β”œβ”€β”€ cldr-emoji-annotation-synonyms-en.txt
β”‚Β Β  └── emoticons.txt
β”œβ”€β”€ elasticsearch.yml
...

Use them like this (this is a complete english example with Elasticsearch >= 6.7):

PUT /tweets
{
  "settings": {
    "analysis": {
      "filter": {
        "english_emoji": {
          "type": "synonym",
          "synonyms_path": "analysis/cldr-emoji-annotation-synonyms-en.txt"
        },
        "emoji_variation_selector_filter": {
          "type": "pattern_replace",
          "pattern": "\\uFE0E|\\uFE0F",
          "replace": ""
        },
        "english_stop": {
          "type":       "stop",
          "stopwords":  "_english_"
        },
        "english_keywords": {
          "type":       "keyword_marker",
          "keywords":   ["example"]
        },
        "english_stemmer": {
          "type":       "stemmer",
          "language":   "english"
        },
        "english_possessive_stemmer": {
          "type":       "stemmer",
          "language":   "possessive_english"
        }
      },
      "analyzer": {
        "english_with_emoji": {
          "tokenizer": "standard",
          "filter": [
            "english_possessive_stemmer",
            "lowercase",
            "emoji_variation_selector_filter",
            "english_emoji",
            "english_stop",
            "english_keywords",
            "english_stemmer"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "content": {
        "type": "text",
        "analyzer": "english_with_emoji"
      }
    }
  }
}

You can now test the result with:

GET tweets/_analyze
{
  "field": "content",
  "text": "🍩 πŸ‡«πŸ‡· πŸ‘©β€πŸš’ πŸš£πŸΎβ€β™€"
}

How to contribute

Build from CLDR SVN

You will need:

  • php cli
  • php zip and curl extensions

Edit the tag in tools/build-released.php and run php tools/build-released.php.

Update emoticons

Run php tools/build-emoticon.php.

Licenses

Emoji data courtesy of CLDR. See unicode-license.txt for details. Some modifications are done on the data, see here. Emoticon data based on https://github.com/wooorm/emoticon/ (MIT).

This repository in distributed under MIT License. Feel free to use and contribute as you please!

emoji-search's People

Contributors

damienalexandre avatar harmenjanssen avatar lyrixx avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

emoji-search's Issues

Provide a proper "emoji_synonym_filter"

The plugin should provide a new token filter, "emoji_synonym_filter", which would simplify the configuration from this:

PUT /en-emoji
{
  "settings": {
    "analysis": {
      "filter": {
        "english_emoji": {
          "type": "synonym",
          "synonyms_path": "analysis/cldr-emoji-annotation-synonyms-en.txt" 
        }
      },
      "analyzer": {
        "english_with_emoji": {
          "tokenizer": "emoji_tokenizer",
          "filter": [
            "lowercase",
            "english_emoji"
          ]
        }
      }
    }
  }
}

To this:

{
  "settings": {
    "analysis": {
      "filter": {
        "english_emoji": {
          "type": "emoji_synonym_filter",
          "language": "en" 
        }
      },
      "analyzer": {
        "english_with_emoji": {
          "tokenizer": "emoji_tokenizer",
          "filter": [
            "lowercase",
            "english_emoji"
          ]
        }
      }
    }
  }
}

One of the main feature is that the synonym files could be shipped directly with the bundle!

An Emoji capable tokenizer

We need a specific tokenizer to handle emoji properly in Elasticsearch, here is the specification.

Emoji Tokenizer

Should be based on the standard tokenizer to keep the grammar logic and Unicode Text Segmentation implementation.

But add support for symbols / emoji as specified in tr51.

Test cases

Input Tokens
Get some donut's. Get some donut's
Get some 🍩's. Get some 🍩's
"🍩" 🍩

Add more tests

  • The tests should also check basic ICU_Tokenizer stuffs, maybe we can run the ICUTokenizer tests against this plugin to make sure we didn't break anything?
  • We should add tests for some edge cases like modifiers, composed emoji, variation selectors...

Mark the synonym token filter as updateable and provide a better example

Reading https://www.elastic.co/blog/boosting-the-power-of-elasticsearch-with-synonyms - we quickly see Emoji Search can benefit from the new POST /synonym_test/_reload_search_analyzers API.

Index-time synonyms have several disadvantages:

  • The index might get bigger, because all synonyms must be indexed.
  • Search scoring, which relies on term statistics, might suffer because synonyms are also counted, and the statistics for less common words become skewed.
  • Synonym rules can’t be changed for existing documents without reindexing.

...

Using synonyms in search-time analyzers on the other hand doesn’t have many of the above mentioned problems:

  • The index size is unaffected.
  • The term statistics in the corpus stay the same.
  • Changes in the synonym rules don’t require reindexing of documents.

And:

Starting with Elasticsearch 7.3, this reopening of indices in order to see changes in synonym files is no longer needed.

We must:

  • provide a new "search time" config
  • write a better documentation for updating synonyms in production

analysis-emoji 5.2.2 was designed for 5.0.0?

trying to install with my 5.2.2 version of ES:

Neptune-2:elasticsearch-5.2.2$ bin/elasticsearch-plugin install https://github.com/jolicode/emoji-search/releases/download/5.2.2/analysis-emoji-5.2.2.zip -v
-> Downloading https://github.com/jolicode/emoji-search/releases/download/5.2.2/analysis-emoji-5.2.2.zip
Retrieving zip from https://github.com/jolicode/emoji-search/releases/download/5.2.2/analysis-emoji-5.2.2.zip
[=================================================] 100%
Exception in thread "main" java.lang.IllegalArgumentException: Plugin [analysis-emoji] is incompatible with Elasticsearch [5.2.2]. Was designed for version [5.0.0]
	at org.elasticsearch.plugins.PluginInfo.readFromProperties(PluginInfo.java:108)
	at org.elasticsearch.plugins.InstallPluginCommand.verify(InstallPluginCommand.java:421)
	at org.elasticsearch.plugins.InstallPluginCommand.install(InstallPluginCommand.java:486)
	at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:212)
	at org.elasticsearch.plugins.InstallPluginCommand.execute(InstallPluginCommand.java:195)
	at org.elasticsearch.cli.SettingCommand.execute(SettingCommand.java:54)
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122)
	at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:69)
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:122)
	at org.elasticsearch.cli.Command.main(Command.java:88)
	at org.elasticsearch.plugins.PluginCli.main(PluginCli.java:47)

is there any work-around or change i can make to fix this?

Remove variation selectors

Variation selector break the synonyms.

Tried:

"cleaning_char_filter": {
          "type": "mapping",
          "mappings": [
            "\uFE0E=>",  
            "\uFE0F=>"
          ]
        }

But it does works.

To test with the modifiers and ZWJ too.

Automatic emoji file update uppon CLDR release

In the light of what Github Actions can do, we could have it create a new Pull Request everytime there is a new CLDR release.

The process is:

That would be awesome.

For information we are at version 38.1, CLDR is at 41!

cldr-emoji-annotation-synonyms-en.txt: Terms "completely eliminated by analyzer"

Hi @damienalexandre ... Thank you for continuing to collate and update the synonyms files ...

Related to #26, I've used your example mappings with one slight change (placing the synonyms file in /etc/elasticsearch/synonyms instead) as follows:

  1. Installed the analysis-icu plugin and restarted Elasticsearch
  2. Copied the synonyms file as sudo cp synonyms/cldr-emoji-annotation-synonyms-en.txt /etc/elasticsearch/synonyms/

Then ...

PUT /emoji-capable
{
  "settings": {
    "analysis": {
      "filter": {
        "english_emoji": {
          "type": "synonym",
          "synonyms_path": "synonyms/cldr-emoji-annotation-synonyms-en.txt" 
        }
      },
      "analyzer": {
        "english_with_emoji": {
          "tokenizer": "icu_tokenizer",
          "filter": [
            "english_emoji"
          ]
        }
      }
    }
  }
}

which gives me ...

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "failed to build synonyms"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "failed to build synonyms",
    "caused_by" : {
      "type" : "parse_exception",
      "reason" : "Invalid synonym rule at line 859",
      "caused_by" : {
        "type" : "illegal_argument_exception",
        "reason" : "term: * was completely eliminated by analyzer"
      }
    }
  },
  "status" : 400
}

Line 859 is the first instance of a synonym which has a non-alpha synonym, in this case:

✨ => ✨, *, sparkle, sparkles, star

Removing the * from the definition works but then the same issue recurs from line 1257 (βœ… => βœ…, βœ“, button, check, mark) onwards.

This is on Elasticsearch 7.8.0 on Ubuntu 20.04.

Is this a problem with the synonyms file or am I missing something very obvious?

Parsing signatures failed: Method not found while parsing signature: java.lang.Thread#stop(java.lang.Throwable)

I am trying to build with gradle in my local machine but I got this:

Task :forbiddenApisMain FAILED

FAILURE: Build failed with an exception.

  • What went wrong:
    Execution failed for task ':forbiddenApisMain'.

Parsing signatures failed: Method not found while parsing signature: java.lang.Thread#stop(java.lang.Throwable)
Configure project :
=======================================
Elasticsearch Build Hamster says Hello!
=======================================
Gradle Version : 4.10.2
OS Info : Linux 4.15.0-36-generic (amd64)
JDK Version : Oracle Corporation 11.0.1 [Java HotSpot(TM) 64-Bit Server VM 11.0.1+13-LTS]
JAVA_HOME : /usr/lib/jvm/java-11-oracle
Random Testing Seed : BFFF22ECAE0D02CD

The stacktrace prints the following:

  • Exception is:
    org.gradle.api.tasks.TaskExecutionException: Execution failed for task ':forbiddenApisMain'.
    at org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.executeActions(ExecuteActionsTaskExecuter.java:110)
    at org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.execute(ExecuteActionsTaskExecuter.java:77)
    at org.gradle.api.internal.tasks.execution.OutputDirectoryCreatingTaskExecuter.execute(OutputDirectoryCreatingTaskExecuter.java:51)
    at org.gradle.api.internal.tasks.execution.SkipUpToDateTaskExecuter.execute(SkipUpToDateTaskExecuter.java:59)
    at org.gradle.api.internal.tasks.execution.ResolveTaskOutputCachingStateExecuter.execute(ResolveTaskOutputCachingStateExecuter.java:54)
    at org.gradle.api.internal.tasks.execution.ValidatingTaskExecuter.execute(ValidatingTaskExecuter.java:59)
    at org.gradle.api.internal.tasks.execution.SkipEmptySourceFilesTaskExecuter.execute(SkipEmptySourceFilesTaskExecuter.java:101)
    at org.gradle.api.internal.tasks.execution.FinalizeInputFilePropertiesTaskExecuter.execute(FinalizeInputFilePropertiesTaskExecuter.java:44)
    at org.gradle.api.internal.tasks.execution.CleanupStaleOutputsExecuter.execute(CleanupStaleOutputsExecuter.java:91)
    at org.gradle.api.internal.tasks.execution.ResolveTaskArtifactStateTaskExecuter.execute(ResolveTaskArtifactStateTaskExecuter.java:62)
    at org.gradle.api.internal.tasks.execution.SkipTaskWithNoActionsExecuter.execute(SkipTaskWithNoActionsExecuter.java:59)
    at org.gradle.api.internal.tasks.execution.SkipOnlyIfTaskExecuter.execute(SkipOnlyIfTaskExecuter.java:54)
    at org.gradle.api.internal.tasks.execution.ExecuteAtMostOnceTaskExecuter.execute(ExecuteAtMostOnceTaskExecuter.java:43)
    at org.gradle.api.internal.tasks.execution.CatchExceptionTaskExecuter.execute(CatchExceptionTaskExecuter.java:34)
    at org.gradle.api.internal.tasks.execution.EventFiringTaskExecuter$1.run(EventFiringTaskExecuter.java:51)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:300)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:292)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor.execute(DefaultBuildOperationExecutor.java:174)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor.run(DefaultBuildOperationExecutor.java:90)
    at org.gradle.internal.operations.DelegatingBuildOperationExecutor.run(DelegatingBuildOperationExecutor.java:31)
    at org.gradle.api.internal.tasks.execution.EventFiringTaskExecuter.execute(EventFiringTaskExecuter.java:46)
    at org.gradle.execution.taskgraph.LocalTaskInfoExecutor.execute(LocalTaskInfoExecutor.java:42)
    at org.gradle.execution.taskgraph.DefaultTaskExecutionGraph$BuildOperationAwareWorkItemExecutor.execute(DefaultTaskExecutionGraph.java:277)
    at org.gradle.execution.taskgraph.DefaultTaskExecutionGraph$BuildOperationAwareWorkItemExecutor.execute(DefaultTaskExecutionGraph.java:262)
    at org.gradle.execution.taskgraph.DefaultTaskPlanExecutor$ExecutorWorker$1.execute(DefaultTaskPlanExecutor.java:135)
    at org.gradle.execution.taskgraph.DefaultTaskPlanExecutor$ExecutorWorker$1.execute(DefaultTaskPlanExecutor.java:130)
    at org.gradle.execution.taskgraph.DefaultTaskPlanExecutor$ExecutorWorker.execute(DefaultTaskPlanExecutor.java:200)
    at org.gradle.execution.taskgraph.DefaultTaskPlanExecutor$ExecutorWorker.executeWithWork(DefaultTaskPlanExecutor.java:191)
    at org.gradle.execution.taskgraph.DefaultTaskPlanExecutor$ExecutorWorker.run(DefaultTaskPlanExecutor.java:130)
    at org.gradle.execution.taskgraph.DefaultTaskPlanExecutor.process(DefaultTaskPlanExecutor.java:74)
    at org.gradle.execution.taskgraph.DefaultTaskExecutionGraph.execute(DefaultTaskExecutionGraph.java:143)
    at org.gradle.execution.SelectedTaskExecutionAction.execute(SelectedTaskExecutionAction.java:40)
    at org.gradle.execution.DefaultBuildExecuter.execute(DefaultBuildExecuter.java:40)
    at org.gradle.execution.DefaultBuildExecuter.access$000(DefaultBuildExecuter.java:24)
    at org.gradle.execution.DefaultBuildExecuter$1.proceed(DefaultBuildExecuter.java:46)
    at org.gradle.execution.DryRunBuildExecutionAction.execute(DryRunBuildExecutionAction.java:49)
    at org.gradle.execution.DefaultBuildExecuter.execute(DefaultBuildExecuter.java:40)
    at org.gradle.execution.DefaultBuildExecuter.execute(DefaultBuildExecuter.java:33)
    at org.gradle.initialization.DefaultGradleLauncher$ExecuteTasks.run(DefaultGradleLauncher.java:355)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:300)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:292)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor.execute(DefaultBuildOperationExecutor.java:174)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor.run(DefaultBuildOperationExecutor.java:90)
    at org.gradle.internal.operations.DelegatingBuildOperationExecutor.run(DelegatingBuildOperationExecutor.java:31)
    at org.gradle.initialization.DefaultGradleLauncher.runTasks(DefaultGradleLauncher.java:219)
    at org.gradle.initialization.DefaultGradleLauncher.doBuildStages(DefaultGradleLauncher.java:149)
    at org.gradle.initialization.DefaultGradleLauncher.executeTasks(DefaultGradleLauncher.java:124)
    at org.gradle.internal.invocation.GradleBuildController$1.call(GradleBuildController.java:77)
    at org.gradle.internal.invocation.GradleBuildController$1.call(GradleBuildController.java:74)
    at org.gradle.internal.work.DefaultWorkerLeaseService.withLocks(DefaultWorkerLeaseService.java:154)
    at org.gradle.internal.work.StopShieldingWorkerLeaseService.withLocks(StopShieldingWorkerLeaseService.java:38)
    at org.gradle.internal.invocation.GradleBuildController.doBuild(GradleBuildController.java:96)
    at org.gradle.internal.invocation.GradleBuildController.run(GradleBuildController.java:74)
    at org.gradle.tooling.internal.provider.ExecuteBuildActionRunner.run(ExecuteBuildActionRunner.java:28)
    at org.gradle.launcher.exec.ChainingBuildActionRunner.run(ChainingBuildActionRunner.java:35)
    at org.gradle.tooling.internal.provider.ValidatingBuildActionRunner.run(ValidatingBuildActionRunner.java:32)
    at org.gradle.launcher.exec.RunAsBuildOperationBuildActionRunner$3.run(RunAsBuildOperationBuildActionRunner.java:50)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:300)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:292)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor.execute(DefaultBuildOperationExecutor.java:174)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor.run(DefaultBuildOperationExecutor.java:90)
    at org.gradle.internal.operations.DelegatingBuildOperationExecutor.run(DelegatingBuildOperationExecutor.java:31)
    at org.gradle.launcher.exec.RunAsBuildOperationBuildActionRunner.run(RunAsBuildOperationBuildActionRunner.java:45)
    at org.gradle.tooling.internal.provider.SubscribableBuildActionRunner.run(SubscribableBuildActionRunner.java:51)
    at org.gradle.launcher.exec.InProcessBuildActionExecuter$1.transform(InProcessBuildActionExecuter.java:47)
    at org.gradle.launcher.exec.InProcessBuildActionExecuter$1.transform(InProcessBuildActionExecuter.java:44)
    at org.gradle.composite.internal.DefaultRootBuildState.run(DefaultRootBuildState.java:79)
    at org.gradle.launcher.exec.InProcessBuildActionExecuter.execute(InProcessBuildActionExecuter.java:44)
    at org.gradle.launcher.exec.InProcessBuildActionExecuter.execute(InProcessBuildActionExecuter.java:30)
    at org.gradle.launcher.exec.BuildTreeScopeBuildActionExecuter.execute(BuildTreeScopeBuildActionExecuter.java:39)
    at org.gradle.launcher.exec.BuildTreeScopeBuildActionExecuter.execute(BuildTreeScopeBuildActionExecuter.java:25)
    at org.gradle.tooling.internal.provider.ContinuousBuildActionExecuter.execute(ContinuousBuildActionExecuter.java:80)
    at org.gradle.tooling.internal.provider.ContinuousBuildActionExecuter.execute(ContinuousBuildActionExecuter.java:53)
    at org.gradle.tooling.internal.provider.ServicesSetupBuildActionExecuter.execute(ServicesSetupBuildActionExecuter.java:62)
    at org.gradle.tooling.internal.provider.ServicesSetupBuildActionExecuter.execute(ServicesSetupBuildActionExecuter.java:34)
    at org.gradle.tooling.internal.provider.GradleThreadBuildActionExecuter.execute(GradleThreadBuildActionExecuter.java:36)
    at org.gradle.tooling.internal.provider.GradleThreadBuildActionExecuter.execute(GradleThreadBuildActionExecuter.java:25)
    at org.gradle.tooling.internal.provider.ParallelismConfigurationBuildActionExecuter.execute(ParallelismConfigurationBuildActionExecuter.java:43)
    at org.gradle.tooling.internal.provider.ParallelismConfigurationBuildActionExecuter.execute(ParallelismConfigurationBuildActionExecuter.java:29)
    at org.gradle.tooling.internal.provider.StartParamsValidatingActionExecuter.execute(StartParamsValidatingActionExecuter.java:59)
    at org.gradle.tooling.internal.provider.StartParamsValidatingActionExecuter.execute(StartParamsValidatingActionExecuter.java:31)
    at org.gradle.tooling.internal.provider.SessionFailureReportingActionExecuter.execute(SessionFailureReportingActionExecuter.java:59)
    at org.gradle.tooling.internal.provider.SessionFailureReportingActionExecuter.execute(SessionFailureReportingActionExecuter.java:44)
    at org.gradle.tooling.internal.provider.SetupLoggingActionExecuter.execute(SetupLoggingActionExecuter.java:46)
    at org.gradle.tooling.internal.provider.SetupLoggingActionExecuter.execute(SetupLoggingActionExecuter.java:30)
    at org.gradle.launcher.daemon.server.exec.ExecuteBuild.doBuild(ExecuteBuild.java:67)
    at org.gradle.launcher.daemon.server.exec.BuildCommandOnly.execute(BuildCommandOnly.java:36)
    at org.gradle.launcher.daemon.server.api.DaemonCommandExecution.proceed(DaemonCommandExecution.java:122)
    at org.gradle.launcher.daemon.server.exec.WatchForDisconnection.execute(WatchForDisconnection.java:37)
    at org.gradle.launcher.daemon.server.api.DaemonCommandExecution.proceed(DaemonCommandExecution.java:122)
    at org.gradle.launcher.daemon.server.exec.ResetDeprecationLogger.execute(ResetDeprecationLogger.java:26)
    at org.gradle.launcher.daemon.server.api.DaemonCommandExecution.proceed(DaemonCommandExecution.java:122)
    at org.gradle.launcher.daemon.server.exec.RequestStopIfSingleUsedDaemon.execute(RequestStopIfSingleUsedDaemon.java:34)
    at org.gradle.launcher.daemon.server.api.DaemonCommandExecution.proceed(DaemonCommandExecution.java:122)
    at org.gradle.launcher.daemon.server.exec.ForwardClientInput$2.call(ForwardClientInput.java:74)
    at org.gradle.launcher.daemon.server.exec.ForwardClientInput$2.call(ForwardClientInput.java:72)
    at org.gradle.util.Swapper.swap(Swapper.java:38)
    at org.gradle.launcher.daemon.server.exec.ForwardClientInput.execute(ForwardClientInput.java:72)
    at org.gradle.launcher.daemon.server.api.DaemonCommandExecution.proceed(DaemonCommandExecution.java:122)
    at org.gradle.launcher.daemon.server.exec.LogAndCheckHealth.execute(LogAndCheckHealth.java:55)
    at org.gradle.launcher.daemon.server.api.DaemonCommandExecution.proceed(DaemonCommandExecution.java:122)
    at org.gradle.launcher.daemon.server.exec.LogToClient.doBuild(LogToClient.java:62)
    at org.gradle.launcher.daemon.server.exec.BuildCommandOnly.execute(BuildCommandOnly.java:36)
    at org.gradle.launcher.daemon.server.api.DaemonCommandExecution.proceed(DaemonCommandExecution.java:122)
    at org.gradle.launcher.daemon.server.exec.EstablishBuildEnvironment.doBuild(EstablishBuildEnvironment.java:81)
    at org.gradle.launcher.daemon.server.exec.BuildCommandOnly.execute(BuildCommandOnly.java:36)
    at org.gradle.launcher.daemon.server.api.DaemonCommandExecution.proceed(DaemonCommandExecution.java:122)
    at org.gradle.launcher.daemon.server.exec.StartBuildOrRespondWithBusy$1.run(StartBuildOrRespondWithBusy.java:50)
    at org.gradle.launcher.daemon.server.DaemonStateCoordinator$1.run(DaemonStateCoordinator.java:295)
    at org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
    at org.gradle.internal.concurrent.ManagedExecutorImpl$1.run(ManagedExecutorImpl.java:46)
    at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:55)
    Caused by: org.gradle.api.InvalidUserDataException: Parsing signatures failed: Method not found while parsing signature: java.lang.Thread#stop(java.lang.Throwable)
    at de.thetaphi.forbiddenapis.gradle.CheckForbiddenApis.checkForbidden(CheckForbiddenApis.java:596)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at org.gradle.internal.reflect.JavaMethod.invoke(JavaMethod.java:73)
    at org.gradle.api.internal.project.taskfactory.StandardTaskAction.doExecute(StandardTaskAction.java:46)
    at org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:39)
    at org.gradle.api.internal.project.taskfactory.StandardTaskAction.execute(StandardTaskAction.java:26)
    at org.gradle.api.internal.AbstractTask$TaskActionWrapper.execute(AbstractTask.java:801)
    at org.gradle.api.internal.AbstractTask$TaskActionWrapper.execute(AbstractTask.java:768)
    at org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter$1.run(ExecuteActionsTaskExecuter.java:131)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:300)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor$RunnableBuildOperationWorker.execute(DefaultBuildOperationExecutor.java:292)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor.execute(DefaultBuildOperationExecutor.java:174)
    at org.gradle.internal.operations.DefaultBuildOperationExecutor.run(DefaultBuildOperationExecutor.java:90)
    at org.gradle.internal.operations.DelegatingBuildOperationExecutor.run(DelegatingBuildOperationExecutor.java:31)
    at org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.executeAction(ExecuteActionsTaskExecuter.java:120)
    at org.gradle.api.internal.tasks.execution.ExecuteActionsTaskExecuter.executeActions(ExecuteActionsTaskExecuter.java:99)
    ... 111 more
    Caused by: de.thetaphi.forbiddenapis.ParseException: Method not found while parsing signature: java.lang.Thread#stop(java.lang.Throwable)
    at de.thetaphi.forbiddenapis.Checker$UnresolvableReporting$1.parseFailed(Checker.java:100)
    at de.thetaphi.forbiddenapis.Checker.addSignature(Checker.java:419)
    at de.thetaphi.forbiddenapis.Checker.parseSignaturesFile(Checker.java:545)
    at de.thetaphi.forbiddenapis.Checker.parseSignaturesFile(Checker.java:516)
    at de.thetaphi.forbiddenapis.Checker.parseSignaturesFile(Checker.java:496)
    at de.thetaphi.forbiddenapis.Checker.parseSignaturesFile(Checker.java:501)
    at de.thetaphi.forbiddenapis.gradle.CheckForbiddenApis.checkForbidden(CheckForbiddenApis.java:583)
    ... 128 more

Plugin release for ElasticSearch 7.8

Hi @damienalexandre,

We're having the same issue you describe here, when using the dictionary file from this repository.

We've just upgraded from ElasticSearch 5.3 to 7.8 however, so we can't use your plugin to solve this issue (yet).
Is a 7.8 release on your roadmap by any chance?

Thanks in advance!

Investigate the plugin usefulness in the light of how ICUTokenizer works

Looks like we could just provide a new Rule File instead of tricking the ICU Tokenizer.

As this code show:

https://github.com/apache/lucene-solr/blob/23bff7dbc207083af2ccb1b308c121ac18c36508/lucene/analysis/icu/src/java/org/apache/lucene/analysis/icu/segmentation/ICUTokenizerFactory.java#L116-L125

The default config is used when there is no file for the current "script" (which was a fear I had about this possibility to change de Rbbi).

What the plugin could do then:

  • load the Rbbi, edit it, and load it for some "script" via the real option;
  • create the token filters for the synonyms directly

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.