Open Source REST API for named entity extraction, named entity linking, named entity disambiguation, recommendation & reconciliation of entities like persons, organizations and places for (semi)automatic semantic tagging & analysis of documents by linked data knowledge graph like SKOS thesaurus, RDF ontology, database(s) or list(s) of names
Optional parameter / autodetection of (con)text language to limit Entity Extraction by language for usage of very general multilingual thesaurus with many false friends.
I want to use docker to run the search engine on windows. I did run docker pull opensemanticsearch/solr but do not know how to do "Call shell script build-deb" to build the dependencies.
When run docker run -p 8983:8983 opensemanticsearch/solr i can access teh service thought "http://localhost:8983/" but I expected a search interface but see the solr Admin console.
How do run semantic search using docker instead of using virtual box?
I have attempted to understand how to use solr's api calls but have not managed to due to I am not able to find the collection name that open semantic seach uses in solr
With the new https://lucene.apache.org/solr/guide/7_4/the-tagger-handler.html we can integrate entity extraction with stemming at ETL time and need not to temporary index the document and to manage an additional dictionary file anymore with need of core reloads, since all in Solr index and updateable by Solr API in near realtime.
Outsource the plain text list of entities importer from Open Semantic Search Apps to generic Open Semantic Entity Search API module & command line tool.
For scoring by named entity recognition class use Spacy, which is integrated with Open Semantic ETL.
For scoring by "more like this" and different scoring by fields like name or description use Solr/Elasticsearch index / API
For other scoring methods in issues evaluate existing Python libraries.
For API standards/parameters inspiration from similar Open Source software:
Check if the new TaggerRequestHandler (AKA SolrTextTagger) for tagging text in Solr 7.4 https://lucene.apache.org/solr/guide/7_4/the-tagger-handler.html can be used for dictionary extraction without have to add/index the text temproary to the Solr core like done now and using Solr index for labels to be extracted instead of plain text lists for a filter.
Import named entities from database so we can do normalization by additional data base fields or links, not only like now export lists of names to plain text lists or by converting tables to SKOS manually.
Automatic setup of not yet existent dictionaries by adding entities to dictionaries so we do not have to setup new dictionaries by Open Semantic Search Apps anymore and API users have not to add dictionaries before use manually by dictionary manager.
Build machine learning model for scores/probabilities based on classification by machine learning from documents, which are tagged by human editors by this entities by Open Semantic Tagger UI or other annotations (f.e. in Drupal).
Move aggragation of all different labels for FST index field from Python lib to Solr schema, so entities can be added directly by Solr APIs like HTTP-REST-API, CSV importer or from SQL dbs without preprocessing by Entity Manager Python library.