linkedinattic / indextank-engine Goto Github PK
View Code? Open in Web Editor NEWIndexing engine for IndexTank
Home Page: http://indextank.com/
License: Apache License 2.0
Indexing engine for IndexTank
Home Page: http://indextank.com/
License: Apache License 2.0
First of all, thank for sharing this amazing project.
I am having issues though with the settings of the cassandra as a index storage and recovery.
It seems to me that the interface for cassandra storage has not been implemented.
Am I wrong ?
Thanks in advance.
It appears that instantlinks is only part of indextank-service, and there is not a endpoint on the engine like autocomplete.
Are there plans to make it part of the engine?
And no updates here for awhile, is this project dead?
If I start the embedded api and index some documents, I'm able to query them. If I stop and restart the embedded api, if I query for any document that was previously in the index, the embedded API throws the IndextankException below. Searching for a term that wasn't previously in the index returns a correct json result of zero matches.
I am using the default sample-engine-config and running on OS X.
Is there something I'm doing wrong here? Do I have to do something to trigger a reload of the previously indexed documents?
/var/www/indextank/indextank-engine$ java -cp target/indextank-engine-1.0.0-jar-with-dependencies.jar com.flaptor.indextank.api.Launcher
WARN [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [log4j.properties not found on classpath!] 2012-01-15 12:56:30,351
INFO [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'environment-prefix' set to TEST] 2012-01-15 12:56:30,359
INFO [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'facets' set to true] 2012-01-15 12:56:30,359
INFO [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'index-code' set to dbajo] 2012-01-15 12:56:30,359
INFO [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'conf-file' set to sample-engine-config] 2012-01-15 12:56:30,365
INFO [main] com.flaptor.indextank.suggest.NewPopularityIndex - [Loading popularity index terms from disk.] 2012-01-15 12:56:30,724
INFO [main] com.flaptor.indextank.suggest.NewPopularityIndex - [Terms loaded] 2012-01-15 12:56:30,725
INFO [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Index recovery configuration set to recover index from simpleDB] 2012-01-15 12:56:30,725
INFO [main] com.flaptor.indextank.index.storage.InMemoryStorage - [Starting a new(empty) InMemoryStorage.] 2012-01-15 12:56:30,726
INFO [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Using in-memory storage] 2012-01-15 12:56:30,727
INFO [main] org.eclipse.jetty.util.log - [jetty-7.x.y-SNAPSHOT] 2012-01-15 12:56:30,790
INFO [main] org.eclipse.jetty.util.log - [started o.e.j.s.ServletContextHandler{/,null}] 2012-01-15 12:56:30,821
INFO [main] org.eclipse.jetty.util.log - [Started [email protected]:20220 STARTING] 2012-01-15 12:56:30,849
IndextankException(message:null)
at com.flaptor.indextank.api.IndexEngineApi.search(IndexEngineApi.java:94)
at com.flaptor.indextank.api.resources.Search.run(Search.java:79)
at com.ghosthack.turismo.servlet.Servlet.service(Servlet.java:55)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:538)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:478)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:937)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:406)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:871)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:110)
at org.eclipse.jetty.server.Server.handle(Server.java:346)
at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:589)
at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:601)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:214)
at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:411)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:535)
at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:40)
at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:529)
at java.lang.Thread.run(Thread.java:637)
Hi all,
I am a web application developer and I am trying out IndexTank to be part of my application. However, I encountered error even when starting with the example.
I followed the given commands:
$ curl -d "{"docid":"post1", "fields":{"text":"I love Fallout"}}" -v -X PUT http://localhost:20220/v1/indexes/idx/docs
$ curl -d "{"docid":"post2", "fields":{"text":"I love Planescape"}}" -v -X PUT http://localhost:20220/v1/indexes/idx/docs
$ curl http://localhost:20220/v1/indexes/idx/search?q=love
Then the search returned the following error,
java.lang.Error: Unresolved compilation:
at com.flaptor.util.CollectionsUtil.mergeIterables(CollectionsUtil.java:186)
Could it be a problem with java type erasure as indicated by eclipse when I imported the source code in?
Please advise on what I should do to rectify the problem. Thank you very much in advance.
Hi all,
I don't know whether you are aware of this but when after I inserted documents into the index 'idx', I can query the same result even if I specify a different index.
e.g.
$ curl http://localhost:20220/v1/indexes/nonidx/search?q=love
will give the same result as in
$ curl http://localhost:20220/v1/indexes/idx/search?q=love
Is this intended?
Thank you.
To avoid situations where an index gets overloaded with searches (sometimes a slow search can cause this), I implemented an option called "max_search_queue", which sets a limit of the number of threads waiting to acquire the search semaphore in TrafficLimitingSearcher. If this limit is reached, it just rejects the search and throws InterruptedException instead of melting down. I've been running this in production and it seems to work well.
If this sounds good, I'll clean this up into a nice pull request, but here's the main code change:
The option name I chose in the indexengine_config is "max_search_queue".
amazing work could you show me the indextank performance doc.
eg, how much docs can hold per indextank .
I am running the indextank service locally on my home computer but when i try to populate the demoindex entry i get this:
{'Authorization': 'Basic OjlVUjBGeHRjbGZTUFAw', 'User-Agent': 'IndexTank-Python/1.0.7'}
/v1/indexes/idx/autocomplete?callback=mycallback&query=test
returns back a json array: ['testing'], not jsonp with my callback function.
First step: I added a document {text: 'something', author: 'rua'}
Second ste: I added a document with new field {text: 'something', author: 'rua', type: '1'}
Now, I query http://localhost:20220/v1/indexes/candidates/search?q=author:rua&fetch=*, I got 503 Service unavaiable
I think I can update rest of document with new field, but that is not a best way.
Hi there,
I'm a researcher studying software evolution. As part of my current research, I'm studying the implications of open-sourcing a proprietary software, for instance, if the project succeed in attracting newcomers. However, I observed that some projects, like indextank-engine, deleted the software history during the transition to open-source.
Knowing that software history is indispensable for developers (e.g., developers need to refer to history several times a day), I would like to ask indextank-engine developers the following four brief questions:
Thanks in advance for your collaboration,
Gustavo Pinto, PhD
http://www.gustavopinto.org
when i try and add a function with -: index.addFunction(1, "relevance");
i get the indextank.apiclient.IndexDoesNotExistException
and Error 404 Not Found HTTP ERROR: 404
Problem accessing /v1/indexes/idx/functions/1.
Reason:Not Found
Hi all,
After I quick started the indextank-engine and indexed some documents, what will happen to the documents if I shut down the engine? Will the documents along with the index be persisted in some files or do I have to reindexed each time I restart the engine?
Sorry for this series of questions but I can't find any user-guide on deploying my personal indextank. Thank you.
I thought I'd open this thread for anyone interested in indextank as a hosted service to indicate their interest.
As a current IndexTank customer, I'd definitely be interested in switching over to a new service hosting the IndexTank software.
Hi all,
Instead of single jar containing all dependencies how about using Open Services Gateway Initiative (OSGi) like Apache Felix or Equinox to manage all dependencies.
Regards
I'd love to see support for multiple category values for a document. This would be great for use cases like tags, multiple colors/sizes on products, etc. I found it surprising that this isn't supported (everything else is so great). It ends up being a show-stopper for me.
Creating this issue to start community discussion for adding support for tags. Currently categories don't allow more than one value per category per document. For example, if a doc has a category "material", with values like metal, plastic, wood, each document can only have one value for this category.
Santip mentioned on this HN thread some limitations of the current categories implementation, and the possibility of adding tags:
IndexTank categories are not designed for the tags use case, and will not work properly. It's intended for a relatively small amount of categories for which each document has a single value. The amount of different values of a category can be large but the amount of categories cannot. If you want to implement something like tags, then each tag should be a category because you'll want more than a single tag per document. We were in the process of designing a new feature to support this kind of use cases, and maybe we'll start a branch to implement it and hopefully the community will colaborate.
I know that I can add every documents a field "all=1", and then fetch all by indexs/idx/search?q=all:1
But I prefer "q" parameter is a totally optional param.
There doesn't appear to be a testsuite of any kind associated with this code. Is this, perhaps, something that got omitted from the open source release? If so, is there any chance of getting it released too?
If there never was a testsuite, it would be good to start one to allow efficient further development.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.