linkedinattic / indextank-engine Goto Github PK

View Code? Open in Web Editor NEW

845.0 845.0 110.0 3.56 MB

Indexing engine for IndexTank

Home Page: http://indextank.com/

License: Apache License 2.0

Java 99.37% Python 0.63%

indextank-engine's People

Contributors

Stargazers

Watchers

Forkers

mechanism d5nguyenvan iladriano clamprecht cpatni oztc ramiyer luca allwefantasy johnnyhg xuanhan863 wufatingliu konakb jimly kathishah gollapudi rafanami chinalife gotomypc hasantayyar shitup yaslama iperez k-saka earsandwhiskers jonasyuandotcom jnorthrop laurencer astral1 loniszczuk kemitche greco heytong cybercent pombredanne searchify dbuthay vendow r2d2v1 advanced lincolnodds kashdan xyuan greenlinechina gnomix vasundhar hittudiv nicolas-arias zhuomingliang lihuibng joshuamckenty open-source-gis netconstructor uxscripts san-diego-web-design danry25 kola0o0 supermapcloud no8899 jameswei wongtai jhandl mcpoet ccstartfish101 abhihub johnulist elbunuelo philolee avidigmi logicalspark pjzjchn component1a mkurian mbit-cloud iokays live0717 tchen0123 xing5 ignacionf luckylecher xinqing yusong666666 folkcode sobolsigizmund davidmr001 zjpjohn yonglehou scorpion1750 rugby110 jxqlovejava wiltonlazary cffyh teotikalki aaronkingdom appsecai-test colinguozizhong garyfub santip drluorose myscl123

indextank-engine's Issues

How to use cassandra as index storage and index recovery.

First of all, thank for sharing this amazing project.
I am having issues though with the settings of the cassandra as a index storage and recovery.
It seems to me that the interface for cassandra storage has not been implemented.
Am I wrong ?
Thanks in advance.

Make instantlinks part of the engine like autocomplete

It appears that instantlinks is only part of indextank-service, and there is not a endpoint on the engine like autocomplete.

Are there plans to make it part of the engine?

IndexTank.com is down

And no updates here for awhile, is this project dead?

Embedded API raises exception during search after restart

If I start the embedded api and index some documents, I'm able to query them. If I stop and restart the embedded api, if I query for any document that was previously in the index, the embedded API throws the IndextankException below. Searching for a term that wasn't previously in the index returns a correct json result of zero matches.

I am using the default sample-engine-config and running on OS X.

Is there something I'm doing wrong here? Do I have to do something to trigger a reload of the previously indexed documents?

/var/www/indextank/indextank-engine$ java -cp target/indextank-engine-1.0.0-jar-with-dependencies.jar com.flaptor.indextank.api.Launcher 
WARN  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [log4j.properties not found on classpath!] 2012-01-15 12:56:30,351
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'environment-prefix' set to TEST] 2012-01-15 12:56:30,359
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'facets' set to true] 2012-01-15 12:56:30,359
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'index-code' set to dbajo] 2012-01-15 12:56:30,359
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'conf-file' set to sample-engine-config] 2012-01-15 12:56:30,365
INFO  [main] com.flaptor.indextank.suggest.NewPopularityIndex - [Loading popularity index terms from disk.] 2012-01-15 12:56:30,724
INFO  [main] com.flaptor.indextank.suggest.NewPopularityIndex - [Terms loaded] 2012-01-15 12:56:30,725
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Index recovery configuration set to recover index from simpleDB] 2012-01-15 12:56:30,725
INFO  [main] com.flaptor.indextank.index.storage.InMemoryStorage - [Starting a new(empty) InMemoryStorage.] 2012-01-15 12:56:30,726
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Using in-memory storage] 2012-01-15 12:56:30,727
INFO  [main] org.eclipse.jetty.util.log - [jetty-7.x.y-SNAPSHOT] 2012-01-15 12:56:30,790
INFO  [main] org.eclipse.jetty.util.log - [started o.e.j.s.ServletContextHandler{/,null}] 2012-01-15 12:56:30,821
INFO  [main] org.eclipse.jetty.util.log - [Started [email protected]:20220 STARTING] 2012-01-15 12:56:30,849
IndextankException(message:null)
    at com.flaptor.indextank.api.IndexEngineApi.search(IndexEngineApi.java:94)
    at com.flaptor.indextank.api.resources.Search.run(Search.java:79)
    at com.ghosthack.turismo.servlet.Servlet.service(Servlet.java:55)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:538)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:478)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:937)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:406)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:871)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:110)
    at org.eclipse.jetty.server.Server.handle(Server.java:346)
    at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:589)
    at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:601)
    at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:214)
    at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:411)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:535)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:40)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:529)
    at java.lang.Thread.run(Thread.java:637)

Problem accessing /v1/indexes/idx/search.

Hi all,

I am a web application developer and I am trying out IndexTank to be part of my application. However, I encountered error even when starting with the example.

I followed the given commands:

$ curl -d "{"docid":"post1", "fields":{"text":"I love Fallout"}}" -v -X PUT http://localhost:20220/v1/indexes/idx/docs

$ curl -d "{"docid":"post2", "fields":{"text":"I love Planescape"}}" -v -X PUT http://localhost:20220/v1/indexes/idx/docs

$ curl http://localhost:20220/v1/indexes/idx/search?q=love

Then the search returned the following error,

java.lang.Error: Unresolved compilation:
at com.flaptor.util.CollectionsUtil.mergeIterables(CollectionsUtil.java:186)

Could it be a problem with java type erasure as indicated by eclipse when I imported the source code in?

Please advise on what I should do to rectify the problem. Thank you very much in advance.

index bug?

Hi all,

I don't know whether you are aware of this but when after I inserted documents into the index 'idx', I can query the same result even if I specify a different index.

e.g.

$ curl http://localhost:20220/v1/indexes/nonidx/search?q=love

will give the same result as in

$ curl http://localhost:20220/v1/indexes/idx/search?q=love

Is this intended?

Thank you.

Implement maximum search queue length

To avoid situations where an index gets overloaded with searches (sometimes a slow search can cause this), I implemented an option called "max_search_queue", which sets a limit of the number of threads waiting to acquire the search semaphore in TrafficLimitingSearcher. If this limit is reached, it just rejects the search and throws InterruptedException instead of melting down. I've been running this in production and it seems to work well.

If this sounds good, I'll clean this up into a nice pull request, but here's the main code change:

clamprecht@1e766e2

The option name I chose in the indexengine_config is "max_search_queue".

amazing work

amazing work could you show me the indextank performance doc.
eg, how much docs can hold per indextank .

the storefront fails to communicate with the api

I am running the indextank service locally on my home computer but when i try to populate the demoindex entry i get this:

{'Authorization': 'Basic OjlVUjBGeHRjbGZTUFAw', 'User-Agent': 'IndexTank-Python/1.0.7'}

Is there any way to use fuzzy search(stemming)

autocomplete does not support jsonp

/v1/indexes/idx/autocomplete?callback=mycallback&query=test

returns back a json array: ['testing'], not jsonp with my callback function.

Support non-fixed schema for indexed document

First step: I added a document {text: 'something', author: 'rua'}
Second ste: I added a document with new field {text: 'something', author: 'rua', type: '1'}
Now, I query http://localhost:20220/v1/indexes/candidates/search?q=author:rua&fetch=*, I got 503 Service unavaiable
I think I can update rest of document with new field, but that is not a best way.

Why the software history was not kept?

Hi there,

I'm a researcher studying software evolution. As part of my current research, I'm studying the implications of open-sourcing a proprietary software, for instance, if the project succeed in attracting newcomers. However, I observed that some projects, like indextank-engine, deleted the software history during the transition to open-source.

bbf3a53

Knowing that software history is indispensable for developers (e.g., developers need to refer to history several times a day), I would like to ask indextank-engine developers the following four brief questions:

Why did you decide to not keep the software history?
Do the core developers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
Do the newcomers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
How does the lack of history impacted on software evolution? Does it placed any burden in understanding and evolving the software?

Thanks in advance for your collaboration,

Gustavo Pinto, PhD
http://www.gustavopinto.org

Not able to add function

when i try and add a function with -: index.addFunction(1, "relevance");

    i get the indextank.apiclient.IndexDoesNotExistException
    and Error 404 Not Found HTTP ERROR: 404
    Problem accessing /v1/indexes/idx/functions/1.
    Reason:Not Found

Where is the index persisted?

Hi all,

After I quick started the indextank-engine and indexed some documents, what will happen to the documents if I shut down the engine? Will the documents along with the index be persisted in some files or do I have to reindexed each time I restart the engine?

Sorry for this series of questions but I can't find any user-guide on deploying my personal indextank. Thank you.

I am interested in a hosted IndexTank

I thought I'd open this thread for anyone interested in indextank as a hosted service to indicate their interest.

As a current IndexTank customer, I'd definitely be interested in switching over to a new service hosting the IndexTank software.

OSGi

Hi all,
Instead of single jar containing all dependencies how about using Open Services Gateway Initiative (OSGi) like Apache Felix or Equinox to manage all dependencies.
Regards

Support for Multiple-Valued Categories

I'd love to see support for multiple category values for a document. This would be great for use cases like tags, multiple colors/sizes on products, etc. I found it surprising that this isn't supported (everything else is so great). It ends up being a show-stopper for me.

Support for tags

Creating this issue to start community discussion for adding support for tags. Currently categories don't allow more than one value per category per document. For example, if a doc has a category "material", with values like metal, plastic, wood, each document can only have one value for this category.

Santip mentioned on this HN thread some limitations of the current categories implementation, and the possibility of adding tags:

IndexTank categories are not designed for the tags use case, and will not work properly. It's intended for a relatively small amount of categories for which each document has a single value. The amount of different values of a category can be large but the amount of categories cannot. If you want to implement something like tags, then each tag should be a category because you'll want more than a single tag per document. We were in the process of designing a new feature to support this kind of use cases, and maybe we'll start a branch to implement it and hopefully the community will colaborate.

Support fetch/search all document without "q" parameter

I know that I can add every documents a field "all=1", and then fetch all by indexs/idx/search?q=all:1
But I prefer "q" parameter is a totally optional param.

No unittests

There doesn't appear to be a testsuite of any kind associated with this code. Is this, perhaps, something that got omitted from the open source release? If so, is there any chance of getting it released too?

If there never was a testsuite, it would be good to start one to allow efficient further development.

linkedinattic / indextank-engine Goto Github PK

indextank-engine's People

Contributors

Stargazers

Watchers

Forkers

indextank-engine's Issues

Recommend Projects

Recommend Topics

Recommend Org