Giter Site home page Giter Site logo

indextank-engine's People

Contributors

astral1 avatar clamprecht avatar dbuthay avatar iladriano avatar santip avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

indextank-engine's Issues

How to use cassandra as index storage and index recovery.

First of all, thank for sharing this amazing project.
I am having issues though with the settings of the cassandra as a index storage and recovery.
It seems to me that the interface for cassandra storage has not been implemented.
Am I wrong ?
Thanks in advance.

Embedded API raises exception during search after restart

If I start the embedded api and index some documents, I'm able to query them. If I stop and restart the embedded api, if I query for any document that was previously in the index, the embedded API throws the IndextankException below. Searching for a term that wasn't previously in the index returns a correct json result of zero matches.

I am using the default sample-engine-config and running on OS X.

Is there something I'm doing wrong here? Do I have to do something to trigger a reload of the previously indexed documents?

/var/www/indextank/indextank-engine$ java -cp target/indextank-engine-1.0.0-jar-with-dependencies.jar com.flaptor.indextank.api.Launcher 
WARN  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [log4j.properties not found on classpath!] 2012-01-15 12:56:30,351
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'environment-prefix' set to TEST] 2012-01-15 12:56:30,359
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'facets' set to true] 2012-01-15 12:56:30,359
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'index-code' set to dbajo] 2012-01-15 12:56:30,359
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Command line option 'conf-file' set to sample-engine-config] 2012-01-15 12:56:30,365
INFO  [main] com.flaptor.indextank.suggest.NewPopularityIndex - [Loading popularity index terms from disk.] 2012-01-15 12:56:30,724
INFO  [main] com.flaptor.indextank.suggest.NewPopularityIndex - [Terms loaded] 2012-01-15 12:56:30,725
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Index recovery configuration set to recover index from simpleDB] 2012-01-15 12:56:30,725
INFO  [main] com.flaptor.indextank.index.storage.InMemoryStorage - [Starting a new(empty) InMemoryStorage.] 2012-01-15 12:56:30,726
INFO  [main] com.flaptor.indextank.api.EmbeddedIndexEngine - [Using in-memory storage] 2012-01-15 12:56:30,727
INFO  [main] org.eclipse.jetty.util.log - [jetty-7.x.y-SNAPSHOT] 2012-01-15 12:56:30,790
INFO  [main] org.eclipse.jetty.util.log - [started o.e.j.s.ServletContextHandler{/,null}] 2012-01-15 12:56:30,821
INFO  [main] org.eclipse.jetty.util.log - [Started [email protected]:20220 STARTING] 2012-01-15 12:56:30,849
IndextankException(message:null)
    at com.flaptor.indextank.api.IndexEngineApi.search(IndexEngineApi.java:94)
    at com.flaptor.indextank.api.resources.Search.run(Search.java:79)
    at com.ghosthack.turismo.servlet.Servlet.service(Servlet.java:55)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:538)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:478)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:937)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:406)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:183)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:871)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:110)
    at org.eclipse.jetty.server.Server.handle(Server.java:346)
    at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:589)
    at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:1048)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:601)
    at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:214)
    at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:411)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:535)
    at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:40)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:529)
    at java.lang.Thread.run(Thread.java:637)

Problem accessing /v1/indexes/idx/search.

Hi all,

I am a web application developer and I am trying out IndexTank to be part of my application. However, I encountered error even when starting with the example.

I followed the given commands:

$ curl -d "{"docid":"post1", "fields":{"text":"I love Fallout"}}" -v -X PUT http://localhost:20220/v1/indexes/idx/docs

$ curl -d "{"docid":"post2", "fields":{"text":"I love Planescape"}}" -v -X PUT http://localhost:20220/v1/indexes/idx/docs

$ curl http://localhost:20220/v1/indexes/idx/search?q=love

Then the search returned the following error,

java.lang.Error: Unresolved compilation:
at com.flaptor.util.CollectionsUtil.mergeIterables(CollectionsUtil.java:186)

Could it be a problem with java type erasure as indicated by eclipse when I imported the source code in?

Please advise on what I should do to rectify the problem. Thank you very much in advance.

Implement maximum search queue length

To avoid situations where an index gets overloaded with searches (sometimes a slow search can cause this), I implemented an option called "max_search_queue", which sets a limit of the number of threads waiting to acquire the search semaphore in TrafficLimitingSearcher. If this limit is reached, it just rejects the search and throws InterruptedException instead of melting down. I've been running this in production and it seems to work well.

If this sounds good, I'll clean this up into a nice pull request, but here's the main code change:

clamprecht@1e766e2

The option name I chose in the indexengine_config is "max_search_queue".

amazing work

amazing work could you show me the indextank performance doc.
eg, how much docs can hold per indextank .

the storefront fails to communicate with the api

I am running the indextank service locally on my home computer but when i try to populate the demoindex entry i get this:

{'Authorization': 'Basic OjlVUjBGeHRjbGZTUFAw', 'User-Agent': 'IndexTank-Python/1.0.7'}

Why the software history was not kept?

Hi there,

I'm a researcher studying software evolution. As part of my current research, I'm studying the implications of open-sourcing a proprietary software, for instance, if the project succeed in attracting newcomers. However, I observed that some projects, like indextank-engine, deleted the software history during the transition to open-source.

bbf3a53

Knowing that software history is indispensable for developers (e.g., developers need to refer to history several times a day), I would like to ask indextank-engine developers the following four brief questions:

  1. Why did you decide to not keep the software history?
  2. Do the core developers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
  3. Do the newcomers faced any kind of problems, when trying to refer to the old history? If so, how did they solve these problems?
  4. How does the lack of history impacted on software evolution? Does it placed any burden in understanding and evolving the software?

Thanks in advance for your collaboration,

Gustavo Pinto, PhD
http://www.gustavopinto.org

Not able to add function

when i try and add a function with -: index.addFunction(1, "relevance");

    i get the indextank.apiclient.IndexDoesNotExistException
    and Error 404 Not Found HTTP ERROR: 404
    Problem accessing /v1/indexes/idx/functions/1.
    Reason:Not Found

Where is the index persisted?

Hi all,

After I quick started the indextank-engine and indexed some documents, what will happen to the documents if I shut down the engine? Will the documents along with the index be persisted in some files or do I have to reindexed each time I restart the engine?

Sorry for this series of questions but I can't find any user-guide on deploying my personal indextank. Thank you.

I am interested in a hosted IndexTank

I thought I'd open this thread for anyone interested in indextank as a hosted service to indicate their interest.

As a current IndexTank customer, I'd definitely be interested in switching over to a new service hosting the IndexTank software.

OSGi

Hi all,
Instead of single jar containing all dependencies how about using Open Services Gateway Initiative (OSGi) like Apache Felix or Equinox to manage all dependencies.
Regards

Support for Multiple-Valued Categories

I'd love to see support for multiple category values for a document. This would be great for use cases like tags, multiple colors/sizes on products, etc. I found it surprising that this isn't supported (everything else is so great). It ends up being a show-stopper for me.

Support for tags

Creating this issue to start community discussion for adding support for tags. Currently categories don't allow more than one value per category per document. For example, if a doc has a category "material", with values like metal, plastic, wood, each document can only have one value for this category.

Santip mentioned on this HN thread some limitations of the current categories implementation, and the possibility of adding tags:

IndexTank categories are not designed for the tags use case, and will not work properly. It's intended for a relatively small amount of categories for which each document has a single value. The amount of different values of a category can be large but the amount of categories cannot. If you want to implement something like tags, then each tag should be a category because you'll want more than a single tag per document. We were in the process of designing a new feature to support this kind of use cases, and maybe we'll start a branch to implement it and hopefully the community will colaborate.

No unittests

There doesn't appear to be a testsuite of any kind associated with this code. Is this, perhaps, something that got omitted from the open source release? If so, is there any chance of getting it released too?

If there never was a testsuite, it would be good to start one to allow efficient further development.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.