Giter Site home page Giter Site logo

Which index codecs are supported? about rucene HOT 9 CLOSED

zhihu avatar zhihu commented on June 6, 2024
Which index codecs are supported?

from rucene.

Comments (9)

sunxiaoguang avatar sunxiaoguang commented on June 6, 2024

We started Rucene three years ago with Lucene 6.2 being the latest Codec at that time. After the long journey of migrating ES based search engine to Rucene, we no longer need to maintain compatibility with Lucene. Therefore we didn't upgrade Lucene codec to later version and made some binary incompatible changes to implement certain features that is crucial to our scenarios, in place update for example.

from rucene.

mooreniemi avatar mooreniemi commented on June 6, 2024

I understand. The downside is that given industry has moved on to Lucene 7 and 8, on-boarding existing indices onto Rucene would mean downgrading them first. This limits the utility.

Have you documented the incompatible changes you made and why?

I may end up writing a codec reader from Rust to serve my purpose, and if it would be useful, could contribute it back.

from rucene.

sunxiaoguang avatar sunxiaoguang commented on June 6, 2024

I understand. The downside is that given industry has moved on to Lucene 7 and 8, on-boarding existing indices onto Rucene would mean downgrading them first. This limits the utility.

Have you documented the incompatible changes you made and why?

I may end up writing a codec reader from Rust to serve my purpose, and if it would be useful, could contribute it back.

Unfortunately due to limited resources, we didn't make the incompatible changes to a new codec but on the only codec we have instead. We will try translating internal documents about the changes and rational behind it. Will let you know when documents are ready.

As the codec for newer Lucene version idea, that will be great. We really appreciate your kindness help, let us know whenever you need help.

from rucene.

mooreniemi avatar mooreniemi commented on June 6, 2024

I've been taking a look at this but without knowing what changes you made to the standard 6.x codec it's a bit tricky to translate. Even a rough summary here is helpful.

What I may do instead is just try going from scratch translating a more recent codec.

from rucene.

jtong11 avatar jtong11 commented on June 6, 2024

hi, we based on es-5.4 with lucene-core-6.4.18, and do not follow any higher version.

some special codec changes are these:
doc values update:
b9a43cc,
9c6e614,
ef encoder:
a700992,
using simd:
6629d2f,

from rucene.

mooreniemi avatar mooreniemi commented on June 6, 2024

Thanks!

Is there a way to open an IndexReader without writing anything? I have gotten as far as reading norms now, and 1. the code assumes all fields have norms (not true for ES _id) so I have to chop around this and 2. when I read the index I'm corrupting it even though I think I've commented out everywhere it tries to write...

from rucene.

jtong11 avatar jtong11 commented on June 6, 2024

Thanks!

Is there a way to open an IndexReader without writing anything? I have gotten as far as reading norms now, and 1. the code assumes all fields have norms (not true for ES _id) so I have to chop around this and 2. when I read the index I'm corrupting it even though I think I've commented out everywhere it tries to write...

I'm afraid that Rucene can now only open the Index which created by Rucene, no more ES-index, included 6.4.18. :(

from rucene.

mooreniemi avatar mooreniemi commented on June 6, 2024

I know, further up in the thread I was told that but I am pursuing opening a more recent index. I have search and reading stored fields working. I may or may not actually try to get doc values working, not sure I need them for my use case.

But my main issue now is just that it seems somewhere Rucene writes/corrupts the index while reading it, and I can't figure out where. I've disabled creating backup segments and everywhere else I see it write. In my case I am not dealing with a live index so that stuff is not necessary. If you have any other pointers about that it's appreciated, otherwise feel free to close this.

If I actually get to the doc values stuff I will submit a PR.

from rucene.

mooreniemi avatar mooreniemi commented on June 6, 2024

I found the call to write. All set now. :)

from rucene.

Related Issues (13)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.