Giter Site home page Giter Site logo

Comments (12)

lemire avatar lemire commented on May 16, 2024 2

We have several implementations of Roaring over 64-bit integers in separate repositories...

https://bitbucket.org/samytto/lazyroaring64-bits

https://bitbucket.org/samytto/roaring64bits2levels

https://bitbucket.org/samytto/roaringtreemap

... as well as an evaluation...

https://bitbucket.org/samytto/datastructures64-bitints

from croaring.

lemire avatar lemire commented on May 16, 2024

@fsaintjacques

  1. It is "easy" to build a 64-bit data structure on top a 32-bit one. Put the Roaring bitmaps into a tree with 32-bit keys. Samy Chambi has done the work and will have a chapter in his thesis on this topic. There are a few interesting design issues. This work should be published in 2016. Because it can be built on top of a 32-bit data structure, then it is not an issue. It is also possible to build a custom 64-bit data structure, but, again, the 16-bit containers can still prove useful as-is. So supporting 64-bit integers can be done separately without harm.
  2. Roaring is used in analytical systems like Druid or Apache Kylin that do scale up to very large data sets, while using only 32 bits. That's because 2^32 is a large number and it often makes sense to partition the problem well before you get to such counts. You probably want to partition for other reasons... such as parallelism.
  3. If you do have a practical use case where, for example, hash sets are not an obviously good choice, then I would be very interested.

from croaring.

lemire avatar lemire commented on May 16, 2024

I'm going to leave this open.

from croaring.

madscientist avatar madscientist commented on May 16, 2024

All the implementations above appear to be in Java; any news on a 64bit version for C/C++? We have various IDs which are 64bits (32 bits is a lot, yes, but when you're talking 100's of transactions / sec and services running for weeks/months, and being stored durably as well, it goes by pretty quickly...). We'll look into a two-layer approach. You mentioned a year ago that Samy Chambi was working on doing this and might publish something; did that happen and if so is there someplace to get a copy?

Cheers!

from croaring.

lemire avatar lemire commented on May 16, 2024

@madscientist

any news on a 64bit version for C/C++?

Not that I know. I have marked the issue as "help wanted" just now.

You mentioned a year ago that Samy Chambi was working on doing this and might publish something; did that happen and if so is there someplace to get a copy?

Yes. There is a paper in French with experimental results as well as links to (Java) implementations. http://r-libre.teluq.ca/930/1/Roaring64bits.pdf

from croaring.

lemire avatar lemire commented on May 16, 2024

We have a relevant PR : #75

from croaring.

lemire avatar lemire commented on May 16, 2024

With the merger of PR #75, this issue is resolved for the c++ layer.

from croaring.

lemire avatar lemire commented on May 16, 2024

I think that the issue remains open for C users.

from croaring.

K2 avatar K2 commented on May 16, 2024

@lemire I just wrote a C++/CLI wrapper that allow's .NET callers (C#, F# etc..) to use the 64 bit CPP interface. It's able to be amalgamated since I used inline #pragma define's for the CLI mode. It requires a small change to the build settings in visual studio that specifies mixed mode (only for the shim.cpp). I can send you a pull request if you'd like, I'm not really great with cmake but can include a project settings file if that helps. The nice thing about mixed mode is that there is only the one DLL instead of at least two with other thunk mechanisms.

from croaring.

lemire avatar lemire commented on May 16, 2024

I'd encourage you to issue a PR.

from croaring.

lemire avatar lemire commented on May 16, 2024

@K2 My main concern at this point is to make sure that your code can be used by others in a reasonable manner. It is fine to ignore CMake... but I am concerned about dropping undocumented mysterious files in the middle of a medium-size project. Who is going to find these files and know how to use them?

from croaring.

K2 avatar K2 commented on May 16, 2024

Sure I can take a look at CMake if you'd like or if you have other suggestions I'm open to it, maybe another .NET (the RogueException/CRoaring.Net is very good with 32bit) for 64.

Similar to those who started this thread (and others Tornhoof/RoaringBitmap#2) regarding 64 implementations, I was looking over the various implementations for a 64 bit compatible version before going out and making my own when I noticed the CPP version and figured that it'd be nice if that were exported into .NET since it's an existing and tested implementation. The CLI code I wrote is almost (exception move operator, I could do better at simulating that;) 1:1 with the Roaring64Map in terms of interfaces.

To me it seemed less error prone to add 64 bit .NET support by extending an existing underlying native implementation, I'd be assured not to inject any serious bugs. An added bonus was that as the Roaring64Map class performance improves it'd automatically boost this interface.

I think in a perfect world if/when the underlying C interfaces are extended to 64bit then RogueException's exemplary work would do well for any other .NET language. I'd tend to defer to @RogueException to extend his code on his timeline, moving to 64 isn't quite as easy to-do and maintain good performance since the *Many operators essentially turn into enumerated forms over the inputs that slows everything down a lot, i.e. Roaring64Map::AddMany. If I were to rewrite those methods it'd be a lot nicer to sort of determine a *Many threshold to sort/bucket the inputs and use the underlying C *many call's to speed them up....

I can fork this off or invite @RogueException for suggestions or @ksajme or anybody else...

from croaring.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.