Giter Site home page Giter Site logo

streams & change notifications about cubdb HOT 2 CLOSED

dch avatar dch commented on May 16, 2024
streams & change notifications

from cubdb.

Comments (2)

lucaong avatar lucaong commented on May 16, 2024

Hi @dch ,
the select/3 function uses lazy streams internally, so if you do, for example, something like:

CubDB.select(db, pipe: [
  map: fn {key, value} -> value end,
  filter: fn x -> Integer.is_even(x) end
  map: fn x -> x * 2 end,
  take: 3
])

The transformations above would be executed as a lazy stream (and, in particular, only maximum 3 entries will be read from disk and processed, because of the final take: 3).

It would be nice to have an interface such as:

# Note: this is NOT the way CubDB actually works, just illustrating a point
CubDB.select() |> Stream.map(fn x -> ... end) |> Stream.filter(...) |> ...

But, unfortunately, it would be tricky and error prone. The issue is the following: remember that CubDB allows you to perform concurrent reads and writes, so you can be executing a long running select while some other process is performing writes. This is because the select runs against a zero-cost immutable snapshot: it basically "sees" the database frozen to the state it was when the select started. Eventually, when a compaction operation runs, it will clean up and remove the old un-compacted database file, but it can only do so when no read operation is "seeing" the old file anymore, or it would remove it from under its feet. For this reason, CubDB has to internally keep track of all readers, and which point in the database history they are seeing.

The way the select/3 API is designed, allows CubDB to perform this internal bookkeeping without user intervention. If the API was something like the fake example above instead, one would have to manually "check in" and "check out" after finishing processing the stream. If a user would forget to "check out", compaction operations would be blocked indefinitely (or, alternatively, if compaction is allowed to run, it could break slow readers).

In conclusion, performance-wise select/3 is already using lazy streams under the hood, so it will minimize disk operations. It would be nice to have a stream-based API, but that would cause the problems described above.

Regarding allowing processes to subscribe to changes, I am curious, what would be your idea? It sounds like an interesting option, maybe better implemented as a library on top of CubDB.

from cubdb.

lucaong avatar lucaong commented on May 16, 2024

Closing this for now as there is no response. Feel free to comment on it and I will reopen if necessary.

from cubdb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.