Giter Site home page Giter Site logo

Comments (6)

roseduan avatar roseduan commented on June 14, 2024 2

OK, we can just make a simple mechanism, to scan the btree to delete expired keys.
And we can add a parameter to limit the execution time of the function, to avoid locking too long time.

Thanks for your recent work.

from rosedb.

Jeremy-Run avatar Jeremy-Run commented on June 14, 2024

@roseduan Do you think this feature is necessary?

from rosedb.

roseduan avatar roseduan commented on June 14, 2024

Yes it is necessary, but we need to rethink the specific strategy, may be we can see how Redis does.

from rosedb.

Jeremy-Run avatar Jeremy-Run commented on June 14, 2024

@roseduan Is this plan ok?

Points to consider:

  • Deleting the expired key requires a global write lock. If we read the record from the disk and then judge whether record is expired, it will affect the performance, so this operation should be based on pure memory.
  • For Bitcask, memory is very precious, so users should decide whether we can occupy more cache to perform expired key recovery based on actual scenarios.

Implementation plan:

  • Add a global switch to enable automatic deletion of expired keys.
  • If it is enabled, build an orderedMap in the cache, store the key and expiration time, sort according to the expiration time from small to large; then start a background goroutine, wait for the minimum time in the ordered dictionary to arrive before performing deletion (a random number will be made here Wait to prevent batch keys from expiring at the same time and occupying the write lock all the time). Merge(true) / Open() needs to rebuild this orderedMap in memory.
  • If it is not enabled, users can choose to call Merge(true) actively, because when Merge(true), the index will be rebuilt and expired keys will be filtered out.

from rosedb.

roseduan avatar roseduan commented on June 14, 2024

Why do we need an ordered map to store the key and expiration?

I think the easiest and most straightforward way to do this is to iterate all keys in BTree and delete all expired keys.

We can add a public function to callers(like DeleteExpiredKeys), they can call it as their needs.

And we should clarify that this method may take a long time to finish, we can do some other optimizations when someone acquires for it.

from rosedb.

Jeremy-Run avatar Jeremy-Run commented on June 14, 2024

My plan is to delete expired keys without stopping DB service as much as possible (because this is an optimization function, not a core function); If it is allowed to stop the service for a long time, it is a good idea to iterate all the keys of the BTree to delete the expired keys. Of course, executing Merge(true) or reopening the db can also filter the expired keys.

The reason for using an orderedMap for storage is mainly due to two considerations:
①The key in the map is unique, which is unified with our db.
②It is in order, so it's possible to know the next execution time of the background goroutine and avoid the time of invalid write lock.

Of course it takes more memory, and written data that have expiration time will be slower. As you said, we can make a simple mechanism (scan BTree) first, and then optimize it according to the needs if there are user needs later.

from rosedb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.