why FB did not add cache logic inside RocksDB, instead create another CacheLib. both a

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

FAQ RocksDB vs Cachelib about cachelib HOT 3 CLOSED

facebook commented on July 27, 2024

FAQ RocksDB vs Cachelib

from cachelib.

Comments (3)

1a1a11a commented on July 27, 2024 6

I am not a Facebook employee, but my view on having KV store and KV cache as two separate components:

design requirements: a) RocksDB are optimized for writes, not reads, while Cachelib is optimized for reads. b) RocksDB and Cachelib have different features sets, which allows different optimizations. For example, not having to provide durability guarantee and range request feature allow performance improvement (no WAL, no need to maintain data in sorted order, etc.).
optimization goals: because the usage difference, a cache and a store have different optimizations goals. A cache prioritizes for efficiency (miss ratio), while a store prioritizes durability and write throughput (among many others).
traffic and usage difference: a) a cache is deployed in front of a store, which means a cache sees more read traffic, while store sees more write traffic. b) cache can be used to store data that can tolerate some degree of loss (for example, rate limiters).
deployment difference: a) because KV store deployments act as the source of truth, and require strong consistency, durability guarantee (data loss is not acceptable), they are usually deployed/replicated on multiple nodes with consensus algorithms like Raft. As a comparison, data in cache can be evicted, cache can be used/deployed as either in-application library or distributed cache (like Memcached). b) a cache can be deployed as DRAM only, a store usually won't be deployed as DRAM only due to high cost of DRAM and difficulty of achieving durability with DRAM.
capacity planning: a) in terms of working set, key-value stores require provisioning sufficient storage capacity to hold all the data, while caching does not need to store the whole working set. Therefore, the capacity usage of KV store grows much faster than KV cache, which means coupling the two into one library may complicate capacity planning and deployments.

from cachelib.

gaowayne commented on July 27, 2024

@1a1a11a thank you so much man, this is very clear!

from cachelib.

sathyaphoenix commented on July 27, 2024

@gaowayne Juncheng covered the main reasons why CacheLib and RocksDB are two separate libraries. One thing I would add is that database storage engine's primary table stake is durability of data that is written. This is not the case for caches which by nature can evict data as more are added. This difference in expectation is leveraged in the design of both the libraries to exploit the best of other constraints and optimizing the read performance for caching and write performance and feature set for storage engines.

from cachelib.

Recommend Projects

FAQ RocksDB vs Cachelib about cachelib HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent