Comments (6)
Can you share a snippet of your code on initially creating the cache, and then shutdown, and then recover?
We have not seen the issue you described here. The logic throwing is this:
// Check if nextPoolid is restored properly or not. If not restored,
// throw error
if (!object.nextPoolId_ref().is_set()) {
throw std::logic_error(
"Memory Pool Manager can not be restored,"
" nextPoolId is not set");
}
https://github.com/facebook/CacheLib/blob/main/cachelib/allocator/memory/MemoryPoolManager.cpp#L42
But on shutdown, if we have successfully shut down, we must have serialized the "nextPoolId" and stored in thrift structure.
object.set_nextPoolId(nextPoolId_);
https://github.com/facebook/CacheLib/blob/main/cachelib/allocator/memory/MemoryPoolManager.cpp#L168
So it's quite odd that on recovery you're hitting an error that complains this thrift field is not set. Can you add some debug code before throwing to print out the value of object.nextPoolId_ref().value()
?
from cachelib.
i agree it is very odd after checking the code.
Here is the code
creating cache and recover
static std::unique_ptr<Cache> CreatePersistentCache(const size_t cache_size,
const std::string& cache_name,
const std::string& cache_metadata_directory,
const int32_t buckets_power,
const int32_t locks_power) {
try {
bool is_metadata_exist = boost::filesystem::exists(METADATA_PATH);
LOG(INFO) << folly::sformat("The metadata file exists before cache created?: {}.",
is_metadata_exist);
if (is_metadata_exist) {
LOG(INFO) << folly::sformat("The size of file is: {}.",
boost::filesystem::file_size(METADATA_PATH));
}
// Cache config
Cache::Config config;
config.setCacheSize(cache_size)
.setCacheName(cache_name)
.setAccessConfig({buckets_power, locks_power})
.enableCachePersistence(cache_metadata_directory)
.validate();
LOG(INFO) << folly::sformat("Cachelib cache_size in config is: {}", cache_size);
LOG(INFO) << folly::sformat("Cachelib allocationClassFSizeFactor in config is: {}",
config.allocationClassSizeFactor);
LOG(INFO) << folly::sformat(
"Cachelib buckets_power is: {}, locks_power is: {}", buckets_power, locks_power);
// Create cache
std::unique_ptr<Cache> cache;
try {
cache = std::make_unique<Cache>(Cache::SharedMemAttach, config);
// Cache is now restored
} catch (const std::exception& ex) {
// Failed to attach the cache. Create a new one but make sure that
// the old cache is destroyed before creating a new one.
// This allows us to release any held resources (such as
// open file descriptors and associated fcntl locks).
cache.reset();
LOG(ERROR) << "Failed to attach the old cache when restarting: " << ex.what();
cache = std::make_unique<Cache>(Cache::SharedMemNew, config);
}
return cache;
} catch (const std::exception& ex) {
LOG(ERROR) << "Error in persistent cache creation: " << ex.what();
throw;
}
}
shutdown
void CachelibCacheHandler::ShutDown() {
if (!FLAGS_enable_persistent_cachelib) {
return;
}
bool is_metadata_exist = boost::filesystem::exists(METADATA_PATH);
LOG(INFO) << folly::sformat("The metadata file exists before cache shutdown?: {}.",
is_metadata_exist);
if (is_metadata_exist) {
LOG(INFO) << folly::sformat("The size of file is: {}.",
boost::filesystem::file_size(METADATA_PATH));
}
auto status = cache_->shutDown();
is_metadata_exist = boost::filesystem::exists(METADATA_PATH);
LOG(INFO) << folly::sformat("The metadata file exists after cache shutdown?: {}.",
is_metadata_exist);
if (is_metadata_exist) {
LOG(INFO) << folly::sformat("The size of file is: {}.",
boost::filesystem::file_size(METADATA_PATH));
}
switch (status) {
case Cache::ShutDownStatus::kSuccess:
LOG(INFO) << folly::sformat(
"Cache is successfully shut down and metadata is recorded in directory {}",
FLAGS_cache_metadata_directory);
break;
case Cache::ShutDownStatus::kFailed:
LOG(ERROR) << folly::sformat("Failed to persist the cache metadata in directory {}",
FLAGS_cache_metadata_directory);
break;
default:
LOG(ERROR) << folly::sformat("Cache metadata is partitally recored in directory {}",
FLAGS_cache_metadata_directory);
}
}
from cachelib.
@tangliisu few questions about the occurrence of the issue.
- how repeatable is this ? is it consistent to repro ?
- are the binaries the same between the shutdown and recovery ? Any thrift changes or cachelib changes between the two ?
from cachelib.
- it is consistent to repro
- the binaries are same and no thrift changes
The way i test the persistent cache is I just restart the test cluster. The thrift/cachelib/binaries are exactly same.
One thing that might be the cause is we use an old version of fbthrift which does not support accessing the thrift members through methods. So I just change the method _ref() in cachelib codebase to the thriftfields. On the changes related to nextPoolId, I change
if (!object.nextPoolId_ref().is_set())
to object.__isset.nextPoolId
*object.nextPoolId_ref()
to object.nextPoolId
nextPoolId_(*object.nextPoolId_ref())
to nextPoolId_(object.nextPoolId)
Do you think it is might be the root cause?
from cachelib.
nvm i figured it out. it is a stupid typo. I change if (!object.nextPoolId_ref().is_set())
to if(object.__isset.nextPoolId)
. sorry for bothering you on this stupid typo :(.
from cachelib.
close the ticket
from cachelib.
Related Issues (20)
- Fail to build dependency fbthrift (with errors reported in fmt) HOT 5
- make clean option for contrib/build.sh HOT 1
- build error about fizz on ubuntu22.04 HOT 2
- CDN trace expected behavior HOT 2
- Enable FDP for CacheBench HOT 26
- qDepth Support for NVM Cache HOT 6
- Questions about trace files when running cachebench HOT 2
- Running simple-cache-example gives an error, flag 'v' was defined more than once HOT 6
- OSS build broken as of May, 2024 -> PRs are all blocked HOT 7
- No build support for Fedora37 OS HOT 4
- failed to build CacheLib following document HOT 2
- Build fails on debian-10 HOT 2
- Segmentation fault while fetching refcount HOT 2
- Minimum Limit For Cache Allocation? HOT 1
- build failed when building dependency 'fbthrift' HOT 3
- Build issue with CacheLib with missing source files HOT 3
- [Seeking Volunteers] Add new builds to CacheLib HOT 2
- Build is failing with error: ‘fmt::v10::detail::type_is_unformattable_for<const facebook::cachelib::navy::Status, char> _’ has incomplete type 1600 | type_is_unformattable_for<T, typename Context::char_type> _; | ^ HOT 4
- NavySetup should not involve the MockDevice
- How to configure folly for logging CacheLib
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cachelib.