Comments (9)
I take some local test and cachelib works well. Just crashed when i go to test cluster
from cachelib.
That's strange. The default in the config is 1.25 here (https://github.com/facebookincubator/CacheLib/blob/main/cachelib/allocator/CacheAllocatorConfig.h#L569).
Can you share the stack trace of the exception and also log the config.allocationClassSizeFactor before creating the cache through make_unique.
Does it always crash and does the error happen when you manually set the factor through setDefaultAllocSizes() ?
from cachelib.
Hi I will private build to check config.allocationClassSizeFactor tomorrow and check if it will crash if i manually set the factor tomorrow. this is the stack trace of the exception.
E0818 21:25:01.105298 14 cachelib_cache_handler.cpp:54] invalid factor 6.93298464824273e-310
E0818 21:25:01.105343 14 ExceptionTracer.cpp:210] terminate() called, exception stack follows
E0818 21:25:01.105351 14 ExceptionTracer.cpp:212] Exception type: std::invalid_argument (14 frames)
@ 00007fa0a90be092 __cxa_throw
/opt/folly/folly/experimental/exception_tracer/ExceptionTracerLib.cpp:58
@ 0000564ab86e010c facebook::cachelib::MemoryAllocator::generateAllocSizes(double, unsigned int, unsigned int, bool) [clone .cold.442]
/opt/cachelib/cachelib/allocator/memory/MemoryAllocator.cpp:187
@ 0000564abf74dede facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait>::getAllocatorConfig(facebook::cachelib::CacheAllocatorConfig<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> > const&)
/opt/cachelib/cachelib/../cachelib/allocator/Util.h:150
-> /opt/cachelib/cachelib/allocator/CacheAllocator.cpp
@ 0000564abf79c75f facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait>::CacheAllocator(facebook::cachelib::CacheAllocatorConfig<facebook::cachelib::CacheAllocator<facebook::cachelib::LruCacheTrait> >)
/opt/cachelib/cachelib/../cachelib/allocator/CacheAllocator-inl.h:34
-> /opt/cachelib/cachelib/allocator/CacheAllocator.cpp
@ 0000564ab8de0021 cache_util::CreateCachelib(unsigned long, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
/usr/include/c++/8/bits/unique_ptr.h:835
-> /proc/self/cwd/common/cache_util/cachelib_cache_handler.cpp
@ 0000564ab8de4fbc cache_util::CachelibCacheHandler::CachelibCacheHandler(std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, cache_util::SegmentInfo, std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, cache_util::SegmentInfo> > > const&)
/proc/self/cwd/common/cache_util/cachelib_cache_handler.cpp:64
@ 0000564ab8ba8197 scorpion::CreateCacheHandler(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<cache_util::CacheHandler>*)
/usr/include/c++/8/ext/new_allocator.h:136
-> /proc/self/cwd/scorpion_v2/utils.cpp
@ 0000564ab88ef10b CreateScorpionHandlerV2(std::unique_ptr<scorpion::ScorpionHandlerV2, std::default_delete<scorpion::ScorpionHandlerV2> >*, std::shared_ptr<scorpion::model_server::ModelServer>*)
/proc/self/cwd/scorpion_v2/scorpion_v2.cpp:153
@ 0000564ab86f18b7 main
/proc/self/cwd/scorpion_v2/scorpion_v2.cpp:238
@ 00007fa09b3cbbf6 __libc_start_main
@ 0000564ab88e7849 _start
E0818 21:25:01.129964 14 ExceptionTracer.cpp:214] exception stack complete
terminate called after throwing an instance of 'std::invalid_argument'
what(): invalid factor 6.93298464824273e-310
from cachelib.
Hi @sathyaphoenix I tried to private build again. Surprisingly cachelib is not crashed and allocationClassFSizeFactor is as expected.
E0819 20:13:54.022516 14 cachelib_cache_handler.cpp:52] Cachelib allocationClassFSizeFactor in config is: 1.25
I think right now everything is good. Feel free to closing the ticket. (although i still don't know why allocationClassFSizeFactor becomes 0 sometimes)
from cachelib.
Thanks for confirming. If you can, please run with ASAN enabled and see if it can provide more information. For now, I am closing this issue. Please reopen if this re-appears and needs investigation.
from cachelib.
Hi @sathyaphoenix finally we find the root cause is we set -DFOLLY_SSE=0 to support AVX512 compiler optimizer. But cachelib requires folly::dynamic and f14map in nvmconfig and f14map requires at least FOLLY_SSE=2. I think cachelib does not check this case but just throws an error with a confusing error message.
The error does not appear in private build is because we don't use any compiler optimizer in private build pipeline. After setting folly_sse=2 in our master build pipeline, the error goes away. Do you think we can add an additional check or have a comment in nvmconfig to avoid this issue?
from cachelib.
@tangliisu Can you share the confusing error message that you see and also more details on how this causes the double value to be ~0. Also, please note that NvmConfig has moved away from using folly::dynamic in the main branch and it has simple declarative api to configure it. https://cachelib.org/docs/Cache_Library_User_Guides/Configure_HybridCache) .. We do rely on F14Map though. Once you share the error message, we can look into an appropriate work around.
from cachelib.
Thanks for the info. We pin cachelib to an old version so nvmconfig is still there.
I could not reproduce the error message allocationClassFSizeFactor ~0 in recent build. Recently the bad build error stack trace is
F0902 20:10:39.052548 14 dynamic.cpp:137] Check failed: 0
*** Check failure stack trace: ***
@ 0x7f8d2b6739bd google::LogMessage::Fail()
@ 0x7f8d2b6758a8 google::LogMessage::SendToLog()
@ 0x7f8d2b673563 google::LogMessage::Flush()
@ 0x7f8d2b6762f9 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f8d2b0ce716 folly::dynamic::operator=()
@ 0x562fb47e392e facebook::cachelib::NvmCache<>::Config::Config()
@ 0x562fb47ed0b3 facebook::cachelib::CacheAllocatorConfig<>::CacheAllocatorConfig()
@ 0x562fb483b122 facebook::cachelib::CacheAllocator<>::CacheAllocator()
@ 0x562fade7cb2a cache_util::CreateCachelib()
@ 0x562fade80c32 cache_util::CachelibCacheHandler::CachelibCacheHandler()
@ 0x562fadc40b48 scorpion::CreateCacheHandler()
@ 0x562fad98330c CreateScorpionHandlerV2()
@ 0x562fad7853b8 main
@ 0x7f8d1aa53bf7 (unknown)
@ 0x562fad97b61a _start
which makes sense. But i happened to get the confusing ~0 error before we figured out the FOLLY_SSE=0 issue
E0818 21:25:01.129964 14 ExceptionTracer.cpp:214] exception stack complete
terminate called after throwing an instance of 'std::invalid_argument'
what(): invalid factor 6.93298464824273e-310
If cachelib still rely on F14Map, i guess we need to have FOLLY_SSE=2.
BTW we implemented cachelib in our system. The perf is very impressive. We are still working on tuning the cachelib to see if we could further reduce the CPU usage.
from cachelib.
Great to hear it is working out as expected. Let us know if you need any information for tuning.
It is strange though that not setting FOLLY_SSE=2 would cause an unrelated double to be broken. cc @agordon if he has any insights to share.
from cachelib.
Related Issues (20)
- Some questions in resizing the cachelib pool size HOT 2
- Is there any plan to provide an Java SDK for this cachelib ? HOT 2
- Fail to build dependency fbthrift (with errors reported in fmt) HOT 5
- make clean option for contrib/build.sh HOT 1
- build error about fizz on ubuntu22.04 HOT 2
- CDN trace expected behavior HOT 2
- Enable FDP for CacheBench HOT 26
- qDepth Support for NVM Cache HOT 6
- Questions about trace files when running cachebench HOT 2
- Running simple-cache-example gives an error, flag 'v' was defined more than once HOT 6
- OSS build broken as of May, 2024 -> PRs are all blocked HOT 7
- No build support for Fedora37 OS HOT 4
- failed to build CacheLib following document HOT 2
- Build fails on debian-10 HOT 2
- Segmentation fault while fetching refcount HOT 2
- Minimum Limit For Cache Allocation? HOT 1
- build failed when building dependency 'fbthrift' HOT 3
- Build issue with CacheLib with missing source files HOT 3
- [Seeking Volunteers] Add new builds to CacheLib HOT 2
- Build is failing with error: ‘fmt::v10::detail::type_is_unformattable_for<const facebook::cachelib::navy::Status, char> _’ has incomplete type 1600 | type_is_unformattable_for<T, typename Context::char_type> _; | ^ HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cachelib.