Giter Site home page Giter Site logo

bytedance / terarkdb Goto Github PK

View Code? Open in Web Editor NEW
2.0K 57.0 198.0 70.03 MB

A RocksDB compatible KV storage engine with better performance

License: Apache License 2.0

CMake 0.48% Python 1.41% Shell 1.19% PHP 0.08% Perl 1.57% PowerShell 0.10% C++ 83.41% C 1.34% Makefile 0.81% Batchfile 0.01% Java 9.50% Dockerfile 0.01% JavaScript 0.02% Assembly 0.09%

terarkdb's Introduction

About TerarkDB

TerarkDB is a RocksDB replacement with optimized tail latency, throughput and compression etc. In most cases you can migrate your existing RocksDB instance to TerarkDB without any drawbacks.

NOTES

  • TerarkDB was only tested and production ready under Linux platform
  • Language bindings except C/C++ are not fully tested yet.
  • Existing data can be migrated from RocksDB directly to TerarkDB, but cannot migrate back to RocksDB.
  • TerarkDB was forked from RocksDB v5.18.3.

Performance Overview

  • RocksDB v6.12
  • Server
    • Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz (2 Sockets, 32 cores 64 threads)
    • 376 GB DRAM
    • NVMe TLC SSD (3.5 TB)
  • Bench Tools & Workloads
    • use db_bench
    • 10 client threads, 20GB requests per thread
    • key = 24 bytes, value = 2000 bytes
    • heavy_write means 90% write operations
    • heavy_read means 90% read operations

1. Use TerarkDB

Prerequisite

If you enabled TerarkZipTable support (-DWITH_TERARK_ZIP=ON), you should install libaio before compile TerarkDB:

sudo apt-get install libaio-dev

If this is your first time using TerarkDB, we recommend you to use without TerarkZipTable by changing -DWITH_TERARK_ZIP to OFF in build.sh.

Method 1: Use CMake subdirectory (Recommend)

  1. Clone
cd {YOUR_PROJECT_DIR}
git submodule add https://github.com/bytedance/terarkdb.git

cd terarkdb && git submodule update --init --recursive
  1. Edit your Top Project's CMakeLists.txt
add_subdirectory(terarkdb)
target_link_libraries({YOUR_TARGET} terarkdb)
  1. Important Default Options
  • CMAKE_BUILD_TYPE: RelWithDebInfo
  • WITH_JEMALLOC: ON
    • Use Jemalloc or Not (If you are using a different malloc library, change to OFF)
  • WITH_TESTS: OFF
    • Build test cases
  • WITH_TOOLS: OFF
    • Build with TerarkDB tools (e.g. db_bench, ldb etc)
  • WITH_TERARK_ZIP: OFF
    • Build with TerarkZipTable
  • WITH_ZNS: OFF
    • Build with ZNS device support

Notes

  • TerarkDB is built with zstd, lz4, snappy, zlib, gtest, boost by default, if you need these libraries, you can remove them from your higher level application.

Method 2: Link as static library

  1. clone & build
git clone https://github.com/bytedance/terarkdb.git

cd terarkdb && git submodule update --init --recursive

WITH_TESTS=OFF WITH_ZNS=OFF ./build.sh
  1. linking

Directory:

  terarkdb/
        \___ output/
                \_____ include/
                \_____ lib/
                         \___ libterarkdb.a
                         \___ libzstd.a
                         \___ ...

We didn't archieve all static libraries together yet, so you have to pack all libraries to your target:

-Wl,-Bstatic \
-lterarkdb -lbz2 -ljemalloc -llz4 -lsnappy -lz -lzstd \
-Wl,-Bdynamic -pthread -lgomp -lrt -ldl -laio

2. Usage

2.1. BlockBasedTable

#include <cassert>
#include "rocksdb/db.h"

rocksdb::DB* db;
rocksdb::Options options;

// Your options here
options.create_if_missing = true;
options.wal_bytes_per_sync = 32768;
options.bytes_per_sync = 32768;

// Open DB
auto status = rocksdb::DB::Open(options, "/tmp/testdb", &db);

// Operations
std::string value;
auto s = db->Put(rocksdb::WriteOptions(), "key1", "value1");
s = db->Get(rocksdb::ReadOptions(), "key1", &value);
assert(s.ok());
assert("value1" == value);

s = db->Delete(rocksdb::WriteOptions(), "key1");
assert(s.ok());

Or manually set table format and table options:

#include <cassert>
#include "rocksdb/db.h"
#include "rocksdb/options.h"
#include "rocksdb/table.h"

rocksdb::DB* db;
rocksdb::Options options;

// Your db options here
options.create_if_missing = true;
options.wal_bytes_per_sync = 32768;
options.bytes_per_sync = 32768;

// Manually specify target table and table options
rocksdb::BlockBasedTableOptions table_options;
table_options.block_cache =
    rocksdb::NewLRUCache(32ULL << 30, 8, false);
table_options.block_size = 8ULL << 10;
options.table_factory = std::shared_ptr<rocksdb::TableFactory>
                          (NewBlockBasedTableFactory(table_options));

// Open DB
auto status = rocksdb::DB::Open(options, "/tmp/testdb2", &db);

// Operations
std::string value;
auto s = db->Put(rocksdb::WriteOptions(), "key1", "value1");
s = db->Get(rocksdb::ReadOptions(), "key1", &value);
assert(s.ok());
assert("value1" == value);

s = db->Delete(rocksdb::WriteOptions(), "key1");
assert(s.ok());

2.2. TerarkZipTable

#include <cassert>
#include "rocksdb/db.h"
#include "rocksdb/options.h"
#include "rocksdb/table.h"
#include "table/terark_zip_table.h"

rocksdb::DB* db;
rocksdb::Options options;

// Your db options here
options.create_if_missing = true;
options.wal_bytes_per_sync = 32768;
options.bytes_per_sync = 32768;

// TerarkZipTable need a `fallback` options because you can indicate which LSM level you want to start using TerarkZipTable
// For example, by setting tzt_options.terarkZipMinLevel = 2, TerarkDB will use your fallback Table on level 0 and 1.
std::shared_ptr<rocksdb::TableFactory> table_factory;
rocksdb::BlockBasedTableOptions blockbased_options;
blockbased_options.block_size = 8ULL << 10;
table_factory.reset(NewBlockBasedTableFactory(blockbased_options));

rocksdb::TerarkZipTableOptions tzt_options;
// TerarkZipTable requires a temp directory other than data directory, a slow device is acceptable
tzt_options.localTempDir = "/tmp";
tzt_options.indexNestLevel = 3;
tzt_options.sampleRatio = 0.01;
tzt_options.terarkZipMinLevel = 2; // Start using TerarkZipTable from level 2

table_factory.reset(
    rocksdb::NewTerarkZipTableFactory(tzt_options, table_factory));

options.table_factory = table_factory;

// Open DB
auto status = rocksdb::DB::Open(options, "/tmp/testdb2", &db);

// Operations
std::string value;
auto s = db->Put(rocksdb::WriteOptions(), "key1", "value1");
s = db->Get(rocksdb::ReadOptions(), "key1", &value);
assert(s.ok());
assert("value1" == value);

s = db->Delete(rocksdb::WriteOptions(), "key1");
assert(s.ok());

3. Real-world Performance Improvement

TerarkDB has been deployed in lots of applications in Bytedance, in most cases TerarkDB can help to reduce latency spike and improve throughput tremendously.

Disk Write

Get Latency (us)

4. Contributing

  • TerarkDB uses Github issues and pull requests to manage features and bug fixes.
  • All PRs are welcome including code formating and refactoring.

5. License

  • Apache 2.0

6. Users

Please let us know if you are using TerarkDB, thanks! (By joining our slack channel)

  • ByteDance (core online services)

terarkdb's People

Contributors

adamretter avatar agiardullo avatar ajkr avatar al13n321 avatar dalgaaf avatar dhruba avatar emayanke avatar fyrz avatar grooverdan avatar haoboxu avatar igorcanadi avatar islamabdelrahman avatar jimchenglin avatar joelmarcey avatar levichen94 avatar lightmark avatar liukai avatar maysamyabandeh avatar mdcallag avatar miasantreble avatar mm304321141 avatar riversand963 avatar rockeet avatar royguo avatar rven1 avatar sagar0 avatar siying avatar squalfof avatar yhchiang avatar yuslepukhin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

terarkdb's Issues

Fix compile issues on MacOS

Expected behavior

TerarkDB should be built on MacOS without problem

Actual behavior

Cannot be built due to some library missing problem

Steps to reproduce the behavior

./build.sh

There is no valid output when testing compact with db_bench tool.

[BUG]

Expected behavior

Recently, I'm testing terarkdb's performance with db_bench。When I execute the command "./db_bench --benchmarks=compact", the output is invalid. There should be positive output because the condition of compacting has been reached.

Actual behavior

But when I execute the command "./db_bench --benchmarks=compact --use_terark_table=false", the output looks correct.
[liufanglei@node24 output]$ ./db_bench --benchmarks=compact --use_existing_db=1 --db=/data4/liufl/rocksdb/_terarkdb/
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
RocksDB: version 5.18
Date: Mon Jan 18 09:06:24 2021
CPU: 80 * Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
CPUCache: 28160 KB
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
Prefix: 0 bytes
Keys per prefix: 0
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Memtablerep: skip_list
Perf Level: 1
DB path: [/data4/liufl/rocksdb/_terarkdb/]
compact : 2878433.000 micros/op 0 ops/sec;

[liufanglei@node24 output]$ ./db_bench --benchmarks=compact --use_existing_db=1 --db=/data4/liufl/rocksdb/_terarkdb/ --use_terark_table=false
Initializing RocksDB Options from the specified file
Initializing RocksDB Options from command-line flags
RocksDB: version 5.18
Date: Mon Jan 18 09:07:12 2021
CPU: 80 * Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
CPUCache: 28160 KB
Keys: 16 bytes each
Values: 100 bytes each (50 bytes after compression)
Entries: 1000000
Prefix: 0 bytes
Keys per prefix: 0
RawSize: 110.6 MB (estimated)
FileSize: 62.9 MB (estimated)
Write rate: 0 bytes/second
Read rate: 0 ops/second
Compression: Snappy
Memtablerep: skip_list
Perf Level: 1
DB path: [/data4/liufl/rocksdb/_terarkdb/]
compact : 1445.000 micros/op 692 ops/sec;

Steps to reproduce the behavior

$$ git clone http:://xxx.terarkdb.git
$$ ./build.sh, with -DCMAKE_BUILD_TYPE=Release -DWITH_TERARK_ZIP=ON
$$ cd output
$$ ./db_bench --benchmarks=fillrandom --use_existing_db=0 --disable_auto_compactions=1 --sync=0 --db=/data4/liufl/terarkdb/db/ --wal_dir=/data4/liufl/terarkdb/wal/ --num=1000000000
$$ ./output/db_bench --benchmarks=compact --use_existing_db=1 --db=/data4/liufl/rocksdb/_terarkdb/
$$ ./output/db_bench --benchmarks=compact --use_existing_db=1 --db=/data4/liufl/rocksdb/_terarkdb/ --use_terark_table=false

Something else

When I tested the performance of readrandom, I found that there was write bandwidth in the statistics。
It's strange beacase the script benchmark.sh has the setting "--disable_auto_compactions=1".
Maybe it's my misjudgment.

My operation steps are as follows:
$$ NUM_KEYS=100000000 NUM_THREADS=64 CACHE_SIZE=137438953472 DURATION=5400 ./benchmark.sh bulkload
$$ NUM_KEYS=100000000 NUM_THREADS=64 CACHE_SIZE=137438953472 DURATION=5400 ./benchmark.sh readrandom

Optimize repair function to support different tables

[Enhancement]

Problem

TerarkDB has some new features for better performance, but it doesn't support repair function of ldb. If some errors happen to TerarkDB, it's difficult to recover data from the chaos of database.
The main question we should solve is to support terark_zip_table, map table(or call lazy compaction) and blob table.

Solution

@ustcwelcome

Release Version

v1.3.x

how does setMaxOpenFiles affect very very large database?

  1. can i put setMaxOpenFiles(10)?
  2. i'm using terarkdb over nfs and nfs has a limit of 1024 open files.
    a) what's the limitation if i set the setMaxOpenFiles size too low?
    b) what if sst files etc exceed the limit of 1024?

is there a fail safe that open limited files as the situation / scenario i mentioned? not limited to number of sst files etc
thx

supported platforms and compiler toolsets

hi, all, I wonder what's the basic supported compiler toolsets and platform to build terarkdb. I tried below environment and failed to build without any modification of build.sh.

Compiler: GCC 4.8.5, 4.9.0, 8.2.0
CentOS 7.2.x

but I build successfully as I disabled DWITH_TERARK_ZIP and DWITH_TOOLS. I can't find any reference about this by searching the wiki and all-in-one-doc. Is anyone to help with this issue?

GitHub CI is broken

[Enhancement]

Problem

  • Currently, GitHub Action is not running for new PRs. #84 (comment)
  • There is no make test target in generated Makefile if using commands in the workflow. Fixed, but unit test will fail. #85
  • clang-format is not checked in CI scripts.

Solution

I'm looking into fixes for this repo.

Pack all static libraries into one singe library for users convenient

[Enhancement]:

Problem

If our users are not using CMake, they will have to build terarkdb manually, but current we produce a set of static libraries including libzstd.a, libz.a, libsnappy.a, libterarkdb.a.... which is not convenient for our users.

Solution

After the build process, we should pack all these static libraries into one single libterarkdb.a by using ar command.

Optmize HandleWriteBufferFull

[Enhancement]

Problem

If min_write_buffer_number_to_merge is set greater than 1, CF may hold an immutable memtable which has not been flushed yet.

The HandleWriteBufferFull will be called when the write buffer is written full, but unfortunately, this function consumes only active Memtable. So it could happen that lots of immutable memtables exist and cannot be merged due to the min_write_buffer_number_to_merge setting.

Solution

We could let HandleWriteBufferFull consumes both mutable & immutable memtables and pick the most valuable(oldest & largest) CF to flush.

Unit test failed

[BUG]

Expected behavior

ctest # execute successfully

Actual behavior

97% tests passed, 4 tests failed out of 152

Total Test time (real) = 683.01 sec

The following tests FAILED:
          3 - column_family_test (Subprocess aborted)
         16 - db_compaction_test (Subprocess aborted)
         29 - db_properties_test (SEGFAULT)
         38 - db_wal_test (SEGFAULT)

Steps to reproduce the behavior

As core functionality isn't affected, we just leave the issue here, in hope of someone could help fix this.

`sst_dump` doesn't support TerarkZipTable

[BUG]

./db_bench \
        --benchmarks=fillrandom \
        --use_existing_db=0 \
        --statistics=0 \
        --stats_per_interval=1 \
        --stats_interval_seconds=60 \
        --max_background_flushes=6 \
        --max_background_compactions=15 \
        --enable_lazy_compaction=0 \
        --level0_file_num_compaction_trigger=4 \
        --sync=1 \
        --allow_concurrent_memtable_write=1 \
        --bytes_per_sync=32768 \
        --wal_bytes_per_sync=32768 \
        --delayed_write_rate=419430400 \
        --enable_write_thread_adaptive_yield=1 \
        --threads=1 \
        --num_levels=7 \
        --key_size=36 \
        --value_size=8192 \
        --level_compaction_dynamic_level_bytes=true \
        --mmap_read=false \
        --compression_type=zstd \
        --memtablerep=patricia_trie \
        --blob_size=1024 \
        --blob_gc_ratio=0.0625 \
        --write_buffer_size=268435456 \
        --max_write_buffer_number=10 \
        --target_file_size_base=134217728 \
        --target_blob_file_size=134217728 \
        --blob_file_defragment_size=33554432 \
        --max_dependence_blob_overlap=128 \
        --optimize_filters_for_hits=true \
        --optimize_range_deletion=true \
        --num=60000000 \
        --db=test_db_1 \
        --benchmark_write_rate_limit=20971520 \
        --use_terark_table=1
./sst_dump --file=000013.sst --show_properties
from [] to []
Process 000013.sst
000013.sst: Corruption: Bad table magic number: expected 9863518390377041911, found 1234605616436508552 in 000013.sst

Seems that we should add support for dumping a TerarkZipTable for sst_dump.

Reduce write stall cause by switchWAL

Reduce write stall caused by switchWAL

In SwitchWAL, relative column families will do switch memtable that will increase an immutable, total number of immutable reaches to the threshold of write stall.
Especially, if the relative column family already has a flush job in the queue, it is easier to reaches the threshold.

Solution

In this case, we ignore this switchWAL. We just need to wait for the flush job in the previous queue to complete, the WAL will be purge automatically.

Release build will shadow tests in build.sh

[BUG]

Expected behavior

CMake will pick up -DWITH_TESTS=ON option when WITH_TESTS=1 ./build.sh.

Actual behavior

CMake cannot pick up -DWITH_TESTS option when WITH_TESTS=1 ./build.sh because of -DCMAKE_BUILD_TYPE=Release.

$ WITH_TESTS=1 ./build.sh
build , with_tests = ON
...
[terarkdb] FORCE_RELEASE_BUILD = OFF, cmake_build_type = Release
[terarkdb] WITH_TOOLS = ON, WITH_ASAN = OFF, WITH_TESTS = OFF
[terarkdb] CMAKE_BUILD_TYPE=Release, BUILD_SUFFIX=r
...

Ref:

CMAKE_DEPENDENT_OPTION(WITH_TESTS "build with tests" ON "CMAKE_BUILD_TYPE STREQUAL Debug" OFF)

Steps to reproduce the behavior

git clone https://github.com/bytedance/terarkdb.git
cd terarkdb && git submodule update --init --recursive
WITH_TESTS=1 ./build.sh

adaptive lazy compaction

[Enhancement]

Problem

in a realistic workload, some ranges are written heavily, but read weakly.
in these read weakly ranges, we should lazy compaction, although the written is heavy, so more compaction resources can use to optimize read heavily range.

Solution

statistic access pattern, consider the read frequency of range to compaction picker.

How can I see Compaction Stats?

[BUG]

Expected behavior

I would like to see the Compaction Stats in test.log,so I add stats to benchmarks's param.

root@wdh-femu:/media/wdh/MyFile/terarkdb# ./output/db_bench --zbd_path=nvme1n1 --benchmarks="fillrandom,stats,sstables,levelstats" --use_existing_db=0 --histogram=1 --num=10000000 --value_size=1000 --compression_type=none --db=test_db_1>test.log
RocksDB:    version 5.18
Date:       Wed Nov 17 12:34:16 2021
CPU:        32 * Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
CPUCache:   20480 KB

Actual behavior

I printf the files in db,there are a lot of files.This is correct.
But the stats in test.log is strange,the files is too few and file size is too small.

root@wdh-femu:/media/wdh/MyFile/terarkdb# output/zenfs list --zbd=nvme1n1 --path=test_db_1
           0    Nov 17 2021 12:34:16            LOCK                            
     1883464    Nov 17 2021 12:35:33            LOG                             
    64123678    Nov 17 2021 12:34:30            000172.sst                      
    67584810    Nov 17 2021 12:34:30            000174.sst                      
    64125951    Nov 17 2021 12:34:30            000178.sst                      
    67584870    Nov 17 2021 12:34:30            000181.sst                      
    67584831    Nov 17 2021 12:34:30            000182.sst                      
    64157972    Nov 17 2021 12:34:30            000184.sst   
……
    67582979    Nov 17 2021 12:35:33            001287.sst                      
    40952609    Nov 17 2021 12:35:33            001288.log                      
    64147427    Nov 17 2021 12:35:32            001289.sst                      
    67584795    Nov 17 2021 12:35:33            001291.sst                      
    67584722    Nov 17 2021 12:35:33            001293.sst                      
        4698    Nov 17 2021 12:35:33            001294.sst                      
           0    Nov 17 2021 12:35:33            001296.log                      
    64157825    Nov 17 2021 12:35:33            001297.sst                      
        2440    Nov 17 2021 12:35:33            001299.sst                      
          16    Nov 17 2021 12:34:17            CURRENT                         
          37    Nov 17 2021 12:34:16            IDENTITY                        
      232228    Nov 17 2021 12:35:33            MANIFEST-000009                 
        5342    Nov 17 2021 12:34:16            OPTIONS-000005  
root@wdh-femu:/media/wdh/MyFile/terarkdb# output/zenfs df --zbd=nvme1n1
Free: 7766799 MB
Used: 16452 MB
Reclaimable: 19595 MB
Space amplification: 119%
** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop
----------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      0/0    0.00 KB   0.0      0.0     0.0      0.0       9.6      9.6       0.0   1.0      0.0    356.7        27       160    0.171       0      0
  L1      1/0    2.38 KB   2.8     12.7     9.6      3.1       9.4      6.3       0.0   1.0    251.8    186.3        52       213    0.242     13M   211K
  L2      1/1    4.59 KB   0.0      0.0     0.0      0.0      10.6     10.6       0.0 148920.4      0.0    181.6        60       218    0.274    7635   615K
  L3      1/1   15.14 KB   0.0      0.0     0.0      0.0       3.8      3.8       0.0 47731.2      0.0    169.9        23        83    0.279    7298   666K
 Sum      3/2   22.11 KB   0.0     12.7     9.6      3.1      33.4     30.2       0.0   3.5     80.2    211.1       162       674    0.240     13M  1493K
 Int      0/0    0.00 KB   0.0     12.0     9.1      2.9      32.3     29.5       0.0   3.6     77.5    209.4       158       654    0.242     12M  1487K
Uptime(secs): 77.0 total, 72.0 interval
Flush(GB): cumulative 9.558, interval 8.960
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 33.37 GB write, 443.60 MB/s write, 12.68 GB read, 168.62 MB/s read, 161.9 seconds
Interval compaction: 32.35 GB write, 459.95 MB/s write, 11.97 GB read, 170.20 MB/s read, 158.2 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count

Steps to reproduce the behavior

My TerarkDB branch is dev.1.4 and WITH_ZENFS=ON.Using ZNS SSD storing data

WAL optimization for better performance while enabling force sync per write

[Enhancement]:

Problem

  • In some cases, users of TerarkDB/RocksDB want to enable WAL sync on each write, the user side latency will be affected if WAL sync is not fast enough.
  • If a single WriteBatch is slowed down by any chance (e.g. I/O spike), the subsequent requests will have to wait until previous WriteBatch is finished, which will cause latency spike on user's perspective
  • The subsequent requests also have to wait until previous writebatch finish writing memtable

Solution

  • Use pipelined & staged WAL to accelerates WAL sync

@JimChengLin Please add more detail on our all-in-one document

Release

  • v1.3.x

cmake build with ninja error

[BUG]

when i build with ninja, the following problems occur

ninja: error: build.ninja:2657: bad $-escape (literal $ must be written as $$)

cmake options:

cmake .. -G Ninja -DCMAKE_BUILD_TYPE=RelWithDebInfo -DWITH_JEMALLOC=ON -DWITH_TESTS=OFF -DWITH_TOOLS=OFF -DWITH_TERARK_ZIP=OFF

but after i modified build.ninja manually by replacing $(nproc) with ${nproc}, the problem was solved.

Flink Support

[Enhancement]:

Problem

TerarkDB's Flink support is broken, should be fixed and push back to Flink.

Solution

Is terarkdb in production use?

  1. Is terarkdb in production use? is it advised to use it in production?
  2. when will terarkdb be supporting golang bindings?

Optimize TTL GC strategy to enable better out-dated records elimination

[Enhancement]

Problem

  • Early version of RocksDB TTL cannot reclaim disk space effectively if there's no more data writing (because there will be no compaction anymore)
  • A quick-fix is to let RocksDB touch each SST at least once in a fixed duration. This strategy works well if the number of SST files is not a lot. But, in large scale applications a single RocksDB instance could be about 2TB, this work takes too much time for out-dated records elimination.

Solution

Release Version

  • v1.3.x

ask for setting of db_bench

hi, happy to be here. I saw terarkdb has a better performance than rocksdb in some cases, just like "Performance Overview" in Readme shows. Can you guys post the bench settings ?

Thanks a lot.

Generated large LOG file

getting the information below which generated large LOG file over time. how do i disable LOG information?
thx in advance.
is this using terarkzip?

2021/11/30-05:11:15.093997 7f0e307f7700        Options.write_buffer_size: 1024000
2021/11/30-05:11:15.093999 7f0e307f7700  Options.max_write_buffer_number: 5
2021/11/30-05:11:15.094002 7f0e307f7700              Options.compression: Snappy
2021/11/30-05:11:15.094004 7f0e307f7700   Options.bottommost_compression: Disabled
2021/11/30-05:11:15.094007 7f0e307f7700         Options.prefix_extractor: nullptr
2021/11/30-05:11:15.094009 7f0e307f7700   Options.memtable_insert_with_hint_prefix_extractor: nullptr
2021/11/30-05:11:15.094011 7f0e307f7700             Options.num_levels: 7
2021/11/30-05:11:15.094013 7f0e307f7700        Options.min_write_buffer_number_to_merge: 1
2021/11/30-05:11:15.094015 7f0e307f7700     Options.max_write_buffer_number_to_maintain: 0
2021/11/30-05:11:15.094018 7f0e307f7700            Options.bottommost_compression_opts.window_bits: -14
2021/11/30-05:11:15.094020 7f0e307f7700                  Options.bottommost_compression_opts.level: 32767
2021/11/30-05:11:15.094022 7f0e307f7700               Options.bottommost_compression_opts.strategy: 0
2021/11/30-05:11:15.094024 7f0e307f7700         Options.bottommost_compression_opts.max_dict_bytes: 0
2021/11/30-05:11:15.094026 7f0e307f7700   Options.bottommost_compression_opts.zstd_max_train_bytes: 0
2021/11/30-05:11:15.094028 7f0e307f7700                  Options.bottommost_compression_opts.enabled: false
2021/11/30-05:11:15.094031 7f0e307f7700            Options.compression_opts.window_bits: -14
2021/11/30-05:11:15.094033 7f0e307f7700                  Options.compression_opts.level: 32767
2021/11/30-05:11:15.094035 7f0e307f7700               Options.compression_opts.strategy: 0
2021/11/30-05:11:15.094037 7f0e307f7700         Options.compression_opts.max_dict_bytes: 0
2021/11/30-05:11:15.094039 7f0e307f7700   Options.compression_opts.zstd_max_train_bytes: 0
2021/11/30-05:11:15.094041 7f0e307f7700                Options.compression_opts.enabled: false
2021/11/30-05:11:15.094043 7f0e307f7700      Options.level0_file_num_compaction_trigger: 4
2021/11/30-05:11:15.094045 7f0e307f7700          Options.level0_slowdown_writes_trigger: 20
2021/11/30-05:11:15.094047 7f0e307f7700              Options.level0_stop_writes_trigger: 36
2021/11/30-05:11:15.094071 7f0e307f7700                   Options.target_file_size_base: 67108864
2021/11/30-05:11:15.094074 7f0e307f7700             Options.target_file_size_multiplier: 1
2021/11/30-05:11:15.094076 7f0e307f7700                Options.max_bytes_for_level_base: 268435456
2021/11/30-05:11:15.094079 7f0e307f7700    Options.level_compaction_dynamic_level_bytes: 0
2021/11/30-05:11:15.094081 7f0e307f7700          Options.max_bytes_for_level_multiplier: 10.000000
2021/11/30-05:11:15.094086 7f0e307f7700 Options.max_bytes_for_level_multiplier_addtl[0]: 1
2021/11/30-05:11:15.094088 7f0e307f7700 Options.max_bytes_for_level_multiplier_addtl[1]: 1
2021/11/30-05:11:15.094090 7f0e307f7700 Options.max_bytes_for_level_multiplier_addtl[2]: 1
2021/11/30-05:11:15.094093 7f0e307f7700 Options.max_bytes_for_level_multiplier_addtl[3]: 1
2021/11/30-05:11:15.094095 7f0e307f7700 Options.max_bytes_for_level_multiplier_addtl[4]: 1
2021/11/30-05:11:15.094097 7f0e307f7700 Options.max_bytes_for_level_multiplier_addtl[5]: 1
2021/11/30-05:11:15.094099 7f0e307f7700 Options.max_bytes_for_level_multiplier_addtl[6]: 1
2021/11/30-05:11:15.094101 7f0e307f7700       Options.max_sequential_skip_in_iterations: 8
2021/11/30-05:11:15.094103 7f0e307f7700                    Options.max_compaction_bytes: 1677721600
2021/11/30-05:11:15.094105 7f0e307f7700                        Options.arena_block_size: 131072
2021/11/30-05:11:15.094107 7f0e307f7700     Options.soft_pending_compaction_bytes_limit: 68719476736

installed terarkzip, works. but not sure terarkzip working or not, dunno how to compile example.

installed terarkzip, works. but not sure terarkzip working or not, dunno how to compile example.

how do u compile the below? sorry i'm quite a newbie to c++. i can code but i cant do very simple compilation. sounds strange because c++ is not my first language.

2.2. TerarkZipTable
#include <cassert>
#include "rocksdb/db.h"
#include "rocksdb/options.h"
#include "rocksdb/table.h"
#include "table/terark_zip_table.h"

rocksdb::DB* db;
rocksdb::Options options;

// Your db options here
options.create_if_missing = true;
options.wal_bytes_per_sync = 32768;
options.bytes_per_sync = 32768;

// TerarkZipTable need a `fallback` options because you can indicate which LSM level you want to start using TerarkZipTable
// For example, by setting tzt_options.terarkZipMinLevel = 2, TerarkDB will use your fallback Table on level 0 and 1.
std::shared_ptr<rocksdb::TableFactory> table_factory;
rocksdb::BlockBasedTableOptions blockbased_options;
blockbased_options.block_size = 8ULL << 10;
table_factory.reset(NewBlockBasedTableFactory(blockbased_options));

rocksdb::TerarkZipTableOptions tzt_options;
// TerarkZipTable requires a temp directory other than data directory, a slow device is acceptable
tzt_options.localTempDir = "/tmp";
tzt_options.indexNestLevel = 3;
tzt_options.sampleRatio = 0.01;
tzt_options.terarkZipMinLevel = 2; // Start using TerarkZipTable from level 2

table_factory.reset(
    rocksdb::NewTerarkZipTableFactory(tzt_options, table_factory));

options.table_factory = table_factory;

// Open DB
auto status = rocksdb::DB::Open(options, "/tmp/testdb2", &db);

// Operations
std::string value;
auto s = db->Put(rocksdb::WriteOptions(), "key1", "value1");
s = db->Get(rocksdb::ReadOptions(), "key1", &value);
assert(s.ok());
assert("value1" == value);

s = db->Delete(rocksdb::WriteOptions(), "key1");
assert(s.ok());

TerarkFS env support

[Enhancement]:

Problem

  • Filesystems(e.g. ext4) are built for general purpose which is not optimized for append-only & LSM storage systems
  • SPDK driver and multi-layer storage (e.g. NVMe SSD + Optane SSD + Optane Memory) should be used for better performance

Solution

@levisonchen Please add a short description on how we gona implement this feature

Release

  • v1.4

TerarkZipTable causes compile error

i use open_column_families a lot so i thought about using terarkzip as default with appended code below BUT...
if anyone can help resolve this, i may be able to track down 1 bug i think is in terarkdb with column families. pls help

i dunno how to code in c++ actually for a long time. seems like this is c++11 thing.

terarkdb/db/c.cc:729:35: error: cannot bind rvalue reference of type 'std::shared_ptr<terarkdb::TableFactory>&&' to lvalue of type 'std::shared_ptr<terarkdb::TableFactory>'
  729 |   db_options->rep.table_factory = table_factory;
      |                                   ^~~~~~~~~~~~~

how do i resolve the above, the code is as below...

rocksdb_t* rocksdb_open_column_families(
    const rocksdb_options_t* db_options, const char* name,
    int num_column_families, const char** column_family_names,
    const rocksdb_options_t** column_family_options,
    rocksdb_column_family_handle_t** column_family_handles, char** errptr) {
  std::vector<ColumnFamilyDescriptor> column_families;
  for (int i = 0; i < num_column_families; i++) {
    column_families.push_back(ColumnFamilyDescriptor(
        std::string(column_family_names[i]),
        ColumnFamilyOptions(column_family_options[i]->rep)));
  }

  DB* db;
  std::vector<ColumnFamilyHandle*> handles;








  BlockBasedTableOptions blockbased_options;
  blockbased_options.block_size = 8ULL << 10;

  std::shared_ptr<terarkdb::TableFactory> table_factory;
  table_factory.reset(NewBlockBasedTableFactory(blockbased_options));

  terarkdb::TerarkZipTableOptions tzt_options;
  tzt_options.localTempDir = "/tmp/t";
  tzt_options.indexNestLevel = 3;
  tzt_options.sampleRatio = 0.01;
  tzt_options.terarkZipMinLevel = 2; // Start using TerarkZipTable from level 2

  table_factory.reset(NewTerarkZipTableFactory(tzt_options, table_factory));


  db_options->rep.table_factory = table_factory; //crashed here



[BUG] CMake inconsistencies

[BUG]:

Expected behavior

  1. All dependency libraries are placed in output/lib folder
  2. Build commands honor parallelism settings
  3. Portable compiles are actually portable
  4. Commonly used build targets are production-ready

Actual behavior

  1. Snappy library is placed under output/lib64 folder
  2. Dependency libraries have inconsistent -j commands
  3. PORTABLE=1 does not cover terark-zip library
  4. RelWithDebInfo is actually -DDEBUG, ASAN is default ON

Steps to reproduce the behavior

  1. ./build.sh

BUILD_COMMAND cd build && cmake ../ -DCMAKE_INSTALL_PREFIX=${CMAKE_BINARY_DIR} -DCMAKE_BUILD_TYPE=Release -DSNAPPY_BUILD_TESTS=OFF -DHAVE_LIBLZO2=OFF && make -j 10

BUILD_COMMAND cd cmake-build && cmake ../ -DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE} && make -j $(nproc)

BUILD_COMMAND make "CXXFLAGS=-fPIC -O2" "CFLAGS=-fPIC -O2" -j 4

BUILD_COMMAND bash autogen.sh && "CFLAGS=-fPIC" "CXXFLAGS=-fPIC" "LDFLAGS=-fPIC" ./configure --prefix=${CMAKE_BINARY_DIR} --enable-prof && make -j 20

3. https://github.com/bytedance/terark-zip/blob/c55ddd7f8ee9b7f71f3ac0f681d44aaf73f1c7af/CMakeLists.txt#L19
4.

terarkdb/CMakeLists.txt

Lines 17 to 18 in 668b7e5

string(REGEX REPLACE "-DNDEBUG " "" CMAKE_CXX_FLAGS_RELWITHDEBINFO "${CMAKE_CXX_FLAGS_RELWITHDEBINFO} -DDEBUG" )
string(REGEX REPLACE "-DNDEBUG " "" CMAKE_C_FLAGS_RELWITHDEBINFO "${CMAKE_C_FLAGS_RELWITHDEBINFO} -DDEBUG" )

if (${CMAKE_SYSTEM_NAME} MATCHES "Darwin" OR CMAKE_BUILD_TYPE STREQUAL "Release" OR WITH_JEMALLOC)

Solution

It seems to be deadlock when executing the script "benchmark.sh"

[BUG]

Expected behavior

The script executes normally and outputs performance results。

Actual behavior

It seems to be deadlock after 8 hours of script execution。

Steps to reproduce the behavior

dev.1.3  e3afb15140b64ae88bdb416413a09e123cbb2f78
I built terarkdb  with reference to wiki documents,then executed "NUM_KEYS=100000000 NUM_THREADS=64 CACHE_SIZE=137438953472  VALUE_SIZE=40960 ./benchmark.sh bulkload". I found it seems to be deadlock after 10 hours of script execution.

[Enhancement]

Problem

Pstack info:
Thread 24 (Thread 0x7f414137f700 (LWP 209502)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=0) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62e90) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 23 (Thread 0x7f41405ff700 (LWP 209503)):
#0 CopyForward (len=50, op=0x7d32c7cf4344 "\uC0z5&2Xs]Ou$Q}'4WKhs(BMhC/_XDEl+^sCN6"jF^xB_EC#i$g.-11ShE!BnTT{nr2c5<TO=]3"!s|^vvfNssoBacxB_EC#i$g.-11ShE!BnTT{nr2c5<TO=]*3\"!s|^vvfNssoBacP$,!M{#!6Tw/$S"9qTp\F-in6B?otJ"XtHC+yN1Wh$<F}bw+nP$,!M{#"..., src=0x7d32c7cf4312 "\uC0z5&2Xs]Ou$Q}'4WKhs(BMhC/_XDEl+^sCN6"jF^ni;YV}\uC0z5&2Xs]Ou$Q}'4WKhs(BMhC/_XDEl+^sCN6"jF^xB_EC#i$g.-11ShE!BnTT{nr2c5<TO=]3"!s|^vvfNssoBacxB_EC#i$g.-11ShE!BnTT{nr2c5<TO=]*3\"!s|^vvfNssoBacP$,!M{#"...) at /data0/liufanglei/src/terarkdb/third-party/terark-zip/src/terark/zbs/dict_zip_blob_store.cpp:2585
#1 terark::DoUnzipSwitchPreserve<3> (pos=0x7d3308fd5893 "\370xB_EC#i$g.-11ShE!BnTT{nr2c5<TO=]\210
3"!s|^vvfNssoBac\206\062", end=0x7d3308fda6a5 "uѧ\216XorUeb[pd\"=->1P7.P~dUf-$T!F^|]r0[R1O>P_k>3skiYYu^XorUeb[poU6GMw11gu"2SuoR$OZV{"ht3Yl29ZfgW3W3Q#d--SWbHk
[wvoU6GMw11gu"2SuoR$OZV{"ht3Yl29ZfgW3W3Q#d--SWbHk
[wv9~>8a>0HYCqy6Fu/=w%kr}$aG}|4Gm`-u)>""..., recData=0x7f412bd63750, dic=, gOffsetBits=, reserveOutputMultiplier=) at /data0/liufanglei/src/terarkdb/third-party/terark-zip/src/terark/zbs/dict_zip_blob_store_unzip_func.hpp:204
#2 0x0000000000a49155 in read_record_append_tpl<false, 2, (terark::DictZipBlobStore::Options::EntropyAlgo)0, 0, terark::DictZipBlobStore::fspread_record_append_tpl(terark::BlobStore::pread_func_t, void, size_t, size_t, terark::valvec, terark::valvec) const [with bool ZipOffset = false; int CheckSumLevel = 2; terark::DictZipBlobStore::Options::EntropyAlgo Entropy = (terark::DictZipBlobStore::Options::EntropyAlgo)0; int EntropyInterLeave = 0]::<lambda(size_t, size_t)> > (readRaw=..., recData=0x7f412bd63750, recId=418, this=0x7f166cd7fd80) at /data0/liufanglei/src/terarkdb/third-party/terark-zip/src/terark/zbs/dict_zip_blob_store.cpp:2895
#3 terark::DictZipBlobStore::fspread_record_append_tpl<false, 2, (terark::DictZipBlobStore::Options::EntropyAlgo)0, 0> (this=0x7f166cd7fd80, fspread=, lambda=, baseOffset=, recID=418, recData=0x7f412bd63750, rdbuf=0x7f41405f9cf0) at /data0/liufanglei/src/terarkdb/third-party/terark-zip/src/terark/zbs/dict_zip_blob_store.cpp:2686
#4 0x0000000000a2cad0 in fspread_record_append (rdbuf=0x7f41405f9cf0, recData=0x7f412bd63750, recID=418, baseOffset=50376, lambda=0x7f062a7d0e78, fspread=0x7aff20 <rocksdb::FsPread(void*, size_t, size_t, terark::valvec)>, this=) at /data0/liufanglei/src/terarkdb/third-party/terark-zip/src/terark/zbs/blob_store.hpp:127
#5 terark::BlobStore::fspread_record_append (this=, fspread=0x7aff20 <rocksdb::FsPread(void
, size_t, size_t, terark::valvec)>, lambda=0x7f062a7d0e78, baseOffset=50376, recID=418, recData=0x7f412bd63750) at /data0/liufanglei/src/terarkdb/third-party/terark-zip/src/terark/zbs/blob_store.cpp:120
#6 0x00000000007bbd16 in rocksdb::TerarkZipTableIterator::UnzipIterRecord (this=0x7f412bd63660, hasRecord=) at /data0/liufanglei/src/terarkdb/table/terark_zip_table_reader.cc:545
#7 0x00000000007bc4c5 in rocksdb::TerarkZipTableIterator::SeekInternal (this=0x7f412bd63660, seek_key=..., seek_tag=0) at /data0/liufanglei/src/terarkdb/table/terark_zip_table_reader.cc:471
#8 0x00000000007bc621 in rocksdb::TerarkZipTableIterator::Seek (this=0x7f412bd63660, target=...) at /data0/liufanglei/src/terarkdb/table/terark_zip_table_reader.cc:355
#9 0x0000000000743afd in Seek (k=..., this=0x7d33085c5de0) at /data0/liufanglei/src/terarkdb/table/iterator_wrapper.h:72
#10 rocksdb::MergingIterator::Seek (this=0x7f41405fa600, target=...) at /data0/liufanglei/src/terarkdb/table/merging_iterator.cc:134
#11 0x00000000006284d0 in Seek (k=..., this=0x7f41405fa370) at /data0/liufanglei/src/terarkdb/table/iterator_wrapper.h:183
#12 rocksdb::(anonymous namespace)::AdjustRange (ic=ic@entry=0x7f4141a83528, iter=iter@entry=0x7f41405fa370, arena=0x7f41405fa5f0, largest_key=..., ranges=std::vector of length 41913, capacity 65536 = {...}) at /data0/liufanglei/src/terarkdb/db/map_builder.cc:633
#13 0x0000000000631c61 in rocksdb::MapBuilder::Build (this=this@entry=0x7f41405fb0e0, inputs=std::vector of length 2, capacity 2 = {...}, push_range=std::vector of length 1, capacity 1 = {...}, output_level=5, output_path_id=0, cfd=0x7f4141a83500, version=0x7f412e780000, edit=0x7f412b672308, output=0x7f41405fb020) at /data0/liufanglei/src/terarkdb/db/map_builder.cc:1464
#14 0x00000000007dd30b in rocksdb::CompactionJob::InstallCompactionResults (this=this@entry=0x7f41405fbc30, mutable_cf_options=...) at /data0/liufanglei/src/terarkdb/db/compaction_job.cc:2343
#15 0x00000000007deb4d in rocksdb::CompactionJob::Install (this=this@entry=0x7f41405fbc30, mutable_cf_options=...) at /data0/liufanglei/src/terarkdb/db/compaction_job.cc:1083
#16 0x00000000005aefec in rocksdb::DBImpl::BackgroundCompaction (this=this@entry=0x7f4141ae5400, made_progress=made_progress@entry=0x7f41405fc086, job_context=job_context@entry=0x7f41405fc0a0, log_buffer=log_buffer@entry=0x7f41405fc2c0, prepicked_compaction=prepicked_compaction@entry=0x7f3f89634e30) at /data0/liufanglei/src/terarkdb/db/db_impl_compaction_flush.cc:2819
#17 0x00000000005b4ad9 in rocksdb::DBImpl::BackgroundCallCompaction (this=this@entry=0x7f4141ae5400, prepicked_compaction=prepicked_compaction@entry=0x7f3f89634e30, bg_thread_pri=bg_thread_pri@entry=rocksdb::Env::LOW) at /data0/liufanglei/src/terarkdb/db/db_impl_compaction_flush.cc:2300
#18 0x00000000005b505f in rocksdb::DBImpl::BGWorkCompaction (arg=) at /data0/liufanglei/src/terarkdb/db/db_impl_compaction_flush.cc:2052
#19 0x00000000007897bd in operator() (this=0x7f41405fcc30) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/std_function.h:706
#20 rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=1) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:252
#21 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62ea0) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#22 0x0000000000d0225f in execute_native_thread_routine ()
#23 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#24 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 22 (Thread 0x7f413fdfe700 (LWP 209504)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=2) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62ec0) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 21 (Thread 0x7f413f1ff700 (LWP 209505)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=3) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62eb0) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 20 (Thread 0x7f413e7fe700 (LWP 209506)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=4) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62ed0) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 19 (Thread 0x7f413d1ff700 (LWP 209507)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=5) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62ee0) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 18 (Thread 0x7f413c5fe700 (LWP 209508)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=6) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62ef0) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 17 (Thread 0x7f413b9ff700 (LWP 209509)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=7) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62f00) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 16 (Thread 0x7f413affe700 (LWP 209510)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=8) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62f10) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 15 (Thread 0x7f413a3ff700 (LWP 209511)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=9) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62f20) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 14 (Thread 0x7f4139bfe700 (LWP 209512)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=10) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62f30) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 13 (Thread 0x7f4138fff700 (LWP 209513)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=11) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62f40) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 12 (Thread 0x7f41383fe700 (LWP 209514)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=12) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62f50) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 11 (Thread 0x7f41377ff700 (LWP 209515)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=13) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62f60) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 10 (Thread 0x7f4136bff700 (LWP 209516)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a460e0, thread_id=thread_id@entry=14) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62f70) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x7f41361fe700 (LWP 209517)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a461c0, thread_id=thread_id@entry=0) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62f80) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x7f41355ff700 (LWP 209518)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a461c0, thread_id=thread_id@entry=1) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62f90) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7f41349fe700 (LWP 209519)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a461c0, thread_id=thread_id@entry=2) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62fb0) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7f4133dff700 (LWP 209520)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a461c0, thread_id=thread_id@entry=3) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62fa0) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7f41331ff700 (LWP 209521)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f414286eaec in std::condition_variable::wait(std::unique_lockstd::mutex&) () from /lib64/libstdc++.so.6
#2 0x0000000000789609 in rocksdb::ThreadPoolImpl::Impl::BGThread (this=this@entry=0x7f4141a461c0, thread_id=thread_id@entry=4) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:185
#3 0x0000000000789954 in rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper (arg=0x7f4141a62fc0) at /data0/liufanglei/src/terarkdb/util/threadpool_imp.cc:290
#4 0x0000000000d0225f in execute_native_thread_routine ()
#5 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7f41305ff700 (LWP 209522)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000000527edd in rocksdb::port::CondVar::Wait (this=this@entry=0x7f4141b051d0) at /data0/liufanglei/src/terarkdb/port/port_posix.cc:91
#2 0x00000000006c5235 in rocksdb::InstrumentedCondVar::WaitInternal (this=this@entry=0x7f4141b051d0) at /data0/liufanglei/src/terarkdb/monitoring/instrumented_mutex.cc:116
#3 0x00000000006c5314 in rocksdb::InstrumentedCondVar::Wait (this=this@entry=0x7f4141b051d0) at /data0/liufanglei/src/terarkdb/monitoring/instrumented_mutex.cc:86
#4 0x000000000077a308 in rocksdb::DeleteScheduler::BackgroundEmptyTrash (this=0x7f4141b050e0) at /data0/liufanglei/src/terarkdb/util/delete_scheduler.cc:226
#5 0x0000000000d0225f in execute_native_thread_routine ()
#6 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#7 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7f4107d45700 (LWP 209554)):
#0 0x00007f4143304d12 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000000527f42 in rocksdb::port::CondVar::TimedWait (this=this@entry=0x7f4141a682a8, abs_time_us=abs_time_us@entry=1610069578258914) at /data0/liufanglei/src/terarkdb/port/port_posix.cc:105
#2 0x000000000058e383 in wait (delay=, this=0x7f4141a68240) at /data0/liufanglei/src/terarkdb/util/repeatable_thread.h:93
#3 thread (this=0x7f4141a68240) at /data0/liufanglei/src/terarkdb/util/repeatable_thread.h:130
#4 operator() (__closure=) at /data0/liufanglei/src/terarkdb/util/repeatable_thread.h:34
#5 __invoke_impl<void, rocksdb::RepeatableThread::RepeatableThread(std::function<void()>, const string&, rocksdb::Env
, uint64_t, uint64_t)::<lambda()> > (__f=) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/invoke.h:60
#6 __invoke<rocksdb::RepeatableThread::RepeatableThread(std::function<void()>, const string&, rocksdb::Env*, uint64_t, uint64_t)::<lambda()> > (__fn=) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/invoke.h:95
#7 _M_invoke<0> (this=) at /opt/rh/devtoolset-7/root/usr/include/c++/7/thread:234
#8 operator() (this=) at /opt/rh/devtoolset-7/root/usr/include/c++/7/thread:243
#9 _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN7rocksdb16RepeatableThreadC4ESt8functionIFvvEERKSsPNS3_3EnvEmmEUlvE_EEEEE6_M_runEv (this=) at /opt/rh/devtoolset-7/root/usr/include/c++/7/thread:186
#10 0x0000000000d0225f in execute_native_thread_routine ()
#11 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#12 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7f410c7ff700 (LWP 209555)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000000527edd in rocksdb::port::CondVar::Wait (this=this@entry=0x7f4141ae5930) at /data0/liufanglei/src/terarkdb/port/port_posix.cc:91
#2 0x00000000006c5235 in rocksdb::InstrumentedCondVar::WaitInternal (this=this@entry=0x7f4141ae5930) at /data0/liufanglei/src/terarkdb/monitoring/instrumented_mutex.cc:116
#3 0x00000000006c5314 in rocksdb::InstrumentedCondVar::Wait (this=this@entry=0x7f4141ae5930) at /data0/liufanglei/src/terarkdb/monitoring/instrumented_mutex.cc:86
#4 0x00000000005a6b71 in rocksdb::DBImpl::RunManualCompaction (this=this@entry=0x7f4141ae5400, cfd=cfd@entry=0x7f4141a83500, input_level=input_level@entry=0, output_level=output_level@entry=-2, output_path_id=, max_subcompactions=0, begin=0x0, end=0x0, files_being_compact=0x7f410c7fca50, exclusive=true, disallow_trivial_move=false) at /data0/liufanglei/src/terarkdb/db/db_impl_compaction_flush.cc:1462
#5 0x00000000005ab006 in rocksdb::DBImpl::CompactRange (this=0x7f4141ae5400, options=..., column_family=, begin=0x0, end=0x0) at /data0/liufanglei/src/terarkdb/db/db_impl_compaction_flush.cc:735
#6 0x0000000000466e3d in rocksdb::DB::CompactRange (this=0x7f4141ae5400, options=..., begin=0x0, end=0x0) at /data0/liufanglei/src/terarkdb/include/rocksdb/db.h:857
#7 0x000000000046cf9e in rocksdb::Benchmark::Compact (this=0x7ffed2238520, thread=) at /data0/liufanglei/src/terarkdb/tools/db_bench_tool.cc:5695
#8 0x0000000000473394 in rocksdb::Benchmark::ThreadBody (v=0x7ed85a728890) at /data0/liufanglei/src/terarkdb/tools/db_bench_tool.cc:2898
#9 0x00000000005283e2 in rocksdb::(anonymous namespace)::StartThreadWrapper (arg=0x7f4141a62e60) at /data0/liufanglei/src/terarkdb/env/env_posix.cc:1069
#10 0x00007f4143300dd5 in start_thread () from /lib64/libpthread.so.0
#11 0x00007f4141fd5ead in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f4143718980 (LWP 209500)):
#0 0x00007f4143304965 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x0000000000527edd in rocksdb::port::CondVar::Wait (this=this@entry=0x7ffed2237cc8) at /data0/liufanglei/src/terarkdb/port/port_posix.cc:91
#2 0x0000000000482eb8 in rocksdb::Benchmark::RunBenchmark (this=this@entry=0x7ffed2238520, n=n@entry=1, name=..., method=(void (rocksdb::Benchmark::*)(rocksdb::Benchmark * const, rocksdb::ThreadState *)) 0x46cf10 rocksdb::Benchmark::Compact(rocksdb::ThreadState*)) at /data0/liufanglei/src/terarkdb/tools/db_bench_tool.cc:2970
#3 0x0000000000489822 in rocksdb::Benchmark::Run (this=this@entry=0x7ffed2238520) at /data0/liufanglei/src/terarkdb/tools/db_bench_tool.cc:2838
#4 0x0000000000462361 in rocksdb::db_bench_tool (argc=, argv=) at /data0/liufanglei/src/terarkdb/tools/db_bench_tool.cc:5871
#5 0x00007f4141efa3d5 in __libc_start_main () from /lib64/libc.so.6
#6 0x0000000000437637 in _start ()

CMD info:
liufang+ 209500 100 2.1 2239499132 16795548 pts/1 Sl+ 00:42 521:15 ./db_bench --benchmarks=compact --use_existing_db=1 --disable_auto_compactions=1 --sync=0 --db=/data4/liufl/terarkdb/terarkdb_test/db_test --wal_dir=/data4/liufl/terarkdb/terarkdb_test/wal_test --num=100000000 --num_levels=6 --key_size=20 --value_size=40960 --block_size=8192 --cache_size=137438953472 --cache_numshardbits=6 --compression_max_dict_bytes=0 --compression_ratio=0.5 --compression_type=snappy --level_compaction_dynamic_level_bytes=true --bytes_per_sync=8388608 --cache_index_and_filter_blocks=0 --pin_l0_filter_and_index_blocks_in_cache=1 --benchmark_write_rate_limit=0 --hard_rate_limit=3 --rate_limit_delay_max_milliseconds=1000000 --write_buffer_size=134217728 --target_file_size_base=134217728 --max_bytes_for_level_base=1073741824 --verify_checksum=1 --delete_obsolete_files_period_micros=62914560 --max_bytes_for_level_multiplier=8 --statistics=0 --stats_per_interval=1 --stats_interval_seconds=60 --histogram=1 --memtablerep=skip_list --bloom_bits=10 --open_files=-1 --level0_file_num_compaction_trigger=4 --level0_slowdown_writes_trigger=12 --level0_stop_writes_trigger=20 --max_background_jobs=20 --max_write_buffer_number=8 --threads=1

Disk Info:
/dev/nvme4n1 3.6T 2.1T 1.4T 61% /data4

Memory Info:
total used free shared buff/cache available
Mem: 754G 21G 14G 27M 718G 730G
Swap: 4.0G 169M 3.8G

Machine Info:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 80
On-line CPU(s) list: 0-79
Thread(s) per core: 2
Core(s) per socket: 20
座: 2
NUMA 节点: 2
厂商 ID: GenuineIntel
CPU 系列: 6
型号: 85
型号名称: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
步进: 4
CPU MHz: 999.902
CPU max MHz: 3700.0000
CPU min MHz: 1000.0000
BogoMIPS: 4800.00
虚拟化: VT-x
L1d 缓存: 32K
L1i 缓存: 32K
L2 缓存: 1024K
L3 缓存: 28160K
NUMA 节点0 CPU: 0-19,40-59
NUMA 节点1 CPU: 20-39,60-79

Solution

Patricia Trie memtable doesn't work well on multi-threading workloads

[BUG]:

Expected behavior

db_bench readrandomwriterandom works well on multi-threading workloads and get better performacne (ops and latency) than skiplist memtable

Actual behavior

Use db_bench for multi-threading (more than 2 threads) reading and writing, background write hangs there without any useful log.

Steps to reproduce the behavior

db_bench --memtablerep=skip_list ....

Replace namespace name "rocksdb" with TERARKDB_NAMESPACE

[Enhancement]

Summary

When dynamically linking two binaries together, different builds of RocksDB from two sources might cause errors. To provide a tool for a user to solve the problem, the RocksDB namespace is changed to a flag that can be overridden in build time. The default "rocksdb" namespace is defined in terark_namespace.h
refers to this link:
facebook/rocksdb#6433

[BUG]Memtable test hangs

[BUG]:

Expected behavior

Test finish in reasonable time

Actual behavior

Test hangs with larger num_operations

Steps to reproduce the behavior

memtablerep_bench -benchmarks seqreadwrite -num_threads 2 -num_operations=1000

reorder write batch list

[Enhancement] reorder write batch

Problem

If a write batch leader enabled WAL, then the leader can group commits from all followers no matter whether they disabled WAL or not, which can lead to data inconsistency in some cases.

(TODO needs a detailed explanation).

Solution

If we simply restrict the leader who enabled WAL can only sequentially group other commits with WAL enabled, then the overall IOPS would be impacted significantly.

So, we should reorder write all commits to allow a leader can take as many commits (with the same WAL config) as possible to maximize the throughput of the system.

[BUG] Cannot manually trigger compaction of blob SSTs at level -1(KV separation)

[BUG]

When there are many small blob SSTs created at level -1 due to KV separation.

Expected behavior

Manual compaction, table option change and key deletion should trigger compaction of blob SSTs involved at level -1.

Actual behavior

Only SST from level 0 and above are compacted, GC does not trigger even when the keys contained in the blob SSTs are deleted.

Steps to reproduce the behavior

Some source code use std::move a const reference object, it mostly returns a copy of the object

[BUG]

Expected behavior

Actual behavior

Steps to reproduce the behavior

[Enhancement]:

Problem

When called on a const object, std::move returns a copy of the object, which is likely not the developer's intent.
https://docs.microsoft.com/en-us/cpp/code-quality/c26478?view=msvc-160

: RangeStorage(std::move(_start), std::move(_limit), _include_start,

Solution

If you don't have any solution, leave this section empty

Add RecordBaseTable for better compression and CPU utilization on heavy-writing workloads

[Enhancement]:

Problem

  • TerarkZipTable costs too much CPU resource on heavy-writing workloads (e.g. 200MB/s WAL writes)
  • BlockBasedTable cannot enable compression on such workloads or CPU would be the bottom neck
  • The main problem is compressed blocks cannot re-used in consequent compactions

Solution

  • RecordBasedTable enables block reusing to lower down CPU usage on heavy-writing workloads
  • Enable point-access inside a block (record-base access)
  • ...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.