Giter Site home page Giter Site logo

m4drat / memplusplus Goto Github PK

View Code? Open in Web Editor NEW
52.0 5.0 4.0 1.27 MB

C++ memory allocator with smart GC

CMake 5.65% C++ 93.85% Shell 0.50%
cpp cplusplus gc garbage-collection garbage-collector allocator memory-allocator data-structures compacting layouting

memplusplus's Introduction

mem++

codecov build benchmarking

C++ memory allocator with smart features, such as garbage collection, and heap compacting.

🔢 Current version

Current library version: 0.3.5

Warning: the API is not stable!

🔬 Features

  • Garbage Collection
  • Fast memory allocations (using bump allocators techniques)
  • Fast free algorithms
  • Advanced compacting algorithms, which are used to reduce memory fragmentation and improve cache locality: heuristic-layouting

⚠ Supported systems / limitations

  • The library is still in the development. It's definitely not recommended to use in production!
  • All Unix-like systems (where it is possible to use mmap)
  • g++ or clang++ compilers
  • Currently supports only single-threaded applications
  • References/Pointers to GcPtr's are always invalidated after GarbageCollect()

❓ Usage

  1. Install latest build systems: apt install cmake g++ clang
  2. Clone just the library: git clone https://github.com/m4drat/memplusplus/
  3. Clone the library (with tests support): git clone --recurse-submodules=./libraries/googletest https://github.com/m4drat/memplusplus/
  4. If you want to run benchmarks, take a look at this: link

How to use the library as a dependency (external project)

  1. run cmake from the root directory (memplusplus):

    cmake \
        -S . \
        -B build \
        -DCMAKE_BUILD_TYPE=Release \
        -DMPP_BUILD_SHARED_LIBS=ON \
  2. compile and install:

    sudo cmake \
        --build build \
        --target install \
        -j 8
  3. After the library is installed, in your CMakeLists.txt use find_package, to actually use the library:

    cmake_minimum_required(VERSION 3.13)
    project(<YOUR_PROJECT_NAME>)
    
    find_package(mpp 0.3.5 REQUIRED)
    
    add_executable(<YOUR_PROJECT_NAME> <YOUR_SOURCES>)
    target_link_libraries(<YOUR_PROJECT_NAME> PRIVATE mpp::mpp)
  4. After that you will be able to include library headers in your sources like that: #include <mpplib/mpp.hpp>

How to use the library internally

  1. just copy the whole directory with libmemplusplus to your project root directory.

  2. In your project's cmake file add this:

    add_subdirectory(libmemplusplus)
    target_link_libraries(<YOUR_PROJECT_NAME> PRIVATE mpp::mpp)
  3. After that you will be able to include library headers in your sources like that: #include "mpplib/mpp.hpp"

📀 Build options

Global options:

  • MPP_ENABLE_COVERAGE - build with code coverage support
  • MPP_BUILD_FUZZER - build fuzzer project (will build the library with sanitizers)
  • MPP_BUILD_EXAMPLE - build example project
  • MPP_BUILD_TESTS - build tests
  • MPP_BUILD_DOCS - build documentation

Library options:

  • MPP_BUILD_SHARED_LIBS - build shared or static libraries
  • MPP_FULL_DEBUG - build in full debug mode (adds extended security checks in debug build)
  • MPP_SECURE - build in secure mode with additional security features
  • MPP_PROFILE - enable profiling instrumentation
  • MPP_SANITIZERS - add sanitizers to the build
  • MPP_VALGRIND - adds support for valgrind
  • MPP_COLOUR_DEBUG_OUTPUT - Add colors to debug output
  • MPP_STATS - Add statistics instrumentation.
  • MPP_ENABLE_LOGGING - Enable logging (even in release mode)

🔳 Environment variables

  • MPP_DUMP_OBJECTS_GRAPH=1 / MPP_DUMP_OBJECTS_GRAPH=2 - dump objects graph to file objects.dot, while performing CollectGarbage() (only possible in debug mode)

  • MPP_SHOW_STATISTICS=1 - display statistics after program termination (should be built with MPP_STATS set to ON)

  • MPP_LOG_LEVEL - set log level (default: FATAL). Supported values: TRACE, DEBUG, INFO, WARNING, ERROR, FATAL, DISABLED

📚 Examples

  • Automatic memory management

    #include <mpplib/mpp.hpp>
    
    ...
    
    // create smart pointer, that will automatically deallocate object, when needed
    mpp::MakeShared<Object> object = mpp::MakeShared<Object>(<constructor params>);
    
    // create array of Objects, that will be automatically deallocated, when goes out of scope
    mpp::MakeShared<Object[]> objects = mpp::MakeSharedN<Object>(<array size>, <constructor params>);
    
    ...
    
    // collect all garbage + compact memory (should be called manually)
    mpp::CollectGarbage();
  • Manual memory management - deprecated

    #include <mpplib/mpp.hpp>
    
    ...
    
    // will call constructor automatically, like new
    Object* obj = mpp::Allocate<Object>(<constructor params>);
    // create raw pointer (behaves like malloc)
    void* ptr = mpp::Allocate(128);
    
    ...
    
    // will call destructor automatically, like delete
    mpp::Deallocate(obj);
    // only deallocates raw pointer (behaves like free)
    mpp::Deallocate(ptr);

💻 Debugging/profiling library

Memplusplus provides different debug-like features, such as data visualizers, profilers, and statistics collectors.

  • Backtrace functionality for MPP_ASSERT:

    Add these flags to your project's CMakeLists.txt, if you want to see a backtrace when MPP_ASSERT fails:

    # For GCC
    target_compile_options(${PROJECT_NAME} PRIVATE -g -O0)
    target_link_libraries(${PROJECT_NAME} PRIVATE mpp::mpp -export-dynamic)
    
    # For Clang
    target_compile_options(${PROJECT_NAME} PRIVATE -g -O0)
    target_link_libraries(${PROJECT_NAME} PRIVATE mpp::mpp -Wl,--export-dynamic)
  • Address Sanitizer support (ASAN):

    Enable MPP_SANITIZERS before building the library. Then compile your project with -fsanitize=address flag. As a result, you will get asan-compatible build which will help you to debug any memory management issues.

    Example buggy code:

    void* p1 = mpp::Allocate(128);
    mpp::Deallocate(p1);
    *(uint32_t*)p1 = 0x13371337; // Use-after-free write

    Example ASAN report for this code:

    =================================================================
    ==26738==ERROR: AddressSanitizer: use-after-poison on address 0x19b297400010 at pc 0x557f3c1aaea3 bp 0x7ffcb838c440 sp 0x7ffcb838c438
    WRITE of size 4 at 0x19b297400010 thread T0
        #0 0x557f3c1aaea2 in main /home/madrat/memplusplus/build/../example_project/src/main.cpp:137:20
        #1 0x7f16c401a0b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16
        #2 0x557f3c0e7d2d in _start (/home/madrat/memplusplus/build/example_project/example_project-d+0x45d2d) (BuildId: cca1c1f77a22aeae021502831561465b63cc9f19)
    
    Address 0x19b297400010 is a wild pointer inside of access range of size 0x000000000004.
    SUMMARY: AddressSanitizer: use-after-poison /home/madrat/memplusplus/build/../example_project/src/main.cpp:137:20 in main
    
    ......
  • Valgrind support:

    Enable MPP_VALGRIND before building the library. Then compile your project with -g flag. As a result, you will get valgrind-compatible build which will help you to track-down any memory leaks. You also might want to disable MPP_SANITIZERS/MPP_BUILD_FUZZER and build using g++.

    Example buggy code:

    void leak() {
        void* p1 = mpp::Allocate(128);
        // mpp::Deallocate(p1);
    }

    Example valgrind report for this code:

    ==21304== HEAP SUMMARY:
    ==21304==     in use at exit: 160 bytes in 1 blocks
    ==21304==   total heap usage: 24 allocs, 23 frees, 82,582 bytes allocated
    ==21304==
    ==21304== Searching for pointers to 1 not-freed blocks
    ==21304== Checked 150,048 bytes
    ==21304==
    ==21304== 160 bytes in 1 blocks are definitely lost in loss record 1 of 1
    ==21304==    at 0x151812: mpp::MemoryManager::Allocate(unsigned long) (memory_manager.cpp:181)
    ==21304==    by 0x152A33: mpp::Allocate(unsigned long) (memory_manager.cpp:352)
    ==21304==    by 0x150238: leak() (main.cpp:131)
    ==21304==    by 0x15024C: main (main.cpp:144)
    ==21304==
    ==21304== LEAK SUMMARY:
    ==21304==    definitely lost: 160 bytes in 1 blocks
    ==21304==    indirectly lost: 0 bytes in 0 blocks
    ==21304==      possibly lost: 0 bytes in 0 blocks
    ==21304==    still reachable: 0 bytes in 0 blocks
    ==21304==         suppressed: 0 bytes in 0 blocks
    ==21304==
    ==21304== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
    
  • Data visualizers:

    In Debug builds, you can dump .dot representation of the ChunkTreap and GcGraph (only using specific environment variable). Later you can render these .dot files using dot from graphviz.

    ChunkTreap visualizer

    How to dump ChunkTreap in code (don't forget to redirect program output to file treap.dot):

    // 1. Get the arena you want to dump the tree for
    std::unique_ptr<Arena>& tmpArena = mpp::GetArenaList()[0];
    
    // 2. Extract ChunkTreap from this arena, and dump it
    // (in this example everything was dumped to std::cout).
    tmpArena->GetFreedChunks().GenerateGraphvizLayout(std::cout);

    How to generate an .svg file with dot:

    dot -Tsvg treap.dot -o treap.svg

    After that you will get .svg file with your freed chunks treap:
    treap

    GcGraph visualizer

    To visualization the GcGraph you have to:

    1. Build the library in debug mode, and set MPP_DUMP_OBJECTS_GRAPH=1 (basic visualization) or MPP_DUMP_OBJECTS_GRAPH=2 (advanced visualization) before running the target app. On each GC cycle, it will dump the objects graph to file "objects_cycle<current cycle number>.dot".

    2. Then generate an .svg file using dot as follows:

      dot -Tsvg objects_cycle<N>.dot -o objects_cycle<N>.svg

    For example, for this code (it creates a linked list and tree):

    // Linked List node
    struct Node {
        uint32_t data;
        SharedGcPtr<Node> prev;
        SharedGcPtr<Node> next;
    
        Node(uint32_t t_data, SharedGcPtr<Node> t_p, SharedGcPtr<Node> t_n)
            : data{t_data}, prev{t_p}, next{t_n}
        {
        }
    };
    
    // Create Linked List
    SharedGcPtr<Node> n1 = MakeShared<Node>(1, nullptr, nullptr);
    SharedGcPtr<Node> n2 = MakeShared<Node>(2, nullptr, nullptr);
    SharedGcPtr<Node> n3 = MakeShared<Node>(3, nullptr, nullptr);
    SharedGcPtr<Node> n4 = MakeShared<Node>(4, nullptr, nullptr);
    
    n1->prev = nullptr;
    n1->next = n2;
    
    n2->prev = n1;
    n2->next = n3;
    
    n3->prev = n2;
    n3->next = n4;
    
    n4->prev = n3;
    n4->next = nullptr;
    
    // Tree node
    struct TreeNode {
        uint32_t data;
        SharedGcPtr<TreeNode> left;
        SharedGcPtr<TreeNode> right;
        SharedGcPtr<TreeNode> up;
    
        TreeNode(uint32_t t_data, SharedGcPtr<TreeNode> t_left, SharedGcPtr<TreeNode> t_right, SharedGcPtr<TreeNode> t_up)
            : data{t_data}, left{t_left}, right{t_right}, up{t_up}
        {
        }
    };
    
    // Create a random tree
    SharedGcPtr<TreeNode> root = MakeShared<TreeNode>(0, nullptr, nullptr, nullptr);
    SharedGcPtr<TreeNode> treeNode1 = MakeShared<TreeNode>(1, nullptr, nullptr, nullptr);
    SharedGcPtr<TreeNode> treeNode2 = MakeShared<TreeNode>(2, nullptr, nullptr, nullptr);
    SharedGcPtr<TreeNode> treeNode3 = MakeShared<TreeNode>(3, nullptr, nullptr, nullptr);
    SharedGcPtr<TreeNode> treeNode4 = MakeShared<TreeNode>(4, nullptr, nullptr, nullptr);
    SharedGcPtr<TreeNode> treeNode5 = MakeShared<TreeNode>(5, nullptr, nullptr, nullptr);
    
    root->up = nullptr;
    root->left = treeNode1;
    root->right = treeNode2;
    
    treeNode1->up = root;
    treeNode1->left = nullptr;
    treeNode1->right = nullptr;
    
    treeNode2->up = root;
    treeNode2->left = treeNode3;
    treeNode2->right = treeNode4;
    
    treeNode3->up = treeNode2;
    treeNode3->left = nullptr;
    treeNode3->right = nullptr;
    
    treeNode4->up = treeNode2;
    treeNode4->left = treeNode5;
    treeNode4->right = nullptr;
    
    treeNode5->up = treeNode4;
    treeNode5->left = nullptr;
    treeNode5->right = nullptr;
    
    CollectGarbage();

    Executing this code with MPP_DUMP_OBJECTS_GRAPH=1 will generate this graph: objects-simple

    Executing this code with MPP_DUMP_OBJECTS_GRAPH=2 will generate this graph: objects-advanced

  • On-demand objects graph visualizer

    To create object graph at the arbitrary point in your program, add this code:

    mpp::g_memoryManager->GetGC().DumpCurrentObjectsGraph();

    You might want to specify dump type (by default it's ADVANCED) and the filename to save .dot graph to (default: "objects-graph.dot").

  • Profiler

    Using the compilation flag MPP_PROFILE, you can build the library with the profiler enabled. To do so:

    1. Set the MPP_PROFILE before building the library
    2. Run the application until it exits
    3. In your current directory, find the file called mpplib-profiling.json
    4. Open this file using the chrome tracing tool. To do so, open chrome and navigate to the chrome://tracing page.

    Example output:

    profiler

  • Statistics collector

    This feature allows you to dump memory allocator statistics. To build the library with this feature, add MPP_STATS while building the project. You can use this feature in 2 ways:

    In runtime using c++ API:

    Example code to dump statistics:

    utils::Statistics::GetInstance().DumpStats(std::cout, true, false, false) << std::endl;

    Example output:

    +============= STATISTICS DUMP START =============+
    MPP: MIN_CHUNK_SIZE     : 32 bytes
    MPP: CHUNK_HEADER_SIZE  : 16 bytes
    MPP: DEFAULT_ARENA_SIZE : 32.000 MB
    MPP: PAGE_SIZE          : 4.000 KB
    ~~~~~~~~~~~~~~~~ All active arenas ~~~~~~~~~~~~~~~~
    -------------- Arena: 0x1ffaf50 --------------
    MPP - Total arena size           : 32.000 MB
    MPP - Currently Allocated Space  : 960.000 Bytes
    MPP - Amount of free memory      : 31.999 MB (99.9971% of total arena size)
    MPP - Amount of allocated memory : 960.000 Bytes (0.0029% of total arena size)
    MPP - Allocated memory == 0      : false
    MPP - Memory used for metadata   : 160.000 Bytes (16.6667% of currently allocated space)
    MPP - TopChunk                   : [0x7f329dd803c0](96, 33553472|InUse:1|PrevInUse:1)
    MPP - Arena begin pointer        : 0x7f329dd80000
    MPP - Arena end pointer          : 0x7f329fd80000
    MPP - Freed chunks nodes (0)
    MPP - Chunks in use (10)
    
    ~~~~~~~~~~~~~~~~~~ General stats ~~~~~~~~~~~~~~~~~~
    -----------------  Arena: 1  -----------------
    MPP - Total amount of allocated memory inside arena : 960.000 Bytes
    MPP - Total amount of freed memory inside arena     : 0 bytes
    MPP - total freed == total allocated                : false
    MPP - Biggest allocation size                       : 96.000 Bytes
    MPP - Smallest allocation size                      : 96.000 Bytes
    MPP - Full size of arena                            : 32.000 MB
    MPP - Arena was allocated for big chunk             : false
    MPP - Arena was allocated by GC                     : false
    
    -----------------  Arena: 2  -----------------
    MPP - Total amount of allocated memory inside arena : 960.000 Bytes
    MPP - Total amount of freed memory inside arena     : 0 bytes
    MPP - total freed == total allocated                : false
    MPP - Biggest allocation size                       : "No allocations have been made yet"
    MPP - Smallest allocation size                      : "No allocations have been made yet"
    MPP - Full size of arena                            : 32.000 MB
    MPP - Arena was allocated for big chunk             : false
    MPP - Arena was allocated by GC                     : true
    
    ~~~~~~~~~~~~~ Garbage Collector stats ~~~~~~~~~~~~~
    -----------------  Cycle: 1  -----------------
    GC - Time wasted inside   GC::Collect() : 2.0000 ms
    GC - Memory cleaned after GC::Collect() : 0 bytes
    GC - Total size of active objects       : 960.000 Bytes
    
    +============== STATISTICS DUMP END ==============+
    

    After program termination, setting environment variable MPP_SHOW_STATISTICS=1. Important note. Runtime stats cannot be dumped using this approach.

    Example output:

    +============= STATISTICS DUMP START =============+
    MPP: MIN_CHUNK_SIZE     : 32 bytes
    MPP: CHUNK_HEADER_SIZE  : 16 bytes
    MPP: DEFAULT_ARENA_SIZE : 32.000 MB
    MPP: PAGE_SIZE          : 4.000 KB
    ~~~~~~~~~~~~~~~~~~ General stats ~~~~~~~~~~~~~~~~~~
    -----------------  Arena: 1  -----------------
    MPP - Total amount of allocated memory inside arena : 960.000 Bytes
    MPP - Total amount of freed memory inside arena     : 0 bytes
    MPP - total freed == total allocated                : false
    MPP - Biggest allocation size                       : 96.000 Bytes
    MPP - Smallest allocation size                      : 96.000 Bytes
    MPP - Full size of arena                            : 32.000 MB
    MPP - Arena was allocated for big chunk             : false
    MPP - Arena was allocated by GC                     : false
    
    -----------------  Arena: 2  -----------------
    MPP - Total amount of allocated memory inside arena : 960.000 Bytes
    MPP - Total amount of freed memory inside arena     : 0 bytes
    MPP - total freed == total allocated                : false
    MPP - Biggest allocation size                       : "No allocations have been made yet"
    MPP - Smallest allocation size                      : "No allocations have been made yet"
    MPP - Full size of arena                            : 32.000 MB
    MPP - Arena was allocated for big chunk             : false
    MPP - Arena was allocated by GC                     : true
    
    ~~~~~~~~~~~~~ Garbage Collector stats ~~~~~~~~~~~~~
    -----------------  Cycle: 1  -----------------
    GC - Time wasted inside   GC::Collect() : 2 ms
    GC - Memory cleaned after GC::Collect() : 0 bytes
    GC - Total size of active objects       : 960.000 Bytes
    
    +============== STATISTICS DUMP END ==============+
    
  • Allocate/Deallocate hooks

    This feature allows you to install your hooks on Allocate and Deallocate. Example:

    // Lambda as an Allocate hook. Important, all hooks should be static!
    static std::function<void*(std::size_t)> allocHook = [&](std::size_t t_AllocSize) {
        mpp::SetAllocateHook(nullptr);
        void* ptr = mpp::Allocate(t_AllocSize);
        mpp::SetAllocateHook(allocHook);
        std::cout << "[mpp] Allocate(" << t_AllocSize << ") -> ";
        mpp::Chunk::DumpChunk(std::cout, mpp::Chunk::GetHeaderPtr(ptr)) << std::endl;
        return ptr;
    };
    // Set actual Allocate hook.
    mpp::SetAllocateHook(allocHook);
    
    // Lambda as an Deallocate hook. Important, all hooks should be static!
    static std::function<bool(void*)> deallocHook = [&](void* t_Addr) {
        mpp::SetDeallocateHook(nullptr);
        bool res = mpp::Deallocate(t_Addr);
        mpp::SetDeallocateHook(deallocHook);
        std::cout << "[mpp] Deallocate(" << t_Addr << ") -> " << res << std::endl;
        return res;
    };
    // Set actual Deallocate hook.
    mpp::SetDeallocateHook(deallocHook);

🔥 Heuristic layouting

Right now algorithm consists of 2 parts:

1. Find connected components

Divide the graph of all objects into connected components. Each component is a set of objects that are reachable from each other. By dividing the graph into components we can solve the layout problem for each component separately, and what is more important, we can do it in parallel. Also, this step can be called optimal layout in some sense, because it is guaranteed that all objects in the component will be located in the same arena (contiguous memory block).

2. Layout each component heuristically

The next step is to layout objects from components in a way that allows you to access them in a cache-friendly way. This is achieved by using a heuristic layout algorithm. Currently, it's a proof of concept that works only for objects that have a single pointer field (to be precise, it should work for any data structure, that can be represented as a singly linked list). The algorithm is based on the following assumptions

  • If we have a singly linked list, then most likely the next element should be located right after the current one. If it is not (take a look at the image below), then it's a good idea to move it there. By doing so we can reduce cache misses and improve performance.

Benchmark results for the algorithm are presented here: Benchmark results.

We can see that the algorithm is able to improve memory access time by 4-8 times (depending on the size of the list). Of course, it should be noted that the benchmark shows the corner case when the list is fully randomized in memory. In real life, the list might be layouted in a way that is already cache-friendly, so the algorithm will not be able to improve the performance.

Closing words. The algorithm is definitely not perfect, but it's a good start. It can be improved in many ways, for example, by adding more heuristics. Also, it can be used as a base for a more complex layouting algorithm, that will be able to layout any data structure

🚀 Performance comparisons

  1. Standalone benchmarks: memplusplus-benchmarks.
  2. Continuous benchmarking results: continuous-benchmarking

🧾 Documentation

  • Install doxygen, sphinx, breathe

  • Configure:

    cmake \
        -S . \
        -B build \
        -DCMAKE_BUILD_TYPE=Release \
        -DMPP_BUILD_FUZZER=OFF \
        -DMPP_BUILD_EXAMPLE=OFF \
        -DMPP_BUILD_TESTS=OFF \
        -DBUILD_SHARED_LIBS=OFF \
        -DMPP_BUILD_DOCS=ON
  • Build docs:

    cmake \
        --build build \
        -j 8
  • Now docs can be found in "./build/docs/doxygen/html" or "./build/docs/sphinx"

🧪 Tests

  • Configure tests:

    cmake \
        -S . \
        -B build \
        -DCMAKE_BUILD_TYPE=Release \
        -DBUILD_SHARED_LIBS=OFF \
        -DMPP_BUILD_TESTS=ON
  • Build tests:

    cmake \
        --build build \
        --target unit_tests \
        -j 8
  • Run tests:

    cd build && ctest

    Some tests are disabled because they require you to turn off ASLR (for example GcGraphTest.GenerateGraphvizLayout). This is considered unreliable, but if you still want to run them, you can do it like this:

    cd build && setarch `uname -m` -R ./build/tests/unit_tests --gtest_also_run_disabled_tests

🚁 clang-format and clang-tidy

  • install clang-tidy and clang-format

  • Configure:

    cmake \
        -S . \
        -B build \
        -DCMAKE_BUILD_TYPE=Release
  • Use clang-format:

    cmake \
        --build build \
        --target clang-format
  • Use clang-tidy:

    cmake \
        --build build \
        --target clang-tidy

memplusplus's People

Contributors

akoul02 avatar lithium1337mwfgtkam avatar m4drat avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

memplusplus's Issues

General refactoring

  • Allocate should only allocate memory for an object. Create separate method for object construction and one more method to automatically allocate and construct objects.
  • Change inheritance scheme (We should only have MemoryManager/MemoryAllocator + GC as a member)
  • Inside GC, increase newly created arena size

Performance comparisons (mimalloc, malloc, jemalloc) + cpu/memory Profiling

Specifically measure: Using valgrind to measure cache misses

Implement performance comparison with these memory allocators:

  • gcpp (only speed gain after compacting)
  • mimalloc (only allocation/deallocation speed)
  • rpmalloc (only allocation/deallocation speed)
  • jemalloc (only allocation/deallocation speed)
  • hoard (only allocation/deallocation speed)
  • supermalloc (only allocation/deallocation speed)
  • ptmalloc3 (only allocation/deallocation speed)
  • ptmalloc2 (only allocation/deallocation speed) - use latest libc
  • mem++ (speed gain after compacting + allocation/deallocation speed)
  • Mesh (only allocation/deallocation speed)
  • tcmalloc (only allocation/deallocation speed)

Useful links:

  1. rpmalloc-benchmark
  2. how-can-i-profile-c-code-running-on-linux
  3. easy_profiler
  4. prof
  5. VISUAL BENCHMARKING in C++ (how to measure performance visually)
  6. Google benchmark
  7. "Performance Matters" by Emery Berger
  8. Coz: Finding Code that Counts with Causal Profiling
  9. P2329-move_at_scale
  10. CppCon 2015: Chandler Carruth "Tuning C++: Benchmarks, and CPUs, and Compilers! Oh My!"
  11. Intel v-tune
  12. github-gprof2dot
  13. How to benchamrk correctly: llvm

Don't forget to use clang stabilizer!

Add PROFILING definition and option to build

#!/bin/bash

# build the program (no special flags are needed)
g++ -std=c++11 cpuload.cpp -o cpuload

# run the program with callgrind; generates a file callgrind.out.12345 that can be viewed with kcachegrind
valgrind --tool=callgrind ./cpuload

# open profile.callgrind with kcachegrind
kcachegrind profile.callgrind
#!/bin/bash

# build the program; For our demo program, we specify -DWITHGPERFTOOLS to enable the gperftools specific #ifdefs
g++ -std=c++11 -DWITHGPERFTOOLS -lprofiler -g ../cpuload.cpp -o cpuload

# run the program; generates the profiling data file (profile.log in our example)
./cpuload

# convert profile.log to callgrind compatible format
pprof --callgrind ./cpuload profile.log > profile.callgrind

# open profile.callgrind with kcachegrind
kcachegrind profile.callgrind

Update std::allocator interface

We already have default std::allocator interface that uses Allocate/Deallocate functions directly. But we are going to get rid of these functions in the future (due to some problems described in #30). So we have to refactor this interface in such a way, so it's possible to use it without presence of Allocate/Deallocate methods.

Add more security features, e.g. chunk header xor-ing, better randomization, double-free checks, buffer overflow prevention, ...

  • heap canaries
  • randomized allocations? (probably it isn't possible)
  • headers encryption
  • better randomization of mmaped arenas (e.g. mimalloc: mi_unix_mmapx)
  • Top chunk checks
  • Checks on chunks merging
  • Replace std::terminate with std::abort?
  • On deallocate fill memory with zeroes
  • Check for invalid headers (for example: chunk header gets overwritten, so we need to check for invalid merging)
  • Check for invalid smart pointer initialization
  • Check for double-free
  • Check for invalid free

option(MPP_SECURE "Build with advanced security checks" OFF)

Refactor statistics system

  1. Refactor everything statistics-related (right now the system looks like a lot of dirty hacks)
  2. GC metrics: #67
  3. Use macros to define statistics-related things

Allocator general improvements

  • Add cache layer
  • Change allocation logic (firstly reuse all available space in free list, and only then allocate new arena)
  • Inside GC, increase newly created arena size

SharedGcPtr class

  1. Calling of ctor/dtor
  2. update internal state of GC, while adding/removing SharedGcPtr.
  3. Different SharedGcPtr's that points to the same object.
  4. Initialization of smart pointers
1. auto ptr = SharedGcPtr<SharedGcPtr>((UserClass*)(void* ptr));
2. auto ptr = SharedGcPtr<SharedGcPtr>(SharedGcPtrptr);
3. auto ptr = SharedGcPtr<SharedGcPtr*>(SharedGcPtr* ptr);
  1. SharedGcPtr, that points bot to the beginning of chunk.
  2. Active objects iteration
  3. make_SharedGcPtr(Args...)

Info:

Tests for ChunkTreap

  1. Тест удаления конкретного чанка из дерева (с разными адресами, но одинаковыми размерами). (2 случая: 1-ый чанк лежит перед 2-ым, 1-ый чанк лежит после 2-го)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.