Giter Site home page Giter Site logo

martinus / unordered_dense Goto Github PK

View Code? Open in Web Editor NEW
799.0 19.0 64.0 1.58 MB

A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion

License: MIT License

CMake 0.90% C++ 93.62% Meson 2.26% Python 2.81% Shell 0.41%
c-plus-plus cpp17 cpp hash hash-tables header-only-library no-dependencies stl-containers unordered-map unordered-set

unordered_dense's Introduction

Release GitHub license meson_build_test CII Best Practices Sponsors

๐Ÿš€ ankerl::unordered_dense::{map, set}

A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion for C++17 and later.

The classes ankerl::unordered_dense::map and ankerl::unordered_dense::set are (almost) drop-in replacements of std::unordered_map and std::unordered_set. While they don't have as strong iterator / reference stability guaranties, they are typically much faster.

Additionally, there are ankerl::unordered_dense::segmented_map and ankerl::unordered_dense::segmented_set with lower peak memory usage. and stable iterator/references on insert.

1. Overview

The chosen design has a few advantages over std::unordered_map:

  • Perfect iteration speed - Data is stored in a std::vector, all data is contiguous!
  • Very fast insertion & lookup speed, in the same ballpark as absl::flat_hash_map
  • Low memory usage
  • Full support for std::allocators, and polymorphic allocators. There are ankerl::unordered_dense::pmr typedefs available
  • Customizeable storage type: with a template parameter you can e.g. switch from std::vector to boost::interprocess::vector or any other compatible random-access container.
  • Better debugging: the underlying data can be easily seen in any debugger that can show an std::vector.

There's no free lunch, so there are a few disadvantages:

  • Deletion speed is relatively slow. This needs two lookups: one for the element to delete, and one for the element that is moved onto the newly empty spot.
  • no const Key in std::pair<Key, Value>
  • Iterators and references are not stable on insert or erase.

2. Installation

The default installation location is /usr/local.

2.1. Installing using cmake

Clone the repository and run these commands in the cloned folder:

mkdir build && cd build
cmake ..
cmake --build . --target install

Consider setting an install prefix if you do not want to install unordered_dense system wide, like so:

mkdir build && cd build
cmake -DCMAKE_INSTALL_PREFIX:PATH=${HOME}/unordered_dense_install ..
cmake --build . --target install

To make use of the installed library, add this to your project:

find_package(unordered_dense CONFIG REQUIRED)
target_link_libraries(your_project_name unordered_dense::unordered_dense)

3. Usage

3.1. Modules

ankerl::unordered_dense supports c++20 modules. Simply compile src/ankerl.unordered_dense.cpp and use the resulting module, e.g. like so:

clang++ -std=c++20 -I include --precompile -x c++-module src/ankerl.unordered_dense.cpp
clang++ -std=c++20 -c ankerl.unordered_dense.pcm

To use the module with e.g. in module_test.cpp, use

import ankerl.unordered_dense;

and compile with e.g.

clang++ -std=c++20 -fprebuilt-module-path=. ankerl.unordered_dense.o module_test.cpp -o main

A simple demo script can be found in test/modules.

3.2. Hash

ankerl::unordered_dense::hash is a fast and high quality hash, based on wyhash. The ankerl::unordered_dense map/set differentiates between hashes of high quality (good avalanching effect) and bad quality. Hashes with good quality contain a special marker:

using is_avalanching = void;

This is the cases for the specializations bool, char, signed char, unsigned char, char8_t, char16_t, char32_t, wchar_t, short, unsigned short, int, unsigned int, long, long long, unsigned long, unsigned long long, T*, std::unique_ptr<T>, std::shared_ptr<T>, enum, std::basic_string<C>, and std::basic_string_view<C>.

Hashes that do not contain such a marker are assumed to be of bad quality and receive an additional mixing step inside the map/set implementation.

3.2.1. Simple Hash

Consider a simple custom key type:

struct id {
    uint64_t value{};

    auto operator==(id const& other) const -> bool {
        return value == other.value;
    }
};

The simplest implementation of a hash is this:

struct custom_hash_simple {
    auto operator()(id const& x) const noexcept -> uint64_t {
        return x.value;
    }
};

This can be used e.g. with

auto ids = ankerl::unordered_dense::set<id, custom_hash_simple>();

Since custom_hash_simple doesn't have a using is_avalanching = void; marker it is considered to be of bad quality and additional mixing of x.value is automatically provided inside the set.

3.2.2. High Quality Hash

Back to the id example, we can easily implement a higher quality hash:

struct custom_hash_avalanching {
    using is_avalanching = void;

    auto operator()(id const& x) const noexcept -> uint64_t {
        return ankerl::unordered_dense::detail::wyhash::hash(x.value);
    }
};

We know wyhash::hash is of high quality, so we can add using is_avalanching = void; which makes the map/set directly use the returned value.

3.2.3. Specialize ankerl::unordered_dense::hash

Instead of creating a new class you can also specialize ankerl::unordered_dense::hash:

template <>
struct ankerl::unordered_dense::hash<id> {
    using is_avalanching = void;

    [[nodiscard]] auto operator()(id const& x) const noexcept -> uint64_t {
        return detail::wyhash::hash(x.value);
    }
};

3.2.4. Heterogeneous Overloads using is_transparent

This map/set supports heterogeneous overloads as described in P2363 Extending associative containers with the remaining heterogeneous overloads which is targeted for C++26. This has overloads for find, count, contains, equal_range (see P0919R3), erase (see P2077R2), and try_emplace, insert_or_assign, operator[], at, and insert & emplace for sets (see P2363R3).

For heterogeneous overloads to take affect, both hasher and key_equal need to have the attribute is_transparent set.

Here is an example implementation that's usable with any string types that is convertible to std::string_view (e.g. char const* and std::string):

struct string_hash {
    using is_transparent = void; // enable heterogeneous overloads
    using is_avalanching = void; // mark class as high quality avalanching hash

    [[nodiscard]] auto operator()(std::string_view str) const noexcept -> uint64_t {
        return ankerl::unordered_dense::hash<std::string_view>{}(str);
    }
};

To make use of this hash you'll need to specify it as a type, and also a key_equal with is_transparent like std::equal_to<>:

auto map = ankerl::unordered_dense::map<std::string, size_t, string_hash, std::equal_to<>>();

For more information see the examples in test/unit/transparent.cpp.

3.2.5. Automatic Fallback to std::hash

When an implementation for std::hash of a custom type is available, this is automatically used and assumed to be of bad quality (thus std::hash is used, but an additional mixing step is performed).

3.2.6. Hash the Whole Memory

When the type has a unique object representation (no padding, trivially copyable), one can just hash the object's memory. Consider a simple class

struct point {
    int x{};
    int y{};

    auto operator==(point const& other) const -> bool {
        return x == other.x && y == other.y;
    }
};

A fast and high quality hash can be easily provided like so:

struct custom_hash_unique_object_representation {
    using is_avalanching = void;

    [[nodiscard]] auto operator()(point const& f) const noexcept -> uint64_t {
        static_assert(std::has_unique_object_representations_v<point>);
        return ankerl::unordered_dense::detail::wyhash::hash(&f, sizeof(f));
    }
};

3.3. Container API

In addition to the standard std::unordered_map API (see https://en.cppreference.com/w/cpp/container/unordered_map) we have additional API that is somewhat similar to the node API, but leverages the fact that we're using a random access container internally:

3.3.1. auto extract() && -> value_container_type

Extracts the internally used container. *this is emptied.

3.3.2. extract() single Elements

Similar to erase() I have an API call extract(). It behaves exactly the same as erase, except that the return value is the moved element that is removed from the container:

  • auto extract(const_iterator it) -> value_type
  • auto extract(Key const& key) -> std::optional<value_type>
  • template <class K> auto extract(K&& key) -> std::optional<value_type>

Note that the extract(key) API returns an std::optional<value_type> that is empty when the key is not found.

3.3.3. [[nodiscard]] auto values() const noexcept -> value_container_type const&

Exposes the underlying values container.

3.3.4. auto replace(value_container_type&& container)

Discards the internally held container and replaces it with the one passed. Non-unique elements are removed, and the container will be partly reordered when non-unique elements are found.

3.4. Custom Container Types

unordered_dense accepts a custom allocator, but you can also specify a custom container for that template argument. That way it is possible to replace the internally used std::vector with e.g. std::deque or any other container like boost::interprocess::vector. This supports fancy pointers (e.g. offset_ptr), so the container can be used with e.g. shared memory provided by boost::interprocess.

3.5. Custom Bucket Types

The map/set supports two different bucket types. The default should be good for pretty much everyone.

3.5.1. ankerl::unordered_dense::bucket_type::standard

  • Up to 2^32 = 4.29 billion elements.
  • 8 bytes overhead per bucket.

3.5.2. ankerl::unordered_dense::bucket_type::big

  • up to 2^63 = 9223372036854775808 elements.
  • 12 bytes overhead per bucket.

4. segmented_map and segmented_set

ankerl::unordered_dense provides a custom container implementation that has lower memory requirements than the default std::vector. Memory is not contiguous, but it can allocate segments without having to reallocate and move all the elements. In summary, this leads to

  • Much smoother memory usage, memory usage increases continuously.
  • No high peak memory usage.
  • Faster insertion because elements never need to be moved to new allocated blocks
  • Slightly slower indexing compared to std::vector because an additional indirection is needed.

Here is a comparison against absl::flat_hash_map and the ankerl::unordered_dense::map when inserting 10 million entries allocated memory

Abseil is fastest for this simple inserting test, taking a bit over 0.8 seconds. It's peak memory usage is about 430 MB. Note how the memory usage goes down after the last peak; when it goes down to ~290MB it has finished rehashing and could free the previously used memory block.

ankerl::unordered_dense::segmented_map doesn't have these peaks, and instead has a smooth increase of memory usage. Note there are still sudden drops & increases in memory because the indexing data structure needs still needs to increase by a fixed factor. But due to holding the data in a separate container we are able to first free the old data structure, and then allocate a new, bigger indexing structure; thus we do not have peaks.

5. Design

The map/set has two data structures:

  • std::vector<value_type> which holds all data. map/set iterators are just std::vector<value_type>::iterator!
  • An indexing structure (bucket array), which is a flat array with 8-byte buckets.

5.1. Inserts

Whenever an element is added it is emplace_back to the vector. The key is hashed, and an entry (bucket) is added at the corresponding location in the bucket array. The bucket has this structure:

struct Bucket {
    uint32_t dist_and_fingerprint;
    uint32_t value_idx;
};

Each bucket stores 3 things:

  • The distance of that value from the original hashed location (3 most significant bytes in dist_and_fingerprint)
  • A fingerprint; 1 byte of the hash (lowest significant byte in dist_and_fingerprint)
  • An index where in the vector the actual data is stored.

This structure is especially designed for the collision resolution strategy robin-hood hashing with backward shift deletion.

5.2. Lookups

The key is hashed and the bucket array is searched if it has an entry at that location with that fingerprint. When found, the key in the data vector is compared, and when equal the value is returned.

5.3. Removals

Since all data is stored in a vector, removals are a bit more complicated:

  1. First, lookup the element to delete in the index array.
  2. When found, replace that element in the vector with the last element in the vector.
  3. Update two locations in the bucket array: First remove the bucket for the removed element
  4. Then, update the value_idx of the moved element. This requires another lookup.

6. Real World Usage

On 2023-09-10 I did a quick search on github to see if this map is used in any popular open source projects. Here are some of the projects I found. Please send me a note if you want on that list!

  • PruaSlicer - G-code generator for 3D printers (RepRap, Makerbot, Ultimaker etc.)
  • Kismet: Wi-Fi, Bluetooth, RF, and more. Kismet is a sniffer, WIDS, and wardriving tool for Wi-Fi, Bluetooth, Zigbee, RF, and more, which runs on Linux and macOS
  • Rspamd - Fast, free and open-source spam filtering system.
  • kallisto - Near-optimal RNA-Seq quantification
  • Slang - Slang is a shading language that makes it easier to build and maintain large shader codebases in a modular and extensible fashion.
  • CyberFSR2 - Drop-in DLSS replacement with FSR 2.0 for various games such as Cyberpunk 2077.
  • ossia score - A free, open-source, cross-platform intermedia sequencer for precise and flexible scripting of interactive scenarios.
  • HiveWE - A Warcraft III World Editor (WE) that focusses on speed and ease of use.
  • opentxs - The Open-Transactions project is a collaborative effort to develop a robust, commercial-grade, fully-featured, free-software toolkit implementing the OTX protocol as well as a full-strength financial cryptography library, API, GUI, command-line interface, and prototype notary server.
  • LuisaCompute - High-Performance Rendering Framework on Stream Architectures
  • Lethe - Lethe (pronounced /หˆliหฮธiห/) is open-source computational fluid dynamics (CFD) software which uses high-order continuous Galerkin formulations to solve the incompressible Navierโ€“Stokes equations (among others).
  • PECOS - PECOS is a versatile and modular machine learning (ML) framework for fast learning and inference on problems with large output spaces, such as extreme multi-label ranking (XMR) and large-scale retrieval.
  • Operon - A modern C++ framework for symbolic regression that uses genetic programming to explore a hypothesis space of possible mathematical expressions in order to find the best-fitting model for a given regression target.
  • MashMap - A fast approximate aligner for long DNA sequences
  • minigpt4.cpp - Port of MiniGPT4 in C++ (4bit, 5bit, 6bit, 8bit, 16bit CPU inference with GGML)

unordered_dense's People

Contributors

chriselrod avatar jcelerier avatar justusranvier avatar martinus avatar phprus avatar sb362 avatar vowstar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

unordered_dense's Issues

API extensions supporting insert/find/erase/etc. when a hash is already available

First, thanks for the library! A key feature I've grown accustomed to (working with my own containers, as well as many specialized containers coming from game engine development) is the ability to invoke various API methods when a hash is already available. This sort of scenario is quite common, imagine you have a string identifier of some sort, and a number of structures keyed to the same identifier. Ideally, you could compute the hash once, and then reuse the result (or even cache it where appropriate) to avoid needing to re-hash the string on every operation.

When I implemented this in the past, usually it involved some light refactoring to split operations into a version accepting a hash, and a version that didn't (which would compute a hash and then forward the call to the hash-accepting version). The savings are, of course, dependent on your hash function, and the size of the input.

This API has a footgun that users need to be aware of, which is that the hash function used must match the hash function used by the container, for obvious reasons. For this reason, I recommend providing a helper function on the map/set structures to compute the hash for a given input. This takes any guesswork out of the usage, and reduces the risk of API misuse.

Cheers!
Jeremy

add constexpr to hash

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

ankerl::unordered_dense requires C++17 or higher (readme)

When not compiling with C+17, this happens:

unordered_dense.h:74:6: error: #error ankerl::unordered_dense requires C++17 or higher
 #    error ankerl::unordered_dense requires C++17 or higher

Maybe you want to indicate the requirement for C++17 in the readme? At least, I was a bit surprised about it.

[BUG] overload operator== do not work for unordered_dense::detail::table

Describe the bug
(I am a new fish in C++. If there are some errors, please forgive me :)
When I am trying to overload the operator==() for detail::table like below, I found it doesn't work for me. The custom function is not invoked.

template<typename _K, typename _V, typename _Hash, typename _Pred, typename _Allocator, typename _Bucket>auto operator==(ankerl::unordered_dense::detail::table<_K, _V, _Hash, _Pred, _Allocator, _Bucket> const& a, ankerl::unordered_dense::detail::table<_K, _V, _Hash, _Pred, _Allocator, _Bucket> const& b) -> bool {
    cout << "test" << endl;
    return true;
}

I trying to understand why, so I referred to std::map and std::unordered_map. I found the operator== for them is defined as the template in its class. Compared with unordered_dense::detail::table it is not a template function.

// unordered_mapp
template<typename _Key1, typename _Tp1, typename _Hash1, typename _Pred1, typename _Alloc1>
friend bool
operator==(const unordered_map<_Key1, _Tp1, _Hash1, _Pred1, _Alloc1>&,
           const unordered_map<_Key1, _Tp1, _Hash1, _Pred1, _Alloc1>&);

// unordered_dense::detail::table
friend auto operator==(table const& a, table const& b) -> bool;

My questions are:

  1. I wonder if it is not possible for me to overload the operator== for ankerl::unordered_dense::detail::table?
  2. If Q1 is yes, will it be updated to the template like std in the future?

To Reproduce

#include "ankerl/unordered_dense.h"
#include <iostream>
using namespace std;

template<typename _K, typename _V, typename _Hash, typename _Pred, typename _Allocator, typename _Bucket>auto operator==(ankerl::unordered_dense::detail::table<_K, _V, _Hash, _Pred, _Allocator, _Bucket> const& a, ankerl::unordered_dense::detail::table<_K, _V, _Hash, _Pred, _Allocator, _Bucket> const& b) -> bool {
    cout << "test" << endl;
    return true;
}

int main(){

    ankerl::unordered_dense::map<int, int*> a;
    ankerl::unordered_dense::map<int, int*> b;
   
    if (a == b){ cout << "true" << endl; } // in stdout, the "test" will not be output, so the default operator== was invoked, not the custom one.
    return 0;
}

Expected behavior

System:

  • OS: Linux
  • Compiler: g++
  • Version: c++11

New Benchmark + Docs

I'm eagerly following your new lib. I hope you will explain more soon how it improves on all your previous hash_map libs, I'm not clear about this yet.

Also I hope you will make a new benchmark round again like in 2019, this was awesome and gave so much insight in all the libs out there.

Thanks :)

[BUG] erase_if returns an incorrect value

Describe the bug
erase_if returns current_size - old_size which is a negative value that is cast to an unsigned size_t on return.

return map.size() - old_size;

To Reproduce
Steps to reproduce the behavior:

  1. Call erase_if

Expected behavior
erase_if should return old_size - current_size, not current_size - old_size

System (please complete the following information):

  • OS: Windows
  • Compiler: MSVC
  • Version 19.34.31937

Additional context
None.

Hash examples

Please add an example for using a custom hash. And an example for specializing a class for the hash used by unordered_dense.

Thank you for your work on this!

Better documentation

Installation, maybe benchmark results, ...

  • bucket types
  • additional extract() and values() API

Slightly speed up find

Don't special case when empty: initially point to a static empty bucket instead. Make sure we never accidentally free or write to that memory.

implement a version with stable iterators

robin_hood map has an option for stable iterators, that's currently not possible with unordered_dense.

One way to implement this might be be to be able to replace the underlying container with some kind of slotmap.

Slot map design

  • a member for size

  • one index array (same capacity as data array)

  • one data array with stable references on insert (e.g. std::deque)

  • index is split into two parts: first part until size(), contains indices into the data with actual content. second part after size contains unused indices

  • data is not densely stored, there are empty unconstructed cells in it.

Insert: O(1) amortized.

Take the index after size, construct element in that location in data. Increase size by 1.
When capacity is reached we know data is completely full. push back a new index with size and construct there.

Erase at location x: O(1)

  1. Lookup in index[x] and destroy that element.
  2. std::swap(index[x], index.back())
  3. decrease size by one. Now we have one more entry in our free section (past the size)

Iterate

Needs a custom iterator that goes over the index and dereferences from there.

Open questions

  • How to support both slotmap and std::vector?
  • I probably can't use deque because I have to destroy the elements. I could use std::optional<Data> but that is huge unnecessary overhead.

[BUG] Nits to better match std::unordered_set / _map

Context: My employer's code base is 35 years old. Use of C++ goes back nearly 20 years. We do make efforts to try to modernize it. As such, I have been try to replace a home-grown robin hood unordered_set and _map with unordered_dense.

We have both a large code base and fairly extensive unit tests. Here of some issues I have encountered and change I made to move exercise forward:

  • Change multiple instances of 'std::pair<Key, T>' to 'std::pair<Key const, T>'.

  • Add missing overload
    auto end() const noexcept -> const_iterator { return m_values.end(); }

  • Add six missing reverse_iterator methods

Frozen hashes/sets

Is your feature request related to a problem? Please describe.
It is more a discussion, than a real issue or feature request. I'm looking for quite an unusual thing - an ability to freeze a hash or a set in some shared memory piece, so other processes can access this data structure from shared memory concurrently but with no ability to change data. So I'm looking for any hints in this direction about the potential design. I'd be also fine if that will not come to the upstream as a rarely needed feature but in my case it can significantly reduce memory footprint especially in highly loaded systems. And since no modifications are possible, there is no need in locking or other consistency support - processes can just read data from the shared memory.

Describe the solution you'd like
I would like to have an interface that stores data in a given shared memory with an ability to restore data afterwards. The most simple approach is to copy the underlying vector into shared memory, and create a structure directly in shared memory with placement new and set the storage pointer to the vector allocated/copied previously. This is quite a low level approach, so I'm wondering if there is something better.

Describe alternatives you've considered
In my case, the alternative is to have a separate hash table for each process which is just a waste of memory.

P.S. please feel free to close if this is not relevant to the project.

debug visualizer

You could also mention in the "advantages section" the following practical one:
the content of the map is easily visible under any debugger that has a visualizer for std::vector, without any extra work ๐Ÿ™‚

MSVC generated code is bad

std::hash is really slow on windows, generates a hell of a lot of imul. Don't fall back to it, until really necessary

RFC: Allow more than 2^32 entries

Thank you for sharing this hash-map implementation. It's a joy to read through the code.

Is your feature request related to a problem? Please describe.
Currently, because of Bucket.value_idx being a uint32_t the map/set can store at max 2^32 entries. That is quite tight. Many applications will hit that limit. In java, where collections have a similar limitation, I run into it regularly.

Describe the solution you'd like
The readme says that only 1 Byte + 3 Bits of Bucket.dist_and_fingerprintare payload. If the remaining 21 Bits would be used to extend Bucket.value_idx, much more entries could be stored (up to 2^(32+21) ~ 64 PB of uint64_t). I would suggest something like:

struct __attribute__((__packed__)) Bucket {
			uint8_t dist:3;
			uint8_t fingerprint:8;
			uint64_t value_idx:53;
		};

Describe alternatives you've considered
Instead of non-standard attribute packing, standard bit masks and shifts can be used.

reserve() function doesn't accelerate

Dear author,

Thanks so much for your library! It really helps!

Sorry for the inconvenience I might cause. I'm wondering if the calling of the "reserve()" function to reserve the space for the hashing_map to reduce collision(the hypothesis is: given the approximate length, the hashing allocation may be optimized) and thus could reduce the time cost. But in the experiment of "ankerl::unordered_dense::map<int, float>", it didn't show any sign of acceleration. The two versions with "reserve()" and without "reserve" get nearly the same speed. Could you please help explain the rationale behind it?

Best regards,
Bob

Compilation error when min/max macros are defined

Is your feature request related to a problem? Please describe.
Some headers (notably Windows.h) define the macros min and max, which causes compilation errors if you use functions like std::min() or std::numeric_limits<int>::max(). Trying to compile

#include <Windows.h>
#include "unordered_dense.h"

int main() {
  ankerl::unordered_dense::map<int, int> abc;
  return 0;
}

results in several errors:
image

Describe the solution you'd like
Should be able to compile even if min/max are defined as macros. You can get around this by #undefing min/max before including unordered_dense.h, but I think a much better solution is wrap any calls to functions named min/max in parentheses. This is what they do in Abseil: https://github.com/abseil/abseil-cpp/blob/1ae9b71c474628d60eb251a3f62967fe64151bb2/absl/random/internal/pool_urbg.h#L96
For example:

[[nodiscard]] static constexpr auto calc_num_buckets(uint8_t shifts) -> size_t {
    return (std::min)(max_bucket_count(), size_t{1} << (64U - shifts));
}

will successfully compile even if min/max are defined.

[BUG] undefined behavior in pmr container destructor

I've been tracking down some UBSan warnings in my project and it seems to originate from ankerl::unordered_dense::v2_0_2::detail::table::~table().

Every time that destructor runs for an ankerl::unordered_dense::pmr::map or ankerl::unordered_dense::pmr::set I get the following UBsan output:

/usr/lib/gcc/powerpc64le-unknown-linux-gnu/12/include/g++-v12/memory_resource:489:24: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/lib/gcc/powerpc64le-unknown-linux-gnu/12/include/g++-v12/memory_resource:186:22: note: nonnull attribute specified here

I don't have a minimal test case at the moment, but the ubsan errors appear to be triggered 100% of the time when an ankerl::unordered_dense::pmr container destructs, and the errors cease when those containers are replaced with their std::pmr equivalents or with non-pmr unordered_dense containers.

Better support for erase() with range-based iteration

This a following from the porting exercise described in issue #42.

An issue I have encountered is that unordered_dense does handle erase within range-base iteration:

for (auto i : myUnorderedDense) {
    if (p(i)) {
        myUnorderedDense.erase(i);
    }
}

I do not have the standard's wording but cppreference.com states:

References and iterators to the erased elements are invalidated. Other references and iterators are not affected.

I acknowledge that such broad invariants are impossible to satisfy. Still, by exploiting the fact that these containers are unordered, the above code could be made to work. As far as I can see, there is neither requirement, nor expectation, that iteration will occur in forward order over the values vector.

So, if instead table::begin() returned m_values.rbegin() and table::end() returned m_values.rend() then range-based iteration would work as expected.

some optimization

**

A fingerprint; 1 byte of the hash (lowest significant byte in dist_and_fingerprint)

**
why not use the main (32-3) bits as hash code in dist_and_fingerprint which can reduces collision rate(improving find hit)
I do it in my emhash8.

Implement segmented_vector as replacement for std::vector

Is your feature request related to a problem? Please describe.
vector resizes have high memory peaks, and lots of elements need to be moved around.

Describe the solution you'd like
Implement a "segmented vector": Datastructure is similar to std::deque, but a bit better optimized for random access:

  • Index array could be just an std::vector
  • Each segment size is 2^x, then indexing is just data[idx >> x][idx % x]; which is very fast.
  • Initially, until the size of segment is reached, it could behave the same as std::vector. Then it's possible to set a relatively lage segment size (e.g. 65k), and it would still be small when only few elements are in there. Best of both worlds.

Describe alternatives you've considered
Always double the size of the segment, and use bitcount to find out which segment to use. This is more difficult and slower though.

Additional context
Would be nice to use something like that for e.g. bitcoin.

Support heterogeneous C++26 overloads

Would you be interested in updating unordered_dense to support heterogeneous overloads outlined in the paper: cplusplus/papers#1037 (which seems like it is a go for C++26)

It can be helpful when trying to do a find on an object that the map doesn't directly support but can be looked up...but still requiring an insert of the real type if it doesn't exist.


Contrived example

Today:

ankerl::unordered_dense::map<std::string, int, StringHash, std::equal_to<>> map;
auto it = map.find(std::string_view{"hello world"});
if (it == map.end())
{
  auto [ new_it, success ] = map.emplace(std::string{"hello world"}, 42);
  it = new_it
}
// do something with it

Future:

ankerl::unordered_dense::map<std::string, int, StringHash, std::equal_to<>> map;
auto [ it, success ] = map.try_emplace(std::string_view{"hello world"}, 42);
// do something with it

Crash when using move-only values

Hello,

I would first like to say thank you for your projects!

I have recently tried to switch from robin-hood hash to ankerl::unordered_dense and the transition was quite smooth with the exception of the move-only classes. So I have a move-only class defined here: https://github.com/rspamd/rspamd/blob/master/src/libserver/redis_pool.cxx#L92 which is pushed into ankerl::unordered_dense::map here: https://github.com/rspamd/rspamd/blob/master/src/libserver/redis_pool.cxx#L499
It worked perfectly with robin-hood hash map and it works fine with std::unordered_map but it causes use-after-free crashes with ankerl::unordered_dense::map:

    #0 0x10511a697 in std::__1::list<std::__1::unique_ptr<rspamd::redis_pool_connection, std::__1::default_delete<rspamd::redis_pool_connection> >, std::__1::allocator<std::__1::unique_ptr<rspamd::redis_pool_connection, std::__1::default_delete<rspamd::redis_pool_connection> > > >::erase(std::__1::__list_const_iterator<std::__1::unique_ptr<rspamd::redis_pool_connection, std::__1::default_delete<rspamd::redis_pool_connection> >, void*>) list:1807
    #1 0x10510f0d5 in rspamd::redis_pool_elt::release_connection(rspamd::redis_pool_connection const*) redis_pool.cxx:123

0x610000060060 is located 32 bytes inside of 184-byte region [0x610000060040,0x6100000600f8)
freed by thread T0 here:
    #4 0x10511aa74 in std::__1::__libcpp_deallocate(void*, unsigned long, unsigned long) new:340
    #5 0x105123c1c in std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> >::deallocate(std::__1::pair<unsigned long long, rspamd::redis_pool_elt>*, unsigned long) memory:1816
    #6 0x1051230d4 in std::__1::allocator_traits<std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> > >::deallocate(std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> >&, std::__1::pair<unsigned long long, rspamd::redis_pool_elt>*, unsigned long) memory:1554
    #7 0x10512f6b5 in std::__1::__split_buffer<std::__1::pair<unsigned long long, rspamd::redis_pool_elt>, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> >&>::~__split_buffer() __split_buffer:343
    #8 0x10512d7d4 in std::__1::__split_buffer<std::__1::pair<unsigned long long, rspamd::redis_pool_elt>, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> >&>::~__split_buffer() __split_buffer:340
    #9 0x10512b7fe in void std::__1::vector<std::__1::pair<unsigned long long, rspamd::redis_pool_elt>, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> > >::__emplace_back_slow_path<std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned long long const&>, std::__1::tuple<rspamd::redis_pool*&&, char const*&, char const*&, char const*&, int&> >(std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned long long const&>&&, std::__1::tuple<rspamd::redis_pool*&&, char const*&, char const*&, char const*&, int&>&&) vector:1672
    #10 0x105128e01 in std::__1::pair<unsigned long long, rspamd::redis_pool_elt>& std::__1::vector<std::__1::pair<unsigned long long, rspamd::redis_pool_elt>, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> > >::emplace_back<std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned long long const&>, std::__1::tuple<rspamd::redis_pool*&&, char const*&, char const*&, char const*&, int&> >(std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned long long const&>&&, std::__1::tuple<rspamd::redis_pool*&&, char const*&, char const*&, char const*&, int&>&&) vector:1694
    #11 0x1051283f5 in std::__1::pair<std::__1::__wrap_iter<std::__1::pair<unsigned long long, rspamd::redis_pool_elt>*>, bool> ankerl::unordered_dense::detail::table<unsigned long long, rspamd::redis_pool_elt, ankerl::unordered_dense::hash<unsigned long long, void>, std::__1::equal_to<unsigned long long>, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> > >::do_try_emplace<unsigned long long const&, rspamd::redis_pool*, char const*&, char const*&, char const*&, int&>(unsigned long long const&, rspamd::redis_pool*&&, char const*&, char const*&, char const*&, int&) unordered_dense.h:583

previously allocated by thread T0 here:
    #0 0x109f0efdd in wrap__Znwm (libclang_rt.asan_osx_dynamic.dylib:x86_64h+0x51fdd)
    #1 0x10511d40c in std::__1::__libcpp_allocate(unsigned long, unsigned long) new:253
    #2 0x10512e1c4 in std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> >::allocate(unsigned long, void const*) memory:1813
    #3 0x10512e020 in std::__1::allocator_traits<std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> > >::allocate(std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> >&, unsigned long) memory:1546
    #4 0x10512ddcf in std::__1::__split_buffer<std::__1::pair<unsigned long long, rspamd::redis_pool_elt>, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> >&>::__split_buffer(unsigned long, unsigned long, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> >&) __split_buffer:311
    #5 0x10512d5ec in std::__1::__split_buffer<std::__1::pair<unsigned long long, rspamd::redis_pool_elt>, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> >&>::__split_buffer(unsigned long, unsigned long, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> >&) __split_buffer:310
    #6 0x10512b702 in void std::__1::vector<std::__1::pair<unsigned long long, rspamd::redis_pool_elt>, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> > >::__emplace_back_slow_path<std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned long long const&>, std::__1::tuple<rspamd::redis_pool*&&, char const*&, char const*&, char const*&, int&> >(std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned long long const&>&&, std::__1::tuple<rspamd::redis_pool*&&, char const*&, char const*&, char const*&, int&>&&) vector:1667
    #7 0x105128e01 in std::__1::pair<unsigned long long, rspamd::redis_pool_elt>& std::__1::vector<std::__1::pair<unsigned long long, rspamd::redis_pool_elt>, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> > >::emplace_back<std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned long long const&>, std::__1::tuple<rspamd::redis_pool*&&, char const*&, char const*&, char const*&, int&> >(std::__1::piecewise_construct_t const&, std::__1::tuple<unsigned long long const&>&&, std::__1::tuple<rspamd::redis_pool*&&, char const*&, char const*&, char const*&, int&>&&) vector:1694
    #8 0x1051283f5 in std::__1::pair<std::__1::__wrap_iter<std::__1::pair<unsigned long long, rspamd::redis_pool_elt>*>, bool> ankerl::unordered_dense::detail::table<unsigned long long, rspamd::redis_pool_elt, ankerl::unordered_dense::hash<unsigned long long, void>, std::__1::equal_to<unsigned long long>, std::__1::allocator<std::__1::pair<unsigned long long, rspamd::redis_pool_elt> > >::do_try_emplace<unsigned long long const&, rspamd::redis_pool*, char const*&, char const*&, char const*&, int&>(unsigned long long const&, rspamd::redis_pool*&&, char const*&, char const*&, char const*&, int&) unordered_dense.h:583

So it seems that try_emplace behaves incorrectly with the move-only objects. I can try to provide a reproducible independent case if you see no apparent reasons for such a behaviour. Ad-hoc replacement with both robin-hood or std::unordered_map immediately resolves the issue, just in case.

can't compiler with T=std::deque<std::unique<T>>

Describe the bug
can't compiler with T=std::deque<std::unique>

To Reproduce

#include "https://raw.githubusercontent.com/martinus/unordered_dense/main/include/ankerl/unordered_dense.h"

#include <deque>
#include <memory>
#include <unordered_map>

using value_t = std::deque<std::unique_ptr<uint32_t>>;
ankerl::unordered_dense::map<uint32_t, value_t> test_map; //compiler error
std::unordered_map<uint32_t, value_t> test_map_std; //succ

Add support for `merge`?

Thanks for making this library!

I am hoping to use unordered_dense::map as a drop-in replacement for std::unordered_map in some performance-critical sections of my code. However, I currently rely on the merge method, which is not yet implemented. It also appears to be the only method missing in the standard API. Would be great if you could include it as well. Thanks in advance!

Question regarding collisions

How does this map handle collisions when the "distance" would exceed the max of 3 bytes. Resize? And what happens when there is poor hashing which collects a very collision rate. Your previous map would blow so wondering if this implementation addresses that short coming.

[BUG] Segmentation fault after move assign of map with different allocator

A move assignment of map_A, which uses allocator A, to map_B, which uses allocator B, terminates the program by SIGSEGV.

Here is simple code to reproduce it on Compiler Explorer. (Exits with double free or corruption)
https://gcc.godbolt.org/z/hGh8v1xfM

This is probably because the move assignment (or move constructor) replaces the m_bucket with m_bucket of the "other".

m_values = std::move(other.m_values);
m_buckets = std::exchange(other.m_buckets, nullptr);

The default behavior of std::vector move assignment is not to propagate the allocator. So, if the allocator of m_values is different from the one allocating other.m_backet, it appears that subsequent allocate() and deallocate() for m_backet will not work correctly.

[BUG] reserve does not call m_values.reserve(...)

Describe the bug
I'd expect a call to map / set :: reserve to preallocate the memory of the m_values vector.
Right now it does not happen - if I add a blanket m_values.reserve(capa) in there, I immediately get a few % improvements - for instance, for an input of ~100k strings (every filename in /usr/include), I get

insert: 5148us  (inserting the 100k elements)
find: 4740us (finding every individual element in a random order)

while if I add the call to m_values.reserve in map::reserve, I get:

insert: 4284us 
find: 4288us 

Support for rbegin / rend

Since the containers support random-access given adequate underlying container, it would be nice for them to provide the typedefs

 reverse_iterator
 const_reverse_iterator

and the families of function rbegin() / rend() / crbegin() / crend()

Define mapped_type only for ankerl::unordered_dense::map?

Is your feature request related to a problem? Please describe.
I'm not sure on what the standard says about the mapped_type definition or how it's used, but in the zpp_bits serialization library they're using it to detect the slight variation between the two containers. Intuitively to me it makes sense that void as a value for a key value structure would mean that you're really dealing with a set. Looking at cppreference it appears though the standard might be that mapped_type is specific to maps.

Currently with the set and map being the same template mapped_type gets defined for both, and in zpp_bits this results in that library attempting to create a reference to void in order to iterate over key value pairs, which is a compiler error.

Describe the solution you'd like
If mapped_type is specific to maps in the standard use detail::table as a backing implementation where mapped_type is not defined. Then wrap that container for set and map where map has mapped_type defined.

About "no const Key"

I am aware of this design: "no const Key in std::pair<Key, Value>". I did not care until today.
For a map, this is fine; I mean, willing to change the key is not common. But for a set, the situation is more bug-prone.

For example, this does not compile and this is good:

std::unordered_set<int> set0({ 1, 2, 3 });
for (auto& x : set)
	x++;  // "can't change x"

While the following is compiling fine, hence there is a risk of introducing a subtle bug for a common for-loop scenario.

ankerl::unordered_dense::set<int> set({ 1, 2, 3 });
for (auto& x : set)
	x++;

Hence I am wondering: what is the reason for not being able to have a const key? No way to change that, at least for a set?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.