High-performance, memory-efficient C++ open addressing flat hash table
emhash is a family of high-performance, header-only hash table implementations designed for maximum performance and memory efficiency. Through innovative collision resolution strategies and cache optimization techniques, emhash demonstrates exceptional performance across various benchmarks.
- Core Features
- Performance Overview
- Quick Start
- API Reference
- Design Principles
- Implementation Comparison
- Build and Test
- Usage Notes
- Third-Party Benchmarks
- License
- High load factor support: Set load factor up to 0.999 via
EMH_HIGH_LOADmacro (available inhash_table[5-8].hpp) - No tombstones: Performance does not degrade even with frequent insert/erase operations
- Smart collision resolution: Three-way hybrid strategy combining linear probing, quadratic probing, and bidirectional search
- Cache-friendly design: Single array inline storage minimizes memory footprint and maximizes cache hit rate
- Compact layout: Significant memory savings when
sizeof(key) % 8 != sizeof(value) % 8- Example:
hash_map<uint64_t, uint32_t>saves 1/3 memory compared tohash_map<uint64_t, uint64_t>
- Example:
- Dynamic shrinking:
shrink_to_fit()releases unused memory
| Feature | Description |
|---|---|
insert_unique |
Direct insertion without lookup (performance boost) |
try_get |
Returns pointer to value, nullptr if key not found |
try_set |
Sets value only if key does not exist (emhash5/8) |
set_get |
Updates value and returns old value (emhash5/8) |
_erase |
Delete operation returning void (faster) |
| LRU Mode | Enable LRU cache optimization with EMH_LRU_SET |
- Operating Systems: Windows, Linux, macOS
- Processors: x86_64, ARM64 (Apple M1/M2, AMD, Intel)
- Compilers: GCC, Clang, MSVC (C++11/14/17/20)
Test Environment: AMD 5800H / Windows 10 / GCC 11.3
| Operation | emhash7 | emhash8 | emhash6 | emhash5 | phmap | absl | std::unordered_map |
|---|---|---|---|---|---|---|---|
| Insert (10M) | 66.0 | 95.6 | 63.6 | 64.7 | 59.5 | 53.2 | 295 |
| Find Hit | 16.9 | 18.3 | 15.1 | 16.6 | 36.4 | 22.6 | 33.0 |
| Find Miss | 18.1 | 18.1 | 17.0 | 18.8 | 19.7 | 11.7 | 52.0 |
| Erase | 20.4 | 46.3 | 24.0 | 22.3 | 56.2 | 41.5 | 293 |
| Iterate | 0.74 | 0.00 | 0.75 | 4.75 | 3.46 | 3.49 | 62.6 |
Note: emhash8's iteration time of 0.00 ms indicates extremely fast iteration due to its contiguous memory layout (actual time is < 0.005 ms).
Even at an extreme load factor of 0.999, emhash maintains outstanding performance:
emhash7::HashMap<int64_t, int> myhash(1 << 20, 0.999f);
// Insert 1M+ elements without rehash
// Stable insert/erase performance without degradation// Compile: g++ -O3 -march=native -I.. -I../thirdparty -std=c++17 -DEMH_HIGH_LOAD=123456 highload_bench.cpp
#include "hash_table7.hpp"
static void RunHighLoadFactor()
{
std::random_device rd;
const auto rand_key = rd();
WyRand srngi(rand_key), srnge(rand_key);
const auto max_lf = 0.999f; //<= 0.999f
const auto vsize = 1u << (20 + rand_key % 6); //must be power of 2
emhash7::HashMap<int64_t, int> myhash(vsize, max_lf);
auto nowus = getus();
for (size_t i = 0; i < size_t(vsize * max_lf); i++)
myhash.emplace(srngi(), i);
assert(myhash.bucket_count() == vsize); //no rehash
const auto insert_time = getus() - nowus; nowus = getus();
//erase & insert at a fixed load factor
for (size_t i = 0; i < vsize; i++) {
myhash.erase(srnge()); //erase an old key
myhash[srngi()] = 1; //insert a new key
}
const auto erase_time = getus() - nowus;
printf("vsize = %d, load factor = %.4f, insert/erase = %ld/%ld ms\n",
vsize, myhash.load_factor(), insert_time / 1000, erase_time / 1000);
assert(myhash.load_factor() >= max_lf - 0.001);
}Full benchmark code with multi-version comparison: bench/highload_bench.cpp
Test Environment: AMD 5800H / Windows 10 / GCC 11.3
| hashmap | Insert(ms) | Erase+Insert(ms) | LF |
|---|---|---|---|
| emhash7::HashMap | 44 | 118 | 99.9 |
| emhash6::HashMap | 38 | 202 | 99.9 |
| emhash5::HashMap | 42 | 90 | 99.9 |
| emhash8::HashMap | 49 | 110 | 99.9 |
| hashmap | Fhit(ms) | Fmiss(ms) | Iter(ms) | LF |
|---|---|---|---|---|
| emhash7::HashMap | 18 | 21 | 2 | 99.9 |
| emhash6::HashMap | 15 | 18 | 2 | 99.9 |
| emhash5::HashMap | 17 | 17 | 1 | 99.9 |
Other hash maps (absl, phmap, ska, tsl, robin_hood) cannot run at 0.999 LF — they either cap max_load_factor at ~0.875 or suffer catastrophic clustering.
emhash is header-only — just copy one .hpp file to your project:
# Option A: Copy directly (simplest, no build system needed)
cp hash_table7.hpp /your/project/
# Option B: Clone and include
git clone https://github.com/ktprime/emhash.git
# Then add -I/path/to/emhash to your compiler flags#include "hash_table7.hpp" // Or hash_table[5-8].hpp#include "hash_table7.hpp"
#include <iostream>
int main() {
// Create and reserve space
emhash7::HashMap<int, std::string> map;
map.reserve(100);
// Insert elements (multiple ways)
map[1] = "one";
map.emplace(2, "two");
map.insert_unique(3, "three"); // Best performance, key must be unique
// Find element
auto it = map.find(2);
if (it != map.end()) {
std::cout << "Found: " << it->second << "\n";
}
// try_get: returns pointer, more concise
if (auto* pval = map.try_get(3)) {
std::cout << "Value: " << *pval << "\n";
}
// Iterate
for (const auto& [key, value] : map) {
std::cout << key << " => " << value << "\n";
}
return 0;
}// Custom key type
struct Point {
int x, y;
bool operator==(const Point& other) const {
return x == other.x && y == other.y;
}
};
struct PointHash {
size_t operator()(const Point& p) const {
return std::hash<int>()(p.x) ^ (std::hash<int>()(p.y) << 1);
}
};
emhash7::HashMap<Point, std::string, PointHash> point_map;
// C++20 using Lambda
#if __cplusplus >= 202002L
auto hash = [](const Point& p) { return std::hash<int>()(p.x + p.y); };
auto eq = [](const Point& a, const Point& b) { return a.x == b.x && a.y == b.y; };
emhash7::HashMap<Point, int, decltype(hash), decltype(eq)> map(0, hash, eq);
#endifHashMap(); // Default constructor
HashMap(size_t bucket_count); // Specify initial bucket count
HashMap(size_t bucket_count, float max_load_factor); // Specify load factor
HashMap(const HashMap& other); // Copy constructor
HashMap(HashMap&& other) noexcept; // Move constructor
HashMap(std::initializer_list<value_type> il);// Initializer list| Method | Description |
|---|---|
size() / empty() |
Element count / is empty |
bucket_count() |
Current bucket count |
load_factor() / max_load_factor() |
Current/max load factor |
reserve(n) |
Reserve space for at least n elements |
shrink_to_fit() |
Shrink to fit current size |
| Method | Description |
|---|---|
operator[key] |
Insert or access element |
at(key) |
Access element, undefined behavior if not found |
find(key) |
Find element, returns iterator |
try_get(key) |
Find element, returns pointer to value (nullptr on failure) |
contains(key) |
Check if key exists |
count(key) |
Key occurrence count (0 or 1) |
| Method | Description |
|---|---|
emplace(key, val) |
In-place construct insert |
insert({key, val}) |
Insert key-value pair |
insert_unique(key, val) |
Direct insert (no existence check, best performance) |
try_set(key, val) |
Set only if key does not exist (emhash5/8) |
set_get(key, val) |
Set new value, return old value (emhash5/8) |
erase(key) / erase(it) |
Delete element |
_erase(key) |
Delete element, return void (faster) |
clear() |
Clear all elements |
for (auto it = map.begin(); it != map.end(); ++it) {
// it->first: key, it->second: value
}
// C++17 structured binding
for (const auto& [key, value] : map) { }| Version | Strategy | High LF Support |
|---|---|---|
| emhash5 | Three-way hybrid: linear probing → quadratic probing → bidirectional search | With EMH_HIGH_LOAD |
| emhash6 | Linked-bucket with separate bitmask for fast empty-bucket search | With EMH_HIGH_LOAD |
| emhash7 | Linked-bucket with separate bitmask, no tombstones | Native (0.80-0.999) |
| emhash8 | Separate index + dense pairs array, linked-bucket chains | With EMH_HIGH_LOAD |
In
hash_table5.hpp, the optimized three-way linear probing strategy is still 2-3x faster than traditional strategies even at load factors > 0.9.
emhash5/6/7 — Single inline array with embedded bucket linkage:
struct Entry {
Key key; // Key
size_t bucket; // Bucket chain linkage / state info
Value value; // Value
};
// _pairs[bucket] stores key, value, and chain pointer in one struct
// emhash6/7 additionally use a separate _bitmask for fast empty-bucket searchemhash8 — Split index + dense pairs (like std::vector):
struct Index { size_t next, slot; }; // Per-bucket chain linkage
Index* _index; // bucket → slot mapping
value_type* _pairs; // dense, compact key-value array (no metadata)
// _pairs is always packed: _pairs[0].._pairs[_num_filled-1]
// This enables extremely fast iteration (just scan _pairs sequentially)- Primary bucket is always assigned to
key_hash(key) % sizeand cannot be occupied - All operations start searching from the primary bucket
- Avoids circular displacement issues in cuckoo hashing
Enable backup hash function by defining EMH_SAFE_HASH macro to defend against hash attacks (performance cost ~ 10%).
emhash provides 4 different implementations, each with different focus:
| Implementation | Layout | Collision Strategy | Default LF | Best Scenario |
|---|---|---|---|---|
| emhash5 | Inline _pairs[] |
Three-way hybrid probing | 0.80 | Fast lookup/erase with integer keys, SBO support |
| emhash6 | Inline _pairs[] + _bitmask |
Linked-bucket | 0.80 | Lookup/erase with integer keys, fast empty scan |
| emhash7 | Inline _pairs[] + _bitmask |
Linked-bucket, no tombstones | 0.80 | High load factor (0.80-0.999), insert-intensive |
| emhash8 | Separate _index[] + dense _pairs[] |
Linked-bucket | 0.80 | Complex keys/values, extremely fast iteration |
- Complex/large keys/values (e.g.,
std::string, custom structs) → emhash8 - Insert-intensive workloads or need high load factor → emhash7
- Fast search and erase with integer keys → emhash5/6
- Small maps that should avoid heap allocation → emhash5 with
EMH_SMALL_SIZE
emhash also provides HashSet implementations:
#include "hash_set3.hpp" // Based on emhash7, namespace emhash7
#include "hash_set81.hpp" // Based on emhash8, namespace emhash8
emhash7::HashSet<int> set1;
emhash8::HashSet<int> set2;All HashMap versions support custom allocators (emhash5/6/7/8):
emhash7::HashMap<Key, Val, Hash, Eq, MyAllocator> mymap;# Clone repository
git clone https://github.com/ktprime/emhash.git
cd emhash
# Create build directory
mkdir build && cd build
# Configure (optional: disable benchmarks)
cmake .. -DWITH_BENCHMARKS=ON
# Build
cmake --build . --config Release# Run unit tests
./test/emhash_test
# Run benchmarks
./bench/ebench
./bench/mbench| Macro | Description |
|---|---|
EMH_HIGH_LOAD=<value> |
Enable high load factor support. Must be a positive integer (e.g. 123456), not 0. Enables empty-bucket chain (_ehead) for LF up to 0.999 |
EMH_WY_HASH=1 |
Use wyhash algorithm (faster for string keys) |
EMH_SAFE_HASH=1 |
Enable backup hash function (hash attack protection, ~10% cost) |
EMH_LRU_SET=1 |
Enable LRU cache mode |
EMH_STATIS=1 |
Enable collision statistics output |
EMH_FIBONACCI_HASH=1 |
Use Fibonacci hashing |
EMH_IDENTITY_HASH=1 |
Use identity hashing (integer key optimization) |
EMH_SMALL_SIZE=<N> |
Small Buffer Optimization for emhash5: inline storage for ≤N buckets, no heap alloc (emhash5 only, N≥2) |
EMH_PACK_TAIL=<1-100> |
Add extra tail buckets (N% of total) for faster probing near table end (emhash5/8) |
EMH_ITER_SAFE=1 |
Eager iterator initialization, safer if container modified during iteration (emhash6/7) |
EMH_ALIGN64=1 |
Use 64-bit aligned bitmask access for faster empty-bucket scan (emhash6) |
emhash is not thread-safe. Concurrent access from multiple threads requires external synchronization:
// ❌ Wrong: concurrent access without synchronization
// Thread 1: map[key] = val;
// Thread 2: map.find(key);
// ✅ Correct: use mutex for concurrent access
std::mutex mtx;
{
std::lock_guard<std::mutex> lock(mtx);
map[key] = val;
}
// ✅ Correct: concurrent read-only access is safe
// Multiple threads can call find()/contains()/count() simultaneously
// as long as no thread is modifying the mapemhash does not guarantee reference stability during insert/erase/rehash operations.
// ❌ Wrong: reference may become invalid after rehash
auto& ref = myhash[1];
myhash[2] = ref; // Danger! May trigger rehash
// ✅ Correct: copy value first
auto val = myhash[1];
myhash.reserve(10000); // Pre-allocate
myhash[2] = val;Deleting elements during iteration may cause some keys to be visited twice or skipped.
// ❌ Wrong: direct deletion without updating iterator
for (const auto& it : myhash) {
if (condition) myhash.erase(it.first);
}
// ✅ Correct: use iterator erase and update
for (auto it = myhash.begin(); it != myhash.end(); ) {
if (condition) {
it = myhash.erase(it); // Returns next valid iterator
} else {
++it;
}
}For very large value types (e.g., > 100 bytes), consider using pointer storage:
// High memory usage
emhash7::HashMap<Key, LargeValue> map1;
// Optimized memory usage
emhash7::HashMap<Key, std::unique_ptr<LargeValue>> map2;emhash has been validated by multiple well-known third-party benchmarks:
- Martin Ankerl's Hashmap Benchmark - Comprehensive performance testing
- Jackson Allan's C/C++ Hash Tables Benchmark - Multi-dimensional comparison
Download all files in the bench/tsl_bench/ directory and open chartsAll.html in a browser to view interactive performance curves.
This project is open-sourced under the MIT License.
Copyright (c) 2019-2026 Huang Yuanbing & bailuzhou AT 163.com
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
Thanks to the following projects and authors for inspiration and comparison: