Skip to content

Commit f30bfa9

Browse files
authored
Add frozen support to roaring64 (#688)
* Array-backed ART * Array-backed r64 * ART serialization * r64 frozen serialization * Synthetic benchmarks for r64 * Address review comments * Add random insert / remove benchmark * Link free nodes together This adds the index of the next free node into a newly freed node, or `capacity` if there are no more free indices. This significantly speeds up finding the next free index, which is important for add+remove workloads. Benchmarks Old: ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ r64InsertRemoveRandom/0 127 ns 127 ns 5461079 r64InsertRemoveRandom/1 31633 ns 31604 ns 24028 r64InsertRemoveRandom/2 30782 ns 30769 ns 21859 r64InsertRemoveRandom/3 31985 ns 31969 ns 21558 r64InsertRemoveRandom/4 356 ns 356 ns 1962694 r64InsertRemoveRandom/5 28972 ns 28962 ns 21366 r64InsertRemoveRandom/6 30632 ns 30623 ns 22682 r64InsertRemoveRandom/7 448 ns 448 ns 1601550 r64InsertRemoveRandom/8 32506 ns 32495 ns 21591 r64InsertRemoveRandom/9 689 ns 689 ns 1002237 cppInsertRemoveRandom/0 131 ns 131 ns 5319673 cppInsertRemoveRandom/1 16106 ns 16104 ns 43632 cppInsertRemoveRandom/2 3881 ns 3881 ns 180087 cppInsertRemoveRandom/3 3582 ns 3582 ns 171298 cppInsertRemoveRandom/4 403 ns 402 ns 1666697 cppInsertRemoveRandom/5 993 ns 993 ns 706038 cppInsertRemoveRandom/6 4039 ns 4038 ns 172421 cppInsertRemoveRandom/7 469 ns 469 ns 1440197 cppInsertRemoveRandom/8 1454 ns 1454 ns 633551 cppInsertRemoveRandom/9 654 ns 654 ns 1091588 setInsertRemoveRandom/0 1944 ns 1943 ns 368926 setInsertRemoveRandom/1 1955 ns 1953 ns 404931 setInsertRemoveRandom/2 1911 ns 1910 ns 358466 setInsertRemoveRandom/3 1953 ns 1951 ns 362351 setInsertRemoveRandom/4 2104 ns 2102 ns 321387 setInsertRemoveRandom/5 1944 ns 1943 ns 354836 setInsertRemoveRandom/6 1835 ns 1835 ns 359099 setInsertRemoveRandom/7 1970 ns 1968 ns 372625 setInsertRemoveRandom/8 1894 ns 1892 ns 355456 setInsertRemoveRandom/9 1659 ns 1659 ns 355902 New: ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ r64InsertRemoveRandom/0 128 ns 128 ns 5614266 r64InsertRemoveRandom/1 935 ns 935 ns 739679 r64InsertRemoveRandom/2 916 ns 916 ns 739944 r64InsertRemoveRandom/3 936 ns 936 ns 690708 r64InsertRemoveRandom/4 368 ns 368 ns 1957642 r64InsertRemoveRandom/5 1141 ns 1140 ns 592505 r64InsertRemoveRandom/6 1139 ns 1138 ns 657840 r64InsertRemoveRandom/7 481 ns 481 ns 1434967 r64InsertRemoveRandom/8 1447 ns 1446 ns 484463 r64InsertRemoveRandom/9 721 ns 721 ns 1017456 cppInsertRemoveRandom/0 134 ns 134 ns 5524804 cppInsertRemoveRandom/1 15616 ns 15608 ns 47666 cppInsertRemoveRandom/2 3855 ns 3854 ns 180265 cppInsertRemoveRandom/3 3809 ns 3808 ns 183595 cppInsertRemoveRandom/4 412 ns 412 ns 1695708 cppInsertRemoveRandom/5 1012 ns 1011 ns 713501 cppInsertRemoveRandom/6 3410 ns 3409 ns 199214 cppInsertRemoveRandom/7 474 ns 474 ns 1496740 cppInsertRemoveRandom/8 1421 ns 1420 ns 465868 cppInsertRemoveRandom/9 564 ns 564 ns 1148076 setInsertRemoveRandom/0 1956 ns 1956 ns 351283 setInsertRemoveRandom/1 1959 ns 1958 ns 355766 setInsertRemoveRandom/2 1886 ns 1885 ns 357406 setInsertRemoveRandom/3 1905 ns 1904 ns 355235 setInsertRemoveRandom/4 1945 ns 1944 ns 364599 setInsertRemoveRandom/5 1902 ns 1902 ns 350312 setInsertRemoveRandom/6 1907 ns 1906 ns 346962 setInsertRemoveRandom/7 1937 ns 1936 ns 356168 setInsertRemoveRandom/8 1881 ns 1880 ns 341472 setInsertRemoveRandom/9 1962 ns 1961 ns 350643 * Sort free lists in art_shrink_to_fit This avoids a bug in the following scenario: art->leaves = [2,0,x] art->first_free[leaf_type] = 1 Where `2` and `0` are pointers to the next free index, and `x` is an occupied leaf. In this case, if `art_shrink_to_fit` was called, then we would have the following result: art->leaves = [2,x,0] art->first_free[leaf_type] = 0 This is not fully shrunken, and therefore wrong. Sorting the free indices fixes this scenario. Before `art_shrink_to_fit`: art->leaves = [1,2,x] art->first_free[leaf_type] = 0 After `art_shrink_to_fit`: art->leaves = [x,2,3] art->first_free[leaf_type] = 1 * Minor cleanups to ART and r64 internals * Replace size_t with uint64_t where applicable Also replace malloc+memset with calloc. * Use a generic pointer array for ART nodes This, combined with a static array of node type sizes, allows us to generically manipulate the nodes. * Correct outdated comment * Always try to shrink containers * Replace size_t with uint64_t where applicable in r64 * Check if ART is shrunken when checking if r64 is shrunken
1 parent 34b2271 commit f30bfa9

File tree

8 files changed

+2929
-1124
lines changed

8 files changed

+2929
-1124
lines changed

include/roaring/art/art.h

Lines changed: 88 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -19,8 +19,8 @@
1919
* chunks _differ_. This means that if there are two entries with different
2020
* high 48 bits, then there is only one inner node containing the common key
2121
* prefix, and two leaves.
22-
* * Intrusive leaves: the leaf struct is included in user values. This removes
23-
* a layer of indirection.
22+
* * Mostly pointer-free: nodes are referred to by index rather than pointer,
23+
* so that the structure can be deserialized with a backing buffer.
2424
*/
2525

2626
// Fixed length of keys in the ART. All keys are assumed to be of this length.
@@ -33,25 +33,33 @@ namespace internal {
3333
#endif
3434

3535
typedef uint8_t art_key_chunk_t;
36-
typedef struct art_node_s art_node_t;
36+
37+
// Internal node reference type. Contains the node typecode in the low 8 bits,
38+
// and the index in the relevant node array in the high 48 bits. Has a value of
39+
// CROARING_ART_NULL_REF when pointing to a non-existent node.
40+
typedef uint64_t art_ref_t;
41+
42+
typedef void art_node_t;
3743

3844
/**
39-
* Wrapper to allow an empty tree.
45+
* The ART is empty when root is a null ref.
46+
*
47+
* Each node type has its own dynamic array of node structs, indexed by
48+
* art_ref_t. The arrays are expanded as needed, and shrink only when
49+
* `shrink_to_fit` is called.
4050
*/
4151
typedef struct art_s {
42-
art_node_t *root;
52+
art_ref_t root;
53+
54+
// Indexed by node typecode, thus 1 larger than they need to be for
55+
// convenience. `first_free` indicates the index where the first free node
56+
// lives, which may be equal to the capacity.
57+
uint64_t first_free[6];
58+
uint64_t capacities[6];
59+
art_node_t *nodes[6];
4360
} art_t;
4461

45-
/**
46-
* Values inserted into the tree have to be cast-able to art_val_t. This
47-
* improves performance by reducing indirection.
48-
*
49-
* NOTE: Value pointers must be unique! This is because each value struct
50-
* contains the key corresponding to the value.
51-
*/
52-
typedef struct art_val_s {
53-
art_key_chunk_t key[ART_KEY_BYTES];
54-
} art_val_t;
62+
typedef uint64_t art_val_t;
5563

5664
/**
5765
* Compares two keys, returns their relative order:
@@ -63,14 +71,21 @@ int art_compare_keys(const art_key_chunk_t key1[],
6371
const art_key_chunk_t key2[]);
6472

6573
/**
66-
* Inserts the given key and value.
74+
* Initializes the ART.
75+
*/
76+
void art_init_cleared(art_t *art);
77+
78+
/**
79+
* Inserts the given key and value. Returns a pointer to the value inserted,
80+
* valid as long as the ART is not modified.
6781
*/
68-
void art_insert(art_t *art, const art_key_chunk_t *key, art_val_t *val);
82+
art_val_t *art_insert(art_t *art, const art_key_chunk_t *key, art_val_t val);
6983

7084
/**
71-
* Returns the value erased, NULL if not found.
85+
* Returns true if a value was erased. Sets `*erased_val` to the value erased,
86+
* if any.
7287
*/
73-
art_val_t *art_erase(art_t *art, const art_key_chunk_t *key);
88+
bool art_erase(art_t *art, const art_key_chunk_t *key, art_val_t *erased_val);
7489

7590
/**
7691
* Returns the value associated with the given key, NULL if not found.
@@ -83,42 +98,39 @@ art_val_t *art_find(const art_t *art, const art_key_chunk_t *key);
8398
bool art_is_empty(const art_t *art);
8499

85100
/**
86-
* Frees the nodes of the ART except the values, which the user is expected to
87-
* free.
101+
* Frees the contents of the ART. Should not be called when using
102+
* `art_deserialize_frozen_safe`.
88103
*/
89104
void art_free(art_t *art);
90105

91-
/**
92-
* Returns the size in bytes of the ART. Includes size of pointers to values,
93-
* but not the values themselves.
94-
*/
95-
size_t art_size_in_bytes(const art_t *art);
96-
97106
/**
98107
* Prints the ART using printf, useful for debugging.
99108
*/
100109
void art_printf(const art_t *art);
101110

102111
/**
103-
* Callback for validating the value stored in a leaf.
112+
* Callback for validating the value stored in a leaf. `context` is a
113+
* user-provided value passed to the callback without modification.
104114
*
105115
* Should return true if the value is valid, false otherwise
106116
* If false is returned, `*reason` should be set to a static string describing
107117
* the reason for the failure.
108118
*/
109-
typedef bool (*art_validate_cb_t)(const art_val_t *val, const char **reason);
119+
typedef bool (*art_validate_cb_t)(const art_val_t val, const char **reason,
120+
void *context);
110121

111122
/**
112-
* Validate the ART tree, ensuring it is internally consistent.
123+
* Validate the ART tree, ensuring it is internally consistent. `context` is a
124+
* user-provided value passed to the callback without modification.
113125
*/
114126
bool art_internal_validate(const art_t *art, const char **reason,
115-
art_validate_cb_t validate_cb);
127+
art_validate_cb_t validate_cb, void *context);
116128

117129
/**
118130
* ART-internal iterator bookkeeping. Users should treat this as an opaque type.
119131
*/
120132
typedef struct art_iterator_frame_s {
121-
art_node_t *node;
133+
art_ref_t ref;
122134
uint8_t index_in_node;
123135
} art_iterator_frame_t;
124136

@@ -130,6 +142,8 @@ typedef struct art_iterator_s {
130142
art_key_chunk_t key[ART_KEY_BYTES];
131143
art_val_t *value;
132144

145+
art_t *art;
146+
133147
uint8_t depth; // Key depth
134148
uint8_t frame; // Node depth
135149

@@ -143,19 +157,19 @@ typedef struct art_iterator_s {
143157
* depending on `first`. The iterator is not valid if there are no entries in
144158
* the ART.
145159
*/
146-
art_iterator_t art_init_iterator(const art_t *art, bool first);
160+
art_iterator_t art_init_iterator(art_t *art, bool first);
147161

148162
/**
149163
* Returns an initialized iterator positioned at a key equal to or greater than
150164
* the given key, if it exists.
151165
*/
152-
art_iterator_t art_lower_bound(const art_t *art, const art_key_chunk_t *key);
166+
art_iterator_t art_lower_bound(art_t *art, const art_key_chunk_t *key);
153167

154168
/**
155169
* Returns an initialized iterator positioned at a key greater than the given
156170
* key, if it exists.
157171
*/
158-
art_iterator_t art_upper_bound(const art_t *art, const art_key_chunk_t *key);
172+
art_iterator_t art_upper_bound(art_t *art, const art_key_chunk_t *key);
159173

160174
/**
161175
* The following iterator movement functions return true if a new entry was
@@ -174,14 +188,49 @@ bool art_iterator_lower_bound(art_iterator_t *iterator,
174188
/**
175189
* Insert the value and positions the iterator at the key.
176190
*/
177-
void art_iterator_insert(art_t *art, art_iterator_t *iterator,
178-
const art_key_chunk_t *key, art_val_t *val);
191+
void art_iterator_insert(art_iterator_t *iterator, const art_key_chunk_t *key,
192+
art_val_t val);
179193

180194
/**
181195
* Erase the value pointed at by the iterator. Moves the iterator to the next
182-
* leaf. Returns the value erased or NULL if nothing was erased.
196+
* leaf.
197+
* Returns true if a value was erased. Sets `*erased_val` to the value erased,
198+
* if any.
199+
*/
200+
bool art_iterator_erase(art_iterator_t *iterator, art_val_t *erased_val);
201+
202+
/**
203+
* Shrinks the internal arrays in the ART to remove any unused elements. Returns
204+
* the number of bytes freed.
205+
*/
206+
size_t art_shrink_to_fit(art_t *art);
207+
208+
/**
209+
* Returns true if the ART has no unused elements.
210+
*/
211+
bool art_is_shrunken(const art_t *art);
212+
213+
/**
214+
* Returns the serialized size in bytes.
215+
* Requires `art_shrink_to_fit` to be called first.
216+
*/
217+
size_t art_size_in_bytes(const art_t *art);
218+
219+
/**
220+
* Serializes the ART and returns the number of bytes written. Returns 0 on
221+
* error. Requires `art_shrink_to_fit` to be called first.
222+
*/
223+
size_t art_serialize(const art_t *art, char *buf);
224+
225+
/**
226+
* Deserializes the ART from a serialized buffer, reading up to `maxbytes`
227+
* bytes. Returns 0 on error. Requires `buf` to be 8 byte aligned.
228+
*
229+
* An ART deserialized in this way should only be used in a readonly context.The
230+
* underlying buffer must not be freed before the ART. `art_free` should not be
231+
* called on the ART deserialized in this way.
183232
*/
184-
art_val_t *art_iterator_erase(art_t *art, art_iterator_t *iterator);
233+
size_t art_frozen_view(const char *buf, size_t maxbytes, art_t *art);
185234

186235
#ifdef __cplusplus
187236
} // extern "C"

include/roaring/roaring64.h

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ namespace api {
1717
#endif
1818

1919
typedef struct roaring64_bitmap_s roaring64_bitmap_t;
20-
typedef struct roaring64_leaf_s roaring64_leaf_t;
20+
typedef uint64_t roaring64_leaf_t;
2121
typedef struct roaring64_iterator_s roaring64_iterator_t;
2222

2323
/**
@@ -312,6 +312,12 @@ uint64_t roaring64_bitmap_maximum(const roaring64_bitmap_t *r);
312312
*/
313313
bool roaring64_bitmap_run_optimize(roaring64_bitmap_t *r);
314314

315+
/**
316+
* Shrinks internal arrays to eliminate any unused capacity. Returns the number
317+
* of bytes freed.
318+
*/
319+
size_t roaring64_bitmap_shrink_to_fit(roaring64_bitmap_t *r);
320+
315321
/**
316322
* (For advanced users.)
317323
* Collect statistics about the bitmap
@@ -564,6 +570,53 @@ size_t roaring64_bitmap_portable_deserialize_size(const char *buf,
564570
roaring64_bitmap_t *roaring64_bitmap_portable_deserialize_safe(const char *buf,
565571
size_t maxbytes);
566572

573+
/**
574+
* Returns the number of bytes required to serialize this bitmap in a "frozen"
575+
* format. This is not compatible with any other serialization formats.
576+
*
577+
* `roaring64_bitmap_shrink_to_fit()` must be called before this method.
578+
*/
579+
size_t roaring64_bitmap_frozen_size_in_bytes(const roaring64_bitmap_t *r);
580+
581+
/**
582+
* Serializes the bitmap in a "frozen" format. The given buffer must be at least
583+
* `roaring64_bitmap_frozen_size_in_bytes()` in size. Returns the number of
584+
* bytes used for serialization.
585+
*
586+
* `roaring64_bitmap_shrink_to_fit()` must be called before this method.
587+
*
588+
* The frozen format is optimized for speed of (de)serialization, as well as
589+
* allowing the user to create a bitmap based on a memory mapped file, which is
590+
* possible because the format mimics the memory layout of the bitmap.
591+
*
592+
* Because the format mimics the memory layout of the bitmap, the format is not
593+
* fixed across releases of Roaring Bitmaps, and may change in future releases.
594+
*
595+
* This function is endian-sensitive. If you have a big-endian system (e.g., a
596+
* mainframe IBM s390x), the data format is going to be big-endian and not
597+
* compatible with little-endian systems.
598+
*/
599+
size_t roaring64_bitmap_frozen_serialize(const roaring64_bitmap_t *r,
600+
char *buf);
601+
602+
/**
603+
* Creates a readonly bitmap that is a view of the given buffer. The buffer
604+
* must be created with `roaring64_bitmap_frozen_serialize()`, and must be
605+
* aligned by 64 bytes.
606+
*
607+
* Returns NULL if deserialization fails.
608+
*
609+
* The returned bitmap must only be used in a readonly manner. The bitmap must
610+
* be freed using `roaring64_bitmap_free()` as normal. The backing buffer must
611+
* only be freed after the bitmap.
612+
*
613+
* This function is endian-sensitive. If you have a big-endian system (e.g., a
614+
* mainframe IBM s390x), the data format is going to be big-endian and not
615+
* compatible with little-endian systems.
616+
*/
617+
roaring64_bitmap_t *roaring64_bitmap_frozen_view(const char *buf,
618+
size_t maxbytes);
619+
567620
/**
568621
* Iterate over the bitmap elements. The function `iterator` is called once for
569622
* all the values with `ptr` (can be NULL) as the second parameter of each call.

microbenchmarks/CMakeLists.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,7 @@ add_executable(bench bench.cpp)
2525
target_link_libraries(bench PRIVATE roaring)
2626
target_link_libraries(bench PRIVATE benchmark::benchmark)
2727
target_compile_definitions(bench PRIVATE BENCHMARK_DATA_DIR="${BENCHMARK_DATA_DIR}")
28+
29+
add_executable(synthetic_bench synthetic_bench.cpp)
30+
target_link_libraries(synthetic_bench PRIVATE roaring)
31+
target_link_libraries(synthetic_bench PRIVATE benchmark::benchmark)

0 commit comments

Comments
 (0)