Introduce bumpalo for arena-based allocation#124
Introduce bumpalo for arena-based allocation#124cijiugechu wants to merge 8 commits intoridiculousfish:masterfrom
bumpalo for arena-based allocation#124Conversation
ridiculousfish
left a comment
There was a problem hiding this comment.
Thank you for this work - though it failed to compile even when running directly on commit a17ffa4910ce27a57c56022405837cc4c40476f2 (I've since tweaked the GitHub settings so you can run our pipelines and tests in CI).
Also I think introducing bumpalo is premature. bumpalo is by necessity an invasive change and programming against arenas is unfamiliar to a lot of Rust programmers.
As of yet zero work has been put into optimizing the parser and optimizer. I think there's a lot of low-hanging fruit there. My suggested approach is to introduce the benchmarking suite you propose, use normal profiling techniques to optimize the parse and optimization phases, and then only reach for bumpalo and similar only if and when we've exhausted what ordinary optimizations cna give us.
Thank you for the feedback and for taking the time to review my PR. I also appreciate you tweaking the GitHub settings so I can run the CI pipeline; that will be very helpful. My initial intention was to contain the use of |
Summary
This pull request introduces
bumpaloas an arena allocator for the regex compilation process. By allocating intermediate data structures within a single memory arena, this change reduces the number of calls to the global allocator and improves compilation performance.Motivation
The current regex compiler performs numerous small, individual allocations for nodes, character sets, and other temporary objects. Each allocation incurs overhead from the system's global allocator, which can become a performance bottleneck, especially when compiling complex regular expressions or a high volume of expressions in a tight loop.
Benchmark Results
To validate the performance improvement, I ran benchmarks using
criterioncomparing themasterbranch with the changes in this PR. The benchmarks measure the time taken to compile various regular expressions.Environment:
Results:
PR:
main: