Skip to content

Simplify partial vocab phase and reduce memory#2856

Open
RobinTF wants to merge 7 commits intoad-freiburg:masterfrom
RobinTF:simplify-partial-merge
Open

Simplify partial vocab phase and reduce memory#2856
RobinTF wants to merge 7 commits intoad-freiburg:masterfrom
RobinTF:simplify-partial-merge

Conversation

@RobinTF
Copy link
Copy Markdown
Collaborator

@RobinTF RobinTF commented May 4, 2026

This PR simplifies the partial vocab phase by avoiding computing sort keys (that consume a lot of memory) and simply comparing the strings with each other on the fly when sorting.
The helper type TripleComponentOrId (alias of std::variant<PossiblyExternalizedIriOrLiteral, Id>) is replaced with PossiblyExternalizedTripleComponent because the wrapped TripleComponent can already store plain Ids, so the variant is not necessary, which also helps reducing the memory footprint.
Finally LocalVocabIndexAndSplitVal is renamed to PartialVocabIndexWithExternalFlag and shrunken down to a single 64 bit int which is sufficient to store the partial vocab index and a external bool, which reduces the memory footprint even further since the bool would be padded with 7 extra bytes otherwise.

Small scale testing showed a minimal increase (~1%) in performance with this change applied.

@RobinTF RobinTF changed the title Simplify partial vocab phase and reduce memory. Simplify partial vocab phase and reduce memory May 4, 2026
@sparql-conformance
Copy link
Copy Markdown

Overview

Number of Tests Passed ✅ Intended ✅ Failed ❌ Not tested
548 447 73 28 0

Conformance check passed ✅

No test result changes.

Details: https://qlever.dev/sparql-conformance-ui?cur=98dab0a6b2a8a4678a110d5e2da3a7791931926d&prev=f54b0a34f8d91af3bbdb7dbc32f115d295e11c31

@codecov
Copy link
Copy Markdown

codecov Bot commented May 4, 2026

Codecov Report

❌ Patch coverage is 98.00000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 92.37%. Comparing base (04f32a1) to head (98dab0a).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
src/index/StringSortComparator.h 92.30% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2856      +/-   ##
==========================================
- Coverage   92.37%   92.37%   -0.01%     
==========================================
  Files         510      510              
  Lines       44109    44089      -20     
  Branches     5834     5832       -2     
==========================================
- Hits        40746    40727      -19     
- Misses       1702     1706       +4     
+ Partials     1661     1656       -5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 4, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant