Skip to content

Sister permutation sharing to reduce redundant triple storage#2783

Open
joka921 wants to merge 3 commits intoad-freiburg:masterfrom
joka921:sister-permutation-sharing
Open

Sister permutation sharing to reduce redundant triple storage#2783
joka921 wants to merge 3 commits intoad-freiburg:masterfrom
joka921:sister-permutation-sharing

Conversation

@joka921
Copy link
Copy Markdown
Member

@joka921 joka921 commented Mar 19, 2026

Summary

  • Sister permutation sharing: Small-relation blocks can reference the sister permutation (e.g., SPO↔SOP), swapping col1/col2 and resorting, instead of storing duplicate data. This reduces redundant triple storage for permutation pairs that share the same col0.
  • The CrossPairPermutation enum value is retained in BlockSharingInfo for serialization format compatibility, but guarded with AD_CORRECTNESS_CHECK(false) so it is never used at read time.
  • Sister pairs wired: SPO↔SOP, OSP↔OPS.

Test plan

  • CompressedRelationsTest passes (22 tests)
  • IndexTest passes (17 tests)
  • Full CI suite

🤖 Generated with Claude Code

joka921 and others added 3 commits March 19, 2026 12:43
When building permutation pairs (SPO↔SOP, PSO↔POS, OSP↔OPS), blocks of
small relations in the second writer (writer2) now reference the
corresponding block in the first writer (writer1) instead of storing
their own compressed data. At read time, the shared block is read from
the sister permutation, columns are swapped, and the result is resorted.

This reduces index size by avoiding duplicate compressed storage for
blocks that contain the same triples in a different column order.

Key changes:
- Add BlockSharingInfo struct and sharingInfo_ field to block metadata
- Writer2 creates shared block references via addSharedBlockMetadata
- Reader handles shared blocks via readSharedBlock (swap + resort)
- Permutation objects track sister/cross-pair links for reader access
- Buffer-to-block index mapping enables fix-up after block sorting
- PSO↔POS sister links are always set up (not just when all perms loaded)
- Cross-pair sharing infrastructure is in place but disabled (TODO)
- Index format version bumped to reflect metadata format change

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Johannes Kalmbach <johannes.kalmbach@gmail.com>
…ion sharing.

The CrossPairPermutation enum value is retained in BlockSharingInfo for
serialization compatibility, but guarded with AD_CORRECTNESS_CHECK(false)
so it is never used at read time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sparql-conformance
Copy link
Copy Markdown

Overview

Number of Tests Passed ✅ Intended ✅ Failed ❌ Not tested
548 400 73 75 0

Conformance check failed ❌

Test Status Changes 📊

Number of Tests Previous Status Current Status
54 Passed Failed

Details: https://qlever.dev/sparql-conformance-ui?cur=de7e32ef5c7855e94825956c25bc10baf21e5d6f&prev=5112e605d624449eb565cbc331f260bdab7dea3b

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant