Skip to content

handles words>maxprefixlen#36

Open
aivillio wants to merge 12 commits into
acmpesuecc:devfrom
aivillio:pr_changes
Open

handles words>maxprefixlen#36
aivillio wants to merge 12 commits into
acmpesuecc:devfrom
aivillio:pr_changes

Conversation

@aivillio
Copy link
Copy Markdown
Contributor

@aivillio aivillio commented Mar 28, 2026

PR title OR Closes {Link to issue}

Context

Why are these changes being made?
The implementation would not handle words greater than maxprefixlen and panic

Description

How have the changes been made? i.e. steps taken to solve the above issue

  • Added fetchleaf so that if the word was longer than maxprefixlen you
    would get the leaf ,so that you could compare the key with the full prefix
  • enables checkprefix to check actual prefix instead of the truncated prefix
  • added num_children to nodes

Copilot AI review requested due to automatic review settings March 28, 2026 18:26
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the internal Adaptive Radix Tree (ART) implementation to avoid panics and incorrect prefix comparisons when a node’s compressed prefix length exceeds maxprefixlen (8), by fetching a leaf to compare against the full key prefix.

Changes:

  • Adds helper utilities (fetchleaf, updated checkprefix, addchild/grow helpers) to support long-prefix handling.
  • Introduces/updates core ART operations: insert and search, plus a tree debug printer.
  • Adds a small cmd/radFS/arttest executable to exercise/debug the ART implementation.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
internal/art/util.go Prefix checking helpers + child management + growth logic to support long prefixes
internal/art/insert.go Insert logic updated to store full prefixlen and split nodes using long-prefix comparisons
internal/art/search.go Search logic that uses checkprefix and full leaf-key comparison
internal/art/print_tree.go Debug tree printer (uses fetchleaf for long prefixes)
internal/art/node.go Node types, constants (maxprefixlen), and shared node metadata (prefixlen, num_children)
internal/art/node4.go Node4 constructor
internal/art/node16.go Node16 constructor
internal/art/node48.go Node48 constructor
internal/art/node256.go Node256 constructor
internal/art/leaf.go Leaf node + newleaf / isleaf
internal/art/art.go Public-ish wrapper (Tree) providing Insert/Search
cmd/radFS/arttest/main.go Manual executable for basic ART insert + print

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/art/util.go
Comment on lines +7 to +31
child1, pos1 := findchild(k, n)
if child1 != nil {
in.children[pos1] = child
return n
}

if n.innerNode.num_children == len(in.keys) {
n = grow(n)
in = n.innerNode

}

var i int
for i = in.num_children - 1; i >= 0 && in.keys[i] > k; i-- {
in.keys[i+1] = in.keys[i]
in.children[i+1] = in.children[i]

}

in.keys[i+1] = k
in.children[i+1] = child
in.num_children += 1

return n

Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addchild grows a node when num_children == len(in.keys), but this is incorrect for Node48 (len(keys)=256 while max children=48) and Node256 (keys is nil, so len(keys)=0). This can prevent growth and eventually cause out-of-bounds writes to children[i+1] or even call grow on a Node256 and get nil back. Please gate growth based on the node type’s actual capacity (e.g., Node4/16: len(keys), Node48: len(children)=48, Node256: len(children)=256) and implement insertion logic per node type instead of shifting arrays for Node48/256.

Suggested change
child1, pos1 := findchild(k, n)
if child1 != nil {
in.children[pos1] = child
return n
}
if n.innerNode.num_children == len(in.keys) {
n = grow(n)
in = n.innerNode
}
var i int
for i = in.num_children - 1; i >= 0 && in.keys[i] > k; i-- {
in.keys[i+1] = in.keys[i]
in.children[i+1] = in.children[i]
}
in.keys[i+1] = k
in.children[i+1] = child
in.num_children += 1
return n
// If a child for this key already exists, simply replace it.
child1, pos1 := findchild(k, n)
if child1 != nil {
in.children[pos1] = child
return n
}
switch in.nodeType {
case Node4, Node16:
// For Node4/Node16, capacity is len(keys); grow when full.
if in.num_children == len(in.keys) {
n = grow(n)
// Re-dispatch insertion on the grown node, which may be a different type.
return addchild(n, k, child)
}
// Insert key in sorted order by shifting keys/children to make room.
var i int
for i = in.num_children - 1; i >= 0 && in.keys[i] > k; i-- {
in.keys[i+1] = in.keys[i]
in.children[i+1] = in.children[i]
}
in.keys[i+1] = k
in.children[i+1] = child
in.num_children++
return n
case Node48:
// For Node48, capacity is the number of children slots (typically 48),
// while keys is a 256-entry index map.
if in.num_children == len(in.children) {
n = grow(n)
// After growing (to Node256), insert using the appropriate logic.
return addchild(n, k, child)
}
// Find first free child slot.
slot := -1
for i := 0; i < len(in.children); i++ {
if in.children[i] == nil {
slot = i
break
}
}
if slot == -1 {
// Should not happen if num_children is correctly maintained,
// but avoid out-of-bounds behavior.
return n
}
in.children[slot] = child
// Store slot+1 so that zero represents "no child".
in.keys[k] = byte(slot + 1)
in.num_children++
return n
case Node256:
// For Node256, capacity is len(children) (typically 256);
// keys is nil and not used.
if in.num_children == len(in.children) {
// Already at maximum capacity; cannot grow further safely.
// Avoid calling grow or writing out of bounds.
return n
}
if in.children[k] == nil {
in.num_children++
}
in.children[k] = child
return n
}
// Fallback: if an unknown node type is encountered, return unchanged.
return n

Copilot uses AI. Check for mistakes.
Comment thread internal/art/util.go
Comment on lines +7 to +31
child1, pos1 := findchild(k, n)
if child1 != nil {
in.children[pos1] = child
return n
}

if n.innerNode.num_children == len(in.keys) {
n = grow(n)
in = n.innerNode

}

var i int
for i = in.num_children - 1; i >= 0 && in.keys[i] > k; i-- {
in.keys[i+1] = in.keys[i]
in.children[i+1] = in.children[i]

}

in.keys[i+1] = k
in.children[i+1] = child
in.num_children += 1

return n

Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addchild shifts both in.keys and in.children as if they are parallel sorted arrays. That only holds for Node4/Node16; it breaks Node48 (indirection table) and Node256 (direct map) and can lead to out-of-bounds writes on children[i+1]. Consider implementing addchild separately per node type.

Suggested change
child1, pos1 := findchild(k, n)
if child1 != nil {
in.children[pos1] = child
return n
}
if n.innerNode.num_children == len(in.keys) {
n = grow(n)
in = n.innerNode
}
var i int
for i = in.num_children - 1; i >= 0 && in.keys[i] > k; i-- {
in.keys[i+1] = in.keys[i]
in.children[i+1] = in.children[i]
}
in.keys[i+1] = k
in.children[i+1] = child
in.num_children += 1
return n
// If a child with this key already exists, replace it.
child1, pos1 := findchild(k, n)
if child1 != nil {
in.children[pos1] = child
return n
}
// Grow the node if the children array is full.
if in.num_children == len(in.children) {
n = grow(n)
in = n.innerNode
}
switch in.nodeType {
case Node4, Node16:
// Maintain sorted order of keys and parallel children array.
var i int
for i = in.num_children - 1; i >= 0 && in.keys[i] > k; i-- {
in.keys[i+1] = in.keys[i]
in.children[i+1] = in.children[i]
}
in.keys[i+1] = k
in.children[i+1] = child
in.num_children++
case Node48:
// Node48 uses an indirection table: keys[k] = index+1 into children.
// Find a free slot in children.
pos := -1
for i := 0; i < len(in.children); i++ {
if in.children[i] == nil {
pos = i
break
}
}
if pos >= 0 {
in.children[pos] = child
in.keys[k] = byte(pos + 1)
in.num_children++
}
case Node256:
// Direct mapping from key byte to child index.
in.children[k] = child
in.num_children++
}
return n

Copilot uses AI. Check for mistakes.
Comment thread internal/art/search.go
}

// Get the next byte to branch on at current depth.
// Returns 0 (terminator) if key is exhausted.
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says keycheck returns 0 (terminator) if key is exhausted, but keycheck currently returns 1 when depth >= len(key). Please align the comment with the implementation, or (preferably) fix keycheck to use the intended terminator semantics.

Suggested change
// Returns 0 (terminator) if key is exhausted.
// Returns 1 (terminator) if key is exhausted.

Copilot uses AI. Check for mistakes.
Comment on lines +35 to +37
for i := 0; i < len(in.keys); i++ {
if in.children[i] != nil {
fmt.Printf("%s Edge('%c' | %d):\t", indent, in.keys[i], in.keys[i])
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop indexes in.children[i] while iterating up to len(in.keys). For Node48, len(keys)=256 but len(children)=48, so this will panic when i >= 48 (and it also doesn’t make sense for Node256 where keys is nil). Please iterate over the appropriate child representation per node type.

Suggested change
for i := 0; i < len(in.keys); i++ {
if in.children[i] != nil {
fmt.Printf("%s Edge('%c' | %d):\t", indent, in.keys[i], in.keys[i])
for i := 0; i < len(in.children); i++ {
if in.children[i] != nil {
var edgeKey byte
if in.keys != nil && i < len(in.keys) {
edgeKey = in.keys[i]
} else {
edgeKey = byte(i)
}
fmt.Printf("%s Edge('%c' | %d):\t", indent, edgeKey, edgeKey)

Copilot uses AI. Check for mistakes.
Comment thread internal/art/util.go
Comment on lines +106 to +115
index := 0
for i := 0; i < 16; i++ {
idx := n.innerNode.keys[i]
child := n.innerNode.children[i]

if child != nil {
n48.innerNode.keys[idx] = byte(index + 1) // the reason its index+1 is because we are making 0 a kind of "no children" case

n48.innerNode.children[index] = child
index++
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In grow (Node16 -> Node48), index tracks how many children are copied into the Node48, but n48.innerNode.num_children is never set. This leaves the grown node’s metadata inconsistent with its contents. Set n48.innerNode.num_children = index before returning.

Copilot uses AI. Check for mistakes.
Comment thread internal/art/insert.go
Comment on lines +13 to +16
i := depth
fmt.Println("depth:", depth)
fmt.Println("oldkey[depth:]:", string(oldkey[depth:]))
fmt.Println("key[depth:]:", string(key[depth:]))
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple fmt.Println debug prints in insert (e.g., printing depth/keys/prefix). This will spam stdout for every insert and makes the package unusable as a library component. Please remove these prints or gate them behind an explicit debug logger/flag.

Copilot uses AI. Check for mistakes.
Comment thread internal/art/util.go
in := n.innerNode
switch in.nodeType {
case Node4, Node16:
for i := 0; i < len(in.keys); i++ {
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Node4/Node16, findchild scans len(in.keys) instead of only the populated range. Unused key slots are zero-initialized, so lookups for k==0 can match an unused slot and return an incorrect position. Please iterate only up to in.num_children for these node types.

Suggested change
for i := 0; i < len(in.keys); i++ {
for i := 0; i < in.num_children; i++ {

Copilot uses AI. Check for mistakes.
Comment thread internal/art/util.go
Comment on lines +85 to +90
if depth >= len(key) {
return 1

} else {
return key[depth]
}
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keycheck uses 1 as an end-of-key sentinel (depth >= len(key) returns 1). With arbitrary []byte keys this is ambiguous (e.g., []byte{} and []byte{1} collide at depth 0). Please switch to an unambiguous terminator strategy (explicit length checks, or return an int with 256 as terminator, etc.).

Suggested change
if depth >= len(key) {
return 1
} else {
return key[depth]
}
// Return the byte at the given depth if it is within bounds.
// Callers should separately check depth against len(key) when they
// need to distinguish "end of key" from an actual byte value.
if depth < 0 || depth >= len(key) {
return 0
}
return key[depth]

Copilot uses AI. Check for mistakes.
Comment thread internal/art/search.go
Comment on lines +12 to +16
if isleaf(n) {
if string(n.leaf.key) == string(key) {
return n
}
return nil
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaf key comparison converts both []byte to string, which allocates and can be expensive for large keys. Prefer bytes.Equal(n.leaf.key, key) to avoid allocations and to clearly express byte-slice equality.

Copilot uses AI. Check for mistakes.
Comment thread internal/art/util.go
for i := 0; i < 4; i++ {
n16.innerNode.keys[i] = n.innerNode.keys[i]
n16.innerNode.children[i] = n.innerNode.children[i]
}
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In grow (Node4 -> Node16), the new node copies keys/children but never copies num_children. After growth, the returned node will think it has 0 children, corrupting later inserts/searches. Please copy n.innerNode.num_children into the grown node.

Suggested change
}
}
n16.innerNode.num_children = n.innerNode.num_children

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants