-
Notifications
You must be signed in to change notification settings - Fork 475
Description
Summary
The variable-length edge (VLE) expansion algorithm used for patterns like [*1..3] exhibits O(n^k) time complexity because it does not perform cycle detection or deduplication during recursive expansion. On graphs with cycles (which are common in real-world data), this causes exponential path explosion.
Reproduction
Setup: 100K nodes with edges forming natural cycles.
-- VLE query: find all nodes reachable within 1-3 hops
SELECT * FROM cypher('benchmark', $$
MATCH (n:Node {bench_id: 42})-[*1..3]->(m)
RETURN count(m)
$$) AS (cnt agtype);On graphs with cycles, the expansion visits the same nodes exponentially many times because UNION ALL semantics are used internally (no deduplication).
Performance Data
Benchmark W06 (variable-length edge traversal):
| Approach | 10K scale | 100K scale |
|---|---|---|
| Cypher VLE (before SDK workaround) | ~23ms | ~23ms |
| Recursive CTE with UNION (dedup) | 0.08ms | 0.34ms |
The SDK workaround uses a recursive CTE with UNION (not UNION ALL) which provides implicit cycle detection through set deduplication, preventing exponential blowup:
WITH RECURSIVE vle AS (
SELECT e.end_id AS node_id, 1 AS depth
FROM benchmark."EDGE" e
WHERE e.start_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 42)
UNION -- implicit dedup = cycle detection
SELECT e.end_id, v.depth + 1
FROM vle v
JOIN benchmark."EDGE" e ON e.start_id = v.node_id
WHERE v.depth < 3
)
SELECT count(*) FROM vle WHERE depth >= 1;Suggested Fix
Add a visited-node set to the VLE expansion in the executor. When expanding [*min..max], track which (node_id, depth) pairs have already been explored and skip duplicates. This is equivalent to using UNION instead of UNION ALL in the recursive expansion.
Environment
- PostgreSQL 18.2
- Apache AGE 1.7.0 (built from source, branch
release/PG18/1.7.0) - 32-core, 64GB RAM, Linux 6.17