-
Notifications
You must be signed in to change notification settings - Fork 475
Description
Summary
The `shortestPath()` function in AGE appears to use the same VLE (variable-length edge) expansion internally, inheriting the O(n^k) cycle-free expansion issue from #2349. A shortest path query should use BFS (breadth-first search) which terminates as soon as the target is found, but the current implementation expands all paths up to the maximum depth before selecting the shortest.
Reproduction
SELECT * FROM cypher('benchmark', $$
MATCH p = shortestPath((a:Node {bench_id: 1})-[*..10]->(b:Node {bench_id: 50}))
RETURN length(p)
$$) AS (len agtype);Performance Data
Benchmark W07 (shortest path):
| Approach | 10K scale | 100K scale |
|---|---|---|
| Cypher shortestPath() | ~23ms | ~23ms |
| BFS CTE with UNION + early termination | 0.09ms | 0.33ms |
The SDK workaround implements BFS as a recursive CTE:
WITH RECURSIVE bfs AS (
SELECT e.end_id AS node_id, 1 AS depth
FROM benchmark."EDGE" e
WHERE e.start_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 1)
UNION
SELECT e.end_id, b.depth + 1
FROM bfs b
JOIN benchmark."EDGE" e ON e.start_id = b.node_id
WHERE b.depth < 10
AND NOT EXISTS (SELECT 1 FROM bfs WHERE node_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 50))
)
SELECT depth FROM bfs
WHERE node_id = (SELECT id FROM benchmark."Node" WHERE _bench_id = 50)
LIMIT 1;Suggested Fix
Implement `shortestPath()` using a proper BFS algorithm in the executor rather than delegating to VLE. BFS guarantees finding the shortest path and terminates at the first match, providing O(V+E) complexity instead of O(n^k).
This is related to #2349 (VLE cycle detection) — fixing VLE would also improve shortestPath, but a dedicated BFS implementation would be even more efficient for the shortest-path use case.
Environment
- PostgreSQL 18.2
- Apache AGE 1.7.0 (built from source, branch `release/PG18/1.7.0`)
- 32-core, 64GB RAM, Linux 6.17