Skip to content

cross_database: batched query state management is broken and fragile #3990

@Josipmrden

Description

@Josipmrden

Problems

The batched query modules in mage/python/cross_database.py, mage/python/migrate.py use a module-level state_dict to track in-progress queries. This pattern has three issues:

1. Concurrent users cannot run the same query

The state_dict is keyed by a hash of (query, config, params). If two users run the same query with the same parameters concurrently, the second call hits:

RuntimeError: Migrate module with these parameters is already running.

This is a legitimate concurrent use case, not a duplicate call.

2. Exceptions corrupt the state dict

If an exception occurs during _fetch_batch (e.g. _convert_bolt_record fails on an unsupported type like a vertex or relationship), the exception propagates but the state_dict entry is never cleaned up. All subsequent calls with the same parameters are permanently blocked with the "already running" error until the module is reloaded.

3. No shared abstraction for batched state management

Every database backend (bolt, mysql, postgresql, oracle, etc.) re-implements the same init/fetch/cleanup state dict pattern independently. This leads to:

  • Duplicated boilerplate across every backend
  • Inconsistent error handling (some backends may handle cleanup, others don't)
  • Each new backend must re-discover and avoid the same pitfalls

Suggested improvements

  • Use a shared base class or helper that manages the state dict lifecycle (init, fetch, cleanup-on-error) for all backends
  • Key state by a unique execution ID rather than query hash, so concurrent identical queries don't collide
  • Ensure cleanup always runs on exception (try/finally in fetch paths)

Affected code

  • mage/python/cross_database.py — all _*_init_state / _*_fetch_batch / _*_cleanup functions

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions