Spindle

A local, durable, observable job fabric for heterogeneous generative ML workloads.

Runs across a control node + one or more GPU nodes. Routes each job to a worker on whichever node hosts the matching ModelConfig. Persists state across restarts. Keeps enough history to replay / eval jobs against new model configs later.

What it does

Submit jobs of varied types: text.*, image.*, video.*, audio.*, cpu.*, external.*, eval.*.
Track lifecycle (queued → leased → running → succeeded | failed | canceled | dead_lettered), retry on transient failures, dead-letter on persistent ones.
Per-config Redis streams + an embedded per-node dispatcher that leases jobs atomically and IPC-dispatches to a local worker. v0 picks the first matching local worker; scoring (warm-model affinity, capacity, fairness) is deferred until contention shows up.
Persist artifacts (images, video, audio, JSON, text) with lineage back to the producing job and model config.
Replay historical jobs against alternate model configs for A/B comparison and promotion (Phase 7).

Data flow in one diagram

client  ─POST /jobs─►  api ─►  Mongo (queued)  +  Redis stream (per config_id)
                                                          │
                                                          ▼  (any node serving that config)
                                       runtime: dispatcher tick reads Redis
                                            → acquire_lease (Mongo CAS)
                                            → IPC `op:run` to a local worker's Unix socket
                                                          │
                                                          ▼
                                        worker.execute(job, ctx)
                                            → ArtifactStore.put(...)  → MinIO
                                            → POST /jobs/{id}/complete → api
                                                                       → Mongo (succeeded)

client polls GET /jobs/{id}   → terminal status + artifact URIs
client GET /artifacts/{id}/bytes  → streams from ArtifactStore (cross-node if needed)

The dispatcher is embedded in the runtime (same process as the supervisor); the API talks to core's state / queue / artifact backends directly, never through Redis or back through itself.

What it isn't

Not a distributed scheduler at scale. Designed for ~1k jobs/day across 2–3 nodes.
Not a workflow/DAG engine. Jobs are independent units. Workflow orchestration lives upstream (agents, scripts, future SDK).
Not opinionated about runtimes. Workers are thin shims around MLX, vLLM, ComfyUI, ffmpeg, HTTP APIs — Spindle just schedules and observes them.

Architecture

Three swap-point protocols, each with at least one production backend and an in-memory backend for tests:

Concern	Protocol	Backends
State (jobs / configs / events / artifact metadata)	`StateStore`	`mongo`, in-memory
Per-config job queue	`JobQueue`	`redis_streams`, in-memory
Artifact bytes	`ArtifactStore`	`s3` (MinIO / AWS / R2 / B2), in-memory

See ARCHITECTURE.md for components and data flow. ROADMAP.md lays out what gets built when. STATUS.md tracks current state.

Layout

core/         domain types, protocols, backend impls (Mongo / Redis / S3)   ✓
api/          FastAPI gateway                                                ✓
runtime/      per-node bundle: supervisor + embedded dispatcher              ✓
workers/      WorkerBase + concrete shims (audio_tts: openai/kokoro/f5)      ✓
cli/          `spindle` CLI                                                  (PLAN.md)
infra/        Docker compose, MinIO bootstrap
configs/      sample per-node runtime YAMLs
docs/         consumer-facing API contract

Each component directory has its own PLAN.md (design reference, locked decisions, acceptance criteria) plus the implementation under src/.

Quickstart: submit a real job

# install uv  (https://docs.astral.sh/uv/)
brew install uv                                  # macOS
# or: curl -LsSf https://astral.sh/uv/install.sh | sh

# bring up state + queue + artifacts
brew services start mongodb-community@7.0        # macOS
brew services start redis
cd infra/minio && ./bootstrap.sh && cd -         # MinIO via Docker

# install workspace + the extras you actually need on this node
# (kokoro / f5 pull torch — install them only on a GPU node)
uv sync --all-packages --extra audio_tts --extra dev

# start the API (control node)
uv run spindle-api &

# seed a ModelConfig
curl -X POST http://localhost:8080/configs -H 'Content-Type: application/json' \
  -d '{"id":"audio-tts-openai-v1","name":"OpenAI TTS","version":"v1",
       "job_types":["audio.tts"],"preferred_node":"control-node",
       "runtime_backend":"openai","model_ref":"tts-1-hd","is_active":true}'

# start the runtime (supervisor + dispatcher + N worker processes)
uv run spindle-workers run --config configs/runtime.control-node.yaml &

# submit a job
curl -X POST http://localhost:8080/jobs -H 'Content-Type: application/json' \
  -d '{"type":"audio.tts","config_id":"audio-tts-openai-v1",
       "input":{"text":"Hello world."}}'

# poll
curl http://localhost:8080/jobs/<id>

Running the test suite

# install deps for the tests you want
uv sync --all-packages --extra dev --extra audio_tts

# unit + smoke tests across components (~36 + 92 core tests)
uv run pytest core/tests api/tests runtime/tests workers/tests

uv run --with ruff ruff check .
uv run --with pyright pyright

--all-extras pulls every backend; on Python 3.14 kokoro's transitive spacy dep doesn't have wheels — pick extras per host instead.

Configure via env vars; copy .env.example to .env to override defaults.

License

Apache 2.0. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spindle

What it does

Data flow in one diagram

What it isn't

Architecture

Layout

Quickstart: submit a real job

Running the test suite

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
api		api
cli		cli
configs		configs
core		core
docs		docs
infra		infra
runtime		runtime
workers		workers
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
ARCHITECTURE.md		ARCHITECTURE.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
STATUS.md		STATUS.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Spindle

What it does

Data flow in one diagram

What it isn't

Architecture

Layout

Quickstart: submit a real job

Running the test suite

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages