Ouroforge is a local-first, evidence-native prototype for game-authoring loops. It turns a declared goal into a local run, captures evidence from the runtime, records what happened, and proposes the next change without giving agents or browser surfaces trusted write authority.
The name is Ouroboros (the serpent that feeds on its own tail) + Forge. The loop is intentionally inspectable:
Seed → Run → Evidence → Evaluation → Journal → Mutation → (back to Seed)
Ouroforge is a pre-release local prototype, not a production editor, hosted service, public launch, commercial product, or engine replacement. Era AB is tracked as a bounded M166-M178 one-human plus agent-team local/web 2D public-alpha-candidate lane, with final completion gated on closure evidence. Era CC evidence shows a clean local checkout can run the documented verification commands and that selected Chrome-observed product journeys pass with known gaps routed instead of hidden.
Evidence:
fresh checkout audit,
Chrome product journey,
BH-CB reality audit,
CC-S5 closure audit.
Evidence-backed capabilities currently include:
- Clean checkout verification for formatting, a focused Rust allowlist test, and Era CC audit validators (fresh checkout audit).
- Chrome/Chromium/CDP observation of the authoring cockpit, authoring evidence page, asset/prefab lane, and dogfood runtime (Chrome journey).
- Editor UX checklist coverage for project open, scene hierarchy, inspector, transform edit, asset select, prefab instantiate, save/reload, play/stop, runtime error display, evidence panel, and undo/redo; this does not prove editor maturity for production (editor UX pass, editor gap ledger).
- A technical dogfood pass with movement, collision, enemy AI, progression, fail/restart, save/load, and scene transition evidence; combat, spawn waves, and a three-minute loop remain known gaps, and fun/feel remains pending human evidence (dogfood pass, human gate).
- Deterministic local scale/performance evidence with known limits, including a p99/memory repair route; this is not production performance certification (perf certification, scale limits, repair draft).
- BH-CB reality inventory across 219 rows with repair routing for gaps and no closure-by-metadata assumption (matrix, BH-CB audit).
Generated run, dashboard, screenshot, sandbox, and local tool artifacts are local state unless a future issue explicitly scopes a deterministic fixture.
Ouroforge does not currently claim:
- No Godot, Unity, Unreal, or any commercial engine parity/replacement.
- No editor or performance maturity for production; no public launch readiness; no commercial readiness.
- No hosted service, liveops, accounts, store approval, real Ads/IAP/payment, or public release operations.
- No human playtest approval, fun/feel approval, legal approval, store approval, or commercial approval.
- No browser trusted writes, local command bridges, auto-apply, auto-merge, or no reviewer bypass.
These non-claims are guarded by Era CC evidence-gate hardening and repair routing:
evidence gate hardening,
repair router dry run,
repair drafts.
- Dogfood combat, spawn waves, and a three-minute loop are known gaps (dogfood pass).
- Fun/feel approval is pending human evidence and is not fabricated (human pending artifact).
- One scale scenario records a known limit and repair draft (perf certification, scale limit ledger, repair draft).
- BH-CB closed issues still have fresh-verification known gaps and repair routing rather than blanket completion claims (BH-CB audit, repair ledger).
These commands are validated by the Era CC fresh-checkout audit:
docs/evidence/era-cc/fresh-checkout-truth-audit.md
and docs/evidence/era-cc/fresh-checkout-truth-audit.json.
cargo fmt --check
cargo test -p ouroforge-core --test test_command_allowlist
node scripts/era-cc-reality-audit.cjs validate
node scripts/era-cc-reality-audit.cjs final-audit --validate-onlyFor the BE-M2 transitional contributor quickstart wrapper, see
scripts/be-m2-quickstart.sh and
docs/be-m2-contributor-quickstart-v1.md.
For Chrome-observed product evidence, inspect the committed packet rather than
assuming live production behavior:
docs/evidence/era-cc/openchrome-product-journey.json.
The browser-evidence quickstart scaffolding lives at
scripts/be-m2-quickstart.sh (run --print-plan
to preview) with the corresponding guide
docs/be-m2-contributor-quickstart-v1.md.
Ouroforge's loop is built around evidence over assertion:
- Seed — declare intent and acceptance criteria.
- Run — execute a local runtime/demo path and collect generated artifacts.
- Evidence — capture bounded runtime, browser, project, scenario, and probe outputs as inspectable files.
- Evaluation — produce a deterministic verdict from the evidence.
- Journal — summarize what actually happened with evidence references.
- Mutation proposal — record proposed next changes as reviewable data, not trusted source writes.
- Repeat — a later reviewed change can become the next seed/run cycle.
The Rust core and local filesystem own trusted state. Agents, browser workers, and Chrome DevTools Protocol observations are evidence inputs only.
Ouroforge's active direction is the Rust engine core, not hosted demo pages or a
website-like editor shell. Runtime-visible 2D work should start from
crates/ouroforge-core/src/runtime_renderer.rs,
crates/ouroforge-core/src/runtime_tilemap.rs,
crates/ouroforge-core/src/runtime_animation.rs,
crates/ouroforge-core/src/runtime_frame_budget.rs, and
crates/ouroforge-core/src/godot_2d_adapter_ir.rs.
Historical browser demos and milestone example packs have been removed from the
tracked product surface. Any fixture needed by a Rust test should be promoted
into a crate-owned fixture directory or a dedicated tests/fixtures path as
part of the engine migration, not reintroduced under examples/.
Ouroforge's current safety boundary is conservative:
- Trusted authority: Rust CLI/core code and the local filesystem.
- Evidence only: agents, browser workers, and CDP observations can inform proposals but cannot apply them.
- Read-only browser surfaces: dashboard and cockpit pages render exported JSON and copyable commands; they do not write files, run commands, or accept source mutations.
- No command bridge: browser/UI surfaces do not invoke local commands or local server command bridges.
- No source apply authority: source-preview, sandbox, stale-target, rollback, and review artifacts are evidence/governance boundaries unless a later explicit issue authorizes trusted apply.
- Generated-state isolation:
runs/,target/, dashboard exports,.omx/,.omc/,.openchrome/,.claude/, and sandbox outputs remain local ignored state.
Security and trust-boundary references:
SECURITY.mddocs/evidence-fidelity-trust-boundary-v1.mddocs/public-alpha-security-trust-boundary-v1.mddocs/public-alpha-disclosure-and-sandbox-limitations-v1.mddocs/artifact-write-policy-v1.mddocs/authority-cosmetic-boundary-v1.mddocs/engine-runtime-studio-project-boundary-v1.md— BG-M1 Engine / Runtime / Studio / Project boundary SSOT.docs/runtime-artifact-boundary-v1.md— BG-M2 candidate runtime artifact, probe API, and mode boundary.docs/studio-authority-boundary-v1.md— BG-M3 Studio preview, trusted write, runtime bridge, and editor-only exclusion boundary.docs/game-project-layout-v1.md— BG-M4 game project layout, fixture/example split, dogfood boundary, and template promotion rules.docs/evidence-harness-boundary-v1.md— BG-M5 evidence harness ownership, browser packet attachment points, and probe-vs-game-state boundary.docs/export-runtime-boundary-v1.md— BG-M6 export inclusion/exclusion, dev-vs-export equivalence, and strip/manifest rules.docs/api-tier-boundary-v1.md— BG-M7 API tier taxonomy, current API tier inventory, and boundary coupling guard.docs/boundary-preflight-checklist-v1.md— BG-032/BG-M8 reusable AX–BF boundary preflight before implementation prompts, coding sessions, and issue retrofits.docs/ax-runtime-performance-boundary-retrofit-v1.md— BG-028 AX runtime performance boundary retrofit for candidate runtime vs fixture evidence claims.docs/az-studio-authority-boundary-retrofit-v1.md— BG-029 AZ Studio authority retrofit for preview/persist separation, trusted writes, and editor-only exclusion.
Ouroforge does not currently provide:
- hosted/cloud execution, accounts, authentication, authorization, or multi-tenant behavior;
- production readiness, support/security SLA, compatibility stability, or secure sandboxing for arbitrary untrusted content;
- native export, packaging, signing, publishing, deployment, or release automation;
- plugin runtime, marketplace, visual scripting, or third-party code-loading ecosystem;
- no browser trusted writes, local command bridges, auto-apply, auto-merge, or no reviewer bypass;
- source patch apply to the trusted maintainer worktree.
Public release still requires fresh evidence gates in
docs/public-readiness-audit.md,
docs/public-launch-checklist.md, and the
manual visibility-decision process. The launch-governance and communication-pack
docs are preparation artifacts, not a visibility toggle or publication event.
The roadmap and per-milestone completion records — with each milestone's
evidence chain and non-goals — live in docs/roadmap.md.
Cross-cutting boundaries are in
Non-goals and maturity boundaries; they are
not repeated per milestone here. Earlier completed milestones — including
Safe Source Mutation Apply, the GDD-to-Playable Prototype v1 prototype lane,
the Plugin / Extension System v1 lane, the Full Studio Editor lane, the
Godot-Plus Demo lane, and the Autonomous QA / Playtest Swarm v1 lane — keep
their full evidence chain and per-issue records in
docs/roadmap.md; only the current Era's snapshot is
summarised below.
Era AC note. Era AC (M179-M190) is recorded complete as a bounded deterministic performance-engine substrate: native + WASM digest equivalence, worker frame pipeline, presentation renderer batching, soak evidence, and trusted-WASM governance; #1 and #23 remain open.
Era AD note. Era AD (M191-M197) is recorded complete in docs/roadmap.md and docs/era-ad-unlock-ledger-finalized-v1.md as a foundational architecture milestone establishing the authoritative-core / non-authoritative-shell boundary, reconciliation protocol, x86↔ARM determinism hardening, and Unlock Ledger for Eras AE-AK; it does not authorize a Godot/Unity parity claim, production-release readiness, or commercial distribution — engineering path unlocked, NO parity claim yet.
Era AK note. Era AK (#3434, #3470-#3475) now has a bounded competitiveness rubric scaffold, gap matrix scaffold, contributor/extension boundary, roll-up draft, and wording guard. Final competitiveness wording remains blocked until AE-AJ evidence is merged or otherwise traceable; unsupported platform, export, hosted, ecosystem, support/SLA, and autonomous-release areas remain explicit.
Current state. Era H (Milestones 42–46) is recorded complete on merged
evidence in docs/roadmap.md: Multi-Agent Production Pipeline
v1 (M42), Autonomous Producer and Whole-Game Orchestration v1 (M43), Scaled Trust
Gradient / Release Provenance / Compliance v1 (M44), the Shipping and LiveOps
Layer-3 Re-evaluation Design Gate (M45), and the Era H closing autonomy
assessment (M46). The descriptive autonomy posture is in
docs/era-h-autonomy-assessment.md: agents
and local Rust contracts can carry proposal, evidence, orchestration, QA,
provenance, and release-candidate preparation work, but vision, taste/fun, legal
compliance acceptance, and release go/no-go remain human decisions.
Earlier foundations remain recorded. Era E established bounded local trust and
Layer-3 DEFER in docs/layer3-reevaluation-v1.md;
Era F/G added genre/function evidence and specialized production gates. The #1 and
#23 anchors are deliberately kept open as ongoing north-star tracks. The full
per-era completion history and evidence chains live in
docs/roadmap.md and the matching docs/*.md contracts.
Current frontier. Era J (Milestones 57-60) is complete on merged evidence as
a bounded human creative/release-judgment track over the existing deckbuilder
substrate: candidate generation and curation, human playtest/fun-feel capture,
narrative/theme proposal assistance, human-approved balance recommendations,
and release-readiness go/no-go evidence. The closing assessment is recorded in
docs/era-j-creative-leverage-assessment.md:
Ouroforge increases proposal/evidence output per human decision, but the
permanent human core remains fun, taste, tone/soul, curation, balance approval,
release go/no-go, and market judgment. This is not automated creativity, an
automated fun/quality verdict, release authority, production readiness, or a
no Godot replacement/parity claim.
Next. Later work requires separate issue-scoped design gates. Shipping/native-store release actions, hosted/cloud, real-player telemetry, live balancing, update/patch pipelines, market demand, and distributed Layer-3 behavior remain DEFER absent a separate #1508 Layer-3 GO; Rust-first / local-first is preserved absent that GO.
- Contribution workflow and review expectations:
CONTRIBUTING.md - Security policy and vulnerability reporting:
SECURITY.md - License:
LICENSE
Before opening a PR, run:
cargo fmt --check
cargo test
cargo clippy --all-targets --all-features -- -D warningsPer-milestone evidence steps live in the matching docs/*.md files. Keep
generated/local runtime state untracked.
Use docs/README.md as the expanded documentation index. The
README keeps only the most common starting points so public-alpha readers do not
have to scan every milestone contract first.
| Reader question | Start here |
|---|---|
| How does the loop work in detail? | docs/architecture.md |
| What is complete and what is next? | docs/roadmap.md |
| What is the trust boundary? | docs/README.md#safetytrust-boundaries |
| What separates engine core, runtime, Studio, project, evidence, export, and API layers? | docs/engine-runtime-studio-project-boundary-v1.md |
Is examples/game-runtime/ a fixture, candidate runtime, or product runtime? |
docs/runtime-artifact-boundary-v1.md |
| How may Studio preview, request writes, and bridge to runtime? | docs/studio-authority-boundary-v1.md |
| Where are milestone references grouped? | docs/README.md |
| What wording is forbidden or risky? | docs/public-wording-guardrail-v1.md, docs/public-wording-audit-process-v1.md |
| Where is the final docs IA audit? | docs/docs-link-wording-audit-pa1.5.3.md |
crates/ouroforge-core— trusted core models and evidence APIs for seeds, runs, ledgers, browser smoke, scenarios, evaluator, journal, mutation proposals, project/scene contracts, source-preview boundaries, and dashboard read models.crates/ouroforge-cli— CLI entrypoints for seed, run, evidence, journal, mutation, dashboard, scene, project, source-preview, and related commands.seeds/— MVP seed examples.docs/— architecture, roadmap, trust-boundary/evidence contracts, milestone notes, public-readiness audits, and governance handoff docs.
Do not commit generated or local runtime/tool state: runs/, target/,
dashboard-data/, sandbox/,
.claude/, .openchrome/, .omc/, .omx/. See
docs/artifact-write-policy-v1.md for the
trusted-write categories and generated-output/source-like collision policy.
Era AE records a browser-verified performance gate, not a performance-achieved
claim. Runtime/performance closure now requires local Chrome/Chromium evidence
with URL, screenshot, console, runtime probe, machine-readable performance
report, digest/reconciliation report, pass/fail summary, and known gaps. The
canonical guard is node scripts/era-ae-browser-verified-gate.cjs; the live
browser run is node scripts/era-ae-browser-verified-gate.cjs --live.