The voice-to-record first hop into CJADC2 — built for DDIL. Speak a field report. Get a doctrinally-correct, schema-validated structured record on every operator's screen — offline, in seconds, over a $30 mesh radio.
A proof-of-concept built for the 2026 NatSec Hackathon (Army xTech / Cerebral Valley · May 2026).
"DDIL — disconnected, degraded, intermittent, and low-bandwidth conditions — is no longer a contingency. It is the default environment for many operations."
Yet the Army's first hop, from a soldier's voice to a structured record, still works like this:
- A soldier in the dirt sees something time-critical — a casualty, a contact, a fuel state.
- They squeeze a doctrinally-shaped 9-line out of their head onto a radio in the middle of a fight.
- The receiver writes it down (and hopefully gets the urgency right).
- Someone re-keys it into GCSS-Army, the MEDEVAC desk, or the targeting tool (Maven Smart System, AIP, TITAN).
Each handoff loses fidelity. Each handoff costs minutes. And in DDIL conditions there often isn't a connection to do any of it. The GAO has flagged that even when CJADC2 connectivity exists, "overly restrictive data classification is a significant hindrance to sharing command and control data."
AI has compressed sensor-to-shooter from 20 minutes to 20 seconds. The kill chain is fast. The first hop — voice to typed record — is still glacial, and it's where everything currently breaks.
Beacon owns that first hop. Not a new C2 system — the missing edge in front of the ones the DoD has already paid for.
- Speak normally. A free-form report from a phone, headset, or laptop. No 9-line template, no menus.
- Get a doctrinally-correct structured record back, on-device. A small local LLM extracts a
MEDEVAC 9-line(FM 4-02.2),LOGSTAT(FM 4-0), contact event (ADP 6-0), call-for-fire (FM 3-09), or convoy reroute (FM 3-90) — typed and Pydantic-validated, doctrinally pin-cited, never free-text. - Sync over a $30 radio. Records propagate across a Meshtastic LoRa mesh — no wifi, no cell, no SATCOM, no tower. Every peer Beacon node sees the same structured picture in seconds. (Same radio class that already integrates with ATAK — the "tactical operating system" with 500K+ users across the force.)
- Push upstream when uplink returns. Confirmed records flow into Palantir AIP / Maven Smart System, GCSS-Army, and any CJADC2 surface through their existing REST endpoints. Pending proposals never leak.
- Human-in-the-loop on life-or-death. MEDEVAC, fires, and reroutes are always proposed by the AI and confirmed by an operator — aligned with the DoD Responsible AI Strategy ("humans retain responsibility for final engagement decisions").
| Voice radio | Cloud-dependent C2 | Maven / AIP / GCSS-Army | Beacon | |
|---|---|---|---|---|
| Works in DDIL (default conditions) | ✅ | ❌ | partial | ✅ |
| Works without infrastructure | ✅ | ❌ | ❌ | ✅ |
| Outputs typed, doctrinally-correct records | ❌ | ✅ | ✅ | ✅ |
| Time from speech → record | minutes (re-keying) | minutes (re-keying) | n/a (consumes records) | seconds |
| Operator workload | high | medium | low | low |
| Aligned with DoD RAI human-in-the-loop policy | n/a | varies | varies | ✅ (enforced) |
Beacon doesn't replace C2 — it feeds C2. It doesn't replace human judgement — it proposes, and a human disposes.
| Existing program / surface | Beacon's relationship |
|---|---|
| MEDEVAC desk (FM 4-02.2 9-line) | Beacon emits the 9-line as a typed record; desk gets a clean, traceable request |
| GCSS-Army (LOGSTAT, sustainment ERP) | Beacon ingests Class III/V/etc. consumption from voice → SupplyStatus record |
| Maven Smart System / Palantir AIP | Beacon pushes confirmed records through REST; AIP/Maven owns the kill chain — Beacon owns the first hop |
| ATAK / TAK ecosystem (500K+ users) | Roadmap target: native ATAK plugin surface (today: web dashboard) |
| Mission Command (ADP 6-0) | Beacon contributes to the COP — peer nodes share situational state immediately |
| DoD RAI Strategy + NIST AI RMF | Confirm-required gate on life-or-death tools enforced at the trigger lane |
| CJADC2 | Beacon is the data-fabric on-ramp from the dirt — local-first, classification-friendly (records can be reviewed pre-emission) |
This is a hackathon proof-of-concept, not a fielded system. Treat it as a working reference architecture you can run on a laptop today, not as a deployable product.
What works (verified by the test suite — 397 tests collect, 4 e2e gated by BEACON_E2E=1)
- Agent pipeline: voice/text → tool call → schema-validated record → SQLite. 7 tools (3 confirm-required:
request_medevac,recommend_reroute,request_fires); 9 RAG tools from GAIA'sRAGToolsMixin. - Three LLM backends: Lemonade (default), OpenAI (
BEACON_USE_OPENAI=1), Claude (BEACON_USE_CLAUDE=1). Default modelQwen3-4B-Instruct-2507-GGUF. - Voice transcription on the server:
POST /voice/transcribeproxies to Lemonade's OpenAI-compatible Whisper endpoint.502on backend down, no in-process fallback. - Mesh transport: in-process UDP adapter; envelope JSON with mesh-id namespacing, no-loopback, ACK tracking.
/mesh/{status,enable,disable,test/*}endpoints. - AIP push: confirmed-only trigger lane subscriber.
{connected:false}whenAIP_BASE_URLunset — real not-configured state, not a mock. - InsightsEngine: agent-only (no heuristics), 1 s trailing debounce, JSON-array contract.
- Reasoning trace: 500-step in-memory ring buffer streamed at
/agent/reasoning/stream.OBSERVE,TOOL,PROPOSE,SURFACE,DECIDE,QUERYstep types; bookends every query lifecycle. - Web dashboard (Vite + React) with EventFeed, Map, Insights panel, Burndown panel, Mesh panel, AIP panel, Reasoning panel, Doctrine corpus panel.
- Eval framework:
beacon-evalCLI; Claude judge via either subscription auth orANTHROPIC_API_KEY; 24 scenarios across 8 categories.
What does not exist yet
- No ATAK plugin and no mobile companion app. Today's only operator surface is the web dashboard. ATAK plugin is the obvious next surface given the 500K-user ecosystem.
- No push-to-talk button in the dashboard UI. The browser voice helper and the
/voice/transcribeendpoint exist; the on-screen button was removed. Headset clients can still POST WAV bytes. - No real LoRa radio validation. The UDP mesh transport has been exercised locally and against the standalone
MeshServiceserver. It has not been wired through actual Meshtastic hardware in this branch. - Hardware probes from Phase 0 were not run. Development is on a Mac M4 Pro (Apple Silicon). Spec-target latencies (5 s p50, 8 s p95) come from a Toughbook-class assumption; we measured ~5 s p50 on M4 Pro for the canonical 6-scenario corpus and have not confirmed Ryzen AI numbers.
- Eval corpus is hackathon-scale. 24 scenarios across
iran-showcase,mixed,rag,medevac,logstat,doctrinal,edge,showcase. The agent-spec called for ~40 across more axes; treat the numbers as a sanity gate, not a release gate. - AIP integration has never been pointed at a real Palantir tenant. Tests use
FakeHTTP. The mapping (medevac → MedevacRequest, etc.) is correct; the live wiring is unproven. - All data is synthetic. No real OPORDs, casualty data, unit positions, MGRS coordinates, or supply rates ship with this repo, and none should be added.
If your downstream decision depends on Beacon doing something not in the What works list, it doesn't do it yet.
git clone https://github.com/kovtcharov/beacon && cd beacon
python -m venv .venv && source .venv/bin/activate
pip install -e ".[dev]"
# Smoke check (397 tests collect; 4 e2e are skipped without BEACON_E2E=1)
pytest tests/
# Seed synthetic demo state (3 companies × 5 supply items × 6h history + a convoy event)
beacon seedYou need three processes for the full loop. Start them in three terminals.
# 1) Local LLM backend. Required unless you flip BEACON_USE_OPENAI=1 or BEACON_USE_CLAUDE=1.
lemonade-server serve
# 2) Beacon API. Defaults to :8888 (Lemonade owns :8001/:8002 when warm).
beacon serve
# 3) Dashboard. :5173 collides with Claudia in some dev environments — use --port 5174 if needed.
cd dashboard && npm install && npm run dev -- --port 5174Open http://localhost:5174, type a field report into the dashboard textbox, hit Submit, and watch records propagate through the EventFeed, Insights panel, and Reasoning panel. Confirm any pending life-or-death proposals.
beacon ask --auto-confirm "Bravo Co at 38SMB12345678, two wounded, gunshot, urgent surgical"
beacon chat # interactive REPL with confirm prompts
beacon state # dump current SQLite state as JSON--auto-confirm is for scripted runs only — it bypasses the operator-confirm gate (confirm_mode="never"). Don't use it in demos.
| Env | Behavior |
|---|---|
| (none) | Lemonade Server, default model Qwen3-4B-Instruct-2507-GGUF |
BEACON_USE_OPENAI=1 OPENAI_API_KEY=… |
OpenAI fallback |
BEACON_USE_CLAUDE=1 ANTHROPIC_API_KEY=… |
Claude fallback |
BEACON_SKIP_LEMONADE=1 |
No LLM probing at startup. /health and /state work; /agent/run returns 502 |
BEACON_MODEL=Gemma-4-E4B-it-GGUF |
Override the default Lemonade model |
BEACON_AGENT_TIMEOUT_S=90 |
Per-query wall-clock cap. Default 90 s. Set 0 to disable |
BEACON_MESH=1 / 0 |
Auto-start the UDP mesh transport at lifespan startup / hard env lock |
AIP_BASE_URL=… AIP_TOKEN=… |
Real AIP push. Otherwise /aip/entities returns {connected:false} |
request_medevac, recommend_reroute, and request_fires write to pending_actions and return status="pending_confirm". The trigger lane stays silent until an operator promotes the row:
curl -X POST http://localhost:8888/agent/confirm/<event_uuid> # promote → fire trigger lane
curl -X POST http://localhost:8888/agent/reject/<event_uuid> # drop, no emissionThis is AGENTS.md § 2.2 — non-negotiable, and the seam where DoD RAI human-in-the-loop policy lives in code. Tests and demos must use the dashboard or the explicit --auto-confirm opt-in; nothing else bypasses it.
# Full corpus, Claude Code subscription auth (no ANTHROPIC_API_KEY needed)
beacon-eval --output eval/runs/baseline-$(date +%s)/ --judge-backend claude-code
# Single category
beacon-eval --category mixed --output eval/runs/mixed-only/ --judge-backend claude-code
# Anthropic SDK (default; needs ANTHROPIC_API_KEY)
beacon-eval --output eval/runs/sdk/The judge is always Claude regardless of which backend the agent under test uses.
src/beacon/
agent.py BeaconAgent — GAIA subclass + tool registration + Qwen-specific overrides
agent_runner.py Async queue, /agent/{enqueue,status,start,pause,resume,stop}, 90 s timeout
state.py SQLite + two-lane EventBus + reap_orphaned_processing
schemas.py Every Pydantic model (records, payloads, mesh envelopes, reasoning steps)
tools/ 7 factory-pattern tools — make_<name>(state)
mesh.py UDPMeshTransport (alias BeaconMeshAdapter); subscribes to sync lane
aip.py Palantir REST adapter; subscribes to trigger lane only
voice.py Lemonade Whisper wrapper; raises BeaconVoiceError on any failure
insights.py Agent-only insights, 1 s debounce
reasoning.py 500-step ring buffer for /agent/reasoning/stream
server.py FastAPI app + lifespan + all HTTP routes
cli.py `beacon serve · seed · ask · chat · state` (typer)
prompts/ v1..v8.py — full file per version, diff-friendly. CURRENT = v8
eval/ beacon-eval entry point (Claude judge)
dashboard/src/ Vite + React. App.tsx, EventFeed.tsx, Map.tsx, *Panel.tsx
docs/plans/ overview.md, agent-spec.md, implementation.md, glossary.md
docs/assets/ Diagrams + screenshots referenced from this README
eval/scenarios/ 24 YAML scenarios across 8 categories
tests/ One file per source module. 397 collect; 4 e2e gated by BEACON_E2E=1
All data is synthetic. No real OPORDs, casualty data, unit positions, MGRS coordinates, supply consumption rates, or military information of any kind. Operational details (callsigns, grids, OPFOR descriptions) are fictional and bear no resemblance to actual operations.
No AI-attribution. Per
AGENTS.md§ 3.5, this repo's commits, PRs, and code carry noCo-Authored-By: Claudetrailers or AI-attribution footers. Humans are authors of record.
MIT — see LICENSE. Built on top of GAIA (Apache-2.0) and Meshtastic.
Named for the Beacons of Gondor — a chain of independent fire-watches that propagated an urgent alert from Minas Tirith to Edoras with no central infrastructure. Mesh radios instead of bonfires; MEDEVAC 9-lines instead of "Gondor calls for aid." And in case the resonance isn't obvious — we feed the palantíri the leaders look into.