From 93a9a331edfde007ef4e2f737c0cec4f916a0aa7 Mon Sep 17 00:00:00 2001 From: chris-colinsky Date: Tue, 12 May 2026 19:51:51 -0700 Subject: [PATCH 1/2] docs: scrub em dashes, open spec link in a new tab MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two small docs-site cleanups bundled in one PR. Em dashes (—) are a common LLM tell; the README scrub during PR #36 replaced them with commas, colons, semicolons, or sentence splits depending on context. Apply the same pass across every page under docs/. Total: 142 em dashes across 13 files, scrubbed without changing the prose's meaning. ``grep -c "—" docs/**/*.md`` returns zero across the tree. Pattern choice per occurrence: parentheticals (X — Y — Z) become parens or commas; definitions (X — body) become ``X: body``; two-clause sentences (X — Y) become sentence splits or semicolons; appositives become commas. Where the em dash was load- bearing in the prose rhythm, the sentence was rewritten rather than mechanically substituted. Also: the apex "specification" link on the docs landing page now opens in a new tab. attr_list is already enabled in mkdocs.yml, so ``{target="_blank" rel="noopener"}`` after the link does the job inline. uv run mkdocs build --strict clean. --- docs/concepts/checkpointing.md | 22 +++++------ docs/concepts/composition.md | 32 ++++++++-------- docs/concepts/fan-out.md | 36 +++++++++--------- docs/concepts/graphs.md | 32 ++++++++-------- docs/concepts/index.md | 16 ++++---- docs/concepts/observability.md | 58 ++++++++++++++--------------- docs/concepts/state-and-reducers.md | 14 +++---- docs/contributing/index.md | 2 +- docs/getting-started/index.md | 12 +++--- docs/index.md | 8 ++-- docs/model-providers/authoring.md | 18 ++++----- docs/model-providers/index.md | 18 ++++----- docs/reference/index.md | 8 ++-- 13 files changed, 138 insertions(+), 138 deletions(-) diff --git a/docs/concepts/checkpointing.md b/docs/concepts/checkpointing.md index 47f9343..a59dfd6 100644 --- a/docs/concepts/checkpointing.md +++ b/docs/concepts/checkpointing.md @@ -2,7 +2,7 @@ Save state at every node boundary; resume a crashed run from the last saved point on a subsequent `invoke()`. Without a checkpointer, the -engine holds no state across invocations — a crash means start-from-entry. +engine holds no state across invocations; a crash means start-from-entry. ## Wiring a checkpointer @@ -27,7 +27,7 @@ graph = ( The engine writes a record at every `completed` event for outermost- graph nodes and subgraph-internal nodes. **Fan-out instance internal -events do NOT save** in the shipping version — atomic-restart is the +events do NOT save** in the shipping version. Atomic-restart is the fan-out contract. ## Saves are synchronous-by-contract @@ -40,7 +40,7 @@ because the save resolves before the next node runs. The corollary: slow backends throttle execution. Wrapping a high- latency persistence layer in a checkpointer makes the whole graph -run at its latency. Plan accordingly — async writes inside the +run at its latency. Plan accordingly: async writes inside the backend (e.g., `asyncio.to_thread` around a sync driver) are fine; fire-and-forget patterns that return before durability is established violate the contract. @@ -94,7 +94,7 @@ Field framing worth getting right: there. - **`correlation_id` ≠ `invocation_id`.** `invocation_id` identifies *this* graph run uniquely. `correlation_id` is a cross-system - identifier propagated via ContextVar — multiple invocations + identifier propagated via ContextVar; multiple invocations related by a higher-level request can share one `correlation_id` while each having its own `invocation_id`. See [Observability](observability.md) for how `correlation_id` @@ -119,15 +119,15 @@ class Checkpointer(Protocol): async def delete(self, invocation_id: str) -> None: ... ``` -- **`save`** — persist the record under `invocation_id`. Durable for +- **`save`**: persist the record under `invocation_id`. Durable for any backend that documents durability. Synchronous-by-contract per the section above. -- **`load`** — return the *most recent* record for `invocation_id`, +- **`load`**: return the *most recent* record for `invocation_id`, or `None`. Round-trip-stable with what `save` wrote. -- **`list`** — enumerate saved invocations, optionally filtered by +- **`list`**: enumerate saved invocations, optionally filtered by `CheckpointFilter` (currently a single `correlation_id` field; v1 ships intentionally narrow). -- **`delete`** — remove all records for `invocation_id`. No-op if the +- **`delete`**: remove all records for `invocation_id`. No-op if the invocation has no record (no error). Backends MUST be safe to share across concurrent invocations; the @@ -137,10 +137,10 @@ call. ## Two built-in backends -- **`InMemoryCheckpointer`** — backed by a dict in process memory. +- **`InMemoryCheckpointer`**: backed by a dict in process memory. Loses everything on process exit. Useful for tests and short-lived contexts that want the API surface without disk overhead. -- **`SQLiteCheckpointer`** — backed by a SQLite database file. +- **`SQLiteCheckpointer`**: backed by a SQLite database file. Survives process exit. Reasonable default for any non-trivial use. Custom backends just implement the four-method Protocol. Targets that @@ -155,7 +155,7 @@ adapter layer. cheap; checkpoints are pure overhead. - **Pipelines whose external side effects can't safely be re-played.** If node A sends an email, resuming from after A means the email - has already sent — fine if your downstream is idempotent, surprising + has already sent; fine if your downstream is idempotent, surprising if it isn't. Reason explicitly about replay semantics before turning on resume. diff --git a/docs/concepts/composition.md b/docs/concepts/composition.md index 9d3251c..b6ea35c 100644 --- a/docs/concepts/composition.md +++ b/docs/concepts/composition.md @@ -7,8 +7,8 @@ pipeline of reusable sub-pipelines: 2. **Subgraphs** encapsulate a sub-pipeline as a single node. 3. **Projections** translate state across the subgraph boundary. -None of these add new primitives — a conditional edge is still one -outgoing edge, a subgraph is still a single node — but they change +None of these add new primitives (a conditional edge is still one +outgoing edge, a subgraph is still a single node) but they change what a graph can express. ## Conditional edges @@ -52,7 +52,7 @@ it somewhere invisible," state-driven routing gives you: **Why sync?** Conditional edges are routing decisions, not units of work. If you want `async def`, the right move is to do the IO in the -producing node and write the decision to a state field — exactly what +producing node and write the decision to a state field, exactly what `classify` does. Keeping edges sync keeps the loop simple to read: node (async) → merge → edge (sync) → next. @@ -114,7 +114,7 @@ builder.add_subgraph_node("research", research_subgraph, projection=...) **Separate state schemas are load-bearing.** The subgraph has its own `State` subclass, distinct from the parent's. At compile time, the subgraph's reducer table and field validation are built against its -own schema. Parent fields can't leak in by accident — they aren't in +own schema. Parent fields can't leak in by accident; they aren't in scope on either side of the boundary. **The only way data crosses is through the projection.** @@ -153,14 +153,14 @@ If you don't pass a `projection=` argument, you get this. It behaves asymmetrically: - **`project_in`: parent state is ignored.** Returns - `subgraph_state_cls()` — a fresh instance from the subgraph's + `subgraph_state_cls()`, a fresh instance from the subgraph's defaults. If the subgraph has a required field, this constructor fails; the subgraph can't run without an explicit projection. - **`project_out`: field-name intersection.** Looks at the subgraph's final state, keeps fields whose names also exist on the parent, and returns them as a partial update. The parent's reducers then merge. -The asymmetry — "closed on the way in, open on the way back" — is by +The asymmetry, "closed on the way in, open on the way back," is by design. The author opts *in* to sharing data with the subgraph; the subgraph's observable outputs route back through the parent's reducers automatically. @@ -183,9 +183,9 @@ projection = ExplicitMapping[ParentState, SubgraphState]( builder.add_subgraph_node("analyze_a", subgraph, projection=projection) ``` -`inputs` and `outputs` are independent — pass either, both, or neither. +`inputs` and `outputs` are independent; pass either, both, or neither. -**Asymmetry — inputs additive, outputs replacement.** This mirrors the +**Asymmetry: inputs additive, outputs replacement.** This mirrors the default's asymmetry. - `inputs` is *additive over no-projection-in*. Subgraph fields named @@ -193,7 +193,7 @@ default's asymmetry. fields get their schema defaults. - `outputs` *replaces* field-name matching when present. Only pairs named in `outputs` are merged back. Unnamed subgraph fields are - discarded — no slip of extra fields by accident. + discarded, so no slip of extra fields by accident. **`None` vs `{}` for `outputs`:** @@ -206,7 +206,7 @@ default's asymmetry. **Compile-time validation.** `ExplicitMapping.validate` runs at parent-graph compile and raises `MappingReferencesUndeclaredField` if any mapping names a field that isn't on the relevant schema. -Refactor-safe — if you rename a parent field but forget the mapping, +Refactor-safe: if you rename a parent field but forget the mapping, construction fails, not runtime. **The case `ExplicitMapping` uniquely unlocks.** Same subgraph at @@ -235,13 +235,13 @@ builder.add_subgraph_node( The two sites address disjoint parent fields, so they cannot collide. Without explicit mapping, both calls would have to read from and write -to the same parent fields under name matching — making "run the same +to the same parent fields under name matching, making "run the same subgraph twice on different inputs" structurally impossible. ### Custom projection strategies -If you need behavior beyond name-mapping — synthesize values, project -conditionally, transform on the way through — write a class that +If you need behavior beyond name-mapping (synthesize values, project +conditionally, transform on the way through), write a class that matches the Protocol: ```python @@ -268,12 +268,12 @@ A few design points worth sitting with: - **Unknown fields from `project_out` raise.** Parent's `extra="forbid"` catches typos at the merge boundary. - **The `parent_state` argument of `project_out` is for context, not - for writing.** You can read it to decide what to project — "only - return the answer if the parent was in a research route" — but you + for writing.** You can read it to decide what to project ("only + return the answer if the parent was in a research route") but you can't mutate it. `ProjectionStrategy` is a `Protocol`, not a base class. A class fits the shape or it doesn't; the type checker verifies at use sites. If you have Java instincts ("where's the `implements` keyword?"), reach -for TypeScript or Go interface instincts instead — that's the same +for TypeScript or Go interface instincts instead; that's the same family. diff --git a/docs/concepts/fan-out.md b/docs/concepts/fan-out.md index bbb3dce..4d69526 100644 --- a/docs/concepts/fan-out.md +++ b/docs/concepts/fan-out.md @@ -6,7 +6,7 @@ a different input, results merged back deterministically. The "same subgraph at two-or-three call sites" pattern from [`ExplicitMapping`](composition.md#explicitmapping-declarative) handles cases where you know the parent fields up front. Fan-out -handles N call sites where N is determined at runtime — "for each +handles N call sites where N is determined at runtime: "for each item in `state.urls`, run the scraping subgraph; collect the results." @@ -15,7 +15,7 @@ results." A fan-out can dispatch instances driven by a list in state (`items_field` mode) or by a count resolved from state (`count` mode). -**`items_field` mode** — one instance per item in a parent list field: +**`items_field` mode**: one instance per item in a parent list field: ```python from openarmature.graph import FanOutConfig, FanOutNode @@ -24,7 +24,7 @@ scrape_all = FanOutNode( name="scrape_all", config=FanOutConfig( subgraph=scrape_subgraph, # CompiledGraph[ScrapeState] - items_field="urls", # parent list field — one instance per item + items_field="urls", # parent list field, one instance per item item_field="url", # subgraph field that receives each item collect_field="content", # subgraph field whose value is collected target_field="contents", # parent list field that receives the collection @@ -36,7 +36,7 @@ scrape_all = FanOutNode( builder.add_node("scrape_all", scrape_all) ``` -**`count` mode** — fixed-or-dynamic instance count, no list field: +**`count` mode**: fixed-or-dynamic instance count, no list field: ```python fan_out = FanOutNode( @@ -58,7 +58,7 @@ time. ## Per-instance state, inputs and outputs -Each instance gets its own subgraph state — distinct from siblings, +Each instance gets its own subgraph state, distinct from siblings, distinct from the parent. By default the instance receives only: - the dispatched item in the field named by `item_field` (in @@ -67,25 +67,25 @@ distinct from the parent. By default the instance receives only: `inputs` is a `Mapping[subgraph_field, parent_field]`. The subgraph fields not named in `inputs` (and not `item_field`) take their -schema defaults — same closed-by-default-on-the-way-in posture as +schema defaults; same closed-by-default-on-the-way-in posture as the explicit-projection story for ordinary subgraphs. On exit, each instance's `collect_field` value becomes one element of the parent's `target_field` list, in instance-index order. To collect additional per-instance fields, declare -`extra_outputs: Mapping[parent_field, subgraph_field]` — each becomes +`extra_outputs: Mapping[parent_field, subgraph_field]`; each becomes its own parent list of the same length, instance-index-aligned. ## Error policy Two values: -- **`"fail_fast"`** (default) — the first instance failure cancels +- **`"fail_fast"`** (default): the first instance failure cancels the in-flight siblings (`asyncio.gather` semantics) and propagates as a `NodeException` wrapping the failing instance's cause, with `recoverable_state` set to the parent's pre-fan-out snapshot. Use this when one bad result invalidates the rest. -- **`"collect"`** — instance failures are captured; the fan-out runs +- **`"collect"`**: instance failures are captured; the fan-out runs to completion. Failed instances contribute nothing to `target_field`. If you declare `errors_field` on the config, each failed instance produces a record (`{"fan_out_index": str(idx), @@ -98,11 +98,11 @@ Choose by whether partial results are useful. After the fan-out completes, the parent receives a partial update containing: -- `target_field` — list of `collect_field` values, instance-index order. -- Each parent name in `extra_outputs` — list of values from the named +- `target_field`: list of `collect_field` values, instance-index order. +- Each parent name in `extra_outputs`: list of values from the named subgraph field, instance-index order. -- `count_field` (if configured) — the instance count. -- `errors_field` (if configured, `"collect"` policy only) — per-instance +- `count_field` (if configured): the instance count. +- `errors_field` (if configured, `"collect"` policy only): per-instance error records. - `on_empty="noop"` for an empty items_field → all the above with empty lists; `count_field` set to 0. @@ -112,9 +112,9 @@ containing: If `items_field` is set and the parent list is empty (or `count` resolves to 0): -- `on_empty="raise"` (default) — raises `FanOutEmpty` (a runtime +- `on_empty="raise"` (default): raises `FanOutEmpty` (a runtime error category). -- `on_empty="noop"` — emits an empty partial (no instances dispatched, +- `on_empty="noop"`: emits an empty partial (no instances dispatched, no errors). ## Observability per instance @@ -124,7 +124,7 @@ The fan-out node's own `started` / `completed` events carry a `item_count` / `concurrency` / `error_policy` / `parent_node_name`. Per-instance events have `fan_out_index = N` (0-based) and a -namespace whose final element is the fan-out node's name — instances +namespace whose final element is the fan-out node's name; instances do NOT contribute a separate synthetic namespace element. Backends disambiguate per-instance spans using `fan_out_index` alongside the namespace. @@ -133,7 +133,7 @@ namespace. A fan-out node's `completed` event triggers a save like any other outermost-graph or subgraph-internal node. **Per-instance internal -events do NOT save** in the shipping version — on resume, the +events do NOT save** in the shipping version; on resume, the fan-out re-runs end-to-end if it hadn't completed (atomic restart). A per-instance fan-out resume mode is planned but not yet shipped. @@ -147,7 +147,7 @@ The signal: N similar pieces of work, N depends on state at runtime (not at build time), the work is independent enough to run concurrently. If N is known at build time and small (≤3), `ExplicitMapping` at multiple subgraph sites is simpler. If the -work isn't independent — instance 2 needs instance 1's output — +work isn't independent (instance 2 needs instance 1's output), that's a linear pipeline, not fan-out. ## What fan-out is NOT diff --git a/docs/concepts/graphs.md b/docs/concepts/graphs.md index cad20e3..20813d0 100644 --- a/docs/concepts/graphs.md +++ b/docs/concepts/graphs.md @@ -33,14 +33,14 @@ Three things to notice: - **The return is a `Mapping`, not a `dict` literally.** You can return any dict shape that satisfies the type; field names are validated against the state schema at merge time (extra keys raise). -- **Empty dict is fine.** `return {}` means "I made no state changes" — +- **Empty dict is fine.** `return {}` means "I made no state changes"; state passes through, execution moves on per the outgoing edge. Good for logging or pure-observation nodes. -**Why async?** The canonical node does IO — LLM call, HTTP request, +**Why async?** The canonical node does IO: LLM call, HTTP request, tool invocation. An async signature lets the runtime overlap IO when you eventually add parallel branches or retries. For a purely CPU node, -async costs nothing — you just `return {...}` without an `await`. +async costs nothing; you just `return {...}` without an `await`. You register the node on a builder under a name: @@ -64,7 +64,7 @@ builder.add_edge("plan", "write") # after `plan` merges, run `write` builder.add_edge("write", END) # after `write` merges, halt ``` -`END` is a sentinel object — a distinct value, not the string `"END"`: +`END` is a sentinel object, a distinct value, not the string `"END"`: ```python from openarmature.graph import END @@ -107,23 +107,23 @@ graph = ( The methods you'll use: -- **`GraphBuilder(state_cls)`** — constructor. The state class +- **`GraphBuilder(state_cls)`**: constructor. The state class determines the reducer table at compile time. -- **`.add_node(name, fn)`** — register an async node function. -- **`.add_edge(source, target)`** — static edge. `target` is a node +- **`.add_node(name, fn)`**: register an async node function. +- **`.add_edge(source, target)`**: static edge. `target` is a node name or `END`. -- **`.add_conditional_edge(source, fn)`** — branching edge. `fn(state)` +- **`.add_conditional_edge(source, fn)`**: branching edge. `fn(state)` is sync and returns a node name or `END`. -- **`.add_subgraph_node(name, compiled, projection=None)`** — register +- **`.add_subgraph_node(name, compiled, projection=None)`**: register a compiled graph as a node inside this graph (see [Composition](composition.md)). -- **`.set_entry(name)`** — declare where execution begins. -- **`.compile()`** — validate and return `CompiledGraph`. +- **`.set_entry(name)`**: declare where execution begins. +- **`.compile()`**: validate and return `CompiledGraph`. **Why split builder and compiled?** Construction and execution are different problems. Construction is mutable and permissive (add things in whatever order reads well); execution wants something immutable and -validated. `compile()` is the one-way door between the two — and +validated. `compile()` is the one-way door between the two, and structural problems surface at the door so a bad graph can't reach runtime. @@ -141,7 +141,7 @@ runtime. | `UnreachableNode` | A declared node isn't reachable from the entry via any edge path | | `MappingReferencesUndeclaredField` | A subgraph projection mapping names a field absent from the schema | -Every failure here is a graph-shape problem — the kind of thing that +Every failure here is a graph-shape problem, the kind of thing that would otherwise crash mid-execution with a confusing traceback. Catching them at construction means you *cannot* invoke a malformed graph. @@ -170,12 +170,12 @@ The per-step loop: 4. Re-validate state against the schema. 5. Evaluate the outgoing edge against the *post-merge* state to pick the next node (or `END`). -6. Dispatch the `completed` observer event — populating `post_state` +6. Dispatch the `completed` observer event, populating `post_state` if the step succeeded, or `error` if any of steps 2–5 failed (including edge / routing errors, which attach to the preceding node's `completed` event rather than producing a new one). -The output is the final `State` instance — whatever state looks like +The output is the final `State` instance: whatever state looks like when an edge returns `END`. See [Observability](observability.md) for what observers do with the started/completed pair. @@ -192,7 +192,7 @@ engine raises one of these: | `RoutingError` | Conditional edge returned something that isn't a node name or `END` | yes | | `StateValidationError` | Merged state fails schema validation (typo'd field, bad type) | no | -`recoverable_state` is the state at the point of failure — +`recoverable_state` is the state at the point of failure: pre-failing-node for node/edge/routing errors, pre-merge for reducer errors. Useful for post-crash forensics. State validation errors don't carry recoverable_state because the merge that triggered the failure diff --git a/docs/concepts/index.md b/docs/concepts/index.md index 2ff098f..ac99588 100644 --- a/docs/concepts/index.md +++ b/docs/concepts/index.md @@ -3,20 +3,20 @@ Each page is a focused take on one idea. Read top-to-bottom for a tour of the framework, or jump to whichever concept you need. -- [State and reducers](state-and-reducers.md) — typed state, per-field +- [State and reducers](state-and-reducers.md): typed state, per-field merge policies, what makes nodes safe to write. -- [Graphs: nodes, edges, build, invoke](graphs.md) — the four moves you +- [Graphs: nodes, edges, build, invoke](graphs.md): the four moves you make to turn a state schema into a runnable pipeline. -- [Composition: conditional edges, subgraphs, projection](composition.md) - — routing decisions, encapsulated sub-pipelines, the parent ↔ subgraph +- [Composition: conditional edges, subgraphs, projection](composition.md): + routing decisions, encapsulated sub-pipelines, the parent ↔ subgraph data seam. -- [Fan-out](fan-out.md) — running the same subgraph many times in +- [Fan-out](fan-out.md): running the same subgraph many times in parallel, results merged back deterministically. -- [Observability](observability.md) — node-boundary hooks, OTel mapping, +- [Observability](observability.md): node-boundary hooks, OTel mapping, log correlation. -- [Checkpointing](checkpointing.md) — save state at each node boundary, +- [Checkpointing](checkpointing.md): save state at each node boundary, resume from a prior point. If you're brand-new, [Quickstart](../getting-started/index.md) is the -faster entry — under a minute to a running graph. Come back here when +faster entry; under a minute to a running graph. Come back here when you want to know *why* things are shaped the way they are. diff --git a/docs/concepts/observability.md b/docs/concepts/observability.md index 1ab3955..3b40c4f 100644 --- a/docs/concepts/observability.md +++ b/docs/concepts/observability.md @@ -2,11 +2,11 @@ Two complementary patterns: -- **The `trace` field pattern** — a typed list inside state that nodes +- **The `trace` field pattern**: a typed list inside state that nodes append to. State-shaped history, accessible from inside the graph, visible in the final state. Falls out of existing primitives. Covered in [State and reducers](state-and-reducers.md). -- **Observer hooks** — out-of-band events delivered to external code, +- **Observer hooks**: out-of-band events delivered to external code, with full pre/post state snapshots, error context, and visibility across subgraph boundaries. The control-side equivalent of the data-side `trace` field. This page. @@ -39,7 +39,7 @@ _: Observer = StructuredLogger() # structural conformance check ## Two registration modes -**Graph-attached** — fires on every invocation until removed: +**Graph-attached**: fires on every invocation until removed: ```python compiled = builder.compile() @@ -49,10 +49,10 @@ handle.remove() # idempotent ``` Changes to the registered set during a graph run don't take effect -until the next invocation — the in-flight observer set is fixed at +until the next invocation. The in-flight observer set is fixed at `invoke()` time. -**Invocation-scoped** — fires only for one specific run: +**Invocation-scoped**: fires only for one specific run: ```python final = await compiled.invoke(initial, observers=[request_logger]) @@ -82,7 +82,7 @@ class NodeEvent: A walk-through: -- **`phase`** — every node attempt produces a `started` / `completed` +- **`phase`**: every node attempt produces a `started` / `completed` *pair*. The pair shares `step` and `pre_state`. `started` fires before the node body runs; `completed` fires after the reducer merge succeeds *and* the outgoing edge has been evaluated. A @@ -98,44 +98,44 @@ A walk-through: `phases=KNOWN_PHASES`, exported from `openarmature.graph`, to subscribe to every phase including `checkpoint_saved`). -- **`node_name`** — the node's local name in its immediate containing +- **`node_name`**: the node's local name in its immediate containing graph. For nested subgraphs, the inner name, NOT a qualified path. -- **`namespace`** — the qualified path of containing-graph node names +- **`namespace`**: the qualified path of containing-graph node names + the current node's name, outermost-first. For a top-level node: `(node_name,)`. For a subgraph-internal node: - `(outer_subgraph_node_name, inner_name)`. A *tuple of strings* — + `(outer_subgraph_node_name, inner_name)`. A *tuple of strings*; the framework keeps it as a tuple at the API boundary rather than joining with a delimiter, so node names can contain any characters without parsing ambiguity. -- **`step`** — monotonic counter starting at 0, scoped to one +- **`step`**: monotonic counter starting at 0, scoped to one outermost invocation. Subgraph-internal nodes increment the same counter; subgraph events interleave with outer events. The `started`/`completed` pair for one attempt share the same step. -- **`pre_state`** / **`post_state`** — state the node received vs. +- **`pre_state`** / **`post_state`**: state the node received vs. state after the reducer merge. *Shape varies with namespace*: for a subgraph-internal node, both are subgraph-state instances, not the outer state. -- **`error`** — the wrapped runtime error on `completed` events that +- **`error`**: the wrapped runtime error on `completed` events that failed. `event.error.category` gives the canonical error category; `event.error.__cause__` gives the original exception. **Edge / - routing errors land here too** — see *Routing errors and the + routing errors land here too**; see *Routing errors and the completed event* below. -- **`parent_states`** — one snapshot per containing graph, outermost +- **`parent_states`**: one snapshot per containing graph, outermost first. Empty tuple for outermost-graph events. Invariant: `len(parent_states) == len(namespace) - 1`. -- **`attempt_index`** — 0-based retry attempt counter. `0` for nodes +- **`attempt_index`**: 0-based retry attempt counter. `0` for nodes not wrapped by retry middleware; `1+` for retries. -- **`fan_out_index`** — 0-based per-instance index for events inside +- **`fan_out_index`**: 0-based per-instance index for events inside a fan-out instance; `None` outside. -- **`fan_out_config`** — populated on `started` / `completed` events +- **`fan_out_config`**: populated on `started` / `completed` events for the *fan-out node itself*, carrying the resolved `item_count` / `concurrency` / `error_policy` / `parent_node_name`. `None` on every other event. @@ -150,10 +150,10 @@ When a conditional edge raises or returns an invalid target: 4. The edge fn raises (`EdgeException`) OR returns something that isn't a declared node name or `END` (`RoutingError`). 5. The engine populates that error into the preceding node's - `completed` event and dispatches it — sharing the + `completed` event and dispatches it, sharing the started/completed pair rather than synthesising a new event. -So edge / routing errors *do* land on a `NodeEvent` — on the +So edge / routing errors *do* land on a `NodeEvent`, on the *preceding* node's `completed` event, with `error` populated and `post_state` left `None`. Observers see the failure attributed to the right node without a synthetic event. @@ -161,7 +161,7 @@ right node without a synthetic event. ## Subgraph events bubble up A subgraph-attached observer sees its own internal node events -whenever the subgraph runs — directly OR as a subgraph inside a +whenever the subgraph runs, directly OR as a subgraph inside a parent. The parent's observers ALSO see those internal events. Delivery order for an event from a subgraph-internal node: @@ -171,7 +171,7 @@ outermost-graph-attached → ... → subgraph-attached → invocation-scoped ``` Within each level, registration order. The subgraph-as-node wrapper -itself does *not* generate its own event — it's transparent to +itself does *not* generate its own event; it's transparent to observers. ## Serial delivery @@ -196,7 +196,7 @@ naturally), or push events to an internal queue and return fast. The graph's execution loop dispatches events onto a per-invocation queue and **does not await** observer processing. Event dispatch is -constant-time from the graph's perspective — observers can't slow +constant-time from the graph's perspective; observers can't slow node execution down. This means `await compiled.invoke(...)` returns when the graph @@ -217,7 +217,7 @@ await compiled.drain() - Snapshot at call time. Events from invocations started concurrently with `drain()` may or may not be included. - Subgraph events are part of the parent. A parent drain covers every - subgraph event from any of its invocations — no need to drain each + subgraph event from any of its invocations; no need to drain each subgraph separately. If you forget `drain()` in a CLI, the symptom is an empty trace file @@ -240,10 +240,10 @@ side concern. Two identifiers travel with every invocation: -- **`invocation_id`** — unique per `invoke()` call. Identifies *this +- **`invocation_id`**: unique per `invoke()` call. Identifies *this run*. Surfaced on `CheckpointRecord.invocation_id`, observer span attributes, log records. -- **`correlation_id`** — a cross-system identifier propagated via +- **`correlation_id`**: a cross-system identifier propagated via `ContextVar`. Multiple invocations related by a higher-level request (e.g., a parent run that spawns a subgraph via direct `await sub.invoke(...)`, or a user-request that drives several @@ -280,14 +280,14 @@ correlation: ### TracerProvider isolation `OTelObserver` constructs a **private** `TracerProvider` from the -processor you supply — it never registers globally and never reads +processor you supply. It never registers globally and never reads `get_tracer_provider()`. This isolation is intentional. The motivation is concrete: many production stacks already register a global `TracerProvider` (Langfuse v3's OpenInference integration is the recurring example) for their own instrumentation. If openarmature piggybacked on the global provider, every span the engine emits would -also flow to those other backends — doubling exports, corrupting +also flow to those other backends, doubling exports, corrupting hierarchies, and tying openarmature's lifecycle to whichever unrelated library happened to register first. Isolation prevents that; the observer's spans only flow through the processor you handed @@ -296,7 +296,7 @@ it. ### Detached trace mode Some subgraphs or fan-outs are better as their own root trace than as -descendants of the parent's span tree — long-running asynchronous +descendants of the parent's span tree: long-running asynchronous work, retries that would balloon a parent span, or work that gets reported to a different backend. @@ -314,6 +314,6 @@ A detached subgraph or fan-out gets a fresh trace root (new `trace_id`); the `correlation_id` still propagates through, so join semantics survive even when trace boundaries don't. -The non-detached default is what you want most of the time — one +The non-detached default is what you want most of the time: one trace per outermost invocation, with subgraphs and fan-out instances as nested spans. diff --git a/docs/concepts/state-and-reducers.md b/docs/concepts/state-and-reducers.md index 95ad73b..3a98767 100644 --- a/docs/concepts/state-and-reducers.md +++ b/docs/concepts/state-and-reducers.md @@ -31,8 +31,8 @@ guarantees come baked in: `{"plann": "..."}` (typo) fails loudly with a `StateValidationError` instead of silently dropping the key. -Everything else Pydantic gives you — validators, computed fields, -custom types, `Field` metadata — still works. You don't need to set +Everything else Pydantic gives you (validators, computed fields, +custom types, `Field` metadata) still works. You don't need to set `model_config` yourself; subclassing `State` is enough. **Why frozen?** It rules out a whole class of bugs that make multi-step @@ -73,7 +73,7 @@ What you do have for "what happened": returns. - **Crash context.** The four non-validation runtime errors (`NodeException`, `EdgeException`, `ReducerError`, `RoutingError`) - carry a `recoverable_state` — the state at the point of failure. Good + carry a `recoverable_state`, the state at the point of failure. Good for forensics; not a walkable timeline. If you need a full timeline (debugging, eval, time-travel, @@ -100,7 +100,7 @@ field and calls `reducer(prior_value, partial_value)`. Fields without an annotated reducer fall back to `last_write_wins`. **The point of per-field reducers:** a node shouldn't know how its -output combines with prior state — that's a property of the field, not +output combines with prior state. That's a property of the field, not the node. `trace.append`, `meta.merge`, `score.last_write_wins`. The schema declares the policy once; nodes return their increment; the engine applies the merge consistently. If two nodes write the same @@ -124,7 +124,7 @@ You can write your own. A reducer is any named callable matching the ## How reducers execute -A reducer **always returns a new value** — never mutates `prior`. That +A reducer **always returns a new value**; never mutates `prior`. That matches the frozen-state contract: the prior list/dict may still be a snapshot somebody else holds. @@ -147,7 +147,7 @@ async def plan_node(s: GraphState) -> dict[str, list[str]]: return {"trace": ["plan"]} # add ["plan"] to trace ``` -NOT `{"trace": s.trace + ["plan"]}` — that's already what `append` +NOT `{"trace": s.trace + ["plan"]}`. That's already what `append` does. Returning the full list would concatenate twice and duplicate entries. @@ -160,7 +160,7 @@ class Bad(State): log: Annotated[list[str], append, merge] = Field(default_factory=list) ``` -But `GraphBuilder.compile()` fails with `ConflictingReducers("log")` — +But `GraphBuilder.compile()` fails with `ConflictingReducers("log")`; the graph never compiles, so you can't reach runtime with an ambiguous merge policy. The same compile pass picks the one declared reducer per field; with no declaration, the default is `last_write_wins`. diff --git a/docs/contributing/index.md b/docs/contributing/index.md index 098bbf6..55140e3 100644 --- a/docs/contributing/index.md +++ b/docs/contributing/index.md @@ -1,5 +1,5 @@ # Contributing Contributor-facing docs (development setup, repo conventions, etc.) -land here. Stub for now — populated as concrete contributor-facing +land here. Stub for now; populated as concrete contributor-facing material accrues. diff --git a/docs/getting-started/index.md b/docs/getting-started/index.md index 61d5f3f..8675dd1 100644 --- a/docs/getting-started/index.md +++ b/docs/getting-started/index.md @@ -1,6 +1,6 @@ # Quickstart -Build and run a two-node graph in under a minute. No LLM required — this is +Build and run a two-node graph in under a minute. No LLM required. This is the smallest possible openarmature program so you can see every part of the shape on one screen. @@ -54,7 +54,7 @@ assert final.log == ["hello", "world"] ## What just happened -- **`S`** is the state schema — a frozen Pydantic model. Nodes can't mutate +- **`S`** is the state schema, a frozen Pydantic model. Nodes can't mutate it; they return partial-update dicts and the engine merges them. - **`append`** is the reducer attached to `log`. When `hello` returns `{"log": ["hello"]}`, the engine *appends* to the existing list rather @@ -69,9 +69,9 @@ assert final.log == ["hello", "world"] ## Next -- [Concepts](../concepts/index.md) — deeper on state, reducers, +- [Concepts](../concepts/index.md): deeper on state, reducers, projections, fan-out, subgraphs, observability. -- [Examples](https://github.com/LunarCommand/openarmature-python/tree/main/examples) - — five runnable demos, each driving a local OpenAI-compatible LLM +- [Examples](https://github.com/LunarCommand/openarmature-python/tree/main/examples): + five runnable demos, each driving a local OpenAI-compatible LLM endpoint to do real work. -- [API reference](../reference/index.md) — auto-generated from docstrings. +- [API reference](../reference/index.md): auto-generated from docstrings. diff --git a/docs/index.md b/docs/index.md index 76db819..2312ff5 100644 --- a/docs/index.md +++ b/docs/index.md @@ -6,7 +6,7 @@ hide: # OpenArmature -A workflow framework for LLM pipelines and tool-calling agents — typed +A workflow framework for LLM pipelines and tool-calling agents. Typed state, structural graph checks, and observability that doesn't require buy-in from every node. @@ -22,7 +22,7 @@ buy-in from every node. --- State schemas are Pydantic models with `frozen=True` and - `extra="forbid"`. Nodes can't mutate state — they return partial + `extra="forbid"`. Nodes can't mutate state; they return partial updates and the engine merges via per-field reducers. - :material-graph:{ .lg .middle }   __Compile-time checks__ @@ -58,7 +58,7 @@ buy-in from every node. --- - The engine has no concept of LLMs or tools — those live at the node + The engine has no concept of LLMs or tools; those live at the node boundary. Use any provider, any model, any external system. @@ -66,6 +66,6 @@ buy-in from every node. --- Built around an open, language-agnostic -[specification](https://github.com/LunarCommand/openarmature-spec). +[specification](https://github.com/LunarCommand/openarmature-spec){target="_blank" rel="noopener"}. A TypeScript implementation is on the roadmap; behaviour stays identical across implementations via spec conformance fixtures. diff --git a/docs/model-providers/authoring.md b/docs/model-providers/authoring.md index 0c63a0f..0bb32f2 100644 --- a/docs/model-providers/authoring.md +++ b/docs/model-providers/authoring.md @@ -1,8 +1,8 @@ # Authoring a Provider -When you target a wire format that isn't OpenAI Chat Completions — -Anthropic Messages, Bedrock, an internal gateway, a hand-rolled -inference service — implement the `Provider` Protocol yourself. The +When you target a wire format that isn't OpenAI Chat Completions +(Anthropic Messages, Bedrock, an internal gateway, a hand-rolled +inference service), implement the `Provider` Protocol yourself. The shipped `OpenAIProvider` is ~465 lines because it handles every edge case; a minimum-viable Provider is closer to 60 lines. @@ -180,18 +180,18 @@ The skeleton omits things real Providers usually need. Reach for `openarmature.llm.OpenAIProvider` as a reference when you need any of: -- **Tool calls** — wire-mapping the `tool_calls` array on +- **Tool calls.** Wire-mapping the `tool_calls` array on `AssistantMessage` to the Provider's expected shape, parsing tool results back from `ToolMessage`s. -- **Observability spans** — opt-in `started`/`completed` events +- **Observability spans.** Opt-in `started`/`completed` events around the wire call so the OTel observer can build LLM spans. -- **Lenient response parsing** under `finish_reason="error"` — - degraded responses surface what they can; tool-call arguments that +- **Lenient response parsing** under `finish_reason="error"`. + Degraded responses surface what they can; tool-call arguments that fail to parse populate `arguments=None` instead of raising. -- **Catalog-aware `ready()`** — `GET /v1/models` plus checking +- **Catalog-aware `ready()`.** `GET /v1/models` plus checking whether the bound model is in the returned catalog (and, for local servers like LM Studio, whether it's actually loaded). -- **`Retry-After` parsing** — use `parse_retry_after` (re-exported +- **`Retry-After` parsing.** Use `parse_retry_after` (re-exported from `openarmature.llm`) to populate the `retry_after` field of `ProviderRateLimit` from the response header. diff --git a/docs/model-providers/index.md b/docs/model-providers/index.md index c3259d5..e2df9fb 100644 --- a/docs/model-providers/index.md +++ b/docs/model-providers/index.md @@ -1,14 +1,14 @@ # Model Providers A **Provider** is the seam between OpenArmature's graph engine and -any LLM backend — OpenAI's hosted API, an Anthropic Messages +any LLM backend (OpenAI's hosted API, an Anthropic Messages endpoint, a local vLLM / LM Studio / llama.cpp server, or an -internal gateway. The engine doesn't know about LLMs; nodes call +internal gateway). The engine doesn't know about LLMs; nodes call providers, providers do the wire work. ## What ships -- **`OpenAIProvider`** — implements the OpenAI Chat Completions wire +- **`OpenAIProvider`**: implements the OpenAI Chat Completions wire format (`POST /v1/chat/completions`). Talks to OpenAI itself plus the local servers that adopt the same format (vLLM, LM Studio, llama.cpp). One Provider class covers most real-world deployments. @@ -41,7 +41,7 @@ class Provider(Protocol): - **`ready()`** verifies the bound model is reachable. Pre-flight check, typically called once before invoking the graph. - **`complete()`** performs a single completion call and returns the - full `Response` — message, finish reason, token usage, raw wire + full `Response`: message, finish reason, token usage, raw wire payload. ### Behaviour guarantees @@ -55,7 +55,7 @@ class Provider(Protocol): Provider returns with `finish_reason="tool_calls"`. The caller executes the tool and makes a follow-on `complete()` with the result. The Provider does not re-enter itself. -- **No retry on transient errors.** That's middleware's job — wrap a +- **No retry on transient errors.** That's middleware's job; wrap a node in `RetryMiddleware` or similar. ## Errors @@ -64,7 +64,7 @@ Seven canonical error categories cover every failure mode: | Error | Trigger | | --------------------------- | --------------------------------------------- | -| `ProviderAuthentication` | 401 / 403 — bad key, expired token | +| `ProviderAuthentication` | 401 / 403 (bad key, expired token) | | `ProviderUnavailable` | 5xx, network failure, timeout | | `ProviderInvalidModel` | Bound model doesn't exist on the provider | | `ProviderModelNotLoaded` | Model known but not currently serving | @@ -73,7 +73,7 @@ Seven canonical error categories cover every failure mode: | `ProviderInvalidRequest` | Malformed request (per-message or list-level) | Three of these (`Unavailable`, `RateLimit`, `ModelNotLoaded`) are -exported in `TRANSIENT_CATEGORIES` — the canonical "safe to retry" +exported in `TRANSIENT_CATEGORIES`, the canonical "safe to retry" set used by the default retry-middleware classifier. ## A minimal example @@ -108,9 +108,9 @@ nodes call it inside their bodies. ## Where to next -- **[Authoring a Provider](authoring.md)** — how to implement the +- **[Authoring a Provider](authoring.md)**: how to implement the Protocol for a non-default wire format. Includes a ~60-line skeleton + contract checklist. -- **[API reference: `openarmature.llm`](../reference/llm.md)** — the +- **[API reference: `openarmature.llm`](../reference/llm.md)**: the full public surface (Message types, Response, Usage, RuntimeConfig, error classes). diff --git a/docs/reference/index.md b/docs/reference/index.md index 6910967..3dd000b 100644 --- a/docs/reference/index.md +++ b/docs/reference/index.md @@ -2,13 +2,13 @@ Auto-generated from docstrings. Pick a subpackage: -- [`openarmature.graph`](graph.md) — state, builder, edges, nodes, +- [`openarmature.graph`](graph.md): state, builder, edges, nodes, projections, fan-out, middleware, observer, reducers, errors. -- [`openarmature.llm`](llm.md) — Provider Protocol, message + tool +- [`openarmature.llm`](llm.md): Provider Protocol, message + tool types, the OpenAIProvider, shared error helpers. -- [`openarmature.checkpoint`](checkpoint.md) — Checkpointer Protocol, +- [`openarmature.checkpoint`](checkpoint.md): Checkpointer Protocol, CheckpointRecord, in-memory + SQLite backends. -- [`openarmature.observability`](observability.md) — correlation +- [`openarmature.observability`](observability.md): correlation primitives and the optional OTel mapping (`[otel]` extra). Public surface is what each subpackage's `__init__.py` re-exports. Symbols From ba1c5fe3b1fd2b85e3281addfa99f950e7550213 Mon Sep 17 00:00:00 2001 From: chris-colinsky Date: Tue, 12 May 2026 19:57:34 -0700 Subject: [PATCH 2/2] docs: drop empty Contributing page MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The Contributing page was a one-paragraph stub saying "populated as concrete contributor-facing material accrues" — a dead click for any reader who landed on it. Until there's a real contributor pipeline, having no page is more honest than having an empty one. Deletes docs/contributing/ and its nav entry in mkdocs.yml. AGENTS.md remains the agent-facing orientation; if a real contributor doc is wanted later, it can be lifted from AGENTS.md selectively. --- docs/contributing/index.md | 5 ----- mkdocs.yml | 1 - 2 files changed, 6 deletions(-) delete mode 100644 docs/contributing/index.md diff --git a/docs/contributing/index.md b/docs/contributing/index.md deleted file mode 100644 index 55140e3..0000000 --- a/docs/contributing/index.md +++ /dev/null @@ -1,5 +0,0 @@ -# Contributing - -Contributor-facing docs (development setup, repo conventions, etc.) -land here. Stub for now; populated as concrete contributor-facing -material accrues. diff --git a/mkdocs.yml b/mkdocs.yml index c661e6d..b7a00b2 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -105,7 +105,6 @@ nav: - openarmature.llm: reference/llm.md - openarmature.checkpoint: reference/checkpoint.md - openarmature.observability: reference/observability.md - - Contributing: contributing/index.md extra: # Hide the "Made with Material for MkDocs" footer.