Add [langfuse] extras and SDK adapter (PR 3.6) by chris-colinsky · Pull Request #82 · LunarCommand/openarmature-python

chris-colinsky · 2026-05-27T20:33:41Z

Summary

Validates the Langfuse observer against real langfuse>=4.6 and ships a bridge so production users get the same Protocol-shaped observer surface as InMemoryLangfuseClient. Fifth of 6 core PRs in the v0.10.0 batch.

[langfuse] extras pinned to langfuse>=4.6,<5. The v4 SDK is structurally different from v2 / v3 — traces auto-create on first observation, span / generation collapse into start_observation(as_type=...), trace_id threads through TraceContext, trace-level metadata sets via propagate_attributes context manager. Per the "no existing-user constraint" directive, the adapter targets v4 only.

LangfuseSDKAdapter wraps the v4 client to satisfy the four-method LangfuseClient Protocol. Key translations:

UUID4 invocation_id → OTel-hex trace_id (32 chars, no dashes). v4 fails int(uuid, 16) parsing on the dashed form; OA's observer error-isolation pattern swallowed that as a warnings.warn, silently dropping traces. The conversion makes the bridge correct.
propagate_attributes(trace_name=, metadata=) on every observation (not just the first). Without this, v4's last-attribute-wins display logic let later observations clobber the trace's display name to whatever the final observation was called.
usage → usage_details translation from LangfuseUsage record to v4's int-only dict.
Returned LangfuseSpan / LangfuseGeneration handles wrap into _SpanHandle exposing the .update() / .end() the observer calls.

Trace-info cache persists per trace_id (rather than popping on first observation). Memory is linear in unique trace_ids; a close_trace cleanup hook is deferred to a future PR.

Tests

Five new unit tests: Protocol satisfaction, observer construction, trace_info cache lifecycle, update_trace merge, UUID4 → OTel-hex conversion (with idempotency on already-hex + non-UUID passthrough).
One opt-in integration test against real Langfuse Cloud, gated by @pytest.mark.integration + LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY env vars. Calls auth_check() to fail loud on bad credentials, client.shutdown() for synchronous batch-exporter drain. Accepts LANGFUSE_HOST or LANGFUSE_BASE_URL for the host (downstream convention).
New pytest marker config: addopts defaults to -m "not integration" so CI auto-skips integration tests; run with -m integration to include.

Docs

docs/concepts/observability.md and docs/examples/10-langfuse-observability.md flip the "no SDK version validated" disclosure to the validated v4 state and show the LangfuseSDKAdapter wire-up snippet.
examples/10-langfuse-observability/main.py docstring + inline comment updated with the production swap recipe.
AGENTS.md regenerated.

Validated end-to-end

Real run against Langfuse Cloud (US region) with langfuse==4.7.0: two-node graph produces one Trace with the entry-node-name as the display name, both nodes as Span observations under it, and the spec §8.4.1 trace metadata (correlation_id, entry_node, spec_version) populated. Earlier runs caught two real bugs that this PR fixed: the UUID format mismatch and the trace-name clobbering.

Test plan

CI green (lint, format, types, conformance, unit, smoke, agents-md drift)
5 new adapter unit tests pass, 1 integration test correctly deselected by default -m "not integration"
Optional: manual run of the integration test against Langfuse Cloud to confirm the trace lands with the expected shape

Validates the Langfuse observer against the real langfuse>=4.6 SDK and ships a bridge so production users get the same Protocol-shaped observer surface as the InMemoryLangfuseClient. [langfuse] optional-dependency group pins langfuse>=4.6,<5. The v4 SDK is structurally different from v2 / v3 — traces are auto-created when the first observation starts, span and generation collapse into start_observation with as_type, trace_id threads through TraceContext, and trace-level metadata sets via propagate_attributes context manager. Per Chris's directive ("no existing-user constraint; do what's right for OA"), the adapter targets v4 only; earlier SDK versions are out of scope. LangfuseSDKAdapter wraps the v4 client to satisfy the four-method LangfuseClient Protocol. Key translations: - UUID4 invocation_id -> OTel-hex trace_id (32 chars, no dashes). v4 fails int(uuid, 16) parsing on the dashed form; OA's observer error-isolation pattern swallowed that as a warnings.warn, silently dropping traces. - propagate_attributes(trace_name=, metadata=) runs on EVERY observation under each trace_id (not just the first). Without this, v4's last-attribute-wins display logic let later observations clobber the trace's display name to whatever the final observation was called. - usage values translate from the Protocol's LangfuseUsage record to v4's usage_details dict (int values only). - Returned LangfuseSpan / LangfuseGeneration handles wrap into _SpanHandle to expose the .update() / .end() the observer calls. Trace-info cache persists per trace_id rather than popping on first observation. Memory is linear in unique trace_ids; a close_trace cleanup hook is deferred to a future PR. Tests: - Five unit tests covering Protocol satisfaction, observer construction, trace_info cache lifecycle, update_trace merge, UUID4 -> OTel-hex conversion (with idempotency on already-hex and non-UUID passthrough). - One opt-in integration test against real Langfuse Cloud, gated by @pytest.mark.integration + LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY env vars. Calls auth_check() to fail loud on bad credentials, client.shutdown() for synchronous batch-exporter drain. Accepts LANGFUSE_HOST or LANGFUSE_BASE_URL for the host. - New pytest marker config: addopts defaults to "-m not integration" so CI auto-skips integration tests; run with -m integration to include. Docs: - docs/concepts/observability.md flips the "no SDK version validated" disclosure to the validated v4 state, shows the LangfuseSDKAdapter wire-up snippet. - docs/examples/10-langfuse-observability.md same. - examples/10-langfuse-observability/main.py docstring + inline comment updated with the production swap recipe. - AGENTS.md regenerated. Validated end-to-end against Langfuse Cloud (US region) with langfuse 4.7.0: a two-node graph produces one Trace with entry-node-name set as the display name, both nodes as Span observations under it, and the spec §8.4.1 trace metadata (correlation_id, entry_node, spec_version) populated. Fifth of 6 core PRs in the v0.10.0 batch (PR 3.6).

Copilot

Pull request overview

Adds an optional Langfuse v4 integration path by pinning a [langfuse] extra and introducing a LangfuseSDKAdapter that maps the v4 SDK surface (start_observation, propagate_attributes, OTel trace IDs) onto OpenArmature’s LangfuseClient Protocol, so LangfuseObserver can be used consistently in tests (in-memory) and production (real SDK).

Changes:

Add [langfuse] optional dependency (pinned >=4.6,<5) and lockfile updates.
Implement LangfuseSDKAdapter bridging Langfuse v4’s API to the 4-method LangfuseClient Protocol (including UUID→OTel-hex trace_id conversion).
Add unit tests + an opt-in integration test, register an integration pytest marker, and update docs/examples to describe the production wire-up.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
uv.lock	Locks new optional dependency set for `langfuse` and its transitive deps.
pyproject.toml	Adds `[langfuse]` extra and configures pytest `integration` marker + default skip.
src/openarmature/observability/langfuse/adapter.py	Implements `LangfuseSDKAdapter` + handle wrapper to satisfy `LangfuseClient`.
src/openarmature/observability/langfuse/init.py	Conditionally exports `LangfuseSDKAdapter` when the extra is installed.
tests/unit/test_observability_langfuse_adapter.py	Adds adapter unit tests and an opt-in Langfuse Cloud integration test.
examples/10-langfuse-observability/main.py	Updates example guidance to use `LangfuseSDKAdapter` for production.
docs/examples/10-langfuse-observability.md	Documents installing `[langfuse]` and wiring the adapter.
docs/concepts/observability.md	Updates Langfuse mapping docs to describe adapter-based production usage.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Three stale-doc residues from when the adapter consumed trace_info on the first observation only. The current behavior — propagate on every observation under each trace_id to avoid v4's last-attribute- wins display logic clobbering the trace name — got caught by the integration-test run and the cache+propagation refactored, but the module header comments and one example-doc paragraph still described the original "first observation only" path. Comment / doc text updated; no code change. Addresses CoPilot PR review feedback on #82.

Copilot AI review requested due to automatic review settings May 27, 2026 20:33

Copilot started reviewing on behalf of chris-colinsky May 27, 2026 20:33 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

Comment thread src/openarmature/observability/langfuse/adapter.py Outdated

Comment thread src/openarmature/observability/langfuse/adapter.py Outdated

Comment thread docs/examples/10-langfuse-observability.md Outdated

chris-colinsky merged commit 2b4bc1b into main May 27, 2026
6 checks passed

chris-colinsky deleted the feature/langfuse-extras branch May 27, 2026 20:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add [langfuse] extras and SDK adapter (PR 3.6)#82

Add [langfuse] extras and SDK adapter (PR 3.6)#82
chris-colinsky merged 2 commits into
mainfrom
feature/langfuse-extras

chris-colinsky commented May 27, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chris-colinsky commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Docs

Validated end-to-end

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chris-colinsky commented May 27, 2026 •

edited

Loading