Skip to content

Add extension support to the pipeline engine#2113

Draft
utpilla wants to merge 10 commits intoopen-telemetry:mainfrom
utpilla:utpilla/Add-extension-support
Draft

Add extension support to the pipeline engine#2113
utpilla wants to merge 10 commits intoopen-telemetry:mainfrom
utpilla:utpilla/Add-extension-support

Conversation

@utpilla
Copy link
Contributor

@utpilla utpilla commented Feb 25, 2026

Change Summary

Introduces first-class extension support into the dataflow pipeline engine. Extensions are non-pipeline components that provide cross-cutting capabilities (e.g., authentication, health checks, service discovery) to receivers and exporters without participating in the pdata flow.

Motivation

Pipeline components like receivers and exporters often need shared services — credential management, token refresh, header validation — that don't fit the receiver → processor → exporter data-flow model. Extensions provide a clean separation: an independent task produces service handles, and pipeline components consume them at startup via a type-safe registry.

What's included

Engine core

  • ExtensionWrapper — Unified wrapper supporting both Send and !Send extension implementations, analogous to ReceiverWrapper/ExporterWrapper.
  • local::Extension / shared::Extension traits — Lifecycle trait with a start() method that receives a control channel and effect handler.
  • ExtensionConfig — Runtime configuration (control channel capacity) for extensions. Extensions only receive control messages, no pdata channels.
  • ExtensionControlMsg — PData-free control message enum (Shutdown, TimerTick, CollectTelemetry).
  • ExtensionFactory — Factory struct (not generic over PData) registered via #[distributed_slice].
  • ExtensionHandles / ExtensionRegistryBuilder / ExtensionRegistry — Type-safe, Clone + Send registry. Extension factories register typed handles; pipeline components retrieve them by (extension_name, TypeId) at startup.
  • ServerAuthenticator / ClientAuthenticator traits — Pluggable auth contract for receivers (validate incoming requests) and exporters (attach outgoing credentials), with cloneable handle wrappers (ServerAuthenticatorHandle, ClientAuthenticatorHandle).
  • Pipeline lifecycle integration — Extensions are created before other nodes and started before the pipeline, ensuring handles are available when components call start(). They shut down after pipeline components.
  • Error variants — ExtensionHandleAlreadyRegistered, ExtensionHandleNotFound, UnknownExtension.

Engine macros (engine-macros)

  • #[pipeline_factory] macro now generates an EXTENSION_FACTORIES distributed slice and a get_<prefix>_extension_factory_map() helper.

Config (config)

  • NodeKind::Extension recognized in URN parsing (:extension suffix).
  • Extensions excluded from connectivity pruning (they have no data-flow edges).

Pipeline components (otap, contrib-nodes, validation, benchmarks)

  • All receiver/exporter start() signatures now accept ExtensionRegistry as a third parameter.
  • Existing components pass _extension_registry (unused) — no behavioral changes.

What's NOT included

No concrete extension implementations are shipped yet (no entries in OTAP_EXTENSION_FACTORIES).

What issue does this PR close?

  • Closes #NNN

How are these changes tested?

Are there any user-facing changes?

Yes

@github-actions github-actions bot added the rust Pull requests that update Rust code label Feb 25, 2026
@codecov
Copy link

codecov bot commented Feb 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 81.97%. Comparing base (3080230) to head (313db85).

❗ There is a different number of reports uploaded between BASE (3080230) and HEAD (313db85). Click for more details.

HEAD has 5 uploads less than BASE
Flag BASE (3080230) HEAD (313db85)
8 3
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #2113       +/-   ##
===========================================
- Coverage   87.27%   81.97%    -5.31%     
===========================================
  Files         553      181      -372     
  Lines      181329    51898   -129431     
===========================================
- Hits       158252    42542   -115710     
+ Misses      22543     8822    -13721     
  Partials      534      534               
Components Coverage Δ
otap-dataflow ∅ <ø> (∅)
query_abstraction 80.61% <ø> (ø)
query_engine 90.30% <ø> (ø)
syslog_cef_receivers ∅ <ø> (∅)
otel-arrow-go 53.50% <ø> (ø)
quiver ∅ <ø> (∅)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@utpilla utpilla marked this pull request as ready for review February 26, 2026 21:51
@utpilla utpilla requested a review from a team as a code owner February 26, 2026 21:51
@lalitb
Copy link
Member

lalitb commented Feb 27, 2026

Pipeline components like receivers and exporters often need shared services

Receivers and exporters both receive ExtensionRegistry, but processors do not. Is this intended, and to be added later ? As there could be the real-world use-cases where processors would need extension access (I believe Go collector also support this).

/// Provides a minimal set of capabilities — primarily node identity and logging.
/// Extensions that need periodic timers should use `tokio::time::interval` directly.
#[derive(Clone)]
pub struct EffectHandler {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In local mode, extensions and pipeline nodes share a single LocalSet thread, so anything that blocks between .await points - sync I/O, heavy crypto, thread::sleep - will stall the whole pipeline silently. Probably worth documenting the non-blocking requirement on the Extension trait so implementors know upfront.

Also noticed EffectHandler doesn't have a spawn_blocking helper. Authors who need to run blocking work will either reach for tokio::task::spawn_blocking directly (works, but not discoverable) or block the thread without realising. Something like:

pub async fn spawn_blocking<F, R>(&self, f: F) -> R
where
    F: FnOnce() -> R + Send + 'static,
    R: Send + 'static,
{
    tokio::task::spawn_blocking(f)
        .await
        .expect("blocking task panicked")
}

would make the safe path obvious. One thing to note - !Send fields can't cross into the closure, so callers need to extract/clone before passing in, might be worth a doc note on the method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue applies equally to all node types (receivers, processors, exporters). They all share the same single-threaded runtime. None of them currently provide a spawn_blocking helper. I think documenting the non-blocking contract and potentially adding a spawn_blocking helper would be better as a follow-up that covers all node types uniformly, not just extensions.

@@ -0,0 +1,126 @@
// Copyright The OpenTelemetry Authors
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need local/shared versions of extensions? both here and in the extensionwrapper? it seems like we already use arc for cloning and sync support anyway, and extensions are send only. so maybe we can just make it so that extensions don't have this separation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extension's service handles are Arc-based and Send + Sync, but the extension implementation itself can hold !Send internal state. This mirrors the pattern used by receivers, processors, and exporters. They all have local/shared variants. Since the engine runs on current_thread + LocalSet, !Send is the natural default. Removing the local variant would force extension authors to add unnecessary Send boilerplate for state that never leaves the thread. I'd prefer to keep it for consistency and flexibility.

///
/// Returns an [`AuthError`] if credentials are unavailable
/// (e.g., token not yet refreshed, provider unreachable).
fn get_request_metadata(&self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does client here mean clients that use http headers? I think something that is more agnostic and focuses more on atomic functionality could be more widely useful. something like I did in my pr -> BearerTokenProvider or sth like that, that returns bearer token. How the consumer uses it is none of our concern. This is also very beneficial if consumer wants to have access to stuff like expiration date of bearer token etc easily.

@lquerel
Copy link
Contributor

lquerel commented Feb 27, 2026

Pipeline components like receivers and exporters often need shared services

Receivers and exporters both receive ExtensionRegistry, but processors do not. Is this intended, and to be added later ? As there could be the real-world use-cases where processors would need extension access (I believe Go collector also support this).

@lalitb @utpilla, I second this. In my view, all node types should be able to access extensions. However, before we can get there, we first need to introduce an init method (or extend the constructor function used by the factories) in our Receiver, Processor, and Exporter traits. That will also solve quite a few issues along the way and will allow us to pass the ExtensionRegistry, including to processors. I think this can be introduced in a separate PR.

@lquerel
Copy link
Contributor

lquerel commented Feb 27, 2026

@utpilla First feedback, given that I haven't read the entire PR. I like the idea of reusing the #[distributed_slice] concept and factories for extensions. It's clean and extensible. However, I didn't see any integration with our configuration model. In the configuration for a specific pipeline, how can I specify that I want to instantiate a specific implementation and configure the extension that is compatible with the ServerAuthenticator or ClientAuthenticator trait?

Before continuing the review, I'd really like to see a concrete example of an extension configuration (in our YAML files) and how it hooks up to, for example, a receiver.

/// Implement this trait in an auth extension to provide client-side
/// authentication. The extension decides what headers to attach
/// (e.g., `Authorization: Bearer <token>`, custom API key headers).
pub trait ClientAuthenticator: Send {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can async traits (funcs) be supported in this pattern?

@gouslu
Copy link
Contributor

gouslu commented Feb 28, 2026

Please don't merge this PR without my approval. I have been working on an extension system as well and I have a different opinion on how I think extensions should be implemented. I think the core idea is very similar in many cases, but I would like us to work together on this feature @utpilla.

@gouslu
Copy link
Contributor

gouslu commented Mar 1, 2026

@utpilla this is what I have put together as an alternative -> #2141

gouslu added a commit to gouslu/otel-arrow that referenced this pull request Mar 2, 2026
Based on utpilla's insight in open-telemetry#2113 that extensions never touch pipeline data.
@utpilla
Copy link
Contributor Author

utpilla commented Mar 5, 2026

However, I didn't see any integration with our configuration model. In the configuration for a specific pipeline, how can I specify that I want to instantiate a specific implementation and configure the extension that is compatible with the ServerAuthenticator or ClientAuthenticator trait?

Before continuing the review, I'd really like to see a concrete example of an extension configuration (in our YAML files) and how it hooks up to, for example, a receiver.

@lquerel You could check this diff to get an idea of how a sample config could look like on both receiver and exporter end: utpilla#3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants