Skip to content

feat: add distributed tracing (Jaeger) to local-setup#922

Open
crstrn13 wants to merge 3 commits intomainfrom
cristu/issues/907
Open

feat: add distributed tracing (Jaeger) to local-setup#922
crstrn13 wants to merge 3 commits intomainfrom
cristu/issues/907

Conversation

@crstrn13
Copy link
Copy Markdown
Contributor

@crstrn13 crstrn13 commented Apr 6, 2026

Summary

Implements #907 - Adds complete distributed tracing configuration to make local-setup for debugging control plane (operator) and data plane (gateway) flows locally.

Changes

1. Infrastructure Setup

Tools namespace (make/tools.mk):

  • Parameterized namespace with TOOLS_NAMESPACE variable (default: tools)
  • Helm install now uses --namespace $(TOOLS_NAMESPACE)
  • Single source of truth for tools deployment

Tracing variables (make/vars.mk):

INSTALL_TRACING ?= true                  # Enable by default
TOOLS_NAMESPACE ?= tools                 # Tools deployment namespace
JAEGER_COLLECTOR_ENDPOINT ?= rpc://jaeger-collector.$(TOOLS_NAMESPACE).svc.cluster.local:4317

2. Control Plane Tracing (kuadrant-operator)

New target (make/kuadrant.mk): configure-kuadrant-tracing-operator

  • Patches operator deployment with OTEL environment variables:
    • OTEL_EXPORTER_OTLP_ENDPOINT=rpc://jaeger-collector.tools.svc.cluster.local:4317
    • OTEL_EXPORTER_OTLP_INSECURE=true
    • LOG_LEVEL=debug
  • Waits for rollout to complete
  • Traces reconciliation loops, policy processing, webhook calls

3. Data Plane Tracing (gateway/envoy)

New target (make/kuadrant.mk): configure-kuadrant-tracing-cr

  • Patches Kuadrant CR with complete observability configuration:
spec:
  observability:
    enable: true
    dataPlane:
      defaultLevels:
        - debug: 'true'
      httpHeaderIdentifier: x-request-id
    tracing:
      defaultEndpoint: rpc://jaeger-collector.tools.svc.cluster.local:4317
      insecure: true

4. Istio Observability Configuration

New target (make/istio.mk): configure-istio-tracing

  • Patches Istio CR with:
    • JSON access logs to stdout with 17 detailed fields
    • OpenTelemetry extension provider pointing to Jaeger
    • Enable tracing in mesh config
  • Creates Telemetry resource with 100% sampling rate
  • Matches production template from helm-charts-olm ossm3

Access log format includes:

  • Request/response metadata (method, path, protocol, response_code)
  • Performance metrics (duration, bytes_sent/received)
  • Tracing correlation (request_id, x_forwarded_for)
  • Routing information (upstream_host, upstream_cluster, route_name)

5. Deployment Flow

Updated (make/local-setup.mk):

1. Gateway provider install (istio/envoygateway)
2. deploy-testsuite-tools (Jaeger → tools namespace)
3. configure-istio-tracing (if INSTALL_TRACING=true AND Istio)
4. deploy-kuadrant-operator
5. deploy-kuadrant-cr
6. configure-kuadrant-tracing-operator (if INSTALL_TRACING=true)
7. configure-kuadrant-tracing-cr (if INSTALL_TRACING=true)

Key improvement: Tracing configuration happens AFTER Jaeger is deployed (no broken references)

6. Configuration & Documentation

Settings template (config/settings.local.yaml.tpl):

tracing:
  backend: "jaeger"
  collector_url: "rpc://jaeger-collector.tools.svc.cluster.local:4317"
  query_url: "http://jaeger-query.tools.svc.cluster.local:80"

Documentation (CLAUDE.md):

  • New "Accessing Jaeger Tracing" section
  • How to access Jaeger UI via port-forward
  • How to run control plane and data plane tracing tests
  • How to disable tracing

Test Plan

Verify OTEL env vars on operator

kubectl get deployment kuadrant-operator-controller-manager \
  -n kuadrant-system \
  -o jsonpath='{.spec.template.spec.containers[0].env[?(@.name=="OTEL_EXPORTER_OTLP_ENDPOINT")].value}'
# Expected: rpc://jaeger-collector.tools.svc.cluster.local:4317

Verify Kuadrant CR tracing config

kubectl get kuadrant kuadrant-sample -n kuadrant-system -o yaml | grep -A 5 "tracing:"
# Expected:
#   tracing:
#     defaultEndpoint: rpc://jaeger-collector.tools.svc.cluster.local:4317
#     insecure: true

Verify Istio tracing config

kubectl get istio default -n istio-system -o yaml | grep -A 5 "extensionProviders"
kubectl get telemetry default-telemetry -n istio-system

Run tracing tests

# Control plane tracing tests (40 tests - currently skipped)
make testsuite/tests/singlecluster/tracing/control_plane/

# Data plane tracing tests (10 tests - currently skipped)
make testsuite/tests/singlecluster/tracing/data_plane_tracing/

# All observability tests
make observability

Access Jaeger UI

kubectl port-forward -n tools svc/jaeger-query 16686:80
# Open http://localhost:16686
# Run tests and verify traces appear for "kuadrant-operator" service

Expected Impact

Before:

  • ❌ 40 control plane tracing tests skipped (no OTEL env vars)
  • ❌ 10 data plane tracing tests skipped (no Kuadrant CR tracing config)
  • ❌ Developers cannot debug reconciliation or request flows

After:

  • ✅ ~50 tracing tests pass (not skip)
  • ✅ Control plane traces: operator reconciliation loops, policy changes
  • ✅ Data plane traces: HTTP requests through gateway/envoy
  • ✅ JSON access logs for request debugging
  • ✅ Local environment matches production observability setup

Configuration Options

Enable (default):

make local-setup  # Tracing enabled by default

Disable:

INSTALL_TRACING=false make local-setup

Custom tools namespace:

TOOLS_NAMESPACE=my-tools make local-setup

Technical Details

Endpoints:

  • Control plane (operator): rpc://jaeger-collector.tools.svc.cluster.local:4317 (OTLP)
  • Data plane (gateway): Same endpoint (OTLP)
  • Jaeger query UI: http://jaeger-query.tools.svc.cluster.local:80

Tracing components:

  • Jaeger collector: Port 4317 (OTLP gRPC)
  • Jaeger query: Port 80 (HTTP UI)
  • Both deployed via kuadrant-olm/tools-instances Helm chart

Sampling:

  • Istio: 100% (randomSamplingPercentage: 100)
  • Kuadrant: Configured via CR

Related

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Configurable distributed tracing via INSTALL_TRACING, with Make targets to configure Istio and Kuadrant tracing
  • Documentation

    • Added guide for local Jaeger deployment, configuration, port‑forward UI access and tracing test targets
  • Chores

    • Switched tracing endpoints to internal Kubernetes DNS, made tracing tools deployable to a configurable namespace, and added tracing-related Make variables and targets

Signed-off-by: Alexander Cristurean <acristur@redhat.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 6, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2ea65469-837f-4e60-80d4-07faab86dacc

📥 Commits

Reviewing files that changed from the base of the PR and between 7cd2f44 and 0b3d881.

📒 Files selected for processing (3)
  • make/istio.mk
  • make/vars.mk
  • testsuite/config/__init__.py
✅ Files skipped from review due to trivial changes (1)
  • make/vars.mk
🚧 Files skipped from review as they are similar to previous changes (1)
  • make/istio.mk

📝 Walkthrough

Walkthrough

Adds Jaeger tracing documentation and wiring for local environments: new docs, Makefile variables and targets to configure Istio and Kuadrant for OTLP/Jaeger, updated local-setup flow to conditionally deploy and configure tracing, and testsuite/tools configuration changes to use a configurable namespace and collector endpoint.

Changes

Cohort / File(s) Summary
Documentation
CLAUDE.md
New "Accessing Jaeger Tracing" section: how to deploy locally, collector endpoint usage, enabling/disabling via INSTALL_TRACING and CR spec.observability.tracing.defaultEndpoint, port-forward commands and test-suite make targets.
Config template
config/settings.local.yaml.tpl
Simplified commented tracing template: removed backend/collector_url comments and replaced example query_url with internal DNS http://jaeger-query.tools.svc.cluster.local:80.
Make variables
make/vars.mk
Added TOOLS_NAMESPACE ?= tools, INSTALL_TRACING ?= true, and JAEGER_COLLECTOR_ENDPOINT ?= http://jaeger-collector.$(TOOLS_NAMESPACE).svc.cluster.local:4318.
Make targets — Istio
make/istio.mk
Added configure-istio-tracing target that patches the Istio CR to enable tracing, set JSON access logs, add extensionProviders for a Jaeger OTLP provider, and applies a Telemetry resource with jaeger-otlp and 100% sampling.
Make targets — Kuadrant
make/kuadrant.mk
Added configure-kuadrant-tracing-operator to set OTEL env vars and debug logging on the operator Deployment, and configure-kuadrant-tracing-cr to patch the kuadrant-sample CR to enable observability and set data-plane tracing defaults and endpoint.
Local setup orchestration
make/local-setup.mk
local-setup now conditionally runs tracing-related steps when INSTALL_TRACING=true; deploys test tools earlier and runs Istio tracing only when GATEWAYAPI_PROVIDER=istio; later calls Kuadrant tracing config targets under INSTALL_TRACING.
Test tools deployment
make/tools.mk
deploy-testsuite-tools now deploys into $(TOOLS_NAMESPACE) (namespace creation and Helm install use the variable).
Testsuite defaults
testsuite/config/__init__.py
Changed default tracing.collector_url to fetch jaeger-collector using http on port 4318 (was rpc/4317).

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Dev as Developer (make)
    participant Kube as Kubernetes API
    participant Istio as Istio control plane
    participant Kuadrant as Kuadrant operator / CR
    participant Jaeger as Jaeger collector/query

    Dev->>Kube: run `make local-setup` (INSTALL_TRACING=true)
    Note right of Dev: create tools namespace & install test tools
    Dev->>Kube: apply `configure-istio-tracing` (if GATEWAYAPI_PROVIDER=istio)
    Kube->>Istio: patch Istio CR (enable tracing, add extensionProvider)
    Dev->>Kube: apply Telemetry resource -> Istio
    Dev->>Kube: run `configure-kuadrant-tracing-operator`
    Kube->>Kuadrant: patch operator Deployment (OTEL env vars)
    Dev->>Kube: run `configure-kuadrant-tracing-cr`
    Kube->>Kuadrant: patch kuadrant-sample CR (observability.tracing.defaultEndpoint)
    Kuadrant->>Jaeger: emit OTLP traces to `JAEGER_COLLECTOR_ENDPOINT`
    Dev->>Jaeger: access Jaeger UI via `kubectl port-forward` (per docs)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Poem

🐰 I hop through spans and tiny threads,

OTLP carrots line my beds,
Jaeger's lights and traces gleam,
Makefiles stitch the tracing dream,
Debug carrots, hops of steam 🥕✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add distributed tracing (Jaeger) to local-setup' clearly summarises the main change—adding Jaeger tracing to local development setup—and follows the conventional commit format.
Description check ✅ Passed The PR description is comprehensive and well-structured with clear sections covering motivation, infrastructure changes, control/data plane tracing, Istio configuration, deployment flow, and test plan.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch cristu/issues/907

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@crstrn13 crstrn13 self-assigned this Apr 6, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
make/istio.mk (1)

35-36: Consider extracting the complex JSON patch for maintainability.

The single-line kubectl patch command at line 36 is very long and difficult to read or modify. Consider using a heredoc or external JSON file to improve maintainability.

Suggested refactor using heredoc
 	`@echo` "Configuring Istio for tracing with Jaeger..."
 	@# Patch Istio CR to add tracing extension provider and JSON access logs
-	`@kubectl` patch istio default -n istio-system --type=merge -p '{"spec":{"values":{"meshConfig":{"accessLogFile":"/dev/stdout","accessLogEncoding":"JSON","accessLogFormat":"{\"start_time\":\"%START_TIME%\",\"method\":\"%REQ(:METHOD)%\",\"path\":\"%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%\",\"protocol\":\"%PROTOCOL%\",\"response_code\":\"%RESPONSE_CODE%\",\"response_flags\":\"%RESPONSE_FLAGS%\",\"bytes_received\":\"%BYTES_RECEIVED%\",\"bytes_sent\":\"%BYTES_SENT%\",\"duration\":\"%DURATION%\",\"upstream_service_time\":\"%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%\",\"x_forwarded_for\":\"%REQ(X-FORWARDED-FOR)%\",\"user_agent\":\"%REQ(USER-AGENT)%\",\"request_id\":\"%REQ(X-REQUEST-ID)%\",\"authority\":\"%REQ(:AUTHORITY)%\",\"upstream_host\":\"%UPSTREAM_HOST%\",\"upstream_cluster\":\"%UPSTREAM_CLUSTER%\",\"route_name\":\"%ROUTE_NAME%\"}","enableTracing":true,"defaultConfig":{"tracing":{}},"extensionProviders":[{"name":"jaeger-otlp","opentelemetry":{"port":4317,"service":"jaeger-collector.$(TOOLS_NAMESPACE).svc.cluster.local"}}]}}}}'
+	`@kubectl` patch istio default -n istio-system --type=merge -p "$$( \
+		cat <<-EOF
+		{
+		  "spec": {
+		    "values": {
+		      "meshConfig": {
+		        "accessLogFile": "/dev/stdout",
+		        "accessLogEncoding": "JSON",
+		        "enableTracing": true,
+		        "defaultConfig": {"tracing": {}},
+		        "extensionProviders": [{
+		          "name": "jaeger-otlp",
+		          "opentelemetry": {
+		            "port": 4317,
+		            "service": "jaeger-collector.$(TOOLS_NAMESPACE).svc.cluster.local"
+		          }
+		        }]
+		      }
+		    }
+		  }
+		}
+		EOF
+	)"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@make/istio.mk` around lines 35 - 36, The long single-line kubectl patch
command (the line invoking "kubectl patch istio default -n istio-system
--type=merge -p") embeds a large JSON payload (including keys like
accessLogFile, accessLogEncoding, accessLogFormat, enableTracing,
defaultConfig.tracing, and extensionProviders) which is hard to read and
maintain; extract that JSON into a separate, well-formatted artifact (either a
checked-in JSON/YAML file or a heredoc block) and update the kubectl invocation
to read the payload from that source (e.g., pass the file contents or heredoc
output into the -p argument), ensuring the accessLogFormat and
extensionProviders JSON remains valid and quoting is preserved.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@CLAUDE.md`:
- Around line 78-81: The docs and implementation disagree on Jaeger OTLP port:
update either the documentation text or the default JAEGER_COLLECTOR_ENDPOINT so
they match; specifically, change the documentation line that claims traces are
sent to port 4318 to reflect port 4317, or change the default value of
JAEGER_COLLECTOR_ENDPOINT (in make/vars.mk) to use 4318 if you intend to use
OTLP/HTTP. Locate the symbol JAEGER_COLLECTOR_ENDPOINT and the doc text
"Operator sends traces to jaeger-collector.tools.svc.cluster.local:4318" and
make them consistent (pick 4317 for OTLP/gRPC or 4318 for OTLP/HTTP) and update
any related README/CLAUDE.md references accordingly.

In `@make/vars.mk`:
- Around line 42-44: The JAEGER_COLLECTOR_ENDPOINT default uses an unsupported
"rpc://" scheme; update the JAEGER_COLLECTOR_ENDPOINT variable to use an http or
https scheme (e.g., change JAEGER_COLLECTOR_ENDPOINT ?= rpc://... to
JAEGER_COLLECTOR_ENDPOINT ?=
http://jaeger-collector.$(TOOLS_NAMESPACE).svc.cluster.local:4317) so the
OpenTelemetry OTLP exporter accepts the endpoint; ensure any related logic that
reads JAEGER_COLLECTOR_ENDPOINT continues to work with the http/https URL.

---

Nitpick comments:
In `@make/istio.mk`:
- Around line 35-36: The long single-line kubectl patch command (the line
invoking "kubectl patch istio default -n istio-system --type=merge -p") embeds a
large JSON payload (including keys like accessLogFile, accessLogEncoding,
accessLogFormat, enableTracing, defaultConfig.tracing, and extensionProviders)
which is hard to read and maintain; extract that JSON into a separate,
well-formatted artifact (either a checked-in JSON/YAML file or a heredoc block)
and update the kubectl invocation to read the payload from that source (e.g.,
pass the file contents or heredoc output into the -p argument), ensuring the
accessLogFormat and extensionProviders JSON remains valid and quoting is
preserved.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 0131b018-c2a7-4ee2-8208-4a58fbca094a

📥 Commits

Reviewing files that changed from the base of the PR and between d6450a3 and dd8fb42.

📒 Files selected for processing (7)
  • CLAUDE.md
  • config/settings.local.yaml.tpl
  • make/istio.mk
  • make/kuadrant.mk
  • make/local-setup.mk
  • make/tools.mk
  • make/vars.mk

Comment thread CLAUDE.md
Comment thread make/vars.mk Outdated
Signed-off-by: Alexander Cristurean <acristur@redhat.com>
@crstrn13 crstrn13 added this to Kuadrant Apr 7, 2026
Signed-off-by: Alexander Cristurean <acristur@redhat.com>
@crstrn13 crstrn13 moved this to In Review in Kuadrant Apr 7, 2026
@crstrn13 crstrn13 moved this from In Review to Ready For Review in Kuadrant Apr 7, 2026
@crstrn13 crstrn13 requested a review from a team April 16, 2026 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Ready For Review

Development

Successfully merging this pull request may close these issues.

1 participant