LLM Tracking

Most customers don't need this. Inside botanu.event(...), the OTel auto-instrumentors for OpenAI, Anthropic, Vertex AI, and LangChain already produce GenAI semantic-convention spans with gen_ai.* attributes and run-context stamping. Reach for track_llm_call only when the library you're calling isn't auto-instrumented (custom inference endpoint, self-hosted model server, proprietary SDK) or when you need to set content for eval manually.

`track_llm_call`

from botanu.tracking.llm import track_llm_call

with track_llm_call(provider="openai", model="gpt-4") as tracker:
    response = await openai.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Hello"}]
    )
    tracker.set_tokens(
        input_tokens=response.usage.prompt_tokens,
        output_tokens=response.usage.completion_tokens,
    )
    tracker.set_request_id(response.id)

What Gets Recorded

Attribute	Example	Description
`gen_ai.operation.name`	`chat`	Type of operation
`gen_ai.provider.name`	`openai`	Normalized provider name
`gen_ai.request.model`	`gpt-4`	Requested model
`gen_ai.response.model`	`gpt-4-0613`	Actual model used
`gen_ai.usage.input_tokens`	`150`	Input/prompt tokens
`gen_ai.usage.output_tokens`	`200`	Output/completion tokens
`gen_ai.response.id`	`chatcmpl-...`	Provider request ID

LLMTracker Methods

set_tokens()

Record token usage from the response:

tracker.set_tokens(
    input_tokens=150,
    output_tokens=200,
    cached_tokens=50,        # For providers with caching
    cache_read_tokens=50,    # Anthropic-style cache read
    cache_write_tokens=100,  # Anthropic-style cache write
)

set_request_id()

Record provider and client request IDs for billing reconciliation:

tracker.set_request_id(
    provider_request_id=response.id,      # From provider response
    client_request_id="my-client-123",    # Your tracking ID
)

set_response_model()

When the response uses a different model than requested:

tracker.set_response_model("gpt-4-0613")

set_input_content() / set_output_content()

Capture the prompt text and response text for downstream evaluation.

tracker.set_input_content(prompt_text)
tracker.set_output_content(response_text)

Both methods are gated by BotanuConfig.content_capture_rate:

Default rate is 0.0 — both calls no-op. Nothing is written to the span.
Set the rate to 0.10–0.20 in production (or 1.0 in a sandbox) to start capturing. The gate is a simple random.random() < rate check, so the decision is per-call.
Text is truncated at max_chars (default 4096) before being stamped.

When capture fires, the SDK writes:

Attribute	Source
`botanu.eval.input_content`	`set_input_content(text)`
`botanu.eval.output_content`	`set_output_content(text)`

PII is scrubbed in-process by default before the attribute is written — regex patterns for email, phone, SSN, credit card, IPs, JWTs, and common API keys. Optional Presidio NER adds name/address/medical-term detection (install with pip install botanu[pii-nlp]). Collector regex + evaluator Presidio remain downstream as belt-and-suspenders. See Content Capture for the full pipeline, opt-out knobs, and the event-level auto-capture path that botanu.event(...) provides.

set_request_params()

Record request parameters for analysis:

tracker.set_request_params(
    temperature=0.7,
    top_p=0.9,
    max_tokens=1000,
    stop_sequences=["END"],
    frequency_penalty=0.5,
    presence_penalty=0.3,
)

set_streaming()

Mark as a streaming request:

tracker.set_streaming(True)

set_cache_hit()

Mark as a cache hit (for semantic caching):

tracker.set_cache_hit(True)

set_attempt()

Track retry attempts:

tracker.set_attempt(2)  # Second attempt

set_finish_reason()

Record the stop reason:

tracker.set_finish_reason("stop")  # or "length", "content_filter", etc.

set_error()

Record errors (automatically called on exceptions):

try:
    response = await client.chat(...)
except openai.RateLimitError as e:
    tracker.set_error(e)
    raise

add_metadata()

Add custom attributes:

tracker.add_metadata(
    prompt_version="v2.1",
    experiment_id="exp-123",
)

Operation Types

Use ModelOperation constants for the operation parameter:

from botanu.tracking.llm import track_llm_call, ModelOperation

# Chat completion
with track_llm_call(provider="openai", model="gpt-4", operation=ModelOperation.CHAT):
    ...

# Embeddings
with track_llm_call(provider="openai", model="text-embedding-3-small", operation=ModelOperation.EMBEDDINGS):
    ...

# Text completion (legacy)
with track_llm_call(provider="openai", model="davinci", operation=ModelOperation.TEXT_COMPLETION):
    ...

Available operations:

Constant	Value	Use Case
`CHAT`	`chat`	Chat completions (default)
`TEXT_COMPLETION`	`text_completion`	Legacy completions
`EMBEDDINGS`	`embeddings`	Embedding generation
`GENERATE_CONTENT`	`generate_content`	Generic content generation
`EXECUTE_TOOL`	`execute_tool`	Tool/function execution
`CREATE_AGENT`	`create_agent`	Agent creation
`INVOKE_AGENT`	`invoke_agent`	Agent invocation
`RERANK`	`rerank`	Reranking
`IMAGE_GENERATION`	`image_generation`	Image generation
`SPEECH_TO_TEXT`	`speech_to_text`	Transcription
`TEXT_TO_SPEECH`	`text_to_speech`	Speech synthesis

Provider Normalization

Provider names are automatically normalized:

Input	Normalized
`openai`, `OpenAI`	`openai`
`azure_openai`, `azure-openai`	`azure.openai`
`anthropic`, `claude`	`anthropic`
`bedrock`, `aws_bedrock`	`aws.bedrock`
`vertex`, `vertexai`, `gemini`	`gcp.vertex_ai`
`cohere`	`cohere`
`mistral`, `mistralai`	`mistral`
`together`, `togetherai`	`together`
`groq`	`groq`

Tool/Function Tracking

Track tool calls triggered by LLMs:

from botanu.tracking.llm import track_tool_call

with track_tool_call(tool_name="search_database", tool_call_id="call_abc123") as tool:
    results = await do_work(query)
    tool.set_result(
        success=True,
        items_returned=len(results),
        bytes_processed=1024,
    )

ToolTracker Methods

# Set execution result
tool.set_result(
    success=True,
    items_returned=10,
    bytes_processed=2048,
)

# Set tool call ID from LLM response
tool.set_tool_call_id("call_abc123")

# Record error
tool.set_error(exception)

# Add custom metadata
tool.add_metadata(query_type="semantic")

Standalone Helpers

For cases where you can't use context managers:

set_llm_attributes()

from botanu.tracking.llm import set_llm_attributes

set_llm_attributes(
    provider="openai",
    model="gpt-4",
    operation="chat",
    input_tokens=150,
    output_tokens=200,
    streaming=True,
    provider_request_id="chatcmpl-...",
)

set_token_usage()

from botanu.tracking.llm import set_token_usage

set_token_usage(
    input_tokens=150,
    output_tokens=200,
    cached_tokens=50,
)

Decorator for Auto-Instrumentation

For wrapping existing client methods:

from botanu.tracking.llm import llm_instrumented

class MyOpenAIClient:
    @llm_instrumented(provider="openai", tokens_from_response=True)
    def chat(self, model: str, messages: list):
        return openai.chat.completions.create(model=model, messages=messages)

Metrics

The SDK automatically records these metrics:

Metric	Type	Description
`gen_ai.client.token.usage`	Histogram	Token counts by type
`gen_ai.client.operation.duration`	Histogram	Operation duration in seconds
`botanu.gen_ai.attempts`	Counter	Request attempts (including retries)

Example: multi-provider fallback

from anthropic import AsyncAnthropic, RateLimitError
from openai import AsyncOpenAI

import botanu
from botanu.tracking.llm import track_llm_call

anthropic = AsyncAnthropic()
openai = AsyncOpenAI()


@botanu.event(
    workflow="process-with-fallback",
    event_id=lambda data: data["id"],
    customer_id=lambda data: data["customer_id"],
)
async def process_with_fallback(data):
    try:
        with track_llm_call(provider="anthropic", model="claude-3-opus") as tracker:
            tracker.set_attempt(1)
            response = await anthropic.messages.create(
                model="claude-3-opus-20240229",
                max_tokens=1024,
                messages=[{"role": "user", "content": data["prompt"]}],
            )
            tracker.set_tokens(
                input_tokens=response.usage.input_tokens,
                output_tokens=response.usage.output_tokens,
            )
            botanu.emit_outcome(value_type="items_processed", value_amount=1)
            return response.content[0].text

    except RateLimitError:
        with track_llm_call(provider="openai", model="gpt-4") as tracker:
            tracker.set_attempt(2)
            response = await openai.chat.completions.create(
                model="gpt-4",
                messages=[{"role": "user", "content": data["prompt"]}],
            )
            tracker.set_tokens(
                input_tokens=response.usage.prompt_tokens,
                output_tokens=response.usage.completion_tokens,
            )
            botanu.emit_outcome(value_type="items_processed", value_amount=1)
            return response.choices[0].message.content

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM Tracking

`track_llm_call`

What Gets Recorded

LLMTracker Methods

set_tokens()

set_request_id()

set_response_model()

set_input_content() / set_output_content()

set_request_params()

set_streaming()

set_cache_hit()

set_attempt()

set_finish_reason()

set_error()

add_metadata()

Operation Types

Provider Normalization

Tool/Function Tracking

ToolTracker Methods

Standalone Helpers

set_llm_attributes()

set_token_usage()

Decorator for Auto-Instrumentation

Metrics

Example: multi-provider fallback

See Also

FilesExpand file tree

llm-tracking.md

Latest commit

History

llm-tracking.md

File metadata and controls

LLM Tracking

track_llm_call

What Gets Recorded

LLMTracker Methods

set_tokens()

set_request_id()

set_response_model()

set_input_content() / set_output_content()

set_request_params()

set_streaming()

set_cache_hit()

set_attempt()

set_finish_reason()

set_error()

add_metadata()

Operation Types

Provider Normalization

Tool/Function Tracking

ToolTracker Methods

Standalone Helpers

set_llm_attributes()

set_token_usage()

Decorator for Auto-Instrumentation

Metrics

Example: multi-provider fallback

See Also

`track_llm_call`