Skip to content

fix(logging): show cache hits in Stats log and fix duplicate metadata restore#1666

Merged
Pouyanpi merged 1 commit intodevelopfrom
fix/logging-cache-hits-in-stats
Feb 26, 2026
Merged

fix(logging): show cache hits in Stats log and fix duplicate metadata restore#1666
Pouyanpi merged 1 commit intodevelopfrom
fix/logging-cache-hits-in-stats

Conversation

@Pouyanpi
Copy link
Collaborator

Description

  • add cache_hits counter to LLMStats so the processing summary shows how many calls were served from cache, e.g. Stats: 5 total calls (3 from cache), ...
  • rename LLM Stats: label to Stats: in generate_async to align with generate_events_async and process_events_async which already use Stats:
  • fix copy-paste bug in get_from_cache_and_restore_stats where restore_llm_metadata_from_cache was called 3 times instead of once

Test plan

Unit tests:

Save the following script (cache_stats_display.py) and run it against the nemoguards_cache config:

import asyncio
import logging

from nemoguardrails import LLMRails, RailsConfig

logging.basicConfig(level=logging.INFO)

config = RailsConfig.from_path("./examples/configs/nemoguards_cache")
rails = LLMRails(config=config, verbose=True)

caches = rails.runtime.registered_action_params.get("model_caches", {})
messages = [{"role": "user", "content": "What is the company policy on paid time off?"}]


async def main():
    print("\n========== REQUEST 1 (cold cache) ==========\n")
    await rails.generate_async(messages=messages)
    for name, cache in caches.items():
        print(f"  {name}: {cache.get_stats()}")

    print("\n========== REQUEST 2 (warm cache) ==========\n")
    await rails.generate_async(messages=messages)
    for name, cache in caches.items():
        print(f"  {name}: {cache.get_stats()}")


asyncio.run(main())

Then filter for the Stats line:

poetry run python cache_stats_display.py 2>&1 | grep "Stats:"

Expected output:

14:13:42.597 | Total processing took 6.35 seconds. Stats: 4 total calls, 6.08 total time, 2734 total tokens, 2333 total prompt tokens, 401 total completion tokens, [0.51, 0.29, 4.85, 0.43] as latencies
14:13:48.738 | Total processing took 6.14 seconds. Stats: 4 total calls (2 from cache), 6.07 total time, 2898 total tokens, 2415 total prompt tokens, 483 total completion tokens, [0.0, 0.0, 5.49, 0.59] as latencies

@Pouyanpi Pouyanpi requested a review from tgasser-nv February 25, 2026 13:21
@Pouyanpi Pouyanpi self-assigned this Feb 25, 2026
@Pouyanpi Pouyanpi added the bug Something isn't working label Feb 25, 2026
@Pouyanpi Pouyanpi added this to the v0.21 milestone Feb 25, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 25, 2026

Greptile Summary

This PR makes three focused improvements to LLM cache observability and correctness:

  • Adds cache_hits counter to LLMStats: When LLM calls are served from cache, the stats summary now reports how many were cache hits (e.g., 4 total calls (2 from cache)), giving better visibility into cache effectiveness.
  • Fixes copy-paste bug: restore_llm_metadata_from_cache was called 3 times instead of once in get_from_cache_and_restore_stats — the two duplicate calls are removed. While the function is idempotent (so the bug caused no data corruption), it was unnecessary repeated work.
  • Aligns log label: Renames LLM Stats: to Stats: in generate_async to match the label already used in generate_events_async and process_events_async.

All changes are well-scoped, with corresponding test updates covering the new cache_hits assertion.

Confidence Score: 5/5

  • This PR is safe to merge — it adds a counter, fixes a copy-paste bug, and aligns a log label with no behavioral risk.
  • All changes are minimal, well-understood, and low-risk. The cache_hits counter is additive and doesn't affect existing stats. The duplicate metadata restore removal is a clear bug fix for idempotent code. The label rename matches existing conventions. Tests cover the new behavior.
  • No files require special attention.

Important Files Changed

Filename Overview
nemoguardrails/logging/stats.py Adds cache_hits counter to _get_empty_stats() and updates __str__ to conditionally display cache hit count alongside total calls. Clean and correct implementation.
nemoguardrails/llm/cache/utils.py Adds llm_stats.inc("cache_hits") to restore_llm_stats_from_cache and removes duplicate restore_llm_metadata_from_cache calls (copy-paste bug fix). Both changes are correct.
nemoguardrails/rails/llm/llmrails.py Renames LLM Stats: to Stats: in generate_async log message, aligning with the label already used in generate_events_async and process_events_async.
tests/test_cache_utils.py Adds cache_hits assertions to three existing test methods to verify the new counter is properly incremented on cache restore operations.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[get_from_cache_and_restore_stats] --> B{Cache hit?}
    B -- No --> C[Return None]
    B -- Yes --> D[Extract result, stats, metadata]
    D --> E{cached_stats exists?}
    E -- Yes --> F[restore_llm_stats_from_cache]
    F --> F1["inc(total_calls)"]
    F --> F2["inc(cache_hits) ✨ NEW"]
    F --> F3["inc(total_time, ...)"]
    F --> F4["inc(total_tokens, ...)"]
    E -- No --> G{cached_metadata exists?}
    F1 & F2 & F3 & F4 --> G
    G -- Yes --> H["restore_llm_metadata_from_cache (called once — fixed)"]
    G -- No --> I[Append to processing_log]
    H --> I
    I --> J[Return result]
Loading

Last reviewed commit: 2fd77cd

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Collaborator

@tgasser-nv tgasser-nv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, the new logging is far clearer about which responses are returned from cache

@Pouyanpi Pouyanpi merged commit eeebc93 into develop Feb 26, 2026
15 checks passed
@Pouyanpi Pouyanpi deleted the fix/logging-cache-hits-in-stats branch February 26, 2026 07:26
@codecov
Copy link

codecov bot commented Feb 26, 2026

Codecov Report

❌ Patch coverage is 85.71429% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nemoguardrails/logging/stats.py 80.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants