fix(logging): show cache hits in Stats log and fix duplicate metadata restore by Pouyanpi · Pull Request #1666 · NVIDIA-NeMo/Guardrails

Pouyanpi · 2026-02-25T13:21:44Z

Description

add cache_hits counter to LLMStats so the processing summary shows how many calls were served from cache, e.g. Stats: 5 total calls (3 from cache), ...
rename LLM Stats: label to Stats: in generate_async to align with generate_events_async and process_events_async which already use Stats:
fix copy-paste bug in get_from_cache_and_restore_stats where restore_llm_metadata_from_cache was called 3 times instead of once

Test plan

Unit tests:

Save the following script (cache_stats_display.py) and run it against the nemoguards_cache config:

import asyncio
import logging

from nemoguardrails import LLMRails, RailsConfig

logging.basicConfig(level=logging.INFO)

config = RailsConfig.from_path("./examples/configs/nemoguards_cache")
rails = LLMRails(config=config, verbose=True)

caches = rails.runtime.registered_action_params.get("model_caches", {})
messages = [{"role": "user", "content": "What is the company policy on paid time off?"}]


async def main():
    print("\n========== REQUEST 1 (cold cache) ==========\n")
    await rails.generate_async(messages=messages)
    for name, cache in caches.items():
        print(f"  {name}: {cache.get_stats()}")

    print("\n========== REQUEST 2 (warm cache) ==========\n")
    await rails.generate_async(messages=messages)
    for name, cache in caches.items():
        print(f"  {name}: {cache.get_stats()}")


asyncio.run(main())

Then filter for the Stats line:

poetry run python cache_stats_display.py 2>&1 | grep "Stats:"

Expected output:

14:13:42.597 | Total processing took 6.35 seconds. Stats: 4 total calls, 6.08 total time, 2734 total tokens, 2333 total prompt tokens, 401 total completion tokens, [0.51, 0.29, 4.85, 0.43] as latencies
14:13:48.738 | Total processing took 6.14 seconds. Stats: 4 total calls (2 from cache), 6.07 total time, 2898 total tokens, 2415 total prompt tokens, 483 total completion tokens, [0.0, 0.0, 5.49, 0.59] as latencies

… restore

greptile-apps · 2026-02-25T13:23:51Z

Greptile Summary

This PR makes three focused improvements to LLM cache observability and correctness:

Adds cache_hits counter to LLMStats: When LLM calls are served from cache, the stats summary now reports how many were cache hits (e.g., 4 total calls (2 from cache)), giving better visibility into cache effectiveness.
Fixes copy-paste bug: restore_llm_metadata_from_cache was called 3 times instead of once in get_from_cache_and_restore_stats — the two duplicate calls are removed. While the function is idempotent (so the bug caused no data corruption), it was unnecessary repeated work.
Aligns log label: Renames LLM Stats: to Stats: in generate_async to match the label already used in generate_events_async and process_events_async.

All changes are well-scoped, with corresponding test updates covering the new cache_hits assertion.

Confidence Score: 5/5

This PR is safe to merge — it adds a counter, fixes a copy-paste bug, and aligns a log label with no behavioral risk.
All changes are minimal, well-understood, and low-risk. The cache_hits counter is additive and doesn't affect existing stats. The duplicate metadata restore removal is a clear bug fix for idempotent code. The label rename matches existing conventions. Tests cover the new behavior.
No files require special attention.

Important Files Changed

Filename	Overview
nemoguardrails/logging/stats.py	Adds `cache_hits` counter to `_get_empty_stats()` and updates `__str__` to conditionally display cache hit count alongside total calls. Clean and correct implementation.
nemoguardrails/llm/cache/utils.py	Adds `llm_stats.inc("cache_hits")` to `restore_llm_stats_from_cache` and removes duplicate `restore_llm_metadata_from_cache` calls (copy-paste bug fix). Both changes are correct.
nemoguardrails/rails/llm/llmrails.py	Renames `LLM Stats:` to `Stats:` in `generate_async` log message, aligning with the label already used in `generate_events_async` and `process_events_async`.
tests/test_cache_utils.py	Adds `cache_hits` assertions to three existing test methods to verify the new counter is properly incremented on cache restore operations.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[get_from_cache_and_restore_stats] --> B{Cache hit?}
    B -- No --> C[Return None]
    B -- Yes --> D[Extract result, stats, metadata]
    D --> E{cached_stats exists?}
    E -- Yes --> F[restore_llm_stats_from_cache]
    F --> F1["inc(total_calls)"]
    F --> F2["inc(cache_hits) ✨ NEW"]
    F --> F3["inc(total_time, ...)"]
    F --> F4["inc(total_tokens, ...)"]
    E -- No --> G{cached_metadata exists?}
    F1 & F2 & F3 & F4 --> G
    G -- Yes --> H["restore_llm_metadata_from_cache (called once — fixed)"]
    G -- No --> I[Append to processing_log]
    H --> I
    I --> J[Return result]

_{Last reviewed commit: 2fd77cd}

greptile-apps

_{4 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

tgasser-nv

Looks great, the new logging is far clearer about which responses are returned from cache

codecov · 2026-02-26T07:27:44Z

Codecov Report

❌ Patch coverage is 85.71429% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
nemoguardrails/logging/stats.py	80.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

fix(logging): show cache hits in Stats log and fix duplicate metadata…

2fd77cd

… restore

Pouyanpi requested a review from tgasser-nv February 25, 2026 13:21

Pouyanpi self-assigned this Feb 25, 2026

Pouyanpi added the bug Something isn't working label Feb 25, 2026

Pouyanpi added this to the v0.21 milestone Feb 25, 2026

greptile-apps bot reviewed Feb 25, 2026

View reviewed changes

tgasser-nv approved these changes Feb 26, 2026

View reviewed changes

Pouyanpi merged commit eeebc93 into develop Feb 26, 2026
15 checks passed

Pouyanpi deleted the fix/logging-cache-hits-in-stats branch February 26, 2026 07:26

tgasser-nv mentioned this pull request Feb 27, 2026

chore(iorails): Increase work queue concurrency and depth #1674

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(logging): show cache hits in Stats log and fix duplicate metadata restore#1666

fix(logging): show cache hits in Stats log and fix duplicate metadata restore#1666
Pouyanpi merged 1 commit intodevelopfrom
fix/logging-cache-hits-in-stats

Pouyanpi commented Feb 25, 2026

Uh oh!

greptile-apps bot commented Feb 25, 2026

Confidence Score: 5/5

Flowchart

Uh oh!

greptile-apps bot left a comment

Uh oh!

tgasser-nv left a comment

Uh oh!

Uh oh!

codecov bot commented Feb 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Pouyanpi commented Feb 25, 2026

Description

Test plan

Uh oh!

greptile-apps bot commented Feb 25, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

tgasser-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov bot commented Feb 26, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants