fix(logging): show cache hits in Stats log and fix duplicate metadata restore#1666
Merged
fix(logging): show cache hits in Stats log and fix duplicate metadata restore#1666
Conversation
Contributor
Greptile SummaryThis PR makes three focused improvements to LLM cache observability and correctness:
All changes are well-scoped, with corresponding test updates covering the new
|
| Filename | Overview |
|---|---|
| nemoguardrails/logging/stats.py | Adds cache_hits counter to _get_empty_stats() and updates __str__ to conditionally display cache hit count alongside total calls. Clean and correct implementation. |
| nemoguardrails/llm/cache/utils.py | Adds llm_stats.inc("cache_hits") to restore_llm_stats_from_cache and removes duplicate restore_llm_metadata_from_cache calls (copy-paste bug fix). Both changes are correct. |
| nemoguardrails/rails/llm/llmrails.py | Renames LLM Stats: to Stats: in generate_async log message, aligning with the label already used in generate_events_async and process_events_async. |
| tests/test_cache_utils.py | Adds cache_hits assertions to three existing test methods to verify the new counter is properly incremented on cache restore operations. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[get_from_cache_and_restore_stats] --> B{Cache hit?}
B -- No --> C[Return None]
B -- Yes --> D[Extract result, stats, metadata]
D --> E{cached_stats exists?}
E -- Yes --> F[restore_llm_stats_from_cache]
F --> F1["inc(total_calls)"]
F --> F2["inc(cache_hits) ✨ NEW"]
F --> F3["inc(total_time, ...)"]
F --> F4["inc(total_tokens, ...)"]
E -- No --> G{cached_metadata exists?}
F1 & F2 & F3 & F4 --> G
G -- Yes --> H["restore_llm_metadata_from_cache (called once — fixed)"]
G -- No --> I[Append to processing_log]
H --> I
I --> J[Return result]
Last reviewed commit: 2fd77cd
tgasser-nv
approved these changes
Feb 26, 2026
Collaborator
tgasser-nv
left a comment
There was a problem hiding this comment.
Looks great, the new logging is far clearer about which responses are returned from cache
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
cache_hitscounter toLLMStatsso the processing summary shows how many calls were served from cache, e.g.Stats: 5 total calls (3 from cache), ...LLM Stats:label toStats:ingenerate_asyncto align withgenerate_events_asyncandprocess_events_asyncwhich already useStats:get_from_cache_and_restore_statswhererestore_llm_metadata_from_cachewas called 3 times instead of onceTest plan
Unit tests:
Save the following script (cache_stats_display.py) and run it against the
nemoguards_cacheconfig:Then filter for the Stats line:
Expected output: