feat: add Redis L2 cache with daily sync by jor2 · Pull Request #66 · terraform-ibm-modules/tim-mcp

jor2 · 2026-01-27T12:08:26Z

Summary

Adds optional Redis as a persistent L2 cache layer to reduce API calls and improve response times.

Architecture:

Request → L1 Memory (5min) → L2 Redis (48h) → API

L1 hit: instant return
L1 miss, L2 hit: populate L1, return
Both miss: fetch from API, populate both

Changes

redis_cache.py: Async Redis backend using aiocache with fresh/stale TTL support
tiered_cache.py: Minimal AsyncTieredCache wrapper for L1+L2 flow
sync_cache.py: Daily sync script with parallel fetching (asyncio.gather)
cache-sync.yml: GitHub Actions workflow (3 AM UTC daily)
server.py: Properly wires AsyncTieredCache when Redis is enabled
Config options: TIM_REDIS_ENABLED, TIM_REDIS_URL, TIM_L1_CACHE_TTL

Configuration

Variable	Default	Description
`TIM_REDIS_ENABLED`	`false`	Enable Redis L2 cache
`TIM_REDIS_URL`	`redis://localhost:6379`	Connection URL (supports `rediss://` for IBM Cloud)
`TIM_L1_CACHE_TTL`	`300`	L1 memory cache TTL (5 min)

Dependencies

Replaced redis with aiocache[redis] for simpler connection pooling and serialization

Test plan

Run tests: uv run pytest
Test with local Redis: docker run -d -p 6379:6379 redis:7-alpine
Verify sync script: python scripts/sync_cache.py

Implements in-memory caching with TTL/LRU eviction and two-tier rate limiting (global + per-IP) to prevent upstream API throttling and service abuse. Core Implementation: - InMemoryCache class using cachetools.TTLCache - LRU eviction when maxsize exceeded - TTL-based expiration (default: 3600s) - Stale cache fallback for graceful degradation - Thread-safe with threading.RLock - RateLimiter class with sliding window algorithm - Global: 30 req/min across all clients - Per-IP: 10 req/min per client (HTTP mode only) - Returns 429 only if rate limited AND no cache available - Serves stale cache when rate limited Client Integration: - Applied @with_rate_limit decorator to all API methods - GitHubClient: 6 methods - TerraformClient: 5 methods - Updated cache initialization to use InMemoryCache Server Integration: - Global rate limiter initialized on startup - Rate limiters injected into both clients - PerIPRateLimitMiddleware for HTTP mode - Extracts client IP from headers (X-Forwarded-For, X-Real-IP) - Adds rate limit headers to responses - Bypasses /health endpoint Configuration: - Added TIM_CACHE_MAXSIZE (default: 1000) - Added TIM_GLOBAL_RATE_LIMIT (default: 30) - Added TIM_PER_IP_RATE_LIMIT (default: 10) - Added TIM_RATE_LIMIT_WINDOW (default: 60) - BREAKING: Removed TIM_CACHE_DIR (file-based cache deprecated) Testing: - 25 unit tests (10 for cache, 15 for rate limiter) - All tests passing - Validated stale cache fallback - Verified rate limiting behavior Resolves #59 Related to epic https://github.ibm.com/GoldenEye/issues/issues/17013

- Add new environment variables to README (TIM_CACHE_MAXSIZE, TIM_GLOBAL_RATE_LIMIT, TIM_PER_IP_RATE_LIMIT, TIM_RATE_LIMIT_WINDOW) - Document removal of TIM_CACHE_DIR (breaking change) - Update Code Engine deployment guide with rate limiting configuration - Update deployment command examples with new env vars - Revise Next Steps section to reflect implemented features

- Change mcp.app.add_middleware to mcp.add_middleware - FastMCP has add_middleware method directly on the object - Fixes AttributeError during HTTP mode initialization

Replace custom caching and rate limiting code with established libraries to reduce maintenance burden and improve reliability. Changes: - Use 'limits' library (same as slowapi) for rate limiting - Simplify InMemoryCache to thin wrapper around cachetools - Add slowapi and limits dependencies with pinned versions - Reduce code from 599 to 221 lines (63% reduction) - Maintain all functionality including stale cache fallback - All 25 unit tests passing Dependencies added: - limits==5.6.0 (battle-tested rate limiting) - slowapi==0.1.9 (FastAPI rate limiting framework) - cachetools==5.5.0 (updated to exact version) Benefits: - Less custom code to maintain - Battle-tested algorithms and implementations - Improved reliability with well-known libraries - No breaking changes to existing API

Reorganize environment variables into two clear sections to reduce friction for new users: - Basic Configuration: Only GITHUB_TOKEN and TIM_LOG_LEVEL - Advanced Configuration: Production/hosting settings This makes it clear that simple stdio users only need to worry about the GitHub token, while production deployments can tune caching and rate limiting separately.

Fix middleware registration to use FastMCP's http_app() method which returns the underlying Starlette app. This allows proper registration of Starlette middleware. The add_middleware() method on FastMCP itself expects a different middleware format, so we need to get the Starlette app first and add our BaseHTTPMiddleware to that.

Add detailed docstring explaining why this module exists and why we can't use an existing library: - cachetools lacks stale cache support - No actively maintained alternatives for general-purpose caching - expirecache (unmaintained since 2015) - requests-cache (HTTP-specific, not general-purpose) Clarifies that 90% of functionality comes from cachetools and only the stale cache feature (10%) is custom code, making this a minimal wrapper justified by lack of alternatives.

Replace bullet-point justification with concise paragraph explaining what this module provides beyond cachetools: stale cache support for serving expired entries during rate limiting.

Add concise paragraph explaining what this module provides beyond the limits library: decorator integration with stale cache fallback for graceful degradation when rate limits are exceeded.

Change requires-python from >=3.11 to ==3.12.* to match PR #58 (IBM Code Engine deployment) which uses UBI8 Python 3.12. Also update ruff target-version to py312 for consistency. All 25 unit tests continue to pass with Python 3.12.

Update lock file to reflect requires-python = ==3.12.* constraint. Changes: - Remove async-timeout (not needed in Python 3.12+) - Remove backports-tarfile (not needed in Python 3.12+) - Remove Python 3.11/3.13/3.14 wheel URLs - Simplify version markers

- Remove slowapi from dependencies (only limits library is used) - Add floor of 1 to Retry-After header to prevent negative values - Update rate_limiter docstring to remove slowapi reference

Replace unbounded dict with TTLCache for stale cache to prevent unbounded memory growth. Stale cache uses 24x longer TTL and 2x larger maxsize than primary cache for graceful degradation.

- Fix critical bug: middleware now calls record_request() to actually enforce per-IP rate limiting - Fix test assertion: stale_size key matches implementation - Add 12 unit tests for PerIPRateLimitMiddleware covering: - Rate limit enforcement and bypass paths - IP extraction from X-Forwarded-For and X-Real-IP headers - Rate limit headers in responses - Per-IP isolation - Document IP header trust assumption for production deployments

@Retry

- Add atomic try_acquire() method to RateLimiter to prevent race conditions - Add missing @Retry decorator to get_repository_tree() - Centralize cache key generation with helper functions to avoid duplication - Use underscore for unused lambda parameters to fix linter warnings - Reduce test sleep from 11s to 1.1s for faster test execution

@Retry

- Fix decorator order: @with_rate_limit now wraps @Retry to prevent retrying rate-limited requests - Remove duplicate cache key generation by having the decorator handle both fresh cache lookup and caching results - Add @Retry decorator to resolve_version for consistency - Refactor from global rate limiter to dependency injection via shared context module, eliminating module-level state - Add configurable stale cache TTL and size multipliers - Add /stats endpoint in HTTP mode for observability - Add logging for stale cache fallback exceptions - Update tests to use freezegun and fix test assertions

- Replace dual-cache design with single TTLCache + timestamp tracking - Extract common decorator and cache key logic into clients/base.py - Reduce GitHubClient from 822 to 294 lines (64% reduction) - Reduce TerraformClient from 583 to 208 lines (64% reduction) - Remove stale cache multiplier config options (use sensible defaults) - Update tests to match simplified cache stats structure

- Include kwarg keys in cache key generation to prevent collisions - Fix timestamp memory leak by cleaning up orphaned entries on cache miss - Make rate_limit_key configurable in with_rate_limit decorator - Extract shared check_rate_limit_response helper to reduce client duplication

- Use Starlette Middleware class with http_app() middleware parameter - Run uvicorn directly instead of mcp.run() to use configured app - Fixes middleware not being applied to MCP endpoints

- Track hits, last_accessed per cache key - Add hit_rate to /stats summary - Add /stats/cache endpoint with per-key details (?top=N parameter)

- Fix line length violations in client and cache modules - Remove unused BaseAPIClient class from base.py - Remove unnecessary try/except blocks in cache.py - Add documentation for stale_ttl_multiplier magic number - Use datetime.UTC alias instead of timezone.utc

Add optional Redis as a persistent L2 cache layer with a daily sync job to pre-warm the cache. Architecture: - L1 (memory): 5 min TTL for fast local reads - L2 (Redis): 48h TTL for persistence across restarts New files: - tim_mcp/utils/redis_cache.py: Async Redis cache backend - tim_mcp/utils/tiered_cache.py: L1+L2 cache wrapper - scripts/sync_cache.py: Daily sync script - .github/workflows/cache-sync.yml: 3 AM UTC daily workflow Configuration: - TIM_REDIS_ENABLED: Enable Redis (default: false) - TIM_REDIS_URL: Connection URL (supports rediss:// for TLS) - TIM_L1_CACHE_TTL: Memory cache TTL when Redis enabled

- Replace single TTLCache + timestamp tracking with two TTLCache instances - Remove unused stats methods from cache, rate limiter, and middleware - Remove X-RateLimit-* headers from middleware responses - Rename cache_ttl to cache_fresh_ttl, add cache_evict_ttl config - Remove /stats endpoints from HTTP mode - Update tests to match simplified API - Collapse advanced config in README - Update hypothesis to 6.151.0

Integrate simplified caching/rate limiting from PR #60: - Use fresh_ttl/evict_ttl cache model - Remove stats endpoints (per review feedback) - Keep Redis L2 lifecycle management

- Switch to aiocache for Redis backend (simpler, battle-tested) - Wire up AsyncTieredCache properly so L2 actually works - Remove dead code: unused TieredCache class, get_stats, get_sync/set_sync - Remove unused redis_cache from context module - Parallelize sync script with asyncio.gather (5x faster) - Add rate limiting throttle to sync script

Jordan-Williams2 and others added 29 commits January 13, 2026 00:16

fix: correct FastMCP middleware integration

c5327ea

- Change mcp.app.add_middleware to mcp.add_middleware - FastMCP has add_middleware method directly on the object - Fixes AttributeError during HTTP mode initialization

Merge branch 'main' into feat/caching-rate-limiting

a9fb701

docs: simplify cache.py module docstring

f0340c1

Replace bullet-point justification with concise paragraph explaining what this module provides beyond cachetools: stale cache support for serving expired entries during rate limiting.

docs: improve rate_limiter.py module docstring

890cc8a

Add concise paragraph explaining what this module provides beyond the limits library: decorator integration with stale cache fallback for graceful degradation when rate limits are exceeded.

fix: upgrade version

e768f22

fix: remove unused slowapi dependency and prevent negative Retry-After

728eee2

- Remove slowapi from dependencies (only limits library is used) - Add floor of 1 to Retry-After header to prevent negative values - Update rate_limiter docstring to remove slowapi reference

fix: bound stale cache size using TTLCache

b151144

Replace unbounded dict with TTLCache for stale cache to prevent unbounded memory growth. Stale cache uses 24x longer TTL and 2x larger maxsize than primary cache for graceful degradation.

fix: use FastMCP middleware parameter for proper HTTP rate limiting

8ee0908

- Use Starlette Middleware class with http_app() middleware parameter - Run uvicorn directly instead of mcp.run() to use configured app - Fixes middleware not being applied to MCP endpoints

feat: add cache hit tracking and /stats/cache endpoint

5d73936

- Track hits, last_accessed per cache key - Add hit_rate to /stats summary - Add /stats/cache endpoint with per-key details (?top=N parameter)

fix: add /stats/cache to rate limit bypass paths

86c7368

fix: use ISO 8601 UTC timestamps in cache stats

5adec29

fix: truncate microseconds from cache timestamps

611c492

fix: add timezone field to cache stats summary

a10d60e

feat: add link to detailed cache stats in /stats endpoint

f1ab861

jor2 requested a review from vburckhardt as a code owner January 27, 2026 12:08

jor2 requested a review from daniel-butler-irl as a code owner January 27, 2026 12:08

Jordan-Williams2 added 2 commits January 27, 2026 12:21

Merge branch 'main' into feat/caching-rate-limiting

557e6ec

Merge branch 'feat/caching-rate-limiting' into feat/redis-l2-cache

5fb00eb

jor2 self-assigned this Jan 27, 2026

Jordan-Williams2 added 6 commits January 27, 2026 23:45

refactor: simplify rate limiter decorator

26fb421

refactor: remove unused rate limit tracking from GitHubClient

a8f4575

Merge branch 'main' into feat/caching-rate-limiting

f95343d

Merge branch 'feat/caching-rate-limiting' into feat/redis-l2-cache

4fa3daa

Integrate simplified caching/rate limiting from PR #60: - Use fresh_ttl/evict_ttl cache model - Remove stats endpoints (per review feedback) - Keep Redis L2 lifecycle management

jor2 marked this pull request as draft January 28, 2026 10:50

Base automatically changed from feat/caching-rate-limiting to main February 5, 2026 18:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Redis L2 cache with daily sync#66

feat: add Redis L2 cache with daily sync#66
jor2 wants to merge 37 commits intomainfrom
feat/redis-l2-cache

jor2 commented Jan 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jor2 commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Configuration

Dependencies

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jor2 commented Jan 27, 2026 •

edited

Loading