Skip to content

Conversation

@kzndotsh
Copy link
Contributor

@kzndotsh kzndotsh commented Jan 28, 2026

Pull Request

Description

Provide a clear summary of your changes and reference any related issues. Include the motivation behind these changes and list any new dependencies if applicable.

If your PR is related to an issue, please include the issue number below:

Related Issue: Closes #1164

Type of Change:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Test improvements

Guidelines

  • My code follows the style guidelines of this project (formatted with Ruff)

  • I have performed a self-review of my own code

  • I have commented my code, particularly in hard-to-understand areas

  • I have made corresponding changes to the documentation if needed

  • My changes generate no new warnings

  • I have tested this change

  • Any dependent changes have been merged and published in downstream modules

  • I have added all appropriate labels to this PR

  • I have followed all of these guidelines.

How Has This Been Tested? (if applicable)

Please describe how you tested your code. e.g describe what commands you ran, what arguments, and any config stuff (if applicable)

Screenshots (if applicable)

Please add screenshots to help explain your changes.

Additional Information

Please add any other information that is important to this PR.

Summary by Sourcery

Introduce a unified cache layer with optional Valkey (Redis-compatible) backend and wire it into permissions, guild config, prefixes, and health checks for shared, multi-guild-safe caching.

New Features:

  • Add a cache package providing TTL-based in-memory caching, Valkey-backed async cache backends, and a CacheService for Valkey connection management.
  • Integrate an optional Valkey-backed cache into permission controllers, the permission system, guild config, jail status, and prefix management to share cached data across restarts and processes.
  • Extend configuration, orchestration, and Docker Compose to support Valkey deployment and environment configuration, plus cache-aware health checks.

Enhancements:

  • Refactor existing caching code to use new async cache managers and backends, including adapting controllers and services to invalidate and fetch from the shared cache layer.
  • Update documentation and developer guides to describe the new cache architecture, Valkey configuration, multi-guild keying, and best practices for async cache usage.
  • Adjust test suites and performance benchmarks to exercise the new cache interfaces and async behavior.

Documentation:

  • Expand caching best practices and environment reference docs to cover the new cache package, Valkey configuration, and multi-guild-safe key design.
  • Update project documentation (AGENTS and config generator docs) to mention Valkey as an optional cache dependency and how to run it via Docker.

Tests:

  • Add unit tests for cache backends, cache service, cache managers with backends, and cache-related health checks, and update existing tests to work with async cache APIs and Valkey integration.

- Introduced a comprehensive caching layer that includes both in-memory and Valkey (Redis-compatible) backends.
- Added CacheService for managing Valkey connections, along with AsyncCacheBackend, InMemoryBackend, and TTLCache for efficient data storage and retrieval.
- Implemented GuildConfigCacheManager and JailStatusCache for managing guild configurations and jail statuses, respectively, with support for asynchronous operations.
- Enhanced cache management with TTL functionality to automatically expire entries, improving performance and resource utilization.
… documentation

- Revised caching best practices to include support for Valkey backend, multi-guild safety, and detailed architecture of the caching system.
- Enhanced descriptions of cache managers and their async operations, emphasizing the importance of proper cache invalidation and configuration.
- Updated environment variable documentation to include Valkey configuration options, ensuring clarity on required settings for optimal caching performance.
- Updated .env.example to include new Valkey environment variables for host, port, database, and password.
- Enhanced config.json.example by adding BOT_INTENTS to support additional bot functionalities, improving configuration clarity for users.
- Implemented an optional health check for Valkey (cache) in the database health command.
- Updated health check logic to report the status of the Valkey connection when configured.
- Enhanced documentation and comments to reflect the inclusion of Valkey in health checks.
- Added CacheSetupService to manage Valkey cache connections during bot initialization.
- Integrated CacheSetupService into BotSetupOrchestrator for seamless setup.
- Updated PermissionSetupService to utilize cache backend for database coordination.
- Enhanced logging for cache connection status and backend wiring.
- Added CacheService to the Tux bot for managing Valkey cache connections.
- Implemented cache connection closure logic in the bot's shutdown process, enhancing resource management.
- Improved logging for cache connection status during shutdown to aid in debugging.
…ission management

- Replaced in-memory cache with a backend cache for command permission fallbacks, improving scalability and performance.
- Updated cache invalidation and retrieval logic to utilize the new cache backend, ensuring consistent behavior across command permissions.
- Enhanced logging for cache operations to aid in debugging and monitoring.
- Added support for retrieving and storing guild prefixes using a cache backend, improving performance and scalability.
- Updated prefix retrieval and setting methods to utilize the cache, ensuring consistent behavior across guilds.
- Enhanced cache invalidation logic to remove prefixes from the cache when necessary, maintaining data integrity.
- Enhanced DatabaseCoordinator to support an optional cache backend for permission controllers, improving performance and scalability.
- Updated GuildConfigController and Permission controllers to utilize the cache backend for caching and invalidating configurations and permissions.
- Refactored cache logic to ensure consistent behavior across different controllers, maintaining data integrity and improving response times.
- Modified JailStatusCache calls in the jail and unjail modules to use async methods for cache invalidation, ensuring proper handling of asynchronous operations.
- Cleaned up import statements to consistently use the new cache import path across modules.
- Updated validation checks in utility functions for improved clarity and type safety.
…upport

- Changed `get` and `invalidate` methods in `CommunicationService` to be asynchronous, ensuring proper handling of cache operations.
- Updated import statements to reflect the correct cache manager path, maintaining consistency across the module.
- Updated settings.py to include new Valkey configuration parameters: VALKEY_HOST, VALKEY_PORT, VALKEY_DB, VALKEY_PASSWORD, and VALKEY_URL.
- Modified documentation to reflect the use of Valkey in environment variable settings.
- Implemented a computed property for constructing the Valkey URL from its components, enhancing flexibility in configuration.
- Deleted the TTLCache class and its associated methods from the cache module, streamlining the caching logic.
- Updated the shared module to reflect the removal of TTLCache, ensuring the `__all__` variable is now an empty list.
- This change simplifies the caching mechanism and prepares for future enhancements in cache management.
- Introduced unit tests for various cache components, including InMemoryBackend, ValkeyBackend, and CacheService, ensuring robust coverage of caching operations.
- Implemented tests for cache health checks and shared cache management, validating the behavior of cache interactions and configurations.
- Enhanced performance tests for GuildConfigCacheManager and JailStatusCache, ensuring efficient operation under load.
- Established a structured testing framework for cache-related functionalities, improving maintainability and reliability of the caching system.
- Added Valkey as an optional cache backend in the Additional tools section.
- Updated directory structure to reflect the new cache layer.
- Included instructions for starting Valkey and configuring the environment variable for shared cache usage.
- Clarified the caching mechanism, detailing the fallback to in-memory cache when Valkey is not set.
- Included Valkey version 6.1.1 in pyproject.toml and updated uv.lock to reflect the new package.
- This addition supports the integration of Valkey for caching functionalities in the project.
- Introduced a new service for Valkey in the Docker Compose file, enabling caching functionalities.
- Configured health checks, logging options, and volume management for persistent data storage.
- Updated the volumes section to include a dedicated volume for Valkey data.
@kzndotsh kzndotsh linked an issue Jan 28, 2026 that may be closed by this pull request
@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Jan 28, 2026

Reviewer's Guide

Introduces a new cache layer with optional Valkey (Redis-compatible) backend and wires it through bot setup, permission controllers, prefix manager, guild config/jail status managers, and health checks, replacing the old tux.shared.cache usage and updating docs/tests to match.

Sequence diagram for get_command_permission with Valkey backend

sequenceDiagram
    actor User
    participant Cog as CommandCog
    participant PermSystem as PermissionSystem
    participant Backend as AsyncCacheBackend
    participant DBCoord as DatabaseCoordinator
    participant CmdCtrl as PermissionCommandController
    participant DB as Database

    User->>Cog: invoke command
    Cog->>PermSystem: get_command_permission(guild_id, command_name)
    PermSystem->>Backend: get(PERM_FALLBACK_KEY_PREFIX + guild_id + command_name)
    alt cache hit
        Backend-->>PermSystem: cached_raw
        PermSystem->>PermSystem: unwrap_optional_perm(cached_raw)
        PermSystem-->>Cog: PermissionCommand | None
        Cog-->>User: authorize or deny
    else cache miss
        PermSystem->>PermSystem: compute command_names_to_check
        PermSystem->>DBCoord: command_permissions.get_command_permission(guild_id, command_name) or batch
        DBCoord->>CmdCtrl: get_command_permission(...)
        CmdCtrl->>DB: SELECT PermissionCommand WHERE guild_id, command_name
        DB-->>CmdCtrl: PermissionCommand | None
        CmdCtrl-->>DBCoord: result
        DBCoord-->>PermSystem: result
        PermSystem->>Backend: set(PERM_FALLBACK_KEY_PREFIX + guild_id + command_name, wrap_optional_perm(result), ttl_sec=PERM_FALLBACK_TTL)
        PermSystem-->>Cog: result
        Cog-->>User: authorize or deny
    end
Loading

Sequence diagram for JailStatusCache.get_or_fetch with shared backend

sequenceDiagram
    actor Moderator
    participant Cog as JailCog
    participant JailCache as JailStatusCache
    participant Backend as AsyncCacheBackend
    participant Service as ModerationService
    participant DB as Database

    Moderator->>Cog: check is_jailed(guild_id, user_id)
    Cog->>JailCache: get_or_fetch(guild_id, user_id, fetch_func)
    JailCache->>Backend: get(jail_status:guild_id:user_id)
    alt cached status exists
        Backend-->>JailCache: bool
        JailCache-->>Cog: bool
        Cog-->>Moderator: result
    else cache miss
        Backend-->>JailCache: None
        JailCache->>JailCache: acquire per-user lock
        JailCache->>Backend: get(jail_status:guild_id:user_id)
        alt filled while waiting
            Backend-->>JailCache: bool
            JailCache-->>Cog: bool
            Cog-->>Moderator: result
        else still missing
            Backend-->>JailCache: None
            JailCache->>Service: fetch_func()  (get_latest_jail_or_unjail_case)
            Service->>DB: query latest case
            DB-->>Service: Case | None
            Service-->>JailCache: is_jailed: bool
            JailCache->>Backend: set(jail_status:guild_id:user_id, is_jailed, ttl=JAIL_STATUS_TTL_SEC)
            JailCache-->>Cog: is_jailed
            Cog-->>Moderator: result
        end
    end
Loading

Class diagram for cache layer and permission-related integration

classDiagram
    class TTLCache {
        +float _ttl
        +int~optional~ _max_size
        +dict cache
        +__init__(ttl: float, max_size: int | None)
        +get(key: Any) Any | None
        +set(key: Any, value: Any) None
        +invalidate(key: Any | None) None
        +get_or_fetch(key: Any, fetch_fn: Callable) Any
        +size() int
        +clear() None
    }

    class AsyncCacheBackend {
        <<interface>>
        +get(key: str) Any | None
        +set(key: str, value: Any, ttl_sec: float | None) None
        +delete(key: str) None
        +exists(key: str) bool
    }

    class InMemoryBackend {
        -TTLCache _cache
        -float _default_ttl
        +__init__(default_ttl: float, max_size: int | None)
        +get(key: str) Any | None
        +set(key: str, value: Any, ttl_sec: float | None) None
        +delete(key: str) None
        +exists(key: str) bool
    }

    class ValkeyBackend {
        -Any _client
        -str _prefix
        +__init__(client: Any)
        +get(key: str) Any | None
        +set(key: str, value: Any, ttl_sec: float | None) None
        +delete(key: str) None
        +exists(key: str) bool
        -_key(key: str) str
    }

    class CacheService {
        -Valkey _client
        +__init__()
        +connect(url: str | None, **kwargs: Any) None
        +is_connected() bool
        +get_client() Valkey | None
        +ping() bool
        +close() None
    }

    class GuildConfigCacheManager {
        -static GuildConfigCacheManager _instance
        -AsyncCacheBackend _backend
        -TTLCache _cache
        -dict~int, asyncio.Lock~ _locks
        +__new__() GuildConfigCacheManager
        +set_backend(backend: AsyncCacheBackend) None
        +get(guild_id: int) dict~str, int | None~ | None
        +set(guild_id: int, audit_log_id: int | None, mod_log_id: int | None, jail_role_id: int | None, jail_channel_id: int | None) None
        +async_set(guild_id: int, audit_log_id: int | None, mod_log_id: int | None, jail_role_id: int | None, jail_channel_id: int | None) None
        +invalidate(guild_id: int) None
        +clear_all() None
        -_get_lock(guild_id: int) asyncio.Lock
        -_cache_key(guild_id: int) str
    }

    class JailStatusCache {
        -static JailStatusCache _instance
        -AsyncCacheBackend _backend
        -TTLCache _cache
        -dict~tuple~int,int~, asyncio.Lock~ _locks
        -asyncio.Lock _locks_lock
        +__new__() JailStatusCache
        +set_backend(backend: AsyncCacheBackend) None
        +get(guild_id: int, user_id: int) bool | None
        +set(guild_id: int, user_id: int, is_jailed: bool) None
        +get_or_fetch(guild_id: int, user_id: int, fetch_func: Callable) bool
        +async_set(guild_id: int, user_id: int, is_jailed: bool) None
        +invalidate(guild_id: int, user_id: int) None
        +invalidate_guild(guild_id: int) None
        +clear_all() None
        -_cache_key(guild_id: int, user_id: int) str
        -_get_lock_key(guild_id: int, user_id: int) tuple~int,int~
        -_get_lock(guild_id: int, user_id: int) asyncio.Lock
    }

    class TuxBot {
        +DatabaseService db_service
        +CacheService cache_service
        +DatabaseCoordinator _db_coordinator
    }

    class DatabaseCoordinator {
        -DatabaseService db
        -AsyncCacheBackend _cache_backend
        +__init__(db: DatabaseService | None, cache_backend: AsyncCacheBackend | None)
        +permission_ranks PermissionRankController
        +permission_assignments PermissionAssignmentController
        +command_permissions PermissionCommandController
    }

    class PermissionRankController {
        -AsyncCacheBackend _backend
        -TTLCache _ranks_cache
        -TTLCache _guild_ranks_cache
        +__init__(db: DatabaseService | None, cache_backend: AsyncCacheBackend | None)
        +create_permission_rank(...)
        +get_permission_ranks_by_guild(guild_id: int)
        +update_permission_rank(...)
        +bulk_create_permission_ranks(...)
        +delete_permission_rank(...)
    }

    class PermissionAssignmentController {
        -AsyncCacheBackend _backend
        -TTLCache _assignments_cache
        -TTLCache _user_rank_cache
        +__init__(db: DatabaseService | None, cache_backend: AsyncCacheBackend | None)
        +assign_permission_rank(...)
        +get_assignments_by_guild(guild_id: int)
        +remove_role_assignment(...)
        +get_user_permission_rank(guild_id: int, user_id: int, user_roles: list~int~) int
    }

    class PermissionCommandController {
        -AsyncCacheBackend _backend
        -TTLCache _command_permissions_cache
        +__init__(db: DatabaseService | None, cache_backend: AsyncCacheBackend | None)
        +set_command_permission(...)
        +get_command_permission(guild_id: int, command_name: str) PermissionCommand | None
    }

    class PermissionSystem {
        -TTLCache _command_permission_cache
        -AsyncCacheBackend _cache_backend
        +__init__(bot: TuxBot, db: DatabaseCoordinator)
        +set_command_permission(...)
        +get_command_permission(guild_id: int, command_name: str) PermissionCommand | None
        +batch_get_command_permissions(guild_id: int, command_names: list~str~) dict~str, PermissionCommand | None~
    }

    class PrefixManager {
        -dict~int,str~ _prefix_cache
        -bool _cache_loaded
        +get_prefix(guild_id: int | None) str
        +set_prefix(guild_id: int, prefix: str) None
        +load_all_prefixes() None
        +invalidate_cache(guild_id: int | None) None
    }

    AsyncCacheBackend <|.. InMemoryBackend
    AsyncCacheBackend <|.. ValkeyBackend

    InMemoryBackend o-- TTLCache
    GuildConfigCacheManager o-- TTLCache
    JailStatusCache o-- TTLCache

    CacheService --> ValkeyBackend
    CacheService --> Valkey

    TuxBot o-- CacheService
    TuxBot o-- DatabaseCoordinator

    DatabaseCoordinator o-- PermissionRankController
    DatabaseCoordinator o-- PermissionAssignmentController
    DatabaseCoordinator o-- PermissionCommandController

    PermissionRankController ..> AsyncCacheBackend
    PermissionAssignmentController ..> AsyncCacheBackend
    PermissionCommandController ..> AsyncCacheBackend

    PermissionSystem ..> AsyncCacheBackend
    PrefixManager ..> AsyncCacheBackend

    GuildConfigCacheManager ..> AsyncCacheBackend
    JailStatusCache ..> AsyncCacheBackend
Loading

File-Level Changes

Change Details Files
Introduce unified cache package with TTLCache, async backends (in-memory and Valkey), cache service, and shared managers for guild config and jail status, then wire it into bot setup and permission system.
  • Add src/tux/cache package with ttl-based in-memory cache, AsyncCacheBackend protocol, InMemoryBackend, ValkeyBackend with JSON serialization and key prefixing, CacheService for Valkey connection lifecycle, and managers for guild config and jail status including async APIs and stampede protection.
  • Expose cache primitives and managers via tux.cache init, including AsyncCacheBackendProtocol alias, and update imports across the codebase to use tux.cache instead of tux.shared.cache.
  • Add CacheSetupService to bot setup orchestrator to initialize Valkey when configured, wire the chosen backend (Valkey or shared InMemoryBackend) into GuildConfigCacheManager and JailStatusCache, and store CacheService on the bot for later use and graceful shutdown.
  • Update DatabaseCoordinator and permission controllers (PermissionRankController, PermissionAssignmentController, PermissionCommandController) to accept an optional AsyncCacheBackend and use it for get/set/delete of permission ranks, assignments, user ranks, and command permissions, while preserving existing TTLCache behavior as a fallback.
  • Extend PermissionSystem to use the cache backend for its command-permission fallback cache, sharing the same wrapping/unwrapping helpers used by PermissionCommandController.
src/tux/cache/__init__.py
src/tux/cache/ttl.py
src/tux/cache/backend.py
src/tux/cache/service.py
src/tux/cache/managers.py
src/tux/core/setup/cache_setup.py
src/tux/core/setup/orchestrator.py
src/tux/core/bot.py
src/tux/database/controllers/__init__.py
src/tux/database/controllers/permissions.py
src/tux/core/permission_system.py
Integrate caching backend into higher-level features (prefix manager, guild config controller, jail system, moderation communication service) and update their APIs to be async where needed.
  • Modify PrefixManager to read/write prefixes through the cache backend (Valkey or in-memory) using prefix:{guild_id} keys, update load_all_prefixes to backfill the backend, and make invalidate_cache async for backend invalidation.
  • Switch GuildConfigCacheManager usages in guild_config controller and communication_service to the new async API (await get/set/invalidate), ensuring cache invalidation happens after config changes and log-channel lookups.
  • Update JailStatusCache usages in jail system, moderation module, and tests to use the async API (await set/invalidate/clear_all), and add cache invalidation in tests before assertions that rely on fresh DB state.
  • Adjust communication_service.invalidate_guild_config_cache to be async and delegate to GuildConfigCacheManager.invalidate.
  • Make minor type-safety tweaks (e.g., del_afk runtime isinstance guard, chunks_list typing in info views) to satisfy new tooling and keep tests passing.
src/tux/core/prefix_manager.py
src/tux/database/controllers/guild_config.py
src/tux/services/moderation/communication_service.py
src/tux/modules/moderation/jail.py
src/tux/modules/moderation/__init__.py
tests/modules/test_jail_system.py
src/tux/modules/utility/__init__.py
src/tux/modules/info/views.py
Extend configuration, Docker compose, health checks, and docs to support optional Valkey deployment and document new caching behavior.
  • Add Valkey configuration fields (VALKEY_HOST/PORT/DB/PASSWORD/URL) and computed valkey_url to Config, ensure env-schema generator treats VALKEY_* as .env keys, and update env reference docs accordingly.
  • Add tux-valkey service and volume to compose.yaml for local Valkey deployment, and mention Valkey as an optional cache dependency in AGENTS.md and performance guidelines.
  • Extend scripts/db/health to run an optional Valkey health check (_check_cache) alongside DB health, returning skipped/healthy/unhealthy and failing the command when cache is unhealthy but configured.
  • Rewrite/expand caching best-practices docs to reflect the new cache package, backends, managers, multi-guild keying, async APIs, and Valkey-specific behavior, including updated examples for invalidation and tests.
  • Declare valkey Python dependency in pyproject.toml and remove old tux.shared.cache export/implementation in favor of the new cache package, keeping all empty in tux.shared.init.
  • Update performance tests for GuildConfigCacheManager and JailStatusCache to use the async clear_all/set/get APIs within asyncio.run wrappers.
src/tux/shared/config/settings.py
scripts/config/generate.py
docs/content/reference/env.md
docs/content/developer/best-practices/caching.md
compose.yaml
AGENTS.md
scripts/db/health.py
pyproject.toml
src/tux/shared/__init__.py
src/tux/shared/cache.py
tests/performance/test_cache_performance.py
Add targeted unit tests for the new cache layer (backends, service, shared managers, and health integration) to validate behavior and edge cases.
  • Add tests for InMemoryBackend and ValkeyBackend (key prefixing, JSON serialization, TTL behavior, delete/exists) plus get_cache_backend selection and fallback reuse.
  • Add tests for CacheService connect/ping/close behavior using mocks, including handling of missing URL, connection errors, and client ping failures.
  • Add tests for GuildConfigCacheManager and JailStatusCache when a backend is set, verifying get/set/invalidate and get_or_fetch semantics.
  • Add tests for the new Valkey health check helper _check_cache to ensure correct status/messaging for skipped, healthy, and unhealthy scenarios, including connection and ping failures.
tests/cache/__init__.py
tests/cache/test_backend.py
tests/cache/test_service.py
tests/cache/test_shared_cache_backend.py
tests/cache/test_db_health_cache.py

Assessment against linked issues

Issue Objective Addressed Explanation
#1164 Introduce a Valkey-based cache layer (backends, service, configuration) and integrate it with bot startup and health checks, while retaining an in-memory fallback.
#1164 Refactor existing caching consumers (permission controllers/system, guild config, jail status, prefixes, moderation communication) to use the new async cache backend/manager APIs instead of tux.shared.cache, preserving correct invalidation and TTL behavior.
#1164 Update infrastructure, configuration, documentation, and tests to support and describe the new Valkey caching behavior (docker-compose service, env/config docs, caching docs, AGENTS overview, and cache-related tests).

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions

This comment was marked as off-topic.

@coderabbitai
Copy link

coderabbitai bot commented Jan 28, 2026

Warning

Rate limit exceeded

@kzndotsh has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 4 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between e70655f and 3f76704.

📒 Files selected for processing (3)
  • docs/content/developer/best-practices/caching.md
  • src/tux/modules/moderation/jail.py
  • src/tux/services/moderation/moderation_coordinator.py
📝 Walkthrough

Walkthrough

Adds an async-first cache subsystem with in-memory TTL and optional Valkey backend, CacheService lifecycle, async cache managers wired into startup, DB and permission layers, Valkey env/config/docs/compose entries, health-check integration, comprehensive tests, and removal of the legacy shared TTL cache.

Changes

Cohort / File(s) Summary
Cache package
src/tux/cache/__init__.py, src/tux/cache/backend.py, src/tux/cache/service.py, src/tux/cache/ttl.py, src/tux/cache/managers.py
Introduce async cache layer: AsyncCacheBackend protocol, TTLCache, InMemoryBackend, ValkeyBackend, CacheService (connect/ping/close), and async managers (GuildConfigCacheManager, JailStatusCache) with per-key locking and backend integration.
Startup & wiring
src/tux/core/setup/cache_setup.py, src/tux/core/setup/orchestrator.py, src/tux/core/bot.py
Add CacheSetupService to orchestrator; create/assign bot.cache_service when Valkey configured; wire backend into managers; ensure cache_service closed on shutdown.
Database & permission controllers
src/tux/database/controllers/__init__.py, src/tux/database/controllers/permissions.py, src/tux/database/controllers/guild_config.py
DatabaseCoordinator accepts optional cache_backend and propagates it; permission and guild-config controllers become backend-aware, using backend keys/TTLs, (un)wrap helpers, and async cache calls.
Core systems
src/tux/core/permission_system.py, src/tux/core/prefix_manager.py, src/tux/core/setup/permission_setup.py
Switch PermissionSystem and PrefixManager to backend-backed async caching via get_cache_backend(bot); convert some methods to async and pass cache_backend into DB coordinator during setup.
Modules & services
src/tux/modules/moderation/..., src/tux/services/moderation/communication_service.py
Update imports to tux.cache, await cache calls, and convert CommunicationService.invalidate_guild_config_cache to async.
Removed legacy shared cache
src/tux/shared/cache.py, src/tux/shared/__init__.py
Delete previous TTLCache, GuildConfigCacheManager, JailStatusCache and their exports; consumers migrated to new src/tux/cache package.
Configuration & env
.env.example, scripts/config/generate.py, src/tux/shared/config/settings.py, config/config.json.example, docs/content/reference/env.md
Add VALKEY_* env vars and Config fields, computed valkey_url, update generator and docs to include Valkey configuration and defaults.
Deployment & health
compose.yaml, scripts/db/health.py
Add tux-valkey service and tux_valkey_data volume to compose; add optional async cache health check when VALKEY_URL is configured and reachable; integrate check into health flow.
Documentation
docs/content/developer/best-practices/caching.md, AGENTS.md, various docs/content/selfhost/*
Large docs rewrite describing new cache architecture, backends, Valkey notes, multi-guild safety, async APIs, compose/profile guidance, and operational instructions.
Tests
tests/cache/*, tests/core/*, tests/database/*, tests/shared/*, ...
Add tests for backends, CacheService, managers, TTLCache, CacheSetupService, and health check; update many tests to async cache API and new import paths.
Miscellaneous
pyproject.toml, src/tux/modules/info/views.py, src/tux/modules/utility/__init__.py
Add valkey>=6.1.1 dependency; minor typing/annotation tweaks and small refactors.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

priority: high

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: valkey' clearly and concisely summarizes the main change: addition of Valkey cache support. It directly relates to the extensive changeset implementing a cache layer with Valkey backend integration.
Description check ✅ Passed The PR description is related to the changeset, providing motivation (closes #1164), listing changes, and explaining the implementation of a Valkey cache layer with integration across multiple systems.
Linked Issues check ✅ Passed The PR implements comprehensive Valkey cache integration: new cache package with TTL/backend support, Valkey-backed async caches, optional Valkey configuration, health checks, and integration into permissions, guild config, jail status, and prefix management—fully addressing the linked issue #1164.
Out of Scope Changes check ✅ Passed Minor scope creep detected: test method rename (test_basic_queries → test_insert_and_commit_persists_guilds_with_expected_ids), type annotation removal in info views, and permission system test cache clearing removal are tangentially related but not core to Valkey implementation.
Docstring Coverage ✅ Passed Docstring coverage is 93.44% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 1164-valkey

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist

This comment was marked as resolved.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 28, 2026

📚 Documentation Preview

Type URL Version Message
Production https://tux.atl.dev - -
Preview https://8ab9639d-tux-docs.allthingslinux.workers.dev 8ab9639d-8a1f-4f2d-8cd2-ebaf8b55f435 Preview: tux@fc6734ce4e917800233387f9d0a7f908c84eaa84 on 1179/merge by kzndotsh (run 374)

Copy link

@amazon-q-developer amazon-q-developer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This PR adds comprehensive Valkey (Redis-compatible) caching support to Tux, providing an optional shared cache layer that persists across restarts. The implementation includes:

  • Cache Service: Connection management with retry logic and graceful fallback
  • Backend Abstraction: Protocol-based design supporting both in-memory and Valkey backends
  • Cache Managers: Singleton managers for guild config and jail status with async APIs
  • Integration: Seamless integration into existing permission system and prefix management

Critical Issues Found

Security Vulnerability: The Valkey URL construction exposes passwords in connection strings, creating a credential exposure risk through logs and error messages.

Logic Error: InMemoryBackend ignores the ttl_sec parameter, causing inconsistent behavior between backends.

Recommendations

  1. Fix the security vulnerability by avoiding password inclusion in URLs or implementing secure URL handling
  2. Implement proper TTL handling in InMemoryBackend to maintain consistency with ValkeyBackend
  3. Add Valkey password validation similar to existing PostgreSQL validation
  4. Address the incomplete guild invalidation behavior when using Valkey backend

The overall architecture is well-designed with proper fallback mechanisms and thread safety considerations. Once the critical issues are resolved, this will be a valuable addition to the caching infrastructure.


You can now have the agent implement changes and create commits directly on your pull request's source branch. Simply comment with /q followed by your request in natural language to ask the agent to make changes.


⚠️ This PR contains more than 30 files. Amazon Q is better at reviewing smaller PRs, and may miss issues in larger changesets.

Comment on lines 352 to 353
auth = f":{self.VALKEY_PASSWORD}@" if self.VALKEY_PASSWORD else ""
return f"valkey://{auth}{host}:{self.VALKEY_PORT}/{self.VALKEY_DB}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛑 Security Vulnerability: The Valkey URL construction exposes passwords in connection strings, which can appear in logs, error messages, and debugging output. This creates a credential exposure risk.

        auth = f":{self.VALKEY_PASSWORD}@" if self.VALKEY_PASSWORD else ""
        return f""

Comment on lines 79 to 86
async def set(
self,
key: str,
value: Any,
ttl_sec: float | None = None,
) -> None:
"""Set a value with optional TTL."""
self._cache.set(key, value)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛑 Logic Error: The ttl_sec parameter is accepted but completely ignored. The underlying TTLCache uses its default TTL regardless of the provided value, causing inconsistent behavior between InMemoryBackend and ValkeyBackend.

Suggested change
async def set(
self,
key: str,
value: Any,
ttl_sec: float | None = None,
) -> None:
"""Set a value with optional TTL."""
self._cache.set(key, value)
async def set(
self,
key: str,
value: Any,
ttl_sec: float | None = None,
) -> None:
"""Set a value with optional TTL."""
if ttl_sec is not None:
# Create a temporary cache with the specific TTL for this entry
temp_cache = TTLCache(ttl=ttl_sec, max_size=1)
temp_cache.set(key, value)
# Store the expiry time and value in our main cache
import time
expire_time = time.monotonic() + ttl_sec
self._cache._cache[key] = (value, expire_time)
else:
self._cache.set(key, value)

Comment on lines +241 to +254
VALKEY_PASSWORD: Annotated[
str,
Field(
default="",
description="Valkey password",
),
] = ""
VALKEY_URL: Annotated[
str,
Field(
default="",
description="Valkey URL override (overrides host/port/db/password)",
),
] = ""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding password validation for Valkey similar to the existing PostgreSQL password validation in the validate_environment() function. Weak cache passwords can compromise cached sensitive data and allow unauthorized access to the cache layer.

Comment on lines 315 to 320
async def invalidate_guild(self, guild_id: int) -> None:
"""Invalidate all jail status entries for a guild (in-memory only)."""
self._cache.clear()
logger.debug(
f"Cleared all jail status cache entries (guild {guild_id} invalidation)",
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The invalidate_guild method only clears the in-memory cache but doesn't handle backend keys when Valkey is used. This creates inconsistent behavior where backend entries remain cached while in-memory entries are cleared.

@sentry
Copy link

sentry bot commented Jan 28, 2026

❌ 5 Tests Failed:

Tests completed Failed Passed Skipped
664 5 659 36
View the top 2 failed test(s) by shortest run time
tests/modules/test_jail_system.py::TestRejailOnRejoin::test_on_member_join_does_not_rejail_when_latest_case_is_unjail
Stack Traces | 0.169s run time
tests/modules/test_jail_system.py:466: in jail_ready_coord
    await coord.guild_config.update_config(
.../database/controllers/guild_config.py:88: in update_config
    await GuildConfigCacheManager().invalidate(guild_id)
.../tux/cache/managers.py:211: in invalidate
    await self._backend.delete(key)
E   TypeError: object MagicMock can't be used in 'await' expression
tests/modules/test_jail_system.py::TestRejailOnRejoin::test_on_member_join_reapplies_jail_role_when_user_was_jailed
Stack Traces | 0.176s run time
tests/modules/test_jail_system.py:466: in jail_ready_coord
    await coord.guild_config.update_config(
.../database/controllers/guild_config.py:88: in update_config
    await GuildConfigCacheManager().invalidate(guild_id)
.../tux/cache/managers.py:211: in invalidate
    await self._backend.delete(key)
E   TypeError: object MagicMock can't be used in 'await' expression
View the full list of 3 ❄️ flaky test(s)
tests/modules/test_moderation_service_integration.py::TestModerationCoordinatorIntegration::test_complete_workflow_with_mod_logging_success

Flake rate in main: 68.18% (Passed 7 times, Failed 15 times)

Stack Traces | 0.176s run time
tests/modules/test_moderation_service_integration.py:569: in test_complete_workflow_with_mod_logging_success
    assert case.mod_log_message_id == mod_message.id
E   AssertionError: assert None == 999888777
E    +  where None = <Case id=1 guild=123456789 num=1 type=CaseType.BAN user=555666777>.mod_log_message_id
E    +  and   999888777 = <MagicMock id='140643236788752'>.id
tests/modules/test_moderation_service_integration.py::TestModerationCoordinatorIntegration::test_mod_log_case_update_failure

Flake rate in main: 68.18% (Passed 7 times, Failed 15 times)

Stack Traces | 0.175s run time
.../hostedtoolcache/Python/3.13.11.../x64/lib/python3.13/unittest/mock.py:958: in assert_called_once
    raise AssertionError(msg)
E   AssertionError: Expected 'send' to have been called once. Called 0 times.

During handling of the above exception, another exception occurred:
tests/modules/test_moderation_service_integration.py:852: in test_mod_log_case_update_failure
    mod_channel.send.assert_called_once()
E   AssertionError: Expected 'send' to have been called once. Called 0 times.
tests/modules/test_moderation_service_integration.py::TestModerationCoordinatorIntegration::test_mod_log_send_failure_permissions

Flake rate in main: 68.18% (Passed 7 times, Failed 15 times)

Stack Traces | 0.175s run time
.../hostedtoolcache/Python/3.13.11.../x64/lib/python3.13/unittest/mock.py:958: in assert_called_once
    raise AssertionError(msg)
E   AssertionError: Expected 'send' to have been called once. Called 0 times.

During handling of the above exception, another exception occurred:
tests/modules/test_moderation_service_integration.py:782: in test_mod_log_send_failure_permissions
    mod_channel.send.assert_called_once()
E   AssertionError: Expected 'send' to have been called once. Called 0 times.

To view more test analytics, go to the [Prevent Tests Dashboard](https://All Things Linux.sentry.io/prevent/tests/?preventPeriod=30d&integratedOrgName=allthingslinux&repository=tux&branch=1164-valkey)

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 5 issues, and left some high level feedback:

  • Several previously synchronous cache APIs are now async (e.g. PrefixManager.invalidate_cache, CommunicationService.invalidate_guild_config_cache, GuildConfigCacheManager.invalidate/set), so it’s worth double‑checking all call sites to ensure they now await these methods to avoid un-awaited coroutine warnings and logic silently not running.
  • In JailStatusCache.invalidate_guild the current implementation clears the entire in-memory cache and ignores the guild_id argument (and does not touch Valkey keys), which may be surprising to callers; consider either scoping invalidation to the given guild or renaming/documenting this as a global clear to avoid misuse.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- Several previously synchronous cache APIs are now async (e.g. `PrefixManager.invalidate_cache`, `CommunicationService.invalidate_guild_config_cache`, `GuildConfigCacheManager.invalidate`/`set`), so it’s worth double‑checking all call sites to ensure they now `await` these methods to avoid un-awaited coroutine warnings and logic silently not running.
- In `JailStatusCache.invalidate_guild` the current implementation clears the entire in-memory cache and ignores the `guild_id` argument (and does not touch Valkey keys), which may be surprising to callers; consider either scoping invalidation to the given guild or renaming/documenting this as a global clear to avoid misuse.

## Individual Comments

### Comment 1
<location> `src/tux/cache/backend.py:32-41` </location>
<code_context>
+        """Get a value by key."""
+        return self._cache.get(key)
+
+    async def set(
+        self,
+        key: str,
+        value: Any,
+        ttl_sec: float | None = None,
+    ) -> None:
+        """Set a value with optional TTL."""
+        self._cache.set(key, value)
+
+    async def delete(self, key: str) -> None:
</code_context>

<issue_to_address>
**issue (bug_risk):** InMemoryBackend ignores per-call ttl_sec, which diverges from the Valkey backend semantics.

`InMemoryBackend.set` ignores `ttl_sec` and always relies on the `TTLCache` default TTL. Callers that pass different TTLs (e.g. `PERM_RANKS_TTL`, `PERM_USER_RANK_TTL`) will see those respected only with Valkey, not with the in-memory backend, causing divergent caching behavior.

To align behavior, either:
- Use separate `InMemoryBackend` instances per logical cache with appropriate `default_ttl`, or
- Extend `TTLCache` to support per-entry TTL and apply `ttl_sec`.

If fixed in-memory TTL is intentional, consider updating the docstring and removing `ttl_sec` from this backend’s interface to avoid misleading callers.
</issue_to_address>

### Comment 2
<location> `src/tux/core/prefix_manager.py:267-270` </location>
<code_context>
                 self._cache_loaded = True

-    def invalidate_cache(self, guild_id: int | None = None) -> None:
+    async def invalidate_cache(self, guild_id: int | None = None) -> None:
         """
         Invalidate prefix cache for a specific guild or all guilds.
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Changing invalidate_cache to async and only deleting from backend for a single guild may leave inconsistencies when clearing all guilds.

In the `guild_id is None` branch, only the in-memory cache is cleared; any existing Valkey entries remain and will be treated as fresh on subsequent `get_prefix` calls, repopulating `_prefix_cache` from the backend. If this path is meant to perform a full reset, consider also clearing/invalidation on the backend, or explicitly document that this performs an in-memory–only reset.

Also, since `invalidate_cache` is now `async`, verify that all existing call sites `await` it so that invalidation still actually runs.

Suggested implementation:

```python
    async def invalidate_cache(self, guild_id: int | None = None) -> None:
        """
        Invalidate prefix cache for a specific guild or all guilds.

        When ``guild_id`` is ``None``, this performs an in-memory–only reset:
        the in-memory prefix cache is cleared, but any entries in the backing
        store (e.g. Valkey) are left untouched and will be treated as fresh on
        subsequent :meth:`get_prefix` calls.

        When a specific ``guild_id`` is provided, both in-memory and backend
        state for that guild should be invalidated.

        Examples
        --------
        >>> await manager.invalidate_cache(123456789)  # Specific guild
        >>> await manager.invalidate_cache()           # All guilds (in-memory only)
        """
        # In-memory only reset: clear all cached prefixes and mark cache as unloaded
        if guild_id is None:
            self._prefix_cache.clear()
            self._cache_loaded = False
            return

        # Per-guild reset: remove from in-memory cache and backend
        self._prefix_cache.pop(guild_id, None)
        self._cache_loaded = False

        # Backend invalidation for a single guild should mirror how prefixes
        # are stored in the backing store (e.g. Valkey). This call is async,
        # so invalidate_cache must be awaited by callers.
        if hasattr(self, "_valkey") and self._valkey is not None:
            # Adjust key construction to match the rest of the class.
            key = f"prefix:{guild_id}"
            await self._valkey.delete(key)

```

1. If this class already has a helper for building prefix keys (e.g. `_prefix_key(guild_id)` or similar), replace the inline `key = f"prefix:{guild_id}"` with that helper to keep key construction consistent.
2. If the backing store attribute is named something other than `_valkey` (for example `self._backend`, `self._cache`, etc.), update the `hasattr(self, "_valkey")` block to use the correct attribute and delete API.
3. Search for all call sites of `invalidate_cache(` across the codebase (e.g. `git grep "invalidate_cache("`) and ensure each call is now awaited, for example:
   - Replace `manager.invalidate_cache(guild_id)` with `await manager.invalidate_cache(guild_id)` inside async functions.
   - Propagate `async`/`await` up the call chain where necessary so that invalidation is actually executed.
4. If you decide that clearing *all* guild prefixes should also clear the backend (not just in-memory), extend the `guild_id is None` branch to iterate over and delete all prefix keys in the backing store using your existing key pattern and backend API (e.g. a scan-and-delete on `prefix:*`).
</issue_to_address>

### Comment 3
<location> `tests/cache/__init__.py:1` </location>
<code_context>
+"""Tests for the cache layer (backends, service, shared cache with backend)."""
</code_context>

<issue_to_address>
**suggestion (testing):** TTLCache itself is not covered by tests, so expiry and eviction semantics are only indirectly exercised.

Given `TTLCache` is now shared by `InMemoryBackend` and the cache managers, please add focused unit tests for it (e.g., `tests/cache/test_ttl_cache.py`). In particular, cover: (1) expiration after the configured TTL, (2) `max_size` eviction order, (3) `get_or_fetch` correctness (no stale values, writes occur as expected), and (4) `clear`/`invalidate` behavior, so these semantics aren’t only tested indirectly via other components.
</issue_to_address>

### Comment 4
<location> `src/tux/database/controllers/permissions.py:85` </location>
<code_context>
+            and jail_channel_id, or None if not cached.
+        """
+        key = self._cache_key(guild_id)
+        if self._backend is not None:
+            value = await self._backend.get(key)
+            return (
</code_context>

<issue_to_address>
**issue (complexity):** Consider introducing a shared cache adapter that encapsulates backend vs local-cache behavior so the permission controllers focus on domain logic instead of repeated branching and serialization code.

Consider extracting the backend-vs-local branching into a small internal cache adapter so controllers stay focused on rank/assignment logic. Most methods only differ in key prefixing, serialization, and invalidate/get/set calls. A helper like the following would remove the repeated `if self._backend` blocks and keep functionality intact:

```python
class PermissionCacheAdapter:
    def __init__(self, backend, local_cache, key_prefix: str, ttl: float):
        self._backend = backend
        self._local = local_cache
        self._prefix = key_prefix
        self._ttl = ttl

    async def get_models(self, key: str, model_type: type[list[BaseModel]]) -> list[BaseModel]:
        full_key = f"{self._prefix}{key}"
        if self._backend:
            raw = await self._backend.get(full_key)
            return [model_type.model_validate(d) for d in raw] if raw else None
        return self._local.get(key)

    async def set_models(self, key: str, models: list[BaseModel]) -> None:
        full_key = f"{self._prefix}{key}"
        payload = [m.model_dump() for m in models]
        if self._backend:
            await self._backend.set(full_key, payload, ttl_sec=self._ttl)
        else:
            self._local.set(key, models)

    async def invalidate(self, key: str) -> None:
        full_key = f"{self._prefix}{key}"
        if self._backend:
            await self._backend.delete(full_key)
        else:
            self._local.invalidate(key)
```

Then controllers call `await self._rank_cache.invalidate(f"permission_ranks:{guild_id}")` or `await self._assignments_cache.get_models(...)` instead of repeating the branching and serialization each time. This keeps command/rank logic readable while preserving the new backend functionality.
</issue_to_address>

### Comment 5
<location> `src/tux/cache/managers.py:90` </location>
<code_context>
+        """Get a value by key. Return None if missing or expired."""
+        ...
+
+    async def set(
+        self,
+        key: str,
</code_context>

<issue_to_address>
**issue (complexity):** Consider refactoring the two `set`/`async_set` methods into a single async `set` with optional locking so the merge and backend logic is defined only once.

Unify the duplicate `set` / `async_set` implementations so the merge logic lives in one async method with optional locking; this trims API surface and removes partial divergences (`_MISSING`, lock handling) without changing behavior. For example:

```python
async def set(
    self,
    guild_id: int,
    audit_log_id: int | None = _MISSING,
    mod_log_id: int | None = _MISSING,
    jail_role_id: int | None = _MISSING,
    jail_channel_id: int | None = _MISSING,
    *, use_lock: bool = False,
) -> None:
    lock = self._get_lock(guild_id) if use_lock else asyncnullcontext()
    async with lock:
        key = self._cache_key(guild_id)
        existing = await self._backend.get(key) if self._backend else self._cache.get(key) or {}
        updated = {**existing}
        if audit_log_id is not _MISSING:
            updated["audit_log_id"] = audit_log_id
        # …repeat for other fields…
        if self._backend:
            await self._backend.set(key, updated, ttl_sec=GUILD_CONFIG_TTL_SEC)
        else:
            self._cache.set(key, updated)
```

`asyncnullcontext` can be a tiny helper returning an async context manager that does nothing. Callers who need concurrency safety pass `use_lock=True`; others keep the simpler path, and there’s no repeated merge logic.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines 32 to 41
async def set(
self,
key: str,
value: Any,
ttl_sec: float | None = None,
) -> None:
"""Set a value with optional TTL."""
...

async def delete(self, key: str) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): InMemoryBackend ignores per-call ttl_sec, which diverges from the Valkey backend semantics.

InMemoryBackend.set ignores ttl_sec and always relies on the TTLCache default TTL. Callers that pass different TTLs (e.g. PERM_RANKS_TTL, PERM_USER_RANK_TTL) will see those respected only with Valkey, not with the in-memory backend, causing divergent caching behavior.

To align behavior, either:

  • Use separate InMemoryBackend instances per logical cache with appropriate default_ttl, or
  • Extend TTLCache to support per-entry TTL and apply ttl_sec.

If fixed in-memory TTL is intentional, consider updating the docstring and removing ttl_sec from this backend’s interface to avoid misleading callers.

Comment on lines +267 to 270
async def invalidate_cache(self, guild_id: int | None = None) -> None:
"""
Invalidate prefix cache for a specific guild or all guilds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Changing invalidate_cache to async and only deleting from backend for a single guild may leave inconsistencies when clearing all guilds.

In the guild_id is None branch, only the in-memory cache is cleared; any existing Valkey entries remain and will be treated as fresh on subsequent get_prefix calls, repopulating _prefix_cache from the backend. If this path is meant to perform a full reset, consider also clearing/invalidation on the backend, or explicitly document that this performs an in-memory–only reset.

Also, since invalidate_cache is now async, verify that all existing call sites await it so that invalidation still actually runs.

Suggested implementation:

    async def invalidate_cache(self, guild_id: int | None = None) -> None:
        """
        Invalidate prefix cache for a specific guild or all guilds.

        When ``guild_id`` is ``None``, this performs an in-memory–only reset:
        the in-memory prefix cache is cleared, but any entries in the backing
        store (e.g. Valkey) are left untouched and will be treated as fresh on
        subsequent :meth:`get_prefix` calls.

        When a specific ``guild_id`` is provided, both in-memory and backend
        state for that guild should be invalidated.

        Examples
        --------
        >>> await manager.invalidate_cache(123456789)  # Specific guild
        >>> await manager.invalidate_cache()           # All guilds (in-memory only)
        """
        # In-memory only reset: clear all cached prefixes and mark cache as unloaded
        if guild_id is None:
            self._prefix_cache.clear()
            self._cache_loaded = False
            return

        # Per-guild reset: remove from in-memory cache and backend
        self._prefix_cache.pop(guild_id, None)
        self._cache_loaded = False

        # Backend invalidation for a single guild should mirror how prefixes
        # are stored in the backing store (e.g. Valkey). This call is async,
        # so invalidate_cache must be awaited by callers.
        if hasattr(self, "_valkey") and self._valkey is not None:
            # Adjust key construction to match the rest of the class.
            key = f"prefix:{guild_id}"
            await self._valkey.delete(key)
  1. If this class already has a helper for building prefix keys (e.g. _prefix_key(guild_id) or similar), replace the inline key = f"prefix:{guild_id}" with that helper to keep key construction consistent.
  2. If the backing store attribute is named something other than _valkey (for example self._backend, self._cache, etc.), update the hasattr(self, "_valkey") block to use the correct attribute and delete API.
  3. Search for all call sites of invalidate_cache( across the codebase (e.g. git grep "invalidate_cache(") and ensure each call is now awaited, for example:
    • Replace manager.invalidate_cache(guild_id) with await manager.invalidate_cache(guild_id) inside async functions.
    • Propagate async/await up the call chain where necessary so that invalidation is actually executed.
  4. If you decide that clearing all guild prefixes should also clear the backend (not just in-memory), extend the guild_id is None branch to iterate over and delete all prefix keys in the backing store using your existing key pattern and backend API (e.g. a scan-and-delete on prefix:*).

@@ -0,0 +1 @@
"""Tests for the cache layer (backends, service, shared cache with backend)."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): TTLCache itself is not covered by tests, so expiry and eviction semantics are only indirectly exercised.

Given TTLCache is now shared by InMemoryBackend and the cache managers, please add focused unit tests for it (e.g., tests/cache/test_ttl_cache.py). In particular, cover: (1) expiration after the configured TTL, (2) max_size eviction order, (3) get_or_fetch correctness (no stale values, writes occur as expected), and (4) clear/invalidate behavior, so these semantics aren’t only tested indirectly via other components.

self._guild_ranks_cache.invalidate(f"permission_ranks:{guild_id}")
if result.id:
self._ranks_cache.invalidate(f"permission_rank:{result.id}")
if self._backend is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider introducing a shared cache adapter that encapsulates backend vs local-cache behavior so the permission controllers focus on domain logic instead of repeated branching and serialization code.

Consider extracting the backend-vs-local branching into a small internal cache adapter so controllers stay focused on rank/assignment logic. Most methods only differ in key prefixing, serialization, and invalidate/get/set calls. A helper like the following would remove the repeated if self._backend blocks and keep functionality intact:

class PermissionCacheAdapter:
    def __init__(self, backend, local_cache, key_prefix: str, ttl: float):
        self._backend = backend
        self._local = local_cache
        self._prefix = key_prefix
        self._ttl = ttl

    async def get_models(self, key: str, model_type: type[list[BaseModel]]) -> list[BaseModel]:
        full_key = f"{self._prefix}{key}"
        if self._backend:
            raw = await self._backend.get(full_key)
            return [model_type.model_validate(d) for d in raw] if raw else None
        return self._local.get(key)

    async def set_models(self, key: str, models: list[BaseModel]) -> None:
        full_key = f"{self._prefix}{key}"
        payload = [m.model_dump() for m in models]
        if self._backend:
            await self._backend.set(full_key, payload, ttl_sec=self._ttl)
        else:
            self._local.set(key, models)

    async def invalidate(self, key: str) -> None:
        full_key = f"{self._prefix}{key}"
        if self._backend:
            await self._backend.delete(full_key)
        else:
            self._local.invalidate(key)

Then controllers call await self._rank_cache.invalidate(f"permission_ranks:{guild_id}") or await self._assignments_cache.get_models(...) instead of repeating the branching and serialization each time. This keeps command/rank logic readable while preserving the new backend functionality.

)
return self._cache.get(key)

async def set(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (complexity): Consider refactoring the two set/async_set methods into a single async set with optional locking so the merge and backend logic is defined only once.

Unify the duplicate set / async_set implementations so the merge logic lives in one async method with optional locking; this trims API surface and removes partial divergences (_MISSING, lock handling) without changing behavior. For example:

async def set(
    self,
    guild_id: int,
    audit_log_id: int | None = _MISSING,
    mod_log_id: int | None = _MISSING,
    jail_role_id: int | None = _MISSING,
    jail_channel_id: int | None = _MISSING,
    *, use_lock: bool = False,
) -> None:
    lock = self._get_lock(guild_id) if use_lock else asyncnullcontext()
    async with lock:
        key = self._cache_key(guild_id)
        existing = await self._backend.get(key) if self._backend else self._cache.get(key) or {}
        updated = {**existing}
        if audit_log_id is not _MISSING:
            updated["audit_log_id"] = audit_log_id
        # …repeat for other fields…
        if self._backend:
            await self._backend.set(key, updated, ttl_sec=GUILD_CONFIG_TTL_SEC)
        else:
            self._cache.set(key, updated)

asyncnullcontext can be a tiny helper returning an async context manager that does nothing. Callers who need concurrency safety pass use_lock=True; others keep the simpler path, and there’s no repeated merge logic.

@kzndotsh kzndotsh self-assigned this Jan 28, 2026
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a substantial and well-executed feature. The introduction of a unified cache layer with an optional Valkey backend is a great improvement for scalability and performance. The new tux.cache package is well-structured with clear separation of concerns between the service, backends, and managers. The refactoring of existing code to use the new async cache managers is consistent and thorough. The documentation updates are excellent and will be very helpful for developers. I've found a couple of areas for improvement, one of which is a high-severity bug in the cache invalidation logic. Overall, great work on this complex feature.

Comment on lines 315 to 320
async def invalidate_guild(self, guild_id: int) -> None:
"""Invalidate all jail status entries for a guild (in-memory only)."""
self._cache.clear()
logger.debug(
f"Cleared all jail status cache entries (guild {guild_id} invalidation)",
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The invalidate_guild method is intended to clear cache entries for a specific guild, but its implementation self._cache.clear() clears the entire in-memory cache for all guilds. This can lead to performance degradation and potential stale data issues if other guilds' caches are cleared unintentionally. The implementation should be updated to only remove entries for the specified guild_id. Note that the suggested fix accesses a private attribute of TTLCache; a more robust solution might involve adding a prefix-based invalidation method to TTLCache itself.

Suggested change
async def invalidate_guild(self, guild_id: int) -> None:
"""Invalidate all jail status entries for a guild (in-memory only)."""
self._cache.clear()
logger.debug(
f"Cleared all jail status cache entries (guild {guild_id} invalidation)",
)
async def invalidate_guild(self, guild_id: int) -> None:
"""Invalidate all jail status entries for a guild (in-memory only)."""
prefix = f"jail_status:{guild_id}:"
keys_to_remove = [key for key in self._cache._cache if str(key).startswith(prefix)]
for key in keys_to_remove:
self._cache.invalidate(key)
logger.debug(
f"Invalidated {len(keys_to_remove)} jail status cache entries for guild {guild_id}",
)

Comment on lines 79 to 86
async def set(
self,
key: str,
value: Any,
ttl_sec: float | None = None,
) -> None:
"""Set a value with optional TTL."""
self._cache.set(key, value)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The InMemoryBackend.set method signature includes ttl_sec, but the implementation ignores it, always using the default TTL set during initialization. This differs from ValkeyBackend, which respects ttl_sec. This inconsistency can lead to subtle bugs when switching between backends, as cache expiration behavior will change unexpectedly. To improve consistency, consider modifying TTLCache to support per-key TTLs, and then use ttl_sec here if provided. Alternatively, log a warning if ttl_sec is provided but ignored.

Comment on lines 42 to 46
if await cache_service.ping():
logger.debug("Valkey health check: healthy")
return {"status": "healthy", "message": "Valkey ping OK"}
logger.debug("Valkey health check: ping failed")
return {"status": "unhealthy", "message": "Valkey ping failed"} # noqa: TRY300

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The if/else structure can be made more explicit, which also allows removing the noqa: TRY300 suppression. Using an else block clarifies the logic for when the ping fails and improves readability by adhering to the style the linter suggests.

Suggested change
if await cache_service.ping():
logger.debug("Valkey health check: healthy")
return {"status": "healthy", "message": "Valkey ping OK"}
logger.debug("Valkey health check: ping failed")
return {"status": "unhealthy", "message": "Valkey ping failed"} # noqa: TRY300
if await cache_service.ping():
logger.debug("Valkey health check: healthy")
return {"status": "healthy", "message": "Valkey ping OK"}
else:
logger.debug("Valkey health check: ping failed")
return {"status": "unhealthy", "message": "Valkey ping failed"}

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

10 issues found across 40 files

Confidence score: 2/5

  • invalidate_guild in src/tux/cache/managers.py clears the entire cache instead of only the target guild, which is a concrete data-loss/behavior regression risk.
  • src/tux/cache/backend.py ignores ttl_sec, creating inconsistent TTL behavior between backends and potential stale data issues.
  • Severity includes multiple medium-to-high issues affecting cache correctness and behavior, so merge risk is elevated despite some being fixable.
  • Pay close attention to src/tux/cache/managers.py, src/tux/cache/backend.py, src/tux/cache/ttl.py - cache invalidation and TTL semantics are inconsistent.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/tux/core/bot.py">

<violation number="1" location="src/tux/core/bot.py:473">
P3: Missing `span.set_data("connections.cache_error_type", type(e).__name__)` to be consistent with the database and HTTP error handlers in this method.</violation>
</file>

<file name="src/tux/shared/config/settings.py">

<violation number="1" location="src/tux/shared/config/settings.py:352">
P2: Password should be URL-encoded to handle special characters. If `VALKEY_PASSWORD` contains characters like `@`, `:`, or `/`, the URL will be malformed and connection will fail.</violation>
</file>

<file name="src/tux/cache/service.py">

<violation number="1" location="src/tux/cache/service.py:86">
P2: Misleading log: `Valkey.from_url()` doesn't establish a connection - it creates a client with a lazy connection pool. The actual TCP connection happens on first command. Consider verifying connectivity with `await self._client.ping()` before logging success, or change the message to "Valkey client initialized".</violation>
</file>

<file name="src/tux/cache/backend.py">

<violation number="1" location="src/tux/cache/backend.py:86">
P1: The `ttl_sec` parameter is accepted but silently ignored. The underlying `TTLCache.set()` doesn't support per-key TTL, causing inconsistent behavior between `ValkeyBackend` (which honors TTL) and `InMemoryBackend` (which doesn't).

Consider either:
1. Modifying `TTLCache` to support per-key TTL, or
2. Creating a new cache entry with custom expiration time, or
3. Documenting that per-key TTL is not supported for in-memory backend</violation>
</file>

<file name="src/tux/core/permission_system.py">

<violation number="1" location="src/tux/core/permission_system.py:529">
P2: Sequential `await` in loop defeats batch efficiency. Use `asyncio.gather()` to fetch all cache entries concurrently, especially important with network caches like Valkey.</violation>
</file>

<file name="src/tux/cache/managers.py">

<violation number="1" location="src/tux/cache/managers.py:187">
P1: Bug: `invalidate_guild` clears ALL cache entries instead of only entries for the specified guild. The method should filter entries by guild_id prefix, not call `self._cache.clear()`. This will cause data loss for all cached jail statuses when trying to invalidate a single guild.</violation>

<violation number="2" location="src/tux/cache/managers.py:296">
P2: Misleading method behavior: `async_set` has 'set if not exists' semantics but the name and docstring suggest it should update the value. If an entry already exists, the method silently returns without updating, which could cause stale jail status data. Either rename to `set_if_not_exists` or remove the early-return logic to match the expected 'set' behavior.</violation>
</file>

<file name="src/tux/cache/ttl.py">

<violation number="1" location="src/tux/cache/ttl.py:94">
P2: Unnecessary eviction when updating existing key at max capacity. The eviction check should skip if the key already exists in the cache, since updating won't increase the size.</violation>

<violation number="2" location="src/tux/cache/ttl.py:143">
P2: `get_or_fetch` cannot cache `None` values. Since `get()` returns `None` for cache misses, there's no way to distinguish between "key not found" and "cached value is `None`". If `fetch_fn()` returns `None`, subsequent calls will re-fetch instead of using the cached value. Consider using a sentinel object or returning a wrapper/tuple from `get()`.</violation>
</file>

<file name="compose.yaml">

<violation number="1" location="compose.yaml:55">
P2: `tux-valkey` is added without a profile, so it will start by default and contradicts the file’s documented behavior that only `tux-postgres` starts without profiles. If Valkey is meant to be optional, add a dedicated profile (e.g., `valkey`) so it only runs when explicitly requested.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

- Modified the payload serialization in the `async_set` method of the `ValkeyBackend` class to handle non-string values more robustly by using `default=str` in `json.dumps()`. This change ensures that all types of values can be serialized correctly, improving the reliability of cache storage.
…ck support

- Added a no-op async context manager `_null_lock` for scenarios where locking is not required.
- Introduced a new method `_set_impl` to handle the merging and writing of guild configurations, improving code organization.
- Updated the `set` method to optionally use locking for concurrent safety, allowing for safer updates when multiple coroutines may modify the same guild configuration.
- Revised `async_set` to leverage the new `set` method with locking enabled.
- Improved documentation for the `set` method to clarify the use of the `use_lock` parameter.
- Introduced a sentinel object `_CACHED_NONE` to differentiate between actual None values and cache misses in the `get_or_fetch` method.
- Updated the eviction logic to only remove the oldest entry when adding a new key, preventing unnecessary evictions during key updates.
- Enhanced the `set` method to store `_CACHED_NONE` when the value is None, ensuring consistent cache behavior.
…h concurrent operations

- Updated the cache retrieval process to fetch all entries concurrently, improving efficiency for Valkey/network caches.
- Enhanced the cache storage mechanism to write all uncached command entries concurrently, ensuring better performance during updates.
- Maintained existing functionality while improving the overall responsiveness of the permission system.
…prefixes

- Updated the `invalidate_cache` method to clear both in-memory and backend cache for all guilds when `guild_id` is None, improving cache management.
- Revised documentation to reflect the new behavior of cache invalidation, ensuring clarity on the impact of the method's parameters.
- Improved logging for better traceability during cache invalidation operations.
… in JailStatusCache

- Renamed the test method to clarify that `async_set` now overwrites existing values when the backend has a value, rather than skipping the write.
- Updated the test documentation to accurately describe the new behavior.
- Adjusted assertions to ensure the correct value is being set in the backend during the test.
- Introduced a new test method `test_get_or_fetch_caches_none_and_does_not_refetch` to verify that the `get_or_fetch` method correctly caches None values and prevents refetching on subsequent calls.
- Enhanced test coverage for the TTLCache to ensure consistent behavior when handling None values.
…tter resource management

- Changed the `prefix_manager` fixture to yield a `PrefixManager` instance instead of returning it, ensuring that the CONFIG patch remains active for the duration of the test.
- This modification enhances resource management and test isolation by properly handling setup and teardown of the fixture.
…tion

- Updated the `_safe_env_for_config` fixture to ensure the `POSTGRES_PASSWORD` is safe and that `TUX_VERSION` is unset, promoting a more deterministic test environment.
- This change improves the reliability of tests by preventing potential configuration inconsistencies.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@docs/content/developer/best-practices/caching.md`:
- Around line 35-40: Update the docs so the permission-related cache key and
pre-warm method names match the code: replace examples that show
`perm:command_permission:{guild_id}:{command_name}` with the actual cache key
`perm:command_permission_fallback:{guild_id}:{command_name}` and rename any doc
references to the pre-warm functions to `prewarm_cache_for_guild` and
`prewarm_cache_for_all_guilds` (these symbols are used in the codebase) so
examples and mentions at lines ~35-40 and ~270-280 align with the
implementation.

In `@src/tux/cache/ttl.py`:
- Around line 49-168: The get() method can return the internal sentinel
_CACHED_NONE (set by get_or_fetch()) which violates its contract; update get()
(and any logic reading self._cache in ttl.py) to translate the sentinel to None
before returning—i.e., when you retrieve value, expire_time = self._cache[key]
handle expiry as now, but if value is _CACHED_NONE return None; keep all other
behavior (eviction, logging, expiration removal) unchanged so get_or_fetch() can
continue to store _CACHED_NONE silently while callers of get() always see None.

In `@src/tux/core/permission_system.py`:
- Around line 524-531: The current logic treats any non-None backend payload as
a cache hit even if unwrap_optional_perm(raw) returns None for malformed/legacy
data; change the logic to call unwrap_optional_perm(raw) first, check its return
value, and only treat it as a cache hit/return when the unwrapped result is not
None; if unwrap_optional_perm(raw) returns None, log/trace that the cache entry
is malformed and proceed as a cache miss so real permissions are resolved. Apply
the same fix to the similar block that handles non-fallback keys (the block
around the code path referenced for lines 625-635).
🧹 Nitpick comments (5)
tests/cache/test_shared_cache_backend.py (1)

17-27: Reset singleton cache managers between tests to avoid state leakage.

Since both managers are singletons, backend/cache state can bleed into other tests. Add fixture teardown to clear state after each test run.

♻️ Suggested fixture teardown
 `@pytest.fixture`
 def manager(self, backend: InMemoryBackend) -> GuildConfigCacheManager:
     """Singleton with backend set."""
     manager = GuildConfigCacheManager()
     manager.set_backend(backend)
-    return manager
+    yield manager
+    manager._backend = None
+    manager._cache.clear()
+    manager._locks.clear()

 `@pytest.fixture`
 def cache(self, backend: InMemoryBackend) -> JailStatusCache:
     """Singleton with backend set."""
     jail_cache = JailStatusCache()
     jail_cache.set_backend(backend)
-    return jail_cache
+    yield jail_cache
+    jail_cache._backend = None
+    jail_cache._cache.clear()
+    jail_cache._locks.clear()

Also applies to: 123-134

tests/core/test_prefix_manager.py (2)

34-48: Test doesn't verify the backend was actually called.

The test asserts the return value and cache state but doesn't verify that mock_backend.get was invoked with the expected key. This would confirm the backend integration path was exercised.

✨ Suggested improvement
         result = await prefix_manager.get_prefix(TEST_GUILD_ID)
     assert result == "?"
     assert prefix_manager._prefix_cache[TEST_GUILD_ID] == "?"
+    mock_backend.get.assert_called_once_with(f"prefix:{TEST_GUILD_ID}")

51-58: Consider asserting no backend access on cache hit.

The test correctly verifies the in-memory cache returns the value, but doesn't confirm that the backend is never consulted when the sync cache has a hit. Adding a patch that would fail if called would strengthen the test.

✨ Suggested improvement
 `@pytest.mark.asyncio`
 async def test_get_prefix_returns_from_sync_cache_on_hit(
     prefix_manager: PrefixManager,
 ) -> None:
     """get_prefix returns from _prefix_cache when key already in cache (no backend call)."""
     prefix_manager._prefix_cache[TEST_GUILD_ID] = "."
-    result = await prefix_manager.get_prefix(TEST_GUILD_ID)
+    with patch(
+        "tux.core.prefix_manager.get_cache_backend",
+        side_effect=AssertionError("Backend should not be called on cache hit"),
+    ):
+        result = await prefix_manager.get_prefix(TEST_GUILD_ID)
     assert result == "."
src/tux/core/prefix_manager.py (2)

243-250: Sequential backend writes may slow startup with many guilds.

The loop awaits each backend.set call individually. With 1000 configs (the limit), this could add noticeable latency on startup. Consider using asyncio.gather for concurrent writes when the backend supports it.

♻️ Suggested improvement for concurrent backend writes
                 backend = get_cache_backend(self.bot)
+                write_tasks = []
                 for config in all_configs:
                     self._prefix_cache[config.id] = config.prefix
-                    await backend.set(
-                        f"prefix:{config.id}",
-                        config.prefix,
-                        ttl_sec=None,
+                    write_tasks.append(
+                        backend.set(
+                            f"prefix:{config.id}",
+                            config.prefix,
+                            ttl_sec=None,
+                        )
                     )
+                if write_tasks:
+                    await asyncio.gather(*write_tasks)

289-290: Sequential backend deletes in full invalidation may be slow.

Similar to load_all_prefixes, iterating with sequential awaits could be optimized with asyncio.gather for better performance when invalidating many guilds.

♻️ Suggested improvement for concurrent backend deletes
         backend = get_cache_backend(self.bot)
         if guild_id is None:
-            for gid in list(self._prefix_cache.keys()):
-                await backend.delete(f"prefix:{gid}")
+            keys_to_delete = list(self._prefix_cache.keys())
+            if keys_to_delete:
+                await asyncio.gather(
+                    *(backend.delete(f"prefix:{gid}") for gid in keys_to_delete)
+                )
             self._prefix_cache.clear()
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07bd082 and 519988b.

📒 Files selected for processing (10)
  • docs/content/developer/best-practices/caching.md
  • src/tux/cache/backend.py
  • src/tux/cache/managers.py
  • src/tux/cache/ttl.py
  • src/tux/core/permission_system.py
  • src/tux/core/prefix_manager.py
  • tests/cache/test_shared_cache_backend.py
  • tests/cache/test_ttl.py
  • tests/core/test_prefix_manager.py
  • tests/shared/test_config_settings.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/cache/test_ttl.py
  • tests/shared/test_config_settings.py
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/rules.mdc)

**/*.py: Follow Python code style guide and best practices as defined in core/style-guide.mdc
Follow error handling patterns as defined in error-handling/patterns.mdc
Follow loguru logging patterns as defined in error-handling/logging.mdc
Follow Sentry integration patterns as defined in error-handling/sentry.mdc

**/*.py: Use strict type hints with union syntax (Type | None not Optional[Type]) in Python
Use NumPy docstring format in Python
Prefer absolute imports; relative imports are allowed within the same module in Python
Group imports in order: stdlib → third-party → local, with blank lines separating groups in Python
Use 88 character line length limit in Python
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants in Python
Always add imports to the top of the file unless absolutely necessary in Python
Use async/await for all I/O operations in Python
Log errors and events with context information in Python
Keep individual source files under 1600 lines
Include complete type hints for all function parameters and return values
Include NumPy-format docstrings for all public APIs

Files:

  • src/tux/cache/backend.py
  • tests/core/test_prefix_manager.py
  • src/tux/core/prefix_manager.py
  • src/tux/core/permission_system.py
  • tests/cache/test_shared_cache_backend.py
  • src/tux/cache/managers.py
  • src/tux/cache/ttl.py
src/tux/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/tux/**/*.py: Use custom exceptions for business logic errors in Python
Handle Discord rate limits gracefully in API calls

Files:

  • src/tux/cache/backend.py
  • src/tux/core/prefix_manager.py
  • src/tux/core/permission_system.py
  • src/tux/cache/managers.py
  • src/tux/cache/ttl.py
**/*.md

📄 CodeRabbit inference engine (.cursor/rules/rules.mdc)

**/*.md: Follow master documentation rules as defined in docs/docs.mdc
Follow practical documentation examples and templates as defined in docs/patterns.mdc
Follow documentation organization structure as defined in docs/structure.mdc
Follow documentation writing standards as defined in docs/style.mdc

Files:

  • docs/content/developer/best-practices/caching.md
**/tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/rules.mdc)

**/tests/**/*.py: Follow fixture usage patterns as defined in testing/fixtures.mdc
Follow test marker conventions as defined in testing/markers.mdc
Follow async testing patterns as defined in testing/async.mdc
Follow test quality philosophy, behavior-driven testing, and ACT pattern as defined in testing/test-quality.mdc

Files:

  • tests/core/test_prefix_manager.py
  • tests/cache/test_shared_cache_backend.py
🧠 Learnings (4)
📚 Learning: 2026-01-25T18:11:17.210Z
Learnt from: CR
Repo: allthingslinux/tux PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-25T18:11:17.210Z
Learning: Applies to src/tux/shared/cache.py : Use TTL cache system for frequently accessed data (guild config, jail status, permissions)

Applied to files:

  • src/tux/cache/backend.py
  • docs/content/developer/best-practices/caching.md
  • src/tux/core/permission_system.py
  • tests/cache/test_shared_cache_backend.py
  • src/tux/cache/managers.py
  • src/tux/cache/ttl.py
📚 Learning: 2026-01-25T18:11:17.210Z
Learnt from: CR
Repo: allthingslinux/tux PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-25T18:11:17.210Z
Learning: Applies to src/tux/core/bot.py : Pre-warm cache on bot startup for frequently accessed data

Applied to files:

  • docs/content/developer/best-practices/caching.md
  • src/tux/core/prefix_manager.py
  • src/tux/core/permission_system.py
📚 Learning: 2026-01-25T18:10:54.909Z
Learnt from: CR
Repo: allthingslinux/tux PR: 0
File: .cursor/rules/rules.mdc:0-0
Timestamp: 2026-01-25T18:10:54.909Z
Learning: Applies to **/tests/**/*.py : Follow fixture usage patterns as defined in testing/fixtures.mdc

Applied to files:

  • tests/core/test_prefix_manager.py
📚 Learning: 2026-01-25T18:11:17.210Z
Learnt from: CR
Repo: allthingslinux/tux PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-25T18:11:17.210Z
Learning: Applies to src/tux/database/controllers/**/*.py : Use batch operations for permission checks and database queries for optimization

Applied to files:

  • src/tux/core/permission_system.py
🧬 Code graph analysis (6)
src/tux/cache/backend.py (3)
src/tux/cache/ttl.py (3)
  • TTLCache (17-191)
  • get (49-83)
  • invalidate (111-127)
src/tux/cache/managers.py (4)
  • get (77-98)
  • get (271-290)
  • invalidate (200-214)
  • invalidate (330-339)
src/tux/cache/service.py (2)
  • is_connected (92-101)
  • get_client (103-112)
tests/core/test_prefix_manager.py (3)
src/tux/core/prefix_manager.py (2)
  • PrefixManager (34-313)
  • set_prefix (109-138)
src/tux/shared/config/settings.py (1)
  • is_prefix_override_enabled (396-408)
tests/database/test_permissions_controller_cache_backend.py (1)
  • mock_backend (19-25)
src/tux/core/prefix_manager.py (1)
src/tux/cache/backend.py (1)
  • get_cache_backend (165-193)
src/tux/core/permission_system.py (2)
src/tux/cache/backend.py (1)
  • get_cache_backend (165-193)
src/tux/database/controllers/permissions.py (2)
  • unwrap_optional_perm (582-590)
  • wrap_optional_perm (577-579)
tests/cache/test_shared_cache_backend.py (2)
src/tux/cache/managers.py (3)
  • GuildConfigCacheManager (24-219)
  • JailStatusCache (222-359)
  • invalidate_guild (341-354)
src/tux/cache/backend.py (1)
  • InMemoryBackend (46-94)
src/tux/cache/ttl.py (2)
src/tux/cache/managers.py (5)
  • get (77-98)
  • get (271-290)
  • invalidate (200-214)
  • invalidate (330-339)
  • get_or_fetch (300-318)
src/tux/cache/backend.py (3)
  • get (28-29)
  • get (71-73)
  • get (122-135)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Seer Code Review
  • GitHub Check: Run All Tests (3.13.11)
  • GitHub Check: cubic · AI code reviewer
🔇 Additional comments (11)
src/tux/cache/backend.py (1)

165-193: Shared fallback backend caching looks solid.

Reusing a single in-memory backend across consumers avoids fragmentation and keeps cache behavior consistent when Valkey is down.

src/tux/cache/managers.py (1)

341-353: Guild-specific invalidation behavior is on point.

The predicate-based invalidation now correctly targets only the intended guild’s keys.

tests/core/test_prefix_manager.py (4)

1-12: LGTM! Module setup follows conventions.

Imports are properly organized (stdlib → third-party → local), the unit marker is correctly applied, and the module docstring describes the test purpose.


17-31: LGTM! Fixture now correctly yields to keep CONFIG patch active.

The prefix_manager fixture properly uses yield to maintain the CONFIG patch for the full test duration, and the return type annotation Generator[PrefixManager] is correct.


61-86: LGTM! Comprehensive test for set_prefix backend write.

Test properly verifies in-memory cache update and backend write with correct key format and TTL. The _persist_prefix is correctly mocked to isolate the test from database operations.


89-95: LGTM! Default prefix test covers None guild_id edge case.

Test correctly verifies that get_prefix(None) returns the mocked default prefix "!".

src/tux/core/prefix_manager.py (5)

24-24: LGTM! Import added for backend cache integration.

The import is correctly placed in the local imports group.


95-106: LGTM! Backend lookup integrated into get_prefix flow.

The implementation correctly:

  1. Checks in-memory cache first (fast path)
  2. Falls back to backend lookup
  3. Updates in-memory cache on backend hit
  4. Falls through to DB load on miss

The type check isinstance(backend_val, str) is a good defensive measure.


131-133: LGTM! Backend write in set_prefix.

Using ttl_sec=None for long-lived prefix data is appropriate.


172-173: LGTM! Backend synchronization on prefix load.

Correctly persists the loaded prefix to backend for cross-process consistency.


267-297: LGTM! Async invalidation now clears both in-memory and backend state.

The implementation addresses the previous review concern by iterating through cached keys and deleting them from the backend when guild_id is None. The docstring accurately reflects the behavior.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

- Renamed cache key from `perm:command_permission` to `perm:command_permission_fallback` for better clarity on its purpose.
- Updated method calls in the documentation from `pre_warm_guild_cache` to `prewarm_cache_for_guild` and from `pre_warm_all_caches` to `prewarm_cache_for_all_guilds` to reflect naming consistency and improve readability.
…fetching

- Updated the `get_or_fetch` method to check for `_CACHED_NONE` in the raw cache, returning None without refetching if the stored value is `_CACHED_NONE`.
- Enhanced the `get` method to return None directly when the value is `_CACHED_NONE`, improving cache behavior and consistency.
- Enhanced the cache retrieval logic to handle malformed cache entries more gracefully by checking for unwrapped values before returning them.
- Updated logging to provide clearer messages when encountering malformed cache entries, improving traceability during cache operations.
- Ensured that uncached commands are properly tracked when cache entries are invalid.
…with asyncio

- Modified the prefix cache writing process to gather all set operations concurrently, improving performance during cache updates.
- Enhanced the cache invalidation logic to delete multiple keys concurrently when `guild_id` is None, streamlining cache management.
- Improved overall efficiency and responsiveness of the prefix management system.
…resource management

- Modified the `manager` and `cache` fixtures to yield instances of `GuildConfigCacheManager` and `JailStatusCache`, respectively, ensuring proper teardown and state clearance after tests.
- This change enhances resource management and prevents state leakage between tests, promoting better isolation and reliability in test outcomes.
…cache behavior

- Added assertions to verify that the backend is called correctly when retrieving prefixes from the cache.
- Updated the test for cache hits to ensure that the backend is not called, improving the accuracy of cache behavior validation.
- These changes enhance the reliability of tests related to prefix management and caching logic.
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 6 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/tux/core/permission_system.py">

<violation number="1" location="src/tux/core/permission_system.py:528">
P1: Same negative caching bug as in `get_command_permission`. When a permission was cached as not existing (wrapped as `{"_v": None}`), `unwrap_optional_perm` correctly returns `None`. However, this code treats it as a cache miss and re-queries the database. Should set `cached_results[command_name] = unwrapped` (which is `None`) instead of appending to `uncached_commands`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/tux/core/prefix_manager.py (1)

170-179: Don’t let backend write failures change the returned prefix.
If the backend write fails, _load_guild_prefix currently falls into the broad except and returns the default prefix even when the DB returned a valid prefix. Cache errors should be best-effort and not affect correctness.

🛠️ Suggested fix
             prefix = guild_config.prefix
             self._prefix_cache[guild_id] = prefix
-            backend = get_cache_backend(self.bot)
-            await backend.set(f"prefix:{guild_id}", prefix, ttl_sec=None)
+            backend = get_cache_backend(self.bot)
+            try:
+                await backend.set(f"prefix:{guild_id}", prefix, ttl_sec=None)
+            except Exception as e:
+                logger.warning(
+                    "Failed to write prefix to backend for guild "
+                    f"{guild_id}: {type(e).__name__}",
+                )
🤖 Fix all issues with AI agents
In `@src/tux/core/permission_system.py`:
- Around line 524-535: The cache-hit logic is incorrectly treating a cached None
as malformed because it checks "if unwrapped is not None"; change the logic to
treat any non-None raw cache payload as a hit and accept
unwrap_optional_perm(raw) even when it returns None. Concretely, in the blocks
using backend_key/ PERM_FALLBACK_KEY_PREFIX and calling
self._cache_backend.get(...) (the sections around the current cache hit checks
and the similar block at lines ~637–647), replace the "if unwrapped is not None"
check with a check that raw is not None to detect a cache hit, then call
unwrap_optional_perm(raw) and return its value (including None); only log/handle
a malformed entry if unwrap_optional_perm indicates a real parse failure (or
raises). Ensure both occurrences use the same pattern so cached None is treated
as a valid hit.
🧹 Nitpick comments (4)
docs/content/developer/best-practices/caching.md (4)

146-148: Consider consolidating or emphasizing the clear_all/invalidate_guild limitation.

The limitation that clear_all() and invalidate_guild() only affect in-memory state (not Valkey keys) is mentioned here, again at lines 172-173, and in troubleshooting at line 335. While it's good to document this, the scattered mentions and behavioral inconsistency (same method name, different scope depending on backend) could confuse users who expect these methods to clear all cached state regardless of backend.

Consider adding a more prominent note or consolidating the warnings, perhaps in a dedicated "Backend-specific behavior" callout box early in the document.


230-241: Clarify that instantiation returns the singleton instance.

The document refers to "singleton cache managers" and shows examples creating instances with cache_manager = GuildConfigCacheManager(). While this pattern works correctly if the constructor returns the singleton instance (likely via metaclass or __new__), it might be clearer to explicitly mention that calling GuildConfigCacheManager() multiple times returns the same shared instance, especially for developers unfamiliar with this pattern.

📝 Optional clarification

Add a brief note after line 232:

 ### 3. Use singleton cache managers
 
 For shared data like guild config and jail status, use the singleton cache managers. Do not replace or bypass them with ad-hoc caches for the same data:
+
+**Note**: `GuildConfigCacheManager()` and `JailStatusCache()` return the same shared instance on each call.
 
 ```python

286-286: Clarify TTLCache concurrency statement.

The parenthetical "(no await between check and use for a key)" is unclear. It's not obvious whether this is:

  • Explaining that the API doesn't require await (sync calls)
  • Making a statement about atomicity guarantees
  • Providing guidance on usage patterns

Consider rephrasing for clarity, for example:

-- **TTLCache**: Synchronous API; safe for concurrent use from multiple async tasks (no await between check and use for a key).
+- **TTLCache**: Synchronous API that can be called from async tasks without `await`; thread-safe for concurrent access.

310-313: Consider adding operational guidance for Valkey disconnect scenarios.

The documentation clearly states that there's no automatic reconnection if Valkey is lost mid-run, and that cache operations will raise errors until process restart. Given the potential impact on production availability, consider adding:

  • Recommended monitoring/alerting for Valkey health
  • Circuit breaker or graceful degradation patterns
  • Operational runbook references for handling Valkey failures
  • Whether the application can continue functioning with cache errors or requires immediate restart

This would help operators prepare for and respond to Valkey failures in production.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 519988b and b97cbd2.

📒 Files selected for processing (6)
  • docs/content/developer/best-practices/caching.md
  • src/tux/cache/ttl.py
  • src/tux/core/permission_system.py
  • src/tux/core/prefix_manager.py
  • tests/cache/test_shared_cache_backend.py
  • tests/core/test_prefix_manager.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/tux/cache/ttl.py
  • tests/cache/test_shared_cache_backend.py
🧰 Additional context used
📓 Path-based instructions (4)
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/rules.mdc)

**/*.py: Follow Python code style guide and best practices as defined in core/style-guide.mdc
Follow error handling patterns as defined in error-handling/patterns.mdc
Follow loguru logging patterns as defined in error-handling/logging.mdc
Follow Sentry integration patterns as defined in error-handling/sentry.mdc

**/*.py: Use strict type hints with union syntax (Type | None not Optional[Type]) in Python
Use NumPy docstring format in Python
Prefer absolute imports; relative imports are allowed within the same module in Python
Group imports in order: stdlib → third-party → local, with blank lines separating groups in Python
Use 88 character line length limit in Python
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants in Python
Always add imports to the top of the file unless absolutely necessary in Python
Use async/await for all I/O operations in Python
Log errors and events with context information in Python
Keep individual source files under 1600 lines
Include complete type hints for all function parameters and return values
Include NumPy-format docstrings for all public APIs

Files:

  • src/tux/core/permission_system.py
  • src/tux/core/prefix_manager.py
  • tests/core/test_prefix_manager.py
src/tux/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/tux/**/*.py: Use custom exceptions for business logic errors in Python
Handle Discord rate limits gracefully in API calls

Files:

  • src/tux/core/permission_system.py
  • src/tux/core/prefix_manager.py
**/*.md

📄 CodeRabbit inference engine (.cursor/rules/rules.mdc)

**/*.md: Follow master documentation rules as defined in docs/docs.mdc
Follow practical documentation examples and templates as defined in docs/patterns.mdc
Follow documentation organization structure as defined in docs/structure.mdc
Follow documentation writing standards as defined in docs/style.mdc

Files:

  • docs/content/developer/best-practices/caching.md
**/tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/rules.mdc)

**/tests/**/*.py: Follow fixture usage patterns as defined in testing/fixtures.mdc
Follow test marker conventions as defined in testing/markers.mdc
Follow async testing patterns as defined in testing/async.mdc
Follow test quality philosophy, behavior-driven testing, and ACT pattern as defined in testing/test-quality.mdc

Files:

  • tests/core/test_prefix_manager.py
🧠 Learnings (3)
📚 Learning: 2026-01-25T18:11:17.210Z
Learnt from: CR
Repo: allthingslinux/tux PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-25T18:11:17.210Z
Learning: Applies to src/tux/shared/cache.py : Use TTL cache system for frequently accessed data (guild config, jail status, permissions)

Applied to files:

  • src/tux/core/permission_system.py
  • docs/content/developer/best-practices/caching.md
📚 Learning: 2026-01-25T18:11:17.210Z
Learnt from: CR
Repo: allthingslinux/tux PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-25T18:11:17.210Z
Learning: Applies to src/tux/database/controllers/**/*.py : Use batch operations for permission checks and database queries for optimization

Applied to files:

  • src/tux/core/permission_system.py
📚 Learning: 2026-01-25T18:11:17.210Z
Learnt from: CR
Repo: allthingslinux/tux PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-25T18:11:17.210Z
Learning: Applies to src/tux/core/bot.py : Pre-warm cache on bot startup for frequently accessed data

Applied to files:

  • src/tux/core/permission_system.py
  • src/tux/core/prefix_manager.py
🧬 Code graph analysis (2)
src/tux/core/prefix_manager.py (2)
tests/cache/test_shared_cache_backend.py (1)
  • cache (133-140)
src/tux/cache/backend.py (1)
  • get_cache_backend (165-193)
tests/core/test_prefix_manager.py (2)
src/tux/core/prefix_manager.py (2)
  • PrefixManager (34-321)
  • set_prefix (109-138)
src/tux/shared/config/settings.py (1)
  • is_prefix_override_enabled (396-408)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Seer Code Review
  • GitHub Check: Run All Tests (3.13.11)
  • GitHub Check: cubic · AI code reviewer
🔇 Additional comments (7)
src/tux/core/permission_system.py (2)

22-38: Cache backend wiring looks solid.

Also applies to: 134-135


480-488: Good cache invalidation for command + parent keys.

src/tux/core/prefix_manager.py (4)

95-105: Backend read-through + in-memory update looks good.


131-134: Write-through cache update is clear.


243-255: Concurrent backend writes on load are a nice touch.


272-305: All-guild invalidation now clears backend keys too—nice.

tests/core/test_prefix_manager.py (1)

1-100: Solid coverage of backend-hit, cache-hit, write-through, and default cases.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

…ions

- Simplified the cache retrieval logic by directly unwrapping values and removing unnecessary checks for malformed entries.
- Improved logging to provide clearer messages for cache hits, enhancing traceability during cache operations.
- Ensured that uncached commands are accurately tracked, improving overall cache management efficiency.
…l guidance

- Added notes on the behavior of `clear_all` and `invalidate_guild` methods when using Valkey, emphasizing their in-memory limitations.
- Included operational guidance for using Valkey in production, covering monitoring, graceful degradation, and failure response strategies.
- Enhanced clarity on singleton cache manager usage and concurrency safety in the documentation.
…etup

- Introduced Valkey as an optional cache backend for persistent state across restarts, with detailed configuration instructions.
- Updated documentation to specify that bot owner ID and sysadmins should be set in `config/config.json` instead of `.env`.
- Enhanced Docker installation guide to include Valkey setup and usage instructions for shared caching.
…larity

- Changed logger formatting in `GuildConfigCacheManager` and `JailStatusCache` to use curly braces for string interpolation, enhancing readability and consistency in log outputs.
- This update ensures that backend type information is logged more clearly, aiding in debugging and monitoring cache behavior.
- Updated the logger message in `CacheSetupService` to use curly braces for string interpolation, ensuring consistency with previous changes in cache managers.
- This adjustment enhances the clarity of logged backend type information, aiding in debugging and monitoring efforts.
- Updated the command in the Docker Compose configuration for the Valkey service to set the log level to debug, enhancing the verbosity of logs for better troubleshooting and monitoring.
…cker setups

- Added guidance on using `VALKEY_HOST=localhost` for local bot runs and `VALKEY_HOST=tux-valkey` for Docker Compose environments.
- Included troubleshooting tips for resolving "Name or service not known" errors when running the bot locally.
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="compose.yaml">

<violation number="1" location="compose.yaml:64">
P2: Avoid running Valkey with debug logging in Compose; it can expose cached data in logs and create excessive log volume. Use warning (or another non-debug level) unless explicitly troubleshooting.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

ports: [127.0.0.1:6379:6379]
volumes:
- tux_valkey_data:/data
command: valkey-server --save 60 1 --loglevel debug
Copy link

@cubic-dev-ai cubic-dev-ai bot Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Avoid running Valkey with debug logging in Compose; it can expose cached data in logs and create excessive log volume. Use warning (or another non-debug level) unless explicitly troubleshooting.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At compose.yaml, line 64:

<comment>Avoid running Valkey with debug logging in Compose; it can expose cached data in logs and create excessive log volume. Use warning (or another non-debug level) unless explicitly troubleshooting.</comment>

<file context>
@@ -61,7 +61,7 @@ services:
     volumes:
       - tux_valkey_data:/data
-    command: valkey-server --save 60 1 --loglevel warning
+    command: valkey-server --save 60 1 --loglevel debug
     logging:
       driver: json-file
</file context>
Suggested change
command: valkey-server --save 60 1 --loglevel debug
command: valkey-server --save 60 1 --loglevel warning
Fix with Cubic

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/tux/core/setup/cache_setup.py`:
- Around line 50-64: The CacheService created in setup (CacheService()) may
remain open if connect() or ping() raises, because await service.close() is only
called when ping() returns False; wrap the connect/ping logic in a
try/except/finally (or add a finally) to always call await service.close() when
the service was created and not assigned to self.bot.cache_service, ensure you
only set self.bot.cache_service = service when ping() succeeds (in the success
branch) and close the service in the finally if self.bot.cache_service is still
None; update the block around CacheService, connect(), ping(), close()
accordingly.
🧹 Nitpick comments (4)
docs/content/selfhost/config/environment.md (1)

57-68: Consider adding brief context to each configuration option.

The examples show different hostnames (localhost vs tux-valkey) without immediate context, which might confuse users who skim the documentation. While the tip on lines 77-82 explains this well, adding brief inline context would improve clarity.

📝 Suggested clarification
 Use either **VALKEY_URL** or **individual variables**:

 ```env
-# Option A: URL (overrides VALKEY_HOST/PORT/DB/PASSWORD)
+# Option A: URL (overrides VALKEY_HOST/PORT/DB/PASSWORD; use localhost for local runs)
 VALKEY_URL=valkey://localhost:6379/0

-# Option B: Individual (e.g. for Docker with tux-valkey)
+# Option B: Individual (use tux-valkey when bot runs in Docker Compose)
 VALKEY_HOST=tux-valkey
 VALKEY_PORT=6379
 VALKEY_DB=0
 VALKEY_PASSWORD=

</details>

</blockquote></details>
<details>
<summary>src/tux/core/setup/cache_setup.py (1)</summary><blockquote>

`37-45`: **Use NumPy-style docstring for `setup`.**

The current docstring is descriptive but not NumPy-formatted. Consider converting it to the required format for public APIs.

<details>
<summary>✍️ Suggested docstring shape</summary>

```diff
 async def setup(self) -> None:
-        """
-        Connect to Valkey when configured; otherwise leave cache_service as None.
-
-        If VALKEY_URL (or VALKEY_HOST) is empty, bot.cache_service is set to None
-        and the bot continues without Valkey (in-memory caches only). If the URL
-        is set but connection or ping fails, log and set bot.cache_service = None
-        so the bot still starts. Never raises.
-        """
+        """
+        Set up the cache backend for the bot.
+
+        Returns
+        -------
+        None
+            Initializes `cache_service` and never raises.
+
+        Notes
+        -----
+        If VALKEY_URL (or VALKEY_HOST) is empty, the bot uses in-memory caches.
+        If connection or ping fails, the bot logs a warning and continues with
+        in-memory caches.
+        """

As per coding guidelines: "Include NumPy-format docstrings for all public APIs".

src/tux/cache/managers.py (2)

52-59: Add NumPy docstrings for public cache APIs.

Several public methods use brief docstrings rather than NumPy format (e.g., set_backend, async_set, clear_all, get_or_fetch, invalidate_guild). Align these with the required style.

✍️ Example update (apply similarly to others)
 def set_backend(self, backend: AsyncCacheBackend) -> None:
-        """Set the cache backend (e.g. from get_cache_backend(bot))."""
+        """
+        Set the cache backend.
+
+        Parameters
+        ----------
+        backend : AsyncCacheBackend
+            Backend instance to use for cache operations.
+        """

As per coding guidelines: "Include NumPy-format docstrings for all public APIs".

Also applies to: 182-190, 216-218, 300-306, 320-322, 341-347, 356-358


214-214: Use loguru {} placeholders instead of f-strings.

This keeps logging consistent with loguru formatting and avoids eager string interpolation.

🔧 Example fixes
-        logger.debug(f"Invalidated guild config cache for guild {guild_id}")
+        logger.debug("Invalidated guild config cache for guild {}", guild_id)

-        logger.debug(
-            f"Invalidated jail status cache for guild {guild_id}, user {user_id}",
-        )
+        logger.debug(
+            "Invalidated jail status cache for guild {}, user {}",
+            guild_id,
+            user_id,
+        )

-        logger.debug(
-            f"Invalidated {removed} jail status cache entries for guild {guild_id}",
-        )
+        logger.debug(
+            "Invalidated {} jail status cache entries for guild {}",
+            removed,
+            guild_id,
+        )

As per coding guidelines: "Follow loguru logging patterns as defined in error-handling/logging.mdc".

Also applies to: 337-339, 352-354

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a8680e and e70655f.

📒 Files selected for processing (8)
  • compose.yaml
  • docs/content/selfhost/config/bot-token.md
  • docs/content/selfhost/config/environment.md
  • docs/content/selfhost/config/index.md
  • docs/content/selfhost/index.md
  • docs/content/selfhost/install/docker.md
  • src/tux/cache/managers.py
  • src/tux/core/setup/cache_setup.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • compose.yaml
🧰 Additional context used
📓 Path-based instructions (3)
**/*.md

📄 CodeRabbit inference engine (.cursor/rules/rules.mdc)

**/*.md: Follow master documentation rules as defined in docs/docs.mdc
Follow practical documentation examples and templates as defined in docs/patterns.mdc
Follow documentation organization structure as defined in docs/structure.mdc
Follow documentation writing standards as defined in docs/style.mdc

Files:

  • docs/content/selfhost/install/docker.md
  • docs/content/selfhost/config/environment.md
  • docs/content/selfhost/config/index.md
  • docs/content/selfhost/index.md
  • docs/content/selfhost/config/bot-token.md
**/*.py

📄 CodeRabbit inference engine (.cursor/rules/rules.mdc)

**/*.py: Follow Python code style guide and best practices as defined in core/style-guide.mdc
Follow error handling patterns as defined in error-handling/patterns.mdc
Follow loguru logging patterns as defined in error-handling/logging.mdc
Follow Sentry integration patterns as defined in error-handling/sentry.mdc

**/*.py: Use strict type hints with union syntax (Type | None not Optional[Type]) in Python
Use NumPy docstring format in Python
Prefer absolute imports; relative imports are allowed within the same module in Python
Group imports in order: stdlib → third-party → local, with blank lines separating groups in Python
Use 88 character line length limit in Python
Use snake_case for functions and variables, PascalCase for classes, UPPER_CASE for constants in Python
Always add imports to the top of the file unless absolutely necessary in Python
Use async/await for all I/O operations in Python
Log errors and events with context information in Python
Keep individual source files under 1600 lines
Include complete type hints for all function parameters and return values
Include NumPy-format docstrings for all public APIs

Files:

  • src/tux/core/setup/cache_setup.py
  • src/tux/cache/managers.py
src/tux/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

src/tux/**/*.py: Use custom exceptions for business logic errors in Python
Handle Discord rate limits gracefully in API calls

Files:

  • src/tux/core/setup/cache_setup.py
  • src/tux/cache/managers.py
🧠 Learnings (3)
📚 Learning: 2026-01-25T18:11:17.210Z
Learnt from: CR
Repo: allthingslinux/tux PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-25T18:11:17.210Z
Learning: Applies to src/tux/core/bot.py : Pre-warm cache on bot startup for frequently accessed data

Applied to files:

  • src/tux/core/setup/cache_setup.py
📚 Learning: 2026-01-25T18:11:17.210Z
Learnt from: CR
Repo: allthingslinux/tux PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-25T18:11:17.210Z
Learning: Store all secrets in `.env` files and environment variables, never in code

Applied to files:

  • docs/content/selfhost/config/bot-token.md
📚 Learning: 2026-01-25T18:11:17.210Z
Learnt from: CR
Repo: allthingslinux/tux PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-25T18:11:17.210Z
Learning: Applies to src/tux/shared/cache.py : Use TTL cache system for frequently accessed data (guild config, jail status, permissions)

Applied to files:

  • src/tux/cache/managers.py
🧬 Code graph analysis (1)
src/tux/cache/managers.py (3)
tests/cache/test_shared_cache_backend.py (1)
  • cache (133-140)
src/tux/cache/backend.py (7)
  • AsyncCacheBackend (25-43)
  • get (28-29)
  • get (71-73)
  • get (122-135)
  • delete (39-40)
  • delete (88-90)
  • delete (151-154)
src/tux/cache/ttl.py (6)
  • TTLCache (17-196)
  • get (49-85)
  • invalidate (113-129)
  • clear (194-196)
  • get_or_fetch (142-173)
  • invalidate_keys_matching (131-140)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Seer Code Review
  • GitHub Check: Run All Tests (3.13.11)
  • GitHub Check: cubic · AI code reviewer
  • GitHub Check: Validate
🔇 Additional comments (12)
docs/content/selfhost/config/bot-token.md (1)

48-56: LGTM! Configuration guidance is clear and well-structured.

The updated instructions properly separate bot token (secret, in .env) from bot owner configuration (non-secret, in config.json). The reference to the Self-Host Configuration documentation helps users find the complete JSON structure.

docs/content/selfhost/config/environment.md (1)

84-97: LGTM! Clear guidance on bot owner configuration.

The JSON example and references to configuration tools provide users with actionable guidance. The emphasis on using config.json instead of .env for non-secret configuration is appropriate.

docs/content/selfhost/install/docker.md (3)

129-132: LGTM! Service descriptions are clear and consistent.

The service list properly documents the profile requirements and provides clear descriptions of each service's purpose.


172-194: LGTM! Comprehensive Valkey setup instructions.

The section provides clear step-by-step guidance with proper profile usage and environment configuration. The instructions correctly reference the Docker Compose service name (tux-valkey) for the VALKEY_HOST variable.


212-220: LGTM! Configuration examples are clear and well-documented.

The environment variable examples properly reference the configuration documentation and include helpful inline comments about profile usage.

docs/content/selfhost/config/index.md (2)

51-95: LGTM! Clear configuration guidance with proper separation of concerns.

The documentation correctly separates secret configuration (.env) from non-secret configuration (config.json). The JSON examples are valid and consistent with other documentation files.


165-178: Configuration examples are clear and complete.

The Valkey configuration section provides clear guidance with appropriate examples for both Docker and local setups. The fragment identifier #optional-valkey-cache correctly maps to the heading "### Optional: Valkey (cache)" in environment.md.

docs/content/selfhost/index.md (1)

47-53: The fragment identifiers in the links are correctly formatted and will resolve properly. Both #optional-valkey-cache and #valkey-optional-cache match how GitHub Markdown generates anchors from the target headings.

src/tux/core/setup/cache_setup.py (1)

66-69: Backend wiring looks solid.

Wiring the selected backend into both managers after setup is clean and consistent.

src/tux/cache/managers.py (3)

35-65: Per-guild lock initialization is thread-safe.

The _locks_lock guard and async _get_lock eliminate races when creating per-guild locks.


77-181: Guild config merge + partial update flow looks consistent.

The read/merge/write path plus optional locking is clear and should behave predictably across both backends.


281-289: JailStatusCache.get treats "false"/"0" as True.

For string values, bool("false") is True, so false-like strings are misinterpreted as jailed.

🐛 Suggested fix
         if self._backend is not None:
             value = await self._backend.get(key)
             if value is None:
                 return None
             if isinstance(value, bool):
                 return value
             if isinstance(value, str):
-                return value in ("1", "true", "True")
+                normalized = value.strip().lower()
+                if normalized in ("1", "true", "yes"):
+                    return True
+                if normalized in ("0", "false", "no"):
+                    return False
+                return bool(normalized)
             return bool(value)

Likely an incorrect or invalid review comment.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +50 to +64
service = CacheService()
try:
await service.connect()
if await service.ping():
self.bot.cache_service = service
logger.success("Cache (Valkey) setup completed")
else:
logger.warning("Valkey ping failed; using in-memory caches only")
await service.close()
self.bot.cache_service = None
except Exception as e:
logger.warning(
f"Valkey connection failed ({e}); using in-memory caches only",
)
self.bot.cache_service = None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's read the cache_setup.py file to see the actual code
cat -n src/tux/core/setup/cache_setup.py | head -80

Repository: allthingslinux/tux

Length of output: 2934


🏁 Script executed:

# Find the loguru logging patterns guide
fd -t f logging.mdc

Repository: allthingslinux/tux

Length of output: 44


🏁 Script executed:

# Search for logging-related documentation
find . -name "*.mdc" -o -name "*.md" | grep -E "(logging|error)" | head -20

Repository: allthingslinux/tux

Length of output: 454


🏁 Script executed:

# Let's also check the error-handling directory structure
ls -la error-handling/ 2>/dev/null || echo "Directory not found, searching..."
find . -type d -name "error-handling"

Repository: allthingslinux/tux

Length of output: 160


🏁 Script executed:

cat -n ./.cursor/rules/error-handling/logging.mdc

Repository: allthingslinux/tux

Length of output: 9607


🏁 Script executed:

# Search for loguru logging with exceptions in the codebase
rg -A 2 "logger\.(warning|error|exception)" src/tux/ --type py | grep -A 2 "except" | head -40

Repository: allthingslinux/tux

Length of output: 2150


🏁 Script executed:

# Check specifically for loguru usage patterns with variables
rg "logger\.(warning|error|info)" src/tux/ --type py -B 1 -A 1 | grep -E "(f\"|logger\.|Exception)" | head -30

Repository: allthingslinux/tux

Length of output: 2768


🏁 Script executed:

# Search for logger.warning and logger.error calls to see pattern usage
rg "logger\.(warning|error)\(" src/tux/ --type py -A 1 | grep -E "logger\.(warning|error)" | head -20

Repository: allthingslinux/tux

Length of output: 2003


🏁 Script executed:

# Find CacheService definition to understand resource management
fd -t f "cache" src/tux/ --extension py | head -10

Repository: allthingslinux/tux

Length of output: 95


🏁 Script executed:

# Find and read CacheService definition
rg "class CacheService" src/tux/ --type py -A 20

Repository: allthingslinux/tux

Length of output: 1334


🏁 Script executed:

# Also check the cache module imports
find src/tux/cache -name "*.py" 2>/dev/null | head -10

Repository: allthingslinux/tux

Length of output: 184


🏁 Script executed:

# Read the full CacheService class
cat -n src/tux/cache/service.py | head -150

Repository: allthingslinux/tux

Length of output: 5539


Ensure CacheService is closed on failures.

If connect() or ping() raises an exception, the service remains open and resources are not released. The await service.close() is only called when ping() returns False, not in the exception handler. Use a finally block to guarantee cleanup:

🛠️ Suggested fix
         else:
             service = CacheService()
+            self.bot.cache_service = None
             try:
                 await service.connect()
                 if await service.ping():
                     self.bot.cache_service = service
                     logger.success("Cache (Valkey) setup completed")
-                else:
-                    logger.warning("Valkey ping failed; using in-memory caches only")
-                    await service.close()
-                    self.bot.cache_service = None
-            except Exception as e:
-                logger.warning(
-                    f"Valkey connection failed ({e}); using in-memory caches only",
-                )
-                self.bot.cache_service = None
+                    return
+                logger.warning("Valkey ping failed; using in-memory caches only")
+            except Exception as exc:
+                logger.warning(
+                    "Valkey connection failed ({}); using in-memory caches only",
+                    exc,
+                )
+            finally:
+                if self.bot.cache_service is None:
+                    await service.close()
🤖 Prompt for AI Agents
In `@src/tux/core/setup/cache_setup.py` around lines 50 - 64, The CacheService
created in setup (CacheService()) may remain open if connect() or ping() raises,
because await service.close() is only called when ping() returns False; wrap the
connect/ping logic in a try/except/finally (or add a finally) to always call
await service.close() when the service was created and not assigned to
self.bot.cache_service, ensure you only set self.bot.cache_service = service
when ping() succeeds (in the success branch) and close the service in the
finally if self.bot.cache_service is still None; update the block around
CacheService, connect(), ping(), close() accordingly.

- Added a detailed section on Cache TTL values, including definitions, usage, and behavior differences between Valkey and in-memory backends.
- Documented specific TTL constants for various data types and provided rationale for chosen values, enhancing understanding of caching strategies.
- Clarified how TTLs are managed in different backend scenarios, improving operational guidance for developers.
- Consolidated independent database queries into a single call for jail configuration, reducing the number of round-trips to the database.
- Improved efficiency by parallelizing the check for whether a member is jailed, enhancing overall performance in the moderation module.
…tion coordinator

- Enhanced the moderation coordinator to handle case creation and post-action DM sending in parallel, improving responsiveness and efficiency.
- Added error handling for both case creation and DM sending, ensuring that failures in one do not block the other.
- Streamlined the logic for sending post-action DMs, making it conditional based on action type and silent mode.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Valkey

2 participants