Skip to content

Add redaction module with pattern-based env var redaction #4

@MarcusJellinghaus

Description

@MarcusJellinghaus

Summary

Add a new redaction.py module to consolidate all redaction logic. Introduce redact_env_vars for substring-based key matching to redact sensitive environment variables. Currently, _redact_for_logging uses exact field-name matching — this doesn't catch env vars like GITHUB_TOKEN when the sensitive field is "token".

Depends on: #2 (log_utils must land in mcp-coder-utils first) ✅ Done

Referenced by: MarcusJellinghaus/mcp_coder#741 (iCoder /info slash command)

Motivation

  1. mcp_coder needs env var redaction in multiple places: log_llm_request() logs env vars without redaction, the planned /info command (#741) will display env vars, and verify may be extended to dump env values.
  2. mcp-coder-utils log_utils.py itself should offer redaction utilities so consumers don't leak secrets in log output.

Implementation Sketch

1. New module redaction.py

Move existing redaction code from log_utils.py into redaction.py and rename to public API:

  • REDACTED_VALUE
  • RedactableDict
  • _redact_for_logging()redact_for_logging() (renamed to public)

Add new:

SENSITIVE_KEY_PATTERNS: frozenset[str] = frozenset({
    "token", "secret", "password", "credential", "api_key", "access_key",
})

def redact_env_vars(
    env: Mapping[str, str],
    extra_patterns: frozenset[str] | None = None,
) -> dict[str, str]:
    """Redact env var values whose keys contain sensitive substrings (case-insensitive)."""

2. Update log_utils.py

Import redact_for_logging, RedactableDict from redaction.py for internal use. No re-exports — log_utils.__all__ stays unchanged. log_function_call calls redact_for_logging (new name) internally.

3. Public API

redaction.py.__all__ exports: redact_for_logging, redact_env_vars, SENSITIVE_KEY_PATTERNS, REDACTED_VALUE, RedactableDict.

4. Tests

Move existing redaction tests from tests/test_log_utils_redaction.py to tests/test_redaction.py, update imports to mcp_coder_utils.redaction. Add new tests for redact_env_vars and SENSITIVE_KEY_PATTERNS.

Constraints & Rationale

  • No bare "key" in deny list — substring "key" false-positives on KEYBOARD_LAYOUT, HKEY_*, REGISTRY_KEY. Use "api_key" and "access_key" instead.
  • extra_patterns parameter — callers can add project-specific patterns without modifying the shared constant.
  • Separate module — keeps log_utils.py focused on logging; redaction.py owns all redaction logic.
  • Move, not copyredact_for_logging, REDACTED_VALUE, RedactableDict move to redaction.py; log_utils.py imports from it. Avoids duplication.
  • log_utils.__all__ never exported redaction symbols — removing them from log_utils.py is not a public API break. Only internal and test imports need updating.
  • Existing tests import from log_utilstest_log_utils_redaction.py imports _redact_for_logging from log_utils. These imports must be updated to the new module and function name.
  • Two matching strategies coexistredact_for_logging uses exact key match (for structured data with known field names); redact_env_vars uses case-insensitive substring match (for env var keys where the full key name varies). These serve different use cases and are not interchangeable.
  • No downstream imports exist — verified that none of the 4 consumer repos (p_mcp_coder, p_workspace, p_config, p_tools) import redaction symbols from mcp_coder_utils.log_utils. Clean break is safe.

Decisions

# Topic Decision
1 Location New redaction.py module (not in log_utils.py)
2 Consolidation Move REDACTED_VALUE, RedactableDict, _redact_for_logging from log_utils.py to redaction.py
3 Deny list {"token", "secret", "password", "credential", "api_key", "access_key"} — no bare "key"
4 Customizability Optional extra_patterns: frozenset[str] | None parameter
5 Usage pattern Standalone public function; callers apply before logging (e.g. log_llm_request)
6 Consumer rule ≥2 consumers: mcp-coder-utils (internal) + mcp_coder (multiple use sites)
7 Rename to public _redact_for_loggingredact_for_logging (drop underscore, public API)
8 Full __all__ export redact_for_logging, redact_env_vars, SENSITIVE_KEY_PATTERNS, REDACTED_VALUE, RedactableDict
9 No re-exports log_utils.py does not re-export redaction symbols; clean break confirmed safe

Metadata

Metadata

Labels

status-10:pr-createdPull request created, awaiting approval/merge

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions