Skip to content

[Bug] sanitize_logs() is never called - user-pasted secrets in build logs go raw into the LLM prompt #265

@sharma-sugurthi

Description

@sharma-sugurthi

Jenkins and plugins versions report

N/A- this is a Python backend bug in chatbot-core/api/prompts/prompt_builder.py and chatbot-core/api/services/chat_service.py, not a Jenkins plugin runtime issue.

What Operating System are you using (both controller, and any agents involved in the problem)?

Linux(Ubuntu 22.04)

Reproduction steps

1.Start a chat session and paste a Jenkins build log containing secrets, e.g.: "Build failed. password=MyS3cretPass api_key=AKIAIOSFODNN7EXAMPLE"
2.The log text enters the pipeline via the log_context parameter
3.In build_prompt() (prompt_builder.py, line 49), the raw log_context is injected directly into the LLM prompt without any sanitization
4.The unsanitized text (including passwords, API keys, tokens) is sent to the LLM
5.sanitize_logs() in api/tools/sanitizer.py exists and handles exactly this redacting passwords, AWS keys, Bearer tokens, GitHub tokens, private keys,and Docker login credential,but it is never imported or called anywhere in the production code

Expected Results

build_prompt() should call sanitize_logs(log_context) before injecting user-provided log data into the prompt. The sanitizer module already exists with comprehensive regex patterns for common secret type, it just needs to be wired into the pipeline.

Actual Results

sanitize_logs() is dead code,only imported in test_sanitizer.py. Raw user-pasted logs containing secrets go directly into the LLM prompt unsanitized.

Anything else?

the sanitizer module (api/tools/sanitizer.py) already handles:

  • password=/passwd=/pwd=/secret=/access_token=/api_key=/client_secret= patterns
  • AWS Access Key IDs (AKI...)
  • Bearer tokens
  • GitHub tokens (ghp_...)
  • Private key blocks (-----BEGIN ... PRIVATE KEY-----)
  • Docker login -p flags

the fix is a 2-line change in prompt_builder.py:

  1. Import: from api.tools.sanitizer import sanitize_logs
  2. Sanitize: log_context = sanitize_logs(log_context) before injecting into prompt

note: Flamki's merged PR #198 addressed redacting chat payloads from Python service LOGS (logger output).

this issue is different it's about sanitizing user-provided build logs before they enter the LLM PROMPT.

Are you interested in contributing a fix?

Yes and also happy to guide another contributor through this if anyone wants.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions