Skip to content

Privacy and performance audit #322

@ghost

Description

Description

Context: Scanning repository content and GitHub entities raises privacy and scale concerns. The runner must perform safely at org scale and avoid leaking sensitive data or exhausting API/rate limits.

Specific Requirements:

  • Perform an audit and implement constraints: 1) scanning limits (max files, max bytes per file), 2) rate-limit aware GitHub calls with exponential backoff and retry, 3) configurable allowlists/denylists to avoid scanning sensitive paths (e.g., configs/, secrets/), 4) redaction policy for notifications and logs so matched sensitive tokens are not posted to external webhooks unless explicitly allowed.
  • Produce an audit report (docs/REPO-RUNNER-AUDIT.md) listing expected resource usage, worst-case API calls per repo, and suggested default limits to include in runner config. Recommend a safe default config used by GitHub Action template.

Implementation Details:

  • Add util/rateLimiter.ts or integrate existing retry helpers in src/util/openai.ts style. Add logging hooks to redact matches in action payloads unless explicit allow flag is set.
  • Tests: add performance and redaction unit tests simulating many files and matches.

Expected Behavior:

  • Runner operates within GitHub API limits in default mode, respects redaction settings, and provides guidance in audit doc. Success criteria: audit published, redaction implemented, and rate-limit handling covered by tests.

Details

  • Priority: high
  • Category: investigation
  • Source: Review

Suggestions

  • Produce an audit report with safe defaults and API usage estimates
  • Implement scanning limits, redaction policy, and rate-limit-aware GitHub helpers

This issue was automatically created from a review session.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions