fix: include max_file_size in S3 cache key to prevent stale digests by Louiszk · Pull Request #570 · coderamp-labs/gitingest

Louiszk · 2026-03-12T11:34:18Z

This PR fixes a caching bug where S3 cache keys did not account for max_file_size (see #568).

Previously, if a repository was requested with the default 50KB limit, and then requested again with a 100MB limit, the server would return the cached 50KB version because the cache key only relied on include/exclude patterns and the commit hash.

Updated generate_s3_file_path in src/server/s3_utils.py to accept max_file_size and append it to the hashing string.
Passed query.max_file_size into both calls to generate_s3_file_path inside src/server/query_processor.py.

While looking into this, I realized this strict hashing approach might cause unnecessary cache misses. For example, if a user requests a 500KB limit, and then a 2MB limit on a repo where the largest file is only 100KB, the current fix will treat them as different keys and trigger a re-clone.

Ideally, the cache would stay the same in this instance. In the future, it might be worth adding largest_file_encountered to the S3Metadata JSON and updating the lookup logic to allow compatible cache hits.

For now, adding the size to the hash key is a fast and reliable way to prevent the UI from serving incorrectly limited files.

fix: include max_file_size in S3 cache key to prevent stale digests

fc2b22b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: include max_file_size in S3 cache key to prevent stale digests#570

fix: include max_file_size in S3 cache key to prevent stale digests#570
Louiszk wants to merge 1 commit intocoderamp-labs:mainfrom
Louiszk:fix/s3-cache-size

Louiszk commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Louiszk commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant