Skip to content

S3: delete directories in batches#12910

Open
stsewd wants to merge 7 commits intomainfrom
delete-s3-directories-in-batches
Open

S3: delete directories in batches#12910
stsewd wants to merge 7 commits intomainfrom
delete-s3-directories-in-batches

Conversation

@stsewd
Copy link
Copy Markdown
Member

@stsewd stsewd commented Apr 1, 2026

We were iterating over each file when deleting a directory. boto3 has support for batch operations, it does one call for every 1K files, this get us faster deletions and fewer API calls.

@stsewd
Copy link
Copy Markdown
Member Author

stsewd commented Apr 1, 2026

@copilot write test for the S3BuildMediaStorage class for its delete_directory method, you can mock the call to .delete().

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 1, 2026

@copilot write test for the S3BuildMediaStorage class for its delete_directory method, you can mock the call to .delete().

Added tests for S3BuildMediaStorage.delete_directory in readthedocs/storage/tests/test_s3_storage.py (commit 411d03e). The tests mock the bucket via _bucket (the internal attribute used by the bucket property) and cover:

  • delete_directory calls bucket.objects.filter(Prefix=path).delete() with the correct path
  • A trailing slash is appended when the path doesn't have one
  • SuspiciousFileOperation is raised for root path "/" and empty path ""

@stsewd stsewd marked this pull request as ready for review April 2, 2026 17:14
@stsewd stsewd requested a review from a team as a code owner April 2, 2026 17:14
@stsewd stsewd requested a review from humitos April 2, 2026 17:14
@stsewd stsewd moved this to Needs review in 📍Roadmap Apr 2, 2026
@stsewd stsewd requested a review from Copilot April 7, 2026 19:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improve S3 directory deletion performance by leveraging boto3’s batch delete operations over a prefix, reducing per-object API calls.

Changes:

  • Replace recursive listdir + per-file deletion with bucket.objects.filter(Prefix=...).delete().
  • Normalize delete paths using clean_name + _normalize_name, and enforce trailing slashes.
  • Add unit tests covering prefix deletion, trailing-slash behavior, and suspicious root/empty path inputs.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
readthedocs/storage/s3_storage.py Switches delete_directory to a single prefix-based batch delete and adds path normalization.
readthedocs/storage/tests/test_s3_storage.py Adds tests for S3 delete_directory behavior using a mocked bucket.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an obvious win. I'm pretty surprised django-storages doesn't expose this directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Needs review

Development

Successfully merging this pull request may close these issues.

4 participants