Skip to content

feat: support download file#1424

Merged
earayu merged 1 commit intomainfrom
feature/support_download
Jan 13, 2026
Merged

feat: support download file#1424
earayu merged 1 commit intomainfrom
feature/support_download

Conversation

@earayu
Copy link
Collaborator

@earayu earayu commented Jan 13, 2026

Note

Introduces a secure, streaming download API for original document files.

  • Adds GET /collections/{collection_id}/documents/{document_id}/download with access control, status checks (blocks EXPIRED/DELETED), and proper headers (Content-Type, Content-Disposition, Content-Length) via StreamingResponse (document_service.download_document, views/collections.py)
  • Updates OpenAPI specs (aperag/api/openapi.yaml, paths/collections.yaml, web/src/api/openapi.merged.yaml) and generates web SDK DocumentsApi (web/src/api/apis/documents-api.ts, exported in web/src/api/api.ts)
  • Adds e2e tests for success, nonexistent/deleted, content-type, and unauthorized scenarios (tests/e2e_test/test_document_download.py)
  • Updates design docs to reflect new route and lifecycle/cleanup behavior (docs/design/document_export_design_zh.md)
  • Regenerates view models timestamp (aperag/schema/view_models.py)

Written by Cursor Bugbot for commit 1a9e14e. This will update automatically on new commits. Configure here.

@earayu earayu merged commit 6c53621 into main Jan 13, 2026
8 of 9 checks passed
@earayu earayu deleted the feature/support_download branch January 13, 2026 12:58
@apecloud-bot apecloud-bot added the size/XL Denotes a PR that changes 500-999 lines. label Jan 13, 2026
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.


except Exception as e:
logger.error(f"Failed to download document {document_id} from path {object_path}: {e}", exc_info=True)
raise HTTPException(status_code=500, detail="Failed to download document from storage")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTPException caught and converted to 500 error

Medium Severity

The except Exception as e block at line 1074 catches all exceptions including the HTTPException(status_code=404) raised at line 1051 when the file is not found in the object store. This causes the 404 error to be swallowed and re-raised as a generic 500 error, returning incorrect status codes to API clients. The exception handler needs to re-raise HTTPException instances instead of converting them.

Additional Locations (1)

Fix in Cursor Fix in Web

# Set headers for file download
headers = {
"Content-Type": content_type,
"Content-Disposition": f'attachment; filename="{document.name}"',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unescaped filename in Content-Disposition header breaks downloads

Medium Severity

The Content-Disposition header directly interpolates document.name without escaping special characters. If a filename contains double quotes (e.g., report "final".pdf), the header becomes malformed (filename="report "final".pdf"), causing browsers to misinterpret or fail the download. Per RFC 6266, special characters in the quoted-string parameter need to be escaped, or the filename* parameter with URL encoding should be used for non-ASCII filenames.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Denotes a PR that changes 500-999 lines.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants