feat: support download file by earayu · Pull Request #1424 · apecloud/ApeRAG

earayu · 2026-01-13T12:58:48Z

Note

Introduces a secure, streaming download API for original document files.

Adds GET /collections/{collection_id}/documents/{document_id}/download with access control, status checks (blocks EXPIRED/DELETED), and proper headers (Content-Type, Content-Disposition, Content-Length) via StreamingResponse (document_service.download_document, views/collections.py)
Updates OpenAPI specs (aperag/api/openapi.yaml, paths/collections.yaml, web/src/api/openapi.merged.yaml) and generates web SDK DocumentsApi (web/src/api/apis/documents-api.ts, exported in web/src/api/api.ts)
Adds e2e tests for success, nonexistent/deleted, content-type, and unauthorized scenarios (tests/e2e_test/test_document_download.py)
Updates design docs to reflect new route and lifecycle/cleanup behavior (docs/design/document_export_design_zh.md)
Regenerates view models timestamp (aperag/schema/view_models.py)

^{Written by Cursor Bugbot for commit 1a9e14e. This will update automatically on new commits. Configure here.}

cursor

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

cursor · 2026-01-13T13:04:22Z

aperag/service/document_service.py

+
+            except Exception as e:
+                logger.error(f"Failed to download document {document_id} from path {object_path}: {e}", exc_info=True)
+                raise HTTPException(status_code=500, detail="Failed to download document from storage")


HTTPException caught and converted to 500 error

Medium Severity

The except Exception as e block at line 1074 catches all exceptions including the HTTPException(status_code=404) raised at line 1051 when the file is not found in the object store. This causes the 404 error to be swallowed and re-raised as a generic 500 error, returning incorrect status codes to API clients. The exception handler needs to re-raise HTTPException instances instead of converting them.

Additional Locations (1)

aperag/service/document_service.py#L1049-L1051

cursor · 2026-01-13T13:04:22Z

aperag/service/document_service.py

+                # Set headers for file download
+                headers = {
+                    "Content-Type": content_type,
+                    "Content-Disposition": f'attachment; filename="{document.name}"',


Unescaped filename in Content-Disposition header breaks downloads

Medium Severity

The Content-Disposition header directly interpolates document.name without escaping special characters. If a filename contains double quotes (e.g., report "final".pdf), the header becomes malformed (filename="report "final".pdf"), causing browsers to misinterpret or fail the download. Per RFC 6266, special characters in the quoted-string parameter need to be escaped, or the filename* parameter with URL encoding should be used for non-ASCII filenames.

feat: support download file

1a9e14e

earayu merged commit 6c53621 into main Jan 13, 2026
8 of 9 checks passed

earayu deleted the feature/support_download branch January 13, 2026 12:58

apecloud-bot added the size/XL Denotes a PR that changes 500-999 lines. label Jan 13, 2026

cursor bot reviewed Jan 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support download file#1424

feat: support download file#1424
earayu merged 1 commit intomainfrom
feature/support_download

earayu commented Jan 13, 2026 •

edited by cursor bot

Loading

Uh oh!

Uh oh!

cursor bot left a comment

Uh oh!

cursor bot Jan 13, 2026

Uh oh!

cursor bot Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

earayu commented Jan 13, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

This PR is being reviewed by Cursor Bugbot

Uh oh!

cursor bot Jan 13, 2026

Choose a reason for hiding this comment

HTTPException caught and converted to 500 error

Uh oh!

cursor bot Jan 13, 2026

Choose a reason for hiding this comment

Unescaped filename in Content-Disposition header breaks downloads

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

earayu commented Jan 13, 2026 •

edited by cursor bot

Loading