Conversation
There was a problem hiding this comment.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
|
|
||
| except Exception as e: | ||
| logger.error(f"Failed to download document {document_id} from path {object_path}: {e}", exc_info=True) | ||
| raise HTTPException(status_code=500, detail="Failed to download document from storage") |
There was a problem hiding this comment.
HTTPException caught and converted to 500 error
Medium Severity
The except Exception as e block at line 1074 catches all exceptions including the HTTPException(status_code=404) raised at line 1051 when the file is not found in the object store. This causes the 404 error to be swallowed and re-raised as a generic 500 error, returning incorrect status codes to API clients. The exception handler needs to re-raise HTTPException instances instead of converting them.
Additional Locations (1)
| # Set headers for file download | ||
| headers = { | ||
| "Content-Type": content_type, | ||
| "Content-Disposition": f'attachment; filename="{document.name}"', |
There was a problem hiding this comment.
Unescaped filename in Content-Disposition header breaks downloads
Medium Severity
The Content-Disposition header directly interpolates document.name without escaping special characters. If a filename contains double quotes (e.g., report "final".pdf), the header becomes malformed (filename="report "final".pdf"), causing browsers to misinterpret or fail the download. Per RFC 6266, special characters in the quoted-string parameter need to be escaped, or the filename* parameter with URL encoding should be used for non-ASCII filenames.
Note
Introduces a secure, streaming download API for original document files.
GET /collections/{collection_id}/documents/{document_id}/downloadwith access control, status checks (blocksEXPIRED/DELETED), and proper headers (Content-Type,Content-Disposition,Content-Length) viaStreamingResponse(document_service.download_document,views/collections.py)aperag/api/openapi.yaml,paths/collections.yaml,web/src/api/openapi.merged.yaml) and generates web SDKDocumentsApi(web/src/api/apis/documents-api.ts, exported inweb/src/api/api.ts)tests/e2e_test/test_document_download.py)docs/design/document_export_design_zh.md)aperag/schema/view_models.py)Written by Cursor Bugbot for commit 1a9e14e. This will update automatically on new commits. Configure here.