feat: Add 'Download All' ZIP archive button to job downloads page#7015
feat: Add 'Download All' ZIP archive button to job downloads page#7015okurz wants to merge 8 commits intoos-autoinst:masterfrom
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7015 +/- ##
========================================
Coverage 99.84% 99.84%
========================================
Files 416 419 +3
Lines 43341 43674 +333
========================================
+ Hits 43273 43606 +333
Misses 68 68 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Martchus
left a comment
There was a problem hiding this comment.
I'm stopping the review with the inline comments I have already created because this is not really ready for review. This could be used as a starting point, though. I would nevertheless be careful as the error handling and locking seems quite insufficient.
lib/OpenQA/Archive.pm
Outdated
| } | ||
| } | ||
| cleanup_cache(); | ||
| my $status = $zip->writeToFileNamed($archive_path->to_string); |
There was a problem hiding this comment.
Looks like images are missing.
There was a problem hiding this comment.
Do you mean screenshots? I wonder if they are even included in openqa-cli archive
There was a problem hiding this comment.
So I compared operations of openqa-cli archive and the new download button and I don't see what should be missing. Please see attached an example download archive of the test fixtures job 99938
I see everything relevant included, that is all testresults for every step with all screenshots and thumbnails, autoinst-log.txt, video.ogv, vars.json and ulogs with y2logs.tar.bz2, serial0.txt and report.html
| $zip->addTree($disk_file, $zip_path . '/'); | ||
| } | ||
| else { | ||
| $zip->addFile($disk_file, $zip_path); |
There was a problem hiding this comment.
Is this relying on in-memory buffers? That would be terrible considering we deal with GiB-sized files here. So it would make sense to test with big files and check that the memory usage doesn't explode.
That doesn't sound very nice. I considered it ready for review as I implemented everything I considered necessary including UI integration and multiple levels of tests. That you find some good and valid points during review is appreciated. But what would have been the better alternative from your point of view than me creating the PR? |
2dec6a0 to
6c13ec8
Compare
| % my $assets = $job->jobs_assets; | ||
| % if (@$resultfiles || @$ulogs || $assets->count) { | ||
| <div class="mb-3"> | ||
| %= link_to url_for('test_archive', testid => $job->id) => (class => 'btn btn-primary', title => 'Download all test results and assets as a ZIP archive') => begin |
There was a problem hiding this comment.
Is the download link accessible to everyone? Then every person and every crawler can "click" that link and cause a large file to be created on the webui host. Clicking multiple links can create a lot of minion jobs and use a lot of disk space.
For the usual downloads served by nginx that is not a problem. The files are already there, and nginz handles everything.
There was a problem hiding this comment.
So what do you suggest, only allow for authenticated users?
lib/OpenQA/Archive.pm
Outdated
| my $lock_path = $cache_dir->child("$archive_name.lock"); | ||
| open(my $lock_fh, '>', $lock_path->to_string) | ||
| or die "Could not open lock file $lock_path: $!"; # uncoverable statement | ||
| flock($lock_fh, LOCK_EX) or die "Could not lock $lock_path: $!"; # uncoverable statement |
There was a problem hiding this comment.
| flock($lock_fh, LOCK_EX) or die "Could not lock $lock_path: $!"; # uncoverable statement | |
| flock $lock_fh, LOCK_EX or die "Could not lock $lock_path: $!"; # uncoverable statement |
also I think this is not uncoverable
There was a problem hiding this comment.
Sure , I will see what codecov says about that :)
012d000 to
6e08e24
Compare
|
Updated with all minor comments by @perlpunk . Pushed an update to check complete coverage analysis by codecov. Still the "authorization" issue and more scaling and performance tests are pending |
6e08e24 to
74cd320
Compare
|
Regarding missing OBS checks: I can't get them redelivered, but the OBS project is there, so we can look at it manually before merging: https://build.opensuse.org/package/show/devel:openQA:GitHub:os-autoinst:openQA:PR-7015/openQA The initial PR creation is too old to redeliver the webhook. |
|
Of course it would be pretty cool to use something way more efficient like https://github.com/evanmiller/mod_zip that also doesn't involve any additional complexity to lock and clean temporary files. |
I actually suggested that to @okurz in Slack, but it doesn't seem to be packaged in openSUSE, and a webserver independent version is required as well. |
I don't think this feature is required at all. It is nice to have at best. And as such we can probably live with it not being available to all users. I think this module could be easily packaged. Other NGINX modules have already been packaged¹. The only annoyance is that these modules always need to be recompiled when NGINX is updated. However, OBS will take care of that for us. ¹ So one just had to take e.g. https://build.opensuse.org/package/show/openSUSE:Factory/nginx-module-brotli and replace the Git repository and maybe a few parameters and dependencies. |
5e1faef to
1dc6775
Compare
1dc6775 to
1d920ec
Compare
246733e to
a803cca
Compare
Fixes os-autoinst#7010. Motivation: Provide a user-friendly way to download all logs, results, and assets at once directly from the UI, matching the functionality of 'openqa-cli archive'. Design choices: - Use ZIP format for broad cross-platform support (industry standard). - Implement 'OpenQA::Archive' utility to keep archival logic separated from controllers and models. - Implement on-demand cleanup in the archive cache directory ('/var/lib/openqa/cache/archives') to maintain storage space without requiring a periodic background task. - Reuse cached archives if they already exist to improve performance for subsequent requests. - Integrate into the 'Downloads' tab for better discoverability. User Benefit: Users can quickly and easily download all artifacts associated with a job in a single ZIP file, saving time and effort compared to downloading them individually or using the CLI.
Motivation: Serving large static files directly through Mojolicious blocks worker processes, reducing overall application throughput and responsiveness. Design choices: - Utilize the existing Apache/Nginx reverse proxy configuration to serve files from '/var/lib/openqa/cache/archives'. - Add an Alias/Location mapping for the '/archives' path in the web server configuration. - Implement a redirect from the WebAPI route to the static file URL. User Benefit: Improved responsiveness of the openQA Web UI, as large ZIP archive downloads are handled efficiently by the web server (Nginx/Apache) without blocking application workers.
Motivation: Creating and downloading an empty ZIP archive is confusing and provides no value to the user. Design choices: - Check if the job has any result files, uploaded logs, or assets before rendering the 'Download All' button. - Utilize the existing template variables (`$resultfiles`, `$ulogs`, `$assets`) to make this determination efficiently. User Benefit: A cleaner UI that does not offer a download option when there is nothing to download, preventing confusion and saving clicks.
Motivation: Generating large ZIP archives is a CPU and I/O intensive operation. Doing this synchronously in the WebAPI controller blocks the worker and can lead to timeouts for the user. Design choices: - Move the core archival logic into a new Minion task: `OpenQA::Task::Job::CreateZipArchive`. - Enqueue the task when the user requests an archive that does not yet exist. - Introduce an intermediate "Preparing Archive" polling page that checks the status of the generation and redirects once complete. - Provide a fallback to synchronous generation if Minion is unavailable (e.g., in certain test or development environments). User Benefit: Prevents browser timeouts and Web UI unresponsiveness when generating large archives. Users are provided with clear feedback that the archive is being prepared in the background.
Motivation: Maintain code cleanliness and accurate test coverage metrics by removing pragmas that are no longer accurate or necessary. Design choices: - Remove `# uncoverable statement` comments from `die` calls related to file locking and renaming, as these paths should ideally be covered or acknowledged as standard error handling. User Benefit: Improved code quality and more reliable test coverage reporting for developers.
Motivation: The server-side archival process consumes significant CPU, I/O, and disk space. If the route is publicly accessible, it becomes a vector for Denial of Service (DoS) attacks, either malicious or accidental (e.g., from web crawlers). Design choices: - Move the `/tests/:testid/archive` route under the `session#ensure_user` bridge in `lib/OpenQA/WebAPI.pm`. - This ensures only logged-in users can trigger the resource-intensive archive generation process. User Benefit: Increased stability and security of the openQA instance by preventing unauthorized resource exhaustion.
Motivation: Prevent search engine crawlers from following the download link, which could otherwise lead to unnecessary Minion task generation and disk space consumption. Design choices: - Add the `rel="nofollow"` attribute to the anchor tag in the downloads template. - This acts as an additional layer of defense alongside the authentication requirement and `robots.txt`. User Benefit: Protects server resources from being consumed by automated crawlers, ensuring the system remains responsive for human users.
Motivation: Even with authentication, multiple users (or scripts) requesting archives simultaneously could saturate the server's I/O and CPU, impacting other critical openQA operations. Additionally, old, unstarted requests should not linger indefinitely. Design choices: - Use a Minion guard (`create_zip_archive_task`) to limit the global concurrency of archive generation to a maximum of 2. - Set a low priority (`-10`) for the task so it does not delay critical operations like scheduling or result processing. - Set a Time-To-Live (`expire => 3600`) of 1 hour to automatically discard stale requests that haven't been picked up by a worker. - Keep the task in the `default` queue to ensure it is processed by standard workers without requiring custom configuration. - Consolidate archive redirect logic in the File controller to remove code duplication. User Benefit: Ensures the overall openQA system remains stable and responsive even when heavy archival tasks are requested. Prevents a backlog of obsolete background tasks.
a803cca to
00837b8
Compare
Fixes #7010.
Motivation:
Provide a user-friendly way to download all logs, results, and assets
at once directly from the UI, matching the functionality of
'openqa-cli archive'.
Architecture choices:
from controllers and models.
('/var/lib/openqa/cache/archives') to maintain storage space without
requiring a periodic background task.
for subsequent requests.