Skip to content

feat: Add 'Download All' ZIP archive button to job downloads page#7015

Open
okurz wants to merge 8 commits intoos-autoinst:masterfrom
okurz:feature/gh7010_download_all
Open

feat: Add 'Download All' ZIP archive button to job downloads page#7015
okurz wants to merge 8 commits intoos-autoinst:masterfrom
okurz:feature/gh7010_download_all

Conversation

@okurz
Copy link
Copy Markdown
Member

@okurz okurz commented Feb 19, 2026

Fixes #7010.

Motivation:
Provide a user-friendly way to download all logs, results, and assets
at once directly from the UI, matching the functionality of
'openqa-cli archive'.

Architecture choices:

  • Use ZIP format for broad cross-platform support (industry standard).
  • Implement 'OpenQA::Archive' utility to keep archival logic separated
    from controllers and models.
  • Implement on-demand cleanup in the archive cache directory
    ('/var/lib/openqa/cache/archives') to maintain storage space without
    requiring a periodic background task.
  • Reuse cached archives if they already exist to improve performance
    for subsequent requests.
  • Integrate into the 'Downloads' tab for better discoverability.
Screenshot_20260219_163420_openQA_download_as_zip

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.84%. Comparing base (2c3afdd) to head (1dc6775).
⚠️ Report is 23 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff            @@
##           master    #7015    +/-   ##
========================================
  Coverage   99.84%   99.84%            
========================================
  Files         416      419     +3     
  Lines       43341    43674   +333     
========================================
+ Hits        43273    43606   +333     
  Misses         68       68            

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@Martchus Martchus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm stopping the review with the inline comments I have already created because this is not really ready for review. This could be used as a starting point, though. I would nevertheless be careful as the error handling and locking seems quite insufficient.

}
}
cleanup_cache();
my $status = $zip->writeToFileNamed($archive_path->to_string);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like images are missing.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean screenshots? I wonder if they are even included in openqa-cli archive

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I compared operations of openqa-cli archive and the new download button and I don't see what should be missing. Please see attached an example download archive of the test fixtures job 99938

job_99938.zip

I see everything relevant included, that is all testresults for every step with all screenshots and thumbnails, autoinst-log.txt, video.ogv, vars.json and ulogs with y2logs.tar.bz2, serial0.txt and report.html

$zip->addTree($disk_file, $zip_path . '/');
}
else {
$zip->addFile($disk_file, $zip_path);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this relying on in-memory buffers? That would be terrible considering we deal with GiB-sized files here. So it would make sense to test with big files and check that the memory usage doesn't explode.

@okurz
Copy link
Copy Markdown
Member Author

okurz commented Feb 19, 2026

because this is not really ready for review.

That doesn't sound very nice. I considered it ready for review as I implemented everything I considered necessary including UI integration and multiple levels of tests. That you find some good and valid points during review is appreciated. But what would have been the better alternative from your point of view than me creating the PR?

@okurz okurz force-pushed the feature/gh7010_download_all branch 3 times, most recently from 2dec6a0 to 6c13ec8 Compare February 20, 2026 22:42
% my $assets = $job->jobs_assets;
% if (@$resultfiles || @$ulogs || $assets->count) {
<div class="mb-3">
%= link_to url_for('test_archive', testid => $job->id) => (class => 'btn btn-primary', title => 'Download all test results and assets as a ZIP archive') => begin
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the download link accessible to everyone? Then every person and every crawler can "click" that link and cause a large file to be created on the webui host. Clicking multiple links can create a lot of minion jobs and use a lot of disk space.
For the usual downloads served by nginx that is not a problem. The files are already there, and nginz handles everything.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what do you suggest, only allow for authenticated users?

my $lock_path = $cache_dir->child("$archive_name.lock");
open(my $lock_fh, '>', $lock_path->to_string)
or die "Could not open lock file $lock_path: $!"; # uncoverable statement
flock($lock_fh, LOCK_EX) or die "Could not lock $lock_path: $!"; # uncoverable statement
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
flock($lock_fh, LOCK_EX) or die "Could not lock $lock_path: $!"; # uncoverable statement
flock $lock_fh, LOCK_EX or die "Could not lock $lock_path: $!"; # uncoverable statement

also I think this is not uncoverable

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure , I will see what codecov says about that :)

@okurz okurz force-pushed the feature/gh7010_download_all branch 2 times, most recently from 012d000 to 6e08e24 Compare March 5, 2026 10:38
@okurz
Copy link
Copy Markdown
Member Author

okurz commented Mar 5, 2026

Updated with all minor comments by @perlpunk . Pushed an update to check complete coverage analysis by codecov. Still the "authorization" issue and more scaling and performance tests are pending

@okurz okurz force-pushed the feature/gh7010_download_all branch from 6e08e24 to 74cd320 Compare March 5, 2026 11:31
@perlpunk
Copy link
Copy Markdown
Contributor

perlpunk commented Mar 5, 2026

Regarding missing OBS checks: I can't get them redelivered, but the OBS project is there, so we can look at it manually before merging: https://build.opensuse.org/package/show/devel:openQA:GitHub:os-autoinst:openQA:PR-7015/openQA

The initial PR creation is too old to redeliver the webhook.

@Martchus
Copy link
Copy Markdown
Contributor

Of course it would be pretty cool to use something way more efficient like https://github.com/evanmiller/mod_zip that also doesn't involve any additional complexity to lock and clean temporary files.

@perlpunk
Copy link
Copy Markdown
Contributor

Of course it would be pretty cool to use something way more efficient like https://github.com/evanmiller/mod_zip that also doesn't involve any additional complexity to lock and clean temporary files.

I actually suggested that to @okurz in Slack, but it doesn't seem to be packaged in openSUSE, and a webserver independent version is required as well.

@Martchus
Copy link
Copy Markdown
Contributor

Martchus commented Mar 10, 2026

… and a webserver independent version is required as well.

I don't think this feature is required at all. It is nice to have at best. And as such we can probably live with it not being available to all users.

I think this module could be easily packaged. Other NGINX modules have already been packaged¹. The only annoyance is that these modules always need to be recompiled when NGINX is updated. However, OBS will take care of that for us.


¹ So one just had to take e.g. https://build.opensuse.org/package/show/openSUSE:Factory/nginx-module-brotli and replace the Git repository and maybe a few parameters and dependencies.

@okurz okurz force-pushed the feature/gh7010_download_all branch 2 times, most recently from 5e1faef to 1dc6775 Compare March 23, 2026 14:02
@okurz okurz force-pushed the feature/gh7010_download_all branch from 1dc6775 to 1d920ec Compare March 24, 2026 22:31
@os-autoinst os-autoinst deleted a comment from mergify bot Mar 27, 2026
@os-autoinst os-autoinst deleted a comment from mergify bot Mar 27, 2026
@okurz okurz force-pushed the feature/gh7010_download_all branch 2 times, most recently from 246733e to a803cca Compare April 1, 2026 11:52
okurz added 4 commits April 13, 2026 13:16
Fixes os-autoinst#7010.

Motivation:
Provide a user-friendly way to download all logs, results, and assets
at once directly from the UI, matching the functionality of
'openqa-cli archive'.

Design choices:
- Use ZIP format for broad cross-platform support (industry standard).
- Implement 'OpenQA::Archive' utility to keep archival logic separated
  from controllers and models.
- Implement on-demand cleanup in the archive cache directory
  ('/var/lib/openqa/cache/archives') to maintain storage space without
  requiring a periodic background task.
- Reuse cached archives if they already exist to improve performance
  for subsequent requests.
- Integrate into the 'Downloads' tab for better discoverability.

User Benefit:
Users can quickly and easily download all artifacts associated with a job
in a single ZIP file, saving time and effort compared to downloading them
individually or using the CLI.
Motivation:
Serving large static files directly through Mojolicious blocks worker processes,
reducing overall application throughput and responsiveness.

Design choices:
- Utilize the existing Apache/Nginx reverse proxy configuration to serve
  files from '/var/lib/openqa/cache/archives'.
- Add an Alias/Location mapping for the '/archives' path in the web
  server configuration.
- Implement a redirect from the WebAPI route to the static file URL.

User Benefit:
Improved responsiveness of the openQA Web UI, as large ZIP archive downloads
are handled efficiently by the web server (Nginx/Apache) without blocking
application workers.
Motivation:
Creating and downloading an empty ZIP archive is confusing and provides
no value to the user.

Design choices:
- Check if the job has any result files, uploaded logs, or assets before
  rendering the 'Download All' button.
- Utilize the existing template variables (`$resultfiles`, `$ulogs`,
  `$assets`) to make this determination efficiently.

User Benefit:
A cleaner UI that does not offer a download option when there is nothing
to download, preventing confusion and saving clicks.
Motivation:
Generating large ZIP archives is a CPU and I/O intensive operation. Doing
this synchronously in the WebAPI controller blocks the worker and can lead
to timeouts for the user.

Design choices:
- Move the core archival logic into a new Minion task:
  `OpenQA::Task::Job::CreateZipArchive`.
- Enqueue the task when the user requests an archive that does not yet exist.
- Introduce an intermediate "Preparing Archive" polling page that checks
  the status of the generation and redirects once complete.
- Provide a fallback to synchronous generation if Minion is unavailable (e.g.,
  in certain test or development environments).

User Benefit:
Prevents browser timeouts and Web UI unresponsiveness when generating large
archives. Users are provided with clear feedback that the archive is being
prepared in the background.
okurz added 4 commits April 13, 2026 13:16
Motivation:
Maintain code cleanliness and accurate test coverage metrics by removing
pragmas that are no longer accurate or necessary.

Design choices:
- Remove `# uncoverable statement` comments from `die` calls related to
  file locking and renaming, as these paths should ideally be covered or
  acknowledged as standard error handling.

User Benefit:
Improved code quality and more reliable test coverage reporting for developers.
Motivation:
The server-side archival process consumes significant CPU, I/O, and disk
space. If the route is publicly accessible, it becomes a vector for Denial
of Service (DoS) attacks, either malicious or accidental (e.g., from web
crawlers).

Design choices:
- Move the `/tests/:testid/archive` route under the `session#ensure_user`
  bridge in `lib/OpenQA/WebAPI.pm`.
- This ensures only logged-in users can trigger the resource-intensive
  archive generation process.

User Benefit:
Increased stability and security of the openQA instance by preventing
unauthorized resource exhaustion.
Motivation:
Prevent search engine crawlers from following the download link, which
could otherwise lead to unnecessary Minion task generation and disk space
consumption.

Design choices:
- Add the `rel="nofollow"` attribute to the anchor tag in the downloads
  template.
- This acts as an additional layer of defense alongside the authentication
  requirement and `robots.txt`.

User Benefit:
Protects server resources from being consumed by automated crawlers, ensuring
the system remains responsive for human users.
Motivation:
Even with authentication, multiple users (or scripts) requesting archives
simultaneously could saturate the server's I/O and CPU, impacting other
critical openQA operations. Additionally, old, unstarted requests should
not linger indefinitely.

Design choices:
- Use a Minion guard (`create_zip_archive_task`) to limit the global
  concurrency of archive generation to a maximum of 2.
- Set a low priority (`-10`) for the task so it does not delay critical
  operations like scheduling or result processing.
- Set a Time-To-Live (`expire => 3600`) of 1 hour to automatically discard
  stale requests that haven't been picked up by a worker.
- Keep the task in the `default` queue to ensure it is processed by standard
  workers without requiring custom configuration.
- Consolidate archive redirect logic in the File controller to remove
  code duplication.

User Benefit:
Ensures the overall openQA system remains stable and responsive even when
heavy archival tasks are requested. Prevents a backlog of obsolete background
tasks.
@okurz okurz force-pushed the feature/gh7010_download_all branch from a803cca to 00837b8 Compare April 13, 2026 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide an easy way to download all logs or per group

3 participants