Skip to content

PART 2: perf: handle huge job groups gracefully - After #7058#7026

Draft
okurz wants to merge 2 commits intoos-autoinst:masterfrom
okurz:feature/019_poo196913_handle_huge_job_groups_gracefully
Draft

PART 2: perf: handle huge job groups gracefully - After #7058#7026
okurz wants to merge 2 commits intoos-autoinst:masterfrom
okurz:feature/019_poo196913_handle_huge_job_groups_gracefully

Conversation

@okurz
Copy link
Copy Markdown
Member

@okurz okurz commented Feb 21, 2026

  • Optimize compute_build_results by using database-level aggregation
    instead of fetching and iterating all job objects.
  • Move job deduplication (per scenario) to the database level.
  • Implement a safety limit of 5,000 jobs per build to prevent timeouts.
  • Optimize comment/review tracking by only checking failed jobs.
  • Add t/61-job_group_aggregation.t to verify aggregation and limit enforcement.
  • Extract category mapping into _get_job_result_category helper.
  • Reuse helper in both count_job and _count_job_aggregated.
  • Ensure consistent result categorization across legacy and optimized paths.
  • Add job_group_overview_max_jobs to misc_limits in openqa.ini.
  • Pass this limit from web and API controllers to compute_build_results.
  • De-duplicate common job data in t/61-job_group_aggregation.t.
  • Add controller and API tests for limit enforcement.

Related progress issue: https://progress.opensuse.org/issues/196913

After:

@okurz okurz force-pushed the feature/019_poo196913_handle_huge_job_groups_gracefully branch from b35dfa2 to afa425c Compare February 21, 2026 17:39
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 21, 2026

Codecov Report

❌ Patch coverage is 98.74477% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 99.86%. Comparing base (594c7e5) to head (de7b0c6).
⚠️ Report is 25 commits behind head on master.

Files with missing lines Patch % Lines
lib/OpenQA/WebAPI/Controller/API/V1/JobGroup.pm 60.00% 2 Missing ⚠️
lib/OpenQA/BuildResults.pm 98.91% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #7026      +/-   ##
==========================================
- Coverage   99.87%   99.86%   -0.01%     
==========================================
  Files         418      420       +2     
  Lines       44000    44190     +190     
==========================================
+ Hits        43945    44132     +187     
- Misses         55       58       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@okurz okurz force-pushed the feature/019_poo196913_handle_huge_job_groups_gracefully branch 2 times, most recently from 49f42c8 to 3345729 Compare February 21, 2026 19:58
Copy link
Copy Markdown
Contributor

@Martchus Martchus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look generally good. Have you tested this with production data and compared before and after? I think for bigger changes like that we should do that (like when I recently did a similar optimization for the test results overview page). This way we could also confirm whether it helps with the problem from the motivating ticket (https://progress.opensuse.org/issues/196709).

$job_result->{total} = 0;
}

sub _get_job_result_category ($state, $result) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether we can reuse some of the mapping functions we already have to avoid repetitiveness here. Of course this is already better than how it was before your change.

@okurz okurz force-pushed the feature/019_poo196913_handle_huge_job_groups_gracefully branch from 3345729 to 38ddab6 Compare February 27, 2026 12:05
Copy link
Copy Markdown
Contributor

@Martchus Martchus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just some style nitpicks but enough to say that at least some of them should be improved.

Comment on lines +238 to +240
my @jobs_data;
push @jobs_data, {%common, id => 600000 + $_, TEST => "test_$_"} for (1 .. $num_jobs);
$schema->resultset('Jobs')->populate(\@jobs_data);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use map or avoid the intermediate array by calling create in a loop.

$t->get_ok("/group_overview/$group_id_ctrl" => form => {distri => 'distri', version => $version, build => $build})
->status_is(400)->content_like(qr/exceeds the limit of 5/);
};
subtest 'API Controller limit enforcement' => sub {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This subtest is almost like the previous ones and they are all very verbose. It would make sense to avoid this kind of duplication.

Comment on lines +14 to +20
use Test::Mojo;
my $test_case = OpenQA::Test::Case->new;
my $schema = $test_case->init_data(fixtures_glob => '01-jobs.pl 03-users.pl');
my $t = Test::Mojo->new('OpenQA::WebAPI');
my $group_id = 1001;
my $group = $schema->resultset('JobGroups')->find($group_id);
subtest 'Aggregation with deduplication' => sub {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use blank lines to separate different sections.

Comment on lines +38 to +45
if ($state eq OpenQA::Jobs::Constants::DONE) {
my $meta = OpenQA::Jobs::Constants::meta_result($result);
return 'passed' if $meta eq OpenQA::Jobs::Constants::PASSED;
return 'softfailed' if $meta eq OpenQA::Jobs::Constants::SOFTFAILED;
return 'skipped' if $meta eq OpenQA::Jobs::Constants::ABORTED;
return 'failed' if $meta eq OpenQA::Jobs::Constants::FAILED || $meta eq OpenQA::Jobs::Constants::NOT_COMPLETE;
}
return 'skipped' if $state eq OpenQA::Jobs::Constants::CANCELLED;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would look less noisy if the constants were imported.

Copy link
Copy Markdown
Contributor

@d3flex d3flex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems to me that this will have impact in the performance. there are multiple $jobs_resultset->search which can stress the system with long results. is there anything better to avoid this? or do i miss something?

Comment on lines +285 to +286
if ($jr{children}) {
for my $child_id (keys %{$jr{children}}) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if ($jr{children}) {
for my $child_id (keys %{$jr{children}}) {
if (my $children = $jr{children}) {
for my $child_id (keys %$children) {

@okurz okurz marked this pull request as draft March 2, 2026 14:59
@okurz okurz changed the title perf: handle huge job groups gracefully PART 2: perf: handle huge job groups gracefully - After #7058 Mar 3, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 6, 2026

This pull request is now in conflicts. Could you fix it? 🙏

@okurz okurz force-pushed the feature/019_poo196913_handle_huge_job_groups_gracefully branch 3 times, most recently from 61ecbbb to 1cee06d Compare March 23, 2026 14:00
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Mar 23, 2026

This pull request is now in conflicts. Could you fix it? 🙏

@okurz okurz force-pushed the feature/019_poo196913_handle_huge_job_groups_gracefully branch from 1cee06d to 24914b1 Compare March 24, 2026 22:29
@os-autoinst os-autoinst deleted a comment from mergify bot Mar 31, 2026
@okurz okurz force-pushed the feature/019_poo196913_handle_huge_job_groups_gracefully branch 3 times, most recently from 8754ad6 to 116ad1f Compare March 31, 2026 21:36
okurz added 2 commits April 1, 2026 13:08
- Optimize compute_build_results by using database-level aggregation
  instead of fetching and iterating all job objects.
- Move job deduplication (per scenario) to the database level.
- Implement a safety limit of 5,000 jobs per build to prevent timeouts.
- Optimize comment/review tracking by only checking failed jobs.
- Add t/61-job_group_aggregation.t to verify aggregation and limit enforcement.
- Extract category mapping into _get_job_result_category helper.
- Reuse helper in both count_job and _count_job_aggregated.
- Ensure consistent result categorization across legacy and optimized paths.
- Add job_group_overview_max_jobs to misc_limits in openqa.ini.
- Pass this limit from web and API controllers to compute_build_results.
- De-duplicate common job data in t/61-job_group_aggregation.t.
- Add controller and API tests for limit enforcement.

Related progress issue: https://progress.opensuse.org/issues/196913
- Introduce OpenQA::Error::LimitExceeded typed exception for robust
  error handling.
- Decompose compute_build_results into smaller, testable helper
  functions.
- Consolidate database queries to reduce O(N) round-trips.
- Centralize job limit configuration using app config and internal
  defaults.
- Implement graceful degradation for oversized builds instead of hard
  failure.
@okurz okurz force-pushed the feature/019_poo196913_handle_huge_job_groups_gracefully branch from 116ad1f to de7b0c6 Compare April 1, 2026 11:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants