Use native ARM runners for Docker multi-arch builds#1020
Conversation
Migrates Docker multi-arch builds from QEMU emulation to native ARM runners
for significantly faster build times (3-5x improvement expected).
Changes:
- Split each build job into parallel architecture-specific jobs:
- build-{flavor}-amd64 on ubuntu-latest
- build-{flavor}-arm64 on ubuntu-24.04-arm
- create-manifest-{flavor} to combine into multi-arch manifest
- Architecture chains run independently (amd64 and arm64 in parallel)
- Each arch build depends on its parent arch-specific image, not manifest
- Added OCI annotations (source, description, licenses) to manifests
- Architecture-specific cache tags prevent cross-arch cache pollution
- PRs now push to GHCR (enables full multi-arch validation)
- Only deploy-to-quay is skipped for PRs
- Simplified test jobs to always pull from registry
- Added ARM64 test jobs (only run on PRs with docker changes)
Job count: 14 -> 30 (but faster due to native execution + parallelization)
Closes #1019
- test-phylo: Add 'as ncbi' alias to import in test_ncbi.py - test-assemble: Add missing expected.ebov.refine1.new.fasta from viral-assemble - test-classify: Fix corrupted test-reads.revert.bam (was 0 bytes, now 177 bytes) Co-Authored-By: Claude <noreply@anthropic.com>
For PRs, github.ref_name is '1020/merge' which contains '/' - invalid in Docker tags. Added image-tag-prefix output in get-version job that sanitizes the ref: - PRs: pr-<number> (e.g., pr-1020) - Branches: slashes replaced with dashes (e.g., feature/foo -> feature-foo) - Tags: unchanged (e.g., v1.25.0)
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. Thanks for integrating Codecov - We've got you covered ☂️ |
When building derivative images (core, assemble, classify, phylo, mega), add the parent image's cache as an additional fallback. This helps when building a new branch where the derivative image cache doesn't exist yet but the parent's cache does. Cache hierarchy: - core: falls back to baseimage cache - assemble/classify/phylo/mega: fall back to core cache
- Add ARM skip decorator to TestMvicuna (mvicuna is x86-only) - Add ARM skip decorator to TestDepleteHuman (bmtagger is x86-only) - Add ARM skip decorator to TestRefineAssemble (novoalign is x86-only) - Fix krona patch path: 'krona.Krona' -> 'viral_ngs.metagenomics.krona.Krona' (patch where used, not where defined)
- Skip NovoalignTool and MvicunaTool in test_tool_install on ARM - Skip test_novoalign method in TestAlignAndFix on ARM - Skip test_downsample_with_dedup_after (uses mvicuna internally) on ARM
GHCR public images can be pulled without authentication, so the docker/login-action steps in test jobs are unnecessary and can cause transient failures due to network timeouts. Removed from: test-core, test-core-arm64, test-assemble, test-assemble-arm64, test-classify, test-classify-arm64, test-phylo, test-phylo-arm64
- Document 3-job pattern for each flavor (amd64, arm64, manifest) - Add build job flow diagram showing parallel architecture chains - Document ARM64 test jobs that run on PRs with docker changes - Add ARM Test Skipping section with code examples - Document registry strategy (GHCR for CI, Quay for production) - Remove outdated references to QEMU emulation
…efore - Add retry logic (5 attempts with 10s wait) to all test jobs for docker pull to handle transient GHCR connectivity timeouts - Add ARM skip decorator to test_downsample_with_dedup_before which also uses mvicuna via rmdup_mvicuna_bam (same as test_downsample_with_dedup_after)
The test was using mocker fixture from pytest-mock which is not installed. Convert to use unittest.mock.patch like other tests in the same file. Also fix patch path to 'viral_ngs.metagenomics.kaiju.Kaiju.classify'.
There was a problem hiding this comment.
Pull request overview
This PR migrates Docker multi-arch builds from QEMU emulation to native ARM runners, significantly improving ARM64 build performance (3-5x faster). The infrastructure now uses separate parallel jobs for AMD64 and ARM64 builds that are later combined into multi-arch manifests.
Changes:
- Build infrastructure split into parallel native builds per architecture with manifest creation
- ARM64 test jobs added for PRs with Docker file changes
- ARM skip decorators added for tests using x86-only bioconda packages (novoalign, mvicuna, bmtagger)
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/docker.yml | Restructured to use native ARM runners with parallel architecture-specific builds and manifest creation jobs |
| tests/unit/core/test_tools.py | Added platform detection and skip logic for x86-only tools |
| tests/unit/core/test_read_utils.py | Added ARM skip decorators for mvicuna-dependent tests |
| tests/unit/classify/test_metagenomics.py | Fixed test to call renamed main_krona function and updated mock paths |
| tests/unit/classify/test_integration_taxon_filter.py | Added ARM skip decorator for bmtagger-dependent test class |
| tests/unit/assemble/test_assembly_integration.py | Added ARM skip decorator for novoalign-dependent test class |
| tests/unit/assemble/test_assembly.py | Added ARM skip decorator for novoalign-dependent test class |
| tests/unit/phylo/test_ncbi.py | Changed import to use alias for consistency |
| tests/input/TestRefineAssembly/expected.ebov.refine1.new.fasta | Added expected test output file |
| AGENTS.md | Updated documentation to reflect new native ARM build architecture and registry strategy |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The kaiju functionality was removed from viral_ngs.metagenomics but this test was left behind. The test fails with: AttributeError: module 'viral_ngs.metagenomics' has no attribute 'kaiju' There is no parser_kaiju, no Kaiju class, and no kaiju module anywhere in the codebase - this test was testing dead code.
Instead of fail-fast mode, run all tests to see the full picture of any failures rather than stopping at the first one.
- Add missing `cmd as util_cmd` import to core/file.py (fixes NameError in test_bmtagger_db_build) - Rename local `samtools` variable to `samtools_tool` in 5 test methods to avoid shadowing the imported module (fixes UnboundLocalError)
- Use SKIP_X86_ONLY_REASON constant in test_read_utils.py skip decorators - Update test_krakenuniq to use context manager pattern for consistency
- Remove self-imports (modules importing themselves): - file.py: removed `file as util_file`, use direct function calls - misc.py: removed `misc as util_misc` (unused) - samtools.py: removed `samtools` self-import - picard.py: removed `picard` self-import, fix SamToFastqTool reference - Remove duplicate imports: - gatk.py, novoalign.py: removed duplicate samtools/picard imports - fastqc.py, splitcode.py: removed duplicate samtools import - bbmap.py: removed duplicate samtools/picard imports - Fix circular import: - file.py: use lazy import for cmd module in set_tmp_dir()
Tests in test_tools_kb_python.py and test_tools_kma.py were using the mocker fixture from pytest-mock which is not installed. Convert them to use unittest.mock.patch context managers for consistency with other tests in the codebase. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The kb.py and kma.py modules import 'file' not 'util_file', so the patch paths need to match.
- Remove anndata from pyproject.toml (only needed in classify image, already in docker/requirements/classify.txt) - Add minimum version constraints to pandas, pyyaml, scipy, seaborn in both pyproject.toml and docker/requirements/core.txt - Add minimum version constraints to awscli, google-cloud-storage in docker/requirements/baseimage.txt This prevents pip from unnecessarily downloading and upgrading packages that are already installed via conda during Docker builds. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Docker Build Performance Analysis (Post-Merge)Analyzed GHA run 21675133318 to verify the build optimizations from this PR are working correctly. Key Results✅ pip optimization is working - The ✅ Layer caching is active - Build logs show cache manifests being imported successfully from both branch-specific and main caches. ✅ Native ARM runners provide major speedup - Comparing to the old QEMU-based workflow:
Build Time Breakdown (core-amd64)
SummaryTotal workflow time is now ~12 minutes (including all architectures and tests), down from 25+ minutes with the QEMU-based approach. The cache export taking ~100s is expected behavior—BuildKit compresses and uploads layers to enable future cache hits. The build is well-optimized and making efficient use of layer caching. |
Summary
Migrates Docker multi-arch builds from QEMU emulation to native ARM runners for 3-5x faster ARM build times.
Build Infrastructure Changes
Test Infrastructure Changes
ARM Skip Decorators Added
Tests using x86-only bioconda packages now properly skip on ARM64:
TestMvicuna- uses mvicuna (x86-only)TestDepleteHuman- uses bmtagger (x86-only)TestRefineAssemble- uses novoalign (x86-only)TestRefineAssembly- uses novoalign (x86-only)test_tool_install- skips NovoalignTool and MvicunaTool on ARMtest_novoalignmethod - uses novoalign (x86-only)test_downsample_with_dedup_after- uses mvicuna via dedup (x86-only)Dependency Version Pinning
Added minimum version constraints to prevent pip from unnecessarily upgrading conda-installed packages during Docker builds:
pyproject.toml: Removedanndata(only needed in classify image), added version constraints topandas>=2.0.0,pyyaml>=6.0,scipy>=1.10.0,seaborn>=0.12.0docker/requirements/core.txt: Added version constraints topandas>=2.0.0,pyyaml>=6.0,scipy>=1.10.0,seaborn>=0.12.0docker/requirements/baseimage.txt: Added version constraints toawscli>=1.32.0,google-cloud-storage>=2.14.0Bug Fixes
test_krona_import_taxonomyto callmain_krona()(renamed in 76de9e3)Job structure change:
Expected benefits:
Closes #1019
Test plan
🤖 Generated with Claude Code