Add allele chunking and pan-HLA prediction support#341
Draft
jonasscheid wants to merge 2 commits intonf-core:devfrom
Draft
Add allele chunking and pan-HLA prediction support#341jonasscheid wants to merge 2 commits intonf-core:devfrom
jonasscheid wants to merge 2 commits intonf-core:devfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jonasscheid
commented
Mar 6, 2026
| if not alleles_str: | ||
| continue | ||
| tool_default = MaxNumberOfAlleles[tool.upper()].value | ||
| max_alleles = global_max if global_max > 0 else tool_default |
Collaborator
Author
There was a problem hiding this comment.
Suggested change
| max_alleles = global_max if global_max > 0 else tool_default | |
| max_alleles = global_max if global_max > 0 else MaxNumberOfAlleles[tool.upper()].value |
jonasscheid
commented
Mar 6, 2026
| for tool, alleles_str in tools_allele_input.items(): | ||
| if not alleles_str: | ||
| continue | ||
| tool_default = MaxNumberOfAlleles[tool.upper()].value |
Collaborator
Author
There was a problem hiding this comment.
Suggested change
| tool_default = MaxNumberOfAlleles[tool.upper()].value |
jonasscheid
commented
Mar 6, 2026
| stub: | ||
| def prefix = task.ext.prefix ?: "${meta.id}" | ||
| """ | ||
| echo '{"mhcflurry":"","mhcnuggets":"","mhcnuggetsii":"","netmhcpan":"","netmhciipan":""}' > ${prefix}_allele_input.json |
Collaborator
Author
There was a problem hiding this comment.
Is this still needed?
jonasscheid
commented
Mar 6, 2026
| .map { meta, file -> [meta.findAll { k, _v -> k != 'alleles_supported' }, file] } // drop alleles_supported from meta | ||
| .groupTuple() | ||
| .join( ch_peptides_to_predict ) | ||
| .map { meta, file -> |
Collaborator
Author
There was a problem hiding this comment.
Same here, this looks not super robust
| netmhciipan: (file.name.contains('netmhciipan_input') && allele_input_dict['netmhciipan']) | ||
| return [meta + [alleles_supported: allele_input_dict['netmhciipan']], file] | ||
| // Find the JSON key matching this file (supports both "tool" and "tool_chunkN" keys) | ||
| def key = allele_input_dict.keySet().find { k -> file.name.contains("${k}_input") } |
Collaborator
Author
There was a problem hiding this comment.
This looks very error prone, is there a robust desgin that we can use here?
…atching Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #340
prepare_prediction_inputusing the existingMaxNumberOfAllelesenum (previously marked# TODO: Implement)allin the samplesheet alleles column to trigger pan-HLA mode (predict against all supported alleles per tool)--max_alleles_per_chunkparameter to override per-tool defaultsmerge_predictionsfor correct NetMHC allele-index resolution across chunksChanges
prepare_prediction_input.py: Allele chunking logic,allsentinel handling, per-chunk file + JSON outputmhc_binding_prediction/main.nf: Chunk-aware branch routing, unique per-chunk file_id for predictor output, per-file alleles carried to mergemerge_predictions.py: Uses per-file allele lists instead of globalmeta.alleles(also fixes a latent bug where unsupported alleles could cause wrong index mapping)merge_predictions/main.nf: Addedval(alleles_per_file)inputmax_alleles_per_chunkparam innextflow.config,nextflow_schema.json,modules.configUsage
Test plan
tests/peptides.nf.test(backward compat, default params = no chunking)tests/mhcflurry.nf.test(backward compat)--max_alleles_per_chunk 1to force chunking with mhcnuggetsalleles=allin samplesheet