Skip to content

Reintegrate STARFUSION_BUILD module#709

Merged
delfiterradas merged 19 commits intonf-core:devfrom
delfiterradas:dev
Jul 23, 2025
Merged

Reintegrate STARFUSION_BUILD module#709
delfiterradas merged 19 commits intonf-core:devfrom
delfiterradas:dev

Conversation

@delfiterradas
Copy link
Contributor

@delfiterradas delfiterradas commented Jul 14, 2025

Closes #694

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/rnafusion branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@nf-core-bot
Copy link
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.2.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

@github-actions
Copy link

github-actions bot commented Jul 14, 2025

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 0e82fba

+| ✅ 223 tests passed       |+
#| ❔   2 tests were ignored |#
!| ❗   4 tests had warnings |!
Details

❗ Test warnings:

  • pipeline_todos - TODO string in ro-crate-metadata.json: "description": "

    \n \n <source media="(prefers-color-scheme: dark)" srcset="docs/images/nf-core-rnafusion_logo_dark_new.png">\n <img alt="nf-core/rnafusion" src="docs/images/nf-core-rnafusion_logo_light_new.png">\n \n

    \n\nGitHub Actions CI Status\nGitHub Actions Linting StatusAWS CICite with Zenodo\nnf-test\n\nNextflow\nrun with conda\nrun with docker\nrun with singularity\nLaunch on Seqera Platform\n\nGet help on SlackFollow on TwitterFollow on MastodonWatch on YouTube\n\n## Introduction\n\nnf-core/rnafusion is a bioinformatics pipeline that ...\n\n TODO nf-core:\n Complete this sentence with a 2-3 sentence summary of what types of data the pipeline ingests, a brief overview of the\n major pipeline sections and the types of output it produces. You're giving an overview to someone new\n to nf-core here, in 15-20 seconds. For an example, see https://github.com/nf-core/rnaseq/blob/master/README.md#introduction\n\n\n Include a figure that guides the user through the major workflow steps. Many nf-core\n workflows use the "tube map" design for that. See https://nf-co.re/docs/contributing/design_guidelines#examples for examples. \n Fill in short bullet-pointed list of the default steps in the pipeline 1. Read QC (FastQC)2. Present QC for raw reads (MultiQC)\n\n## Usage\n\n> [!NOTE]\n> If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test before running the workflow on actual data.\n\n Describe the minimum required steps to execute the pipeline, e.g. how to prepare samplesheets.\n Explain what rows and columns represent. For instance (please edit as appropriate):\n\nFirst, prepare a samplesheet with your input data that looks as follows:\n\nsamplesheet.csv:\n\ncsv\nsample,fastq_1,fastq_2\nCONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz\n\n\nEach row represents a fastq file (single-end) or a pair of fastq files (paired end).\n\n\n\nNow, you can run the pipeline using:\n\n update the following command to include all required parameters for a minimal example \n\nbash\nnextflow run nf-core/rnafusion \\\n -profile <docker/singularity/.../institute> \\\n --input samplesheet.csv \\\n --outdir <OUTDIR>\n\n\n> [!WARNING]\n> Please provide pipeline parameters via the CLI or Nextflow -params-file option. Custom config files including those provided by the -c Nextflow option can be used to provide any configuration except for parameters; see docs.\n\nFor more details and further functionality, please refer to the usage documentation and the parameter documentation.\n\n## Pipeline output\n\nTo see the results of an example test run with a full size dataset refer to the results tab on the nf-core website pipeline page.\nFor more details about the output files and reports, please refer to the\noutput documentation.\n\n## Credits\n\nnf-core/rnafusion was originally written by Martin Proks, Annick Renevey.\n\nWe thank the following people for their extensive assistance in the development of this pipeline:\n\n If applicable, make list of people who have also contributed \n\n## Contributions and Support\n\nIf you would like to contribute to this pipeline, please see the contributing guidelines.\n\nFor further information or help, don't hesitate to get in touch on the Slack #rnafusion channel (you can join with this invite).\n\n## Citations\n\n Add citation for pipeline after first release. Uncomment lines below and update Zenodo doi and badge at the top of this file. \n If you use nf-core/rnafusion for your analysis, please cite it using the following doi: 10.5281/zenodo.XXXXXX \n\n Add bibliography of tools and data used in your pipeline \n\nAn extensive list of references for the tools used by the pipeline can be found in the CITATIONS.md file.\n\nYou can cite the nf-core publication as follows:\n\n> The nf-core framework for community-curated bioinformatics pipelines.\n>\n> Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.\n>\n> Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.\n",
  • pipeline_todos - TODO string in nextflow.config: Update the field with the details of the contributors to your pipeline. New with Nextflow version 24.10.0
  • schema_lint - Input mimetype is missing or empty
  • local_component_structure - fusioninspector_workflow.nf in subworkflows/local should be moved to a SUBWORKFLOW_NAME/main.nf structure

❔ Tests ignored:

  • files_unchanged - File ignored due to lint config: .github/CONTRIBUTING.md
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md

✅ Tests passed:

Run details

  • nf-core/tools version 3.2.1
  • Run at 2025-07-18 18:04:29

@delfiterradas delfiterradas self-assigned this Jul 16, 2025
"dfam_h3i": {
"type": "string",
"format": "uri",
"pattern": "^https?://.*",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question: people might have these files downloaded or they might host their files in places other than https (such as s3) so I would remove this pattern for all the dfam files. However, you can add a pattern for the required suffix (for example .h3i `.h3f``)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I'll change that :)

dfam_h3i = Channel.fromPath(params.dfam_h3i, checkIfExists: true)
dfam_h3m = Channel.fromPath(params.dfam_h3m, checkIfExists: true)
dfam_h3p = Channel.fromPath(params.dfam_h3p, checkIfExists: true)
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change this for an else if dfam_version and params.species, then an else for failure if all those are missing.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@delfiterradas is this addressed because dfam_h** are dependent on dfam_version and species?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, now dfam_h** by default are dependent on the dfam_version and species. However, the user can instead pass the full paths to each of the dfam_h** params and in that case it does not use the dfam_version.

}

if ((params.dfam_hmm || params.dfam_h3p || params.dfam_h3m || params.dfam_h3i || params.dfam_h3f) && (params.dfam_version)) {
log.warn("Both custom dfam_urls and dfam_version were specified. \n If you want to use custom dfam URLs make sure to provide the full paths for each of the dfam params as the dfam_version will not be overwritten. \n Otherwise, use only the `--dfam_version` and `--species` params and the dfam URLs will be automatically filled.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning should tell you which of the two options was preferred. In your if/else if these parameters are provided: params.dfam_hmm params.dfam_h3p params.dfam_h3m params.dfam_h3i params.dfam_h3f, then the "building" of the URL via dfam_version won't be triggered. So it should make it clear here.

@delfiterradas delfiterradas marked this pull request as ready for review July 17, 2025 20:47
nextflow.config Outdated
starindex_ref = "${params.genomes_base}/star"
fusionreport_ref = "${params.genomes_base}/fusion_report_db"
ctatsplicing_cancer_introns = "https://data.broadinstitute.org/Trinity/CTAT_RESOURCE_LIB/CANCER_SPLICING_LIB_SUPPLEMENT/cancer_introns.GRCh38.Jun232020.tsv.gz"
dfam_hmm = null
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency: I would put all the defaults in nextflow.config rather than nextflow_schema.json since it is the primary source

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The defaults should be in both (otherwise linting will complain). The schema defaults will be mainly used for documentation and the config defaults will be used as the real defaults

Copy link
Member

@apeltzer apeltzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my end, this looks good and reasonable 👍🏻

Copy link
Collaborator

@rannick rannick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super good job!

dfam_h3i = Channel.fromPath(params.dfam_h3i, checkIfExists: true)
dfam_h3m = Channel.fromPath(params.dfam_h3m, checkIfExists: true)
dfam_h3p = Channel.fromPath(params.dfam_h3p, checkIfExists: true)
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@delfiterradas is this addressed because dfam_h** are dependent on dfam_version and species?

@delfiterradas delfiterradas merged commit ab2124b into nf-core:dev Jul 23, 2025
13 checks passed
@atrigila atrigila mentioned this pull request Sep 16, 2025
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants