Skip to content

alevin-fry quant slightly inconsistent resultsΒ #153

@nschcolnicov

Description

@nschcolnicov

This isn't really an issue, but I want to understand why is it that alevin-fry quant always produces different results even when using the same input.
I see that for these files:

  • featureDump.txt
  • gene_eqclass.txt.gz
  • geqc_counts.mtx
  • quants_mat.mtx
  • quants_mat_rows.txt
    The rows are always in different order, and the values in each row have slightly different values.

For example, I compared the featureDump.txt files across two executions that used the same inputs. I sorted both files and then ran the diff command and I got this output:

Image

As you can see, after sorting, the values are almost the same, but not exactly the same.

The reason why I'm asking about this is because I need to create some unit tests around this tool and need to find a way to make the results fully reproducible. Is there a way to achieve this?

This is the command that I'm using to run the tool:

alevin-fry quant \
-i ./af_permit_list_P10_T_collate -m gencode.v42.transcripts_tg2.txt \
--resolution cr-like-em -o ./alevin_fry_quant_P10_T \
--dump-eqclasses \
--use-mtx -t 16

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions