Skip to content

Improvement of xmax determination in plot_spectra_cn on high-depth data #163

@Yoonsung1203

Description

@Yoonsung1203

Hi,

We were performing Merqury's hapmer.sh using child HiFi data produced at 150X coverage and parent WGS data produced at 30X coverage each.

At line 168 of plot_spectra_cn.R, the part that removes the initial counts of the k-mer multiplicity histogram corresponding to child-only is hard-coded to 3, which causes xmax to be set too far forward. As a result, the plot appears as follows.

Image

This appears to result in less filtering for high-coverage datasets, and when the value is changed to 5, we can observe that xmax is properly determined as shown below.

Image

To improve this, how about modifying the filtering process to use the count values recorded in cutoffs.txt as the basis for filtering?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions