Skip to content

VG giraffe extremely slow for some samples with repeated Watchdog warnings #4771

@Daishoulu

Description

@Daishoulu

Dear VG Team,

I built the graph and mapping using the following workflow:

=====>code<=====
vg version
#vg version v1.70.0 "Zebedassi"
vg autoindex --workflow giraffe
--prefix Graph
--tmp-dir TMPDIR
--threads 16
--ref-fasta $fasta
--vcf $VCF
vg snarls --include-trivial Graph.giraffe.gbz -t 32 > Graph.snarls
vg giraffe -Z Graph.giraffe.gbz -m Graph.shortread.withzip.min -d Graph.dist -f $Fastq1 -f $Fastq2 -t $Threads --rescue-algorithm none -b fast -N $SampleName -p > $OutDir/${SampleName}.gam
vg pack -x Graph.giraffe.gbz -g $OutDir/${SampleName}.gam -o $OutDir/${SampleName}.pack -t $Threads
vg call Graph.giraffe.gbz -r Graph.snarls -k $OutDir/${SampleName}.pack -s $SampleName -a -t $Threads --bias-mode --het-bias 2,4 > $OutDir/$SampleName.vcf
=====>code<=====

As shown in the vg logs below, some samples complete vg giraffe, vg pack, and vg call smoothly in a short time, but for other samples there are always warnings at the vg giraffe stage, which makes it impossible to finish mapping a single sample even after three days. I’ve checked reports of this issue from other users, and as the logs show, even when I use --rescue-algorithm none I still can’t avoid the warnings. I would like to know what the most fundamental solution to this problem is.

=====>Sample1<=====
[vg giraffe] Guessing that vgAutoindex2/Graph.shortread.zipcodes is Short Read Zipcodes
[vg giraffe] Guessing that vgAutoindex2/Graph.xg is XG
[vg giraffe] Preparing Indexes
[vg giraffe] Loading Minimizer Index
[vg giraffe] Loading Zipcodes
[vg giraffe] Loading GBZ
[vg giraffe] Loading Distance Index v3
[vg giraffe] Paging in Distance Index v3
[vg giraffe] Initializing MinimizerMapper
[vg giraffe] Loading and initialization: 67.5071 seconds
[vg giraffe] Of which Distance Index v3 paging: 2.186 seconds
[vg giraffe] Mapping reads to "-" (GAM)
--watchdog-timeout 10
--batch-size 512
--match 1
--mismatch 4
--gap-open 6
--gap-extend 1
--full-l-bonus 5
--max-multimaps 1
--hit-cap 10
--hard-hit-cap 500
--score-fraction 0.5
--max-min 500
--min-coverage-flank 250
--num-bp-per-min 1000
--downsample-window-length 18446744073709551615
--downsample-window-count 0
--distance-limit 200
--max-extensions 400
--max-alignments 8
--cluster-score 50
--pad-cluster-score 0
--cluster-coverage 0.2
--max-extension-mismatches 4
--extension-score 1
--extension-set 20
--rescue-attempts 15
--max-fragment-length 2000
--paired-distance-limit 2
--rescue-subgraph-size 4
--rescue-seed-limit 100
--mapq-score-window 0
--mapq-score-scale 1
--zipcode-tree-score-threshold 50
--pad-zipcode-tree-score-threshold 20
--zipcode-tree-coverage-threshold 0.3
--zipcode-tree-scale 2
--min-to-fragment 4
--max-to-fragment 10
--max-direct-chain 0
--gapless-extension-limit 0
--fragment-max-graph-lookback-bases 300
--fragment-max-graph-lookback-bases-per-base 0.03
--fragment-max-read-lookback-bases 18446744073709551615
--fragment-max-read-lookback-bases-per-base 1
--max-fragments 18446744073709551615
--fragment-max-indel-bases 2000
--fragment-max-indel-bases-per-base 0.2
--fragment-gap-scale 1
--fragment-points-per-possible-match 0
--fragment-score-fraction 0.1
--fragment-max-min-score 1.79769e+308
--fragment-min-score 60
--fragment-set-score-threshold 0
--min-chaining-problems 1
--max-chaining-problems 2147483647
--max-graph-lookback-bases 3000
--max-graph-lookback-bases-per-base 0.3
--max-read-lookback-bases 18446744073709551615
--max-read-lookback-bases-per-base 1
--max-indel-bases 2000
--max-indel-bases-per-base 0.2
--item-bonus 0
--item-scale 1
--gap-scale 1
--rec-penalty-chain 0
--rec-penalty-fragment 0
--points-per-possible-match 0
--chain-score-threshold 100
--min-chains 4
--min-chain-score-per-base 0.01
--max-min-chain-score 200
--max-skipped-bases 0
--max-chains-per-tree 1
--max-chain-connection 100
--max-tail-length 100
--max-dp-cells 18446744073709551615
--max-tail-gap 18446744073709551615
--max-middle-gap 18446744073709551615
--max-tail-dp-length 30000
--max-middle-dp-length 2147483647
--wfa-max-mismatches 2
--wfa-max-mismatches-per-base 0.1
--wfa-max-max-mismatches 20
--wfa-distance 10
--wfa-distance-per-base 0.1
--wfa-max-distance 200
--softclip-penalty 0
--min-unique-node-fraction 0
--rescue-algorithm dozeu
[vg giraffe] Using fragment length estimate: 315.716 +/- 105.642
warning[vg::giraffe]: Refusing to perform too-large rescue alignment of 125 bp against 14414 bp ordered subgraph for read SRR6650035.235043 which would use more than 1572864 cells and might exhaust Dozeu's allocator; suppressing further warnings.
[vg giraffe] Mapped 156386470 reads across 16 threads in 1530.97 seconds with 1.95959 additional single-threaded seconds.
[vg giraffe] Mapping speed: 6383.76 reads per second per thread
[vg giraffe] Used 23633.6 CPU-seconds (including output).
[vg giraffe] Achieved 6617.12 reads per CPU-second (including output)
[vg giraffe] Used 185339210608661 CPU instructions (not including output).
[vg giraffe] Mapping slowness: 1.18514 M instructions per read at 7842.19 M mapping instructions per inclusive CPU-second
[vg giraffe] Memory footprint: 43.1987 GB
[vg call] You can restrict the search to GBZ haplotypes, often to the benefict of speed and accuracy, with the -z option
=====>Sample1<=====

=====>Sample2<=====
[vg giraffe] Guessing that vgAutoindex2/Graph.shortread.zipcodes is Short Read Zipcodes
[vg giraffe] Guessing that vgAutoindex2/Graph.xg is XG
[vg giraffe] Preparing Indexes
[vg giraffe] Loading Minimizer Index
[vg giraffe] Loading Zipcodes
[vg giraffe] Loading GBZ
[vg giraffe] Loading Distance Index v3
[vg giraffe] Paging in Distance Index v3
[vg giraffe] Initializing MinimizerMapper
[vg giraffe] Loading and initialization: 59.8921 seconds
[vg giraffe] Of which Distance Index v3 paging: 1.82132 seconds
[vg giraffe] Mapping reads to "-" (GAM)
--watchdog-timeout 10
--batch-size 512
--match 1
--mismatch 4
--gap-open 6
--gap-extend 1
--full-l-bonus 5
--max-multimaps 1
--hit-cap 10
--hard-hit-cap 500
--score-fraction 0.5
--max-min 500
--min-coverage-flank 250
--num-bp-per-min 1000
--downsample-window-length 18446744073709551615
--downsample-window-count 0
--distance-limit 200
--max-extensions 400
--max-alignments 8
--cluster-score 50
--pad-cluster-score 0
--cluster-coverage 0.2
--max-extension-mismatches 4
--extension-score 1
--extension-set 20
--rescue-attempts 0
--max-fragment-length 2000
--paired-distance-limit 2
--rescue-subgraph-size 4
--rescue-seed-limit 100
--mapq-score-window 0
--mapq-score-scale 1
--zipcode-tree-score-threshold 50
--pad-zipcode-tree-score-threshold 20
--zipcode-tree-coverage-threshold 0.3
--zipcode-tree-scale 2
--min-to-fragment 4
--max-to-fragment 10
--max-direct-chain 0
--gapless-extension-limit 0
--fragment-max-graph-lookback-bases 300
--fragment-max-graph-lookback-bases-per-base 0.03
--fragment-max-read-lookback-bases 18446744073709551615
--fragment-max-read-lookback-bases-per-base 1
--max-fragments 18446744073709551615
--fragment-max-indel-bases 2000
--fragment-max-indel-bases-per-base 0.2
--fragment-gap-scale 1
--fragment-points-per-possible-match 0
--fragment-score-fraction 0.1
--fragment-max-min-score 1.79769e+308
--fragment-min-score 60
--fragment-set-score-threshold 0
--min-chaining-problems 1
--max-chaining-problems 2147483647
--max-graph-lookback-bases 3000
--max-graph-lookback-bases-per-base 0.3
--max-read-lookback-bases 18446744073709551615
--max-read-lookback-bases-per-base 1
--max-indel-bases 2000
--max-indel-bases-per-base 0.2
--item-bonus 0
--item-scale 1
--gap-scale 1
--rec-penalty-chain 0
--rec-penalty-fragment 0
--points-per-possible-match 0
--chain-score-threshold 100
--min-chains 4
--min-chain-score-per-base 0.01
--max-min-chain-score 200
--max-skipped-bases 0
--max-chains-per-tree 1
--max-chain-connection 100
--max-tail-length 100
--max-dp-cells 18446744073709551615
--max-tail-gap 18446744073709551615
--max-middle-gap 18446744073709551615
--max-tail-dp-length 30000
--max-middle-dp-length 2147483647
--wfa-max-mismatches 2
--wfa-max-mismatches-per-base 0.1
--wfa-max-max-mismatches 20
--wfa-distance 10
--wfa-distance-per-base 0.1
--wfa-max-distance 200
--softclip-penalty 0
--min-unique-node-fraction 0
--rescue-algorithm none
[vg giraffe] Using fragment length estimate: 287.569 +/- 54.724
warning[vg::Watchdog]: Thread 1 has been checked in for 10 seconds processing: E100078082L1C001R00200775564/1, E100078082L1C001R00200775564/2
warning[vg::Watchdog]: Thread 2 has been checked in for 10 seconds processing: E100078082L1C001R00200828212/1, E100078082L1C001R00200828212/2
warning[vg::Watchdog]: Thread 10 has been checked in for 10 seconds processing: E100078082L1C001R00200475849/1, E100078082L1C001R00200475849/2
warning[vg::Watchdog]: Thread 10 finally checked out after 10 seconds and 30576 kb memory growth processing: E100078082L1C001R00200475849/1, E100078082L1C001R00200475849/2
warning[vg::Watchdog]: Thread 4 has been checked in for 10 seconds processing: E100078082L1C001R00200738282/1, E100078082L1C001R00200738282/2
warning[vg::Watchdog]: Thread 4 finally checked out after 12 seconds and 0 kb memory growth processing: E100078082L1C001R00200738282/1, E100078082L1C001R00200738282/2
warning[vg::Watchdog]: Thread 5 has been checked in for 10 seconds processing: E100078082L1C001R00200886132/1, E100078082L1C001R00200886132/2
warning[vg::Watchdog]: Thread 5 finally checked out after 10 seconds and 0 kb memory growth processing: E100078082L1C001R00200886132/1, E100078082L1C001R00200886132/2
warning[vg::Watchdog]: Thread 1 finally checked out after 17 seconds and 30576 kb memory growth processing: E100078082L1C001R00200775564/1, E100078082L1C001R00200775564/2
warning[vg::Watchdog]: Thread 2 finally checked out after 17 seconds and 30576 kb memory growth processing: E100078082L1C001R00200828212/1, E100078082L1C001R00200828212/2
warning[vg::Watchdog]: Thread 14 has been checked in for 10 seconds processing: E100078082L1C001R00200868257/1, E100078082L1C001R00200868257/2
warning[vg::Watchdog]: Thread 14 finally checked out after 11 seconds and 0 kb memory growth processing: E100078082L1C001R00200868257/1, E100078082L1C001R00200868257/2
warning[vg::Watchdog]: Thread 3 has been checked in for 10 seconds processing: E100078082L1C001R00200924242/1, E100078082L1C001R00200924242/2
warning[vg::Watchdog]: Thread 3 finally checked out after 12 seconds and 0 kb memory growth processing: E100078082L1C001R00200924242/1, E100078082L1C001R00200924242/2
warning[vg::Watchdog]: Thread 4 has been checked in for 10 seconds processing: E100078082L1C001R00200739170/1, E100078082L1C001R00200739170/2
warning[vg::Watchdog]: Thread 7 has been checked in for 10 seconds processing: E100078082L1C001R00200812118/1, E100078082L1C001R00200812118/2
warning[vg::Watchdog]: Thread 4 finally checked out after 15 seconds and 0 kb memory growth processing: E100078082L1C001R00200739170/1, E100078082L1C001R00200739170/2
warning[vg::Watchdog]: Thread 7 finally checked out after 11 seconds and 0 kb memory growth processing: E100078082L1C001R00200812118/1, E100078082L1C001R00200812118/2
warning[vg::Watchdog]: Thread 8 has been checked in for 10 seconds processing: E100078082L1C001R01103229428/1, E100078082L1C001R01103229428/2
warning[vg::Watchdog]: Thread 9 has been checked in for 10 seconds processing: E100078082L1C001R01103503684/1, E100078082L1C001R01103503684/2
warning[vg::Watchdog]: Thread 11 has been checked in for 10 seconds processing: E100078082L1C001R01103306733/1, E100078082L1C001R01103306733/2
warning[vg::Watchdog]: Thread 13 has been checked in for 10 seconds processing: E100078082L1C001R01103269285/1, E100078082L1C001R01103269285/2
warning[vg::Watchdog]: Thread 15 has been checked in for 10 seconds processing: E100078082L1C001R01103484689/1, E100078082L1C001R01103484689/2
warning[vg::Watchdog]: Thread 11 finally checked out after 11 seconds and 0 kb memory growth processing: E100078082L1C001R01103306733/1, E100078082L1C001R01103306733/2
warning[vg::Watchdog]: Thread 3 has been checked in for 10 seconds processing: E100078082L1C001R01103346222/1, E100078082L1C001R01103346222/2
warning[vg::Watchdog]: Thread 9 finally checked out after 16 seconds and 0 kb memory growth processing: E100078082L1C001R01103503684/1, E100078082L1C001R01103503684/2
warning[vg::Watchdog]: Thread 3 finally checked out after 14 seconds and 0 kb memory growth processing: E100078082L1C001R01103346222/1, E100078082L1C001R01103346222/2
warning[vg::Watchdog]: Thread 2 has been checked in for 10 seconds processing: E100078082L1C001R01103381222/1, E100078082L1C001R01103381222/2
warning[vg::Watchdog]: Thread 4 has been checked in for 10 seconds processing: E100078082L1C001R01103288429/1, E100078082L1C001R01103288429/2
warning[vg::Watchdog]: Thread 15 finally checked out after 27 seconds and 0 kb memory growth processing: E100078082L1C001R01103484689/1, E100078082L1C001R01103484689/2
warning[vg::Watchdog]: Thread 2 finally checked out after 14 seconds and 0 kb memory growth processing: E100078082L1C001R01103381222/1, E100078082L1C001R01103381222/2
warning[vg::Watchdog]: Thread 5 has been checked in for 10 seconds processing: E100078082L1C001R01103428091/1, E100078082L1C001R01103428091/2
...
=====>Sample2<=====

Best,
Shoulu Dai

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions