-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
We need to handle adapter sequences.
Observed behaviour:
- amplicon N is ok, amplicon N+1 is dropped.
- reads at the end of amplicon N have adapters.
- a few bp of adapter are included in the consensus output by cylon, so the consensus is amplicon + (a little adapter sequence)
- the reads all map fine to the consensus, including the adapters because the consensus now has some adapter seq
- self-QC does not mask the adapters
Desired behaviour: the adapter sequence is masked. I think self-qc could do this. An alternative would be to remove adapters from the reads earlier on in the pipeline so they are never seen again.
Example is the same as in #99. Please also see #100 for more detail on the amplicon in question.
The adapter sequence is included in the consensus at the end of amplicon SARS-CoV-2_76. All the reads there end either with the primer, or with the primer plus a few bp of adapter. None of them should contribute to Clean.Tot.cons at the positions of the primer or past the end of the primer.
First columns of all_stats.tsv at the end of the primer -- start of adapter -- start of dropped amplicon:
23055 A 23030 A 335 0 2 3 2 342 335 7
23056 A 23031 A 322 0 2 0 0 324 322 2
23057 C 23032 C 1 309 0 4 2 316 309 7
23058 C 23033 A 170 1 0 0 21 192 170 22
23058 - 23034 G 2 2 181 0 1 186 181 5
23059 C 23035 C 1 171 0 4 1 177 171 6
23060 A 23036 A 169 2 0 0 0 171 169 2
23061 C 23037 A 164 0 0 0 0 164 164 0
23062 T 23038 T 0 0 2 155 0 157 155 2
23063 A 23039 A 141 2 0 0 0 143 141 2
23064 A 23040 N 0 0 0 0 0 0 0 0
23065 T 23041 N 0 0 0 0 0 0 0 0
23066 G 23042 N 0 0 0 0 0 0 0 0
23067 G 23043 N 0 0 0 0 0 0 0 0
23068 T 23044 N 0 0 0 0 0 0 0 0
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels