-
Notifications
You must be signed in to change notification settings - Fork 522
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Description of the bug
The reference files provided by van loo lab github for hg38 DO have 'chr' prefixes.
so this is incorrect
How to generate ASCAT resources for exome or targeted sequencing
Fetch the GC content correction and replication timing (RT) correction files from the [Dropbox links provided by the ASCAT developers](https://github.com/VanLoo-lab/ascat/tree/master/ReferenceFiles/WGS) and intersect the SNP coordinates with the exome target coordinates. If the target file has ‘chr’ prefixes, make a copy with these removed first. Extract the GC and RT information for only the on target SNPs and zip the results.
this code fails for multiple reasons
- you remove the chr prefix
- the files are ${t}_G1000_WES_hg38.zip but then unzip to xx_G100_hg38.txt
sed -e 's/chr//' targets_with_chr.bed > targets.bed
for t in GC RT
do
unzip ${t}_G1000_hg38.zip
cut -f 1-3 ${t}_G1000_hg38.txt > ascat_${t}_snps_hg38.txt
tail -n +2 ascat_${t}_snps_hg38.txt | awk '{ print $2 "\t" $3-1 "\t" $3 "\t" $1 }' > ascat_${t}_snps_hg38.bed
bedtools intersect -a ascat_${t}_snps_hg38.bed -b targets.bed | awk '{ print $1 "_" $3 }' > ascat_${t}_snps_on_target_hg38.txt
head -n 1 ${t}_G1000_hg38.txt > ${t}_G1000_on_target_hg38.txt
grep -f ascat_${t}_snps_on_target_hg38.txt ${t}_G1000_hg38.txt >> ${t}_G1000_on_target_hg38.txt
zip ${t}_G1000_on_target_hg38.zip ${t}_G1000_on_target_hg38.txt
rm ${t}_G1000_hg38.zip
done
Command used and terminal output
Relevant files
No response
System information
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working