Skip to content

removed max_target_seqs option in blastp process#12

Open
husensofteng wants to merge 1 commit intolehtiolab:masterfrom
husensofteng:patch-3
Open

removed max_target_seqs option in blastp process#12
husensofteng wants to merge 1 commit intolehtiolab:masterfrom
husensofteng:patch-3

Conversation

@husensofteng
Copy link
Copy Markdown
Member

Remove the -max_target_seqs 1 parameter in blasp process to avoid early stopping of the algorithm that makes it not find the top best matches.

See the references here and here

blastp -db $blastdb -query $novelfasta -outfmt '6 qseqid sseqid pident qlen slen qstart qend sstart send mismatch positive gapopen gaps qseq sseq evalue bitscore' -num_threads 4 -max_target_seqs 1 -evalue 1000 -out blastp_out.txt

Remove the `-max_target_seqs 1` parameter in `blasp` process to avoid early stopping of the algorithm that makes it not find the top best matches. 

See the references [here](https://gist.github.com/sujaikumar/504b3b7024eaf3a04ef5) and [here](https://doi.org/10.1093/bioinformatics/bty833)

https://github.com/lehtiolab/proteogenomics-analysis-workflow/blob/3472dcbaf8b3d20d41a7113519dff691c0a4b0b7/main.nf#L698
@husensofteng
Copy link
Copy Markdown
Member Author

Actually, it turns out there can still be canonical sequences reported as not mapping in blastp.
I have tested the following parameters that seem to find the matches properly:
-evalue 10000 -comp_based_stats 0 -ungapped

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant