What is BLAST?
Overview
BLAST stands for Basic Local Alignment Search Tool. The primary purpose of BLAST is to compare protein and gene sequences against available public databases.
It is a set of sequence comparison algorithms used to search databases for an
Types of BLAST
Type | Query | Database | Comparison |
blastn | DNA | DNA | DNA level |
blastx | DNA | Protein | Protein level |
tblastx | DNA | DNA | Protein level |
blastp | Protein | Protein | Protein level |
tblastn | Protein | DNA | Protein level |
BLAST input and output
BLAST input and output are explained below.
Input
BLAST takes input in the form of a
Output
BLAST output is available in multiple formats. These include HTML, plain text, and XML.
The BLAST algorithm
The BLAST algorithm has two basic components: word matching and extended hits.
- A list of high-scoring words of length
is obtained for the query. - The obtained word list is compared with the database, and exact matches are identified.
- For every word match, an extended alignment is done in both directions to find alignments with scores greater than the score threshold
.
The two statistical parameters to obtain the similarities include:
E-value
- E-value is described as the number of hits "expected" to be seen by chance while searching a database.
- It decreases exponentially as the score threshold of the match increases.
- Smaller values of E are more likely if the match is significant.
Bit score
- A bit score measures the sequence similarity independent of query sequence length and database size.
- It is normalized based on a pairwise alignment score.
The BLAST algorithm is at least fifty times faster than getting alignments by dynamic programming.
BLAST function
Few of the significant functions of BLAST include identifying species by obtaining homologous sequences, mapping the DNA by getting identical DNA and splicing patterns, location of domains, and creating phylogenic trees based on DNA or protein similarities.
Note: To explore BLAST, visit here.
Free Resources