Related Tags

# What is BLAST?

Ayesha Kanwal

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

### Overview

BLAST stands for Basic Local Alignment Search Tool. The primary purpose of BLAST is to compare protein and gene sequences against available public databases.

It is a set of sequence comparison algorithms used to search databases for an optimal local alignmentA measure of how two unrelated sequences would produce an alignment score greater than or equal to the observed score. against the query. It breaks the obtained queries and database sequences into fragments and matches them. A statistically notable alignment is likely to contain a high-scoring pair of aligned words.

Types of BLAST
 Type Query Database Comparison blastn DNA DNA DNA level blastx DNA Protein Protein level tblastx DNA DNA Protein level blastp Protein Protein Protein level tblastn Protein DNA Protein level

### BLAST input and output

BLAST input and output are explained below.

#### Input

BLAST takes input in the form of a A genetic sequence database.FASTA formatA text based format to represent nucleotide sequence or protein sequences., bare sequences, and identifiers. It does not require identical words. Moreover, it requires three amino acidsMolecules that combine to form proteins. or 11 basesMolecules that combine to form DNA. to match a query.

#### Output

BLAST output is available in multiple formats. These include HTML, plain text, and XML.

### The BLAST algorithm

The BLAST algorithm has two basic components: word matching and extended hits.

1. A list of high-scoring words of length $w$ is obtained for the query.
2. The obtained word list is compared with the database, and exact matches are identified.
3. For every word match, an extended alignment is done in both directions to find alignments with scores greater than the score threshold $S$.
BLAST algorithm

The two statistical parameters to obtain the similarities include:

#### E-value

• E-value is described as the number of hits "expected" to be seen by chance while searching a database.
• It decreases exponentially as the score threshold of the match increases.
• Smaller values of E are more likely if the match is significant.

### Bit score

• A bit score measures the sequence similarity independent of query sequence length and database size.
• It is normalized based on a pairwise alignment score.

The BLAST algorithm is at least fifty times faster than getting alignments by dynamic programming.

### BLAST function

Few of the significant functions of BLAST include identifying species by obtaining homologous sequences, mapping the DNA by getting identical DNA and splicing patterns, location of domains, and creating phylogenic trees based on DNA or protein similarities.

Note: To explore BLAST, visit here.

RELATED TAGS

CONTRIBUTOR

Ayesha Kanwal