What is BLAST?

Overview

BLAST stands for Basic Local Alignment Search Tool. The primary purpose of BLAST is to compare protein and gene sequences against available public databases.

It is a set of sequence comparison algorithms used to search databases for an optimal local alignmentA measure of how two unrelated sequences would produce an alignment score greater than or equal to the observed score. against the query. It breaks the obtained queries and database sequences into fragments and matches them. A statistically notable alignment is likely to contain a high-scoring pair of aligned words.

Types of BLAST

Type

Query

Database

Comparison

blastn

DNA

DNA

DNA level

blastx

DNA

Protein

Protein level

tblastx

DNA

DNA

Protein level

blastp

Protein

Protein

Protein level

tblastn

Protein

DNA

Protein level

BLAST input and output

BLAST input and output are explained below.

Input

BLAST takes input in the form of a A genetic sequence database.FASTA formatA text based format to represent nucleotide sequence or protein sequences., bare sequences, and identifiers. It does not require identical words. Moreover, it requires three amino acidsMolecules that combine to form proteins. or 11 basesMolecules that combine to form DNA. to match a query.

Output

BLAST output is available in multiple formats. These include HTML, plain text, and XML.

The BLAST algorithm

The BLAST algorithm has two basic components: word matching and extended hits.

  1. A list of high-scoring words of length ww is obtained for the query.
  2. The obtained word list is compared with the database, and exact matches are identified.
  3. For every word match, an extended alignment is done in both directions to find alignments with scores greater than the score threshold SS.
BLAST algorithm

The two statistical parameters to obtain the similarities include:

E-value

    • E-value is described as the number of hits "expected" to be seen by chance while searching a database.
    • It decreases exponentially as the score threshold of the match increases.
    • Smaller values of E are more likely if the match is significant.

Bit score

    • A bit score measures the sequence similarity independent of query sequence length and database size.
    • It is normalized based on a pairwise alignment score.

The BLAST algorithm is at least fifty times faster than getting alignments by dynamic programming.

BLAST function

Few of the significant functions of BLAST include identifying species by obtaining homologous sequences, mapping the DNA by getting identical DNA and splicing patterns, location of domains, and creating phylogenic trees based on DNA or protein similarities.

Note: To explore BLAST, visit here.

Copyright ©2024 Educative, Inc. All rights reserved