Search⌘ K

Counting Words

Explore how to count frequent nucleotide sequences or k-mers in DNA to uncover regions vital for biological processes such as protein binding during DNA replication. Understand the application of algorithmic methods like sliding windows and pseudocode to analyze overlapping sequence occurrences.

We'll cover the following...

Identifying frequent words

Operating under the assumption that DNA is a language of its own, let’s borrow Legrand’s method and see if we can find any surprisingly frequent “words” within the ori of Vibrio cholerae. We’ve added reason to look for frequent words in the ori because for various biological processes, certain nucleotide strings appear surprisingly often in small regions of the genome. This is because certain proteins can only bind to DNA if a specific string of nucleotides is present, and if there are more occurrences of the string, then it’s more likely that binding will successfully occur. (It’s also less likely that a mutation will disrupt ...