Identifying frequent words

Operating under the assumption that DNA is a language of its own, let’s borrow Legrand’s method and see if we can find any surprisingly frequent “words” within the ori of Vibrio cholerae. We’ve added reason to look for frequent words in the ori because for various biological processes, certain nucleotide strings appear surprisingly often in small regions of the genome. This is because certain proteins can only bind to DNA if a specific string of nucleotides is present, and if there are more occurrences of the string, then it’s more likely that binding will successfully occur. (It’s also less likely that a mutation will disrupt ...

Before Getting Started

Where in the Genome Does DNA Replication Begin?

DNA Replication: Open Problems, Charging Stations, and Detours

How Do We Assemble Genomes?

Assemble Genomes: Charging Stations, and Detours

How Do We Compare Biological Sequences?

Biological Sequences: Detours

Conclusion

Counting Words

Identifying frequent words