# Detour: Probabilities of Patterns in a String

Explore how probabilities of patterns in a string are calculated, how changing the pattern changes the probability, and what the overlapping words paradox is.

We mentioned that the probability that
some 9-mer appears 3 or more times in a random DNA string of length 500 is approximately 1/1300. We assure you that this calculation doesn’t appear out of thin air.
Specifically, we can generate a **random string** modeling a DNA strand by choosing each nucleotide for any position with a probability *1/4*. The construction of random strings can be generalized to an arbitrary alphabet with *A* symbols, where each symbol is chosen with probability 1/*A*.

Exercise Break:What is the probability that two randomly generated strings of lengthnin anA-letter alphabet are identical?

Now, there’s a simple question: what’s the probability that a specific *k*-mer *Pattern* will appear (at least once) as a substring of a random string of length *N*?
For example, say that we want to find the probability that “01” appears in a random **binary string**
(*A* = 2) of length 4. Here are all possible such strings:

Get hands-on with 1200+ tech skills courses.