Frequency Analysis

Learn how an encrypted letter’s frequency can be used to guess the corresponding plaintext.

The nature of plaintexts

A good cryptanalyst needs many skills, including the ability to think laterally. In order to ‘break’ a cryptosystem, every available piece of information should be used. We’re about to see that cryptosystems such as the Caesar cipher and the simple substitution cipher have a significant problem that can be exploited. Intriguingly, this exploit arises because of the typical nature of plaintexts.

The job of a cryptographer would arguably be much simpler if cryptosystems were only used to protect plaintexts consisting of randomly generated data. But, typically, they are not. In many situations, a plaintext is a meaningful string of letters that represents words, sentences, perhaps even an entire book, expressed in a language such as English. In any language, there are certain letters, or combinations of letters, that occur far more often than others, and hence languages are highly structured. The table below shows approximate letter frequencies for the English language.

