The Holy Grail of Cryptography
The First World War saw a series of victories for cryptanalysts, culminating in the decipherment of the Zimmermann telegram. Ever since the cracking of the Vigenère cipher in the nineteenth century, codebreakers had maintained the upper hand over the codemakers. Then, toward the end of the war, when cryptographers were in a state of utter despair, scientists in America made an astounding breakthrough. They discovered that the Vigenère cipher could be used as the basis for a new, more formidable form of encryption. In fact, this new cipher could offer perfect security.
The fundamental weakness of the Vigenère cipher is its cyclical nature. If the keyword is five letters long, then every fifth letter of the plaintext is encrypted according to the same cipher alphabet. If the cryptanalyst can identify the length of the keyword, the ciphertext can be treated as a series of five monoalphabetic ciphers, and each one can be broken by frequency analysis. However, consider what happens as the keyword gets longer.
Imagine a plaintext of 1,000 letters encrypted according to the Vigenère cipher, and imagine that we are trying to cryptanalyze the resulting ciphertext. If the keyword used to encipher the plaintext were only 5 letters long, the final stage of cryptanalysis would require applying frequency analysis to 5 sets of 200 letters, which is easy. But if the keyword had been 20 letters long, the final stage would be a frequency analysis of 20 sets of 50 letters, which is considerably harder. And if the keyword had been 1,000 letters long, you would be faced with frequency analysis of 1,000 sets of 1 letter each, which is completely impossible. In other words, if the keyword (or keyphrase) is as long as the message, then the cryptanalytic technique developed by Babbage and Kasiski will not work.
Using a key as long as the message is all well and good, but this requires the cryptographer to create a lengthy key. If the message is hundreds of letters long, the key also needs to be hundreds of letters long. Rather than inventing a long key from scratch, it might be tempting to base it on, say, the lyrics of a song. Alternatively, the cryptographer could pick up a book on birdwatching and base the key on a series of randomly chosen bird names. However, such shortcut keys are fundamentally flawed.
In the following example, I have enciphered a piece of ciphertext using the Vigenère cipher, using a keyphrase that is as long as the message. All the cryptanalytic techniques that I have previously described will fail. None the less, the message can be deciphered.
This new system of cryptanalysis begins with the assumption that the ciphertext contains some common words, such as the. Next, we randomly place the at various points in the plaintext, as shown below, and deduce what sort of keyletters would be required to turn the into the appropriate ciphertext. For example, if we pretend that the is the first word of the plaintext, then what would this imply for the first three letters of the key? The first letter of the key would encrypt t into V. To work out the first letter of the key, we take a Vigenère square, look down the column headed by t until we reach V, and find that the letter that begins that row is C. This process is repeated with h and e, which would be encrypted as H and R respectively, and eventually we have candidates for the first three letters of the key, CAN. All of this comes from the assumption that the is the first word of the plaintext. We place the in a few other positions, and, once again, deduce the corresponding keyletters. (You can check the relationship between each plaintext letter and ciphertext letter by referring to the Vigenère square in
Table 9
.)
We have tested three the’s against three arbitrary fragments of the ciphertext, and generated three guesses as to the elements of certain parts of the key. How can we tell whether any of the the’s are in the right position? We suspect that the key consists of sensible words, and we can use this to our advantage. If a the is in a wrong position, it will probably result in a random selection of keyletters. However, if it is in a correct position, the keyletters should make some sense. For example, the first the yields the keyletters CAN, which is encouraging because this is a perfectly reasonable English syllable. It is possible that this the is in the correct position. The second the yields BSJ, which is a very peculiar combination of consonants, suggesting that the second the is probably a mistake. The third the yields YPT, an unusual syllable but one which is worth further investigation. If YPT really were part of the key, it would be within a larger word, the only possibilities being APOCALYPTIC, CRYPT and EGYPT, and derivatives of these words. How can we find out if one of these words is part of the key? We can test each hypothesis by inserting the three candidate words in the key, above the appropriate section of the ciphertext, and working out the corresponding plaintext:
If the candidate word is not part of the key, it will probably result in a random piece of plaintext, but if it is part of the key the resulting plaintext should make some sense. With APOCALYPTIC as part of the key the resulting plaintext is gibberish of the highest quality. With CRYPT, the resulting plaintext is cithe, which is not an inconceivable piece of plaintext. However, if EGYPT were part of the key it would generate atthe, a more promising combination of letters, probably representing the words at the.
For the time being let us assume that the most likely possibility is that EGYPT is part of the key. Perhaps the key is a list of countries. This would suggest that CAN, the piece of the key that corresponds to the first the, is the start of CANADA. We can test this hypothesis by working out more of the plaintext, based on the assumption that CANADA, as well as EGYPT, is part of the key:
Our assumption seems to be making sense. CANADA implies that the plaintext begins with themee which perhaps is the start of the meeting. Now that we have deduced some more letters of the plaintext, ting, we can deduce the corresponding part of the key, which turns out to be BRAZ. Surely this is the beginning of BRAZIL. Using the combination of CANADABRAZILEGYPT as the bulk of the key, we get the following decipherment: the meeting is at the ????.
In order to find the final word of the plaintext, the location of the meeting, the best strategy would be to complete the key by testing one by one the names of all possible countries, and deducing the resulting plaintext. The only sensible plaintext is derived if the final piece of the key is CUBA:
Table 9
Vigenère square.
So, a key that is as long as the message is not sufficient to guarantee security. The insecurity in the example above arises because the key was constructed from meaningful words. We began by randomly inserting the throughout the plaintext, and working out the corresponding keyletters. We could tell when we had put a the in the correct place, because the keyletters looked as if they might be part of meaningful words. Thereafter, we used these snippets in the key to deduce whole words in the key. In turn this gave us more snippets in the message, which we could expand into whole words, and so on. This entire process of toing and froing between the message and the key was only possible because the key had an inherent structure and consisted of recognizable words. However, in 1918 cryptographers began experimenting with keys that were devoid of structure. The result was an unbreakable cipher.
As the Great War drew to a close, Major Joseph Mauborgne, head of cryptographic research for the U.S. Army, introduced the concept of a random key-one that consisted not of a recognizable series of words, but rather a random series of letters. He advocated employing these random keys as part of a Vigenère cipher to give an unprecedented level of security. The first stage of Mauborgne’s system was to compile a thick pad consisting of hundreds of sheets of paper, each sheet bearing a unique key in the form of lines of randomly sequenced letters. There would be two copies of the pad, one for the sender and one for the receiver. To encrypt a message, the sender would apply the Vigenère cipher using the first sheet of the pad as the key.
Figure 30
shows three sheets from such a pad (in reality each sheet would contain hundreds of letters), followed by a message encrypted using the random key on the first sheet. The receiver can easily decipher the ciphertext by using the identical key and reversing the Vigenère cipher. Once that message has been successfully sent, received and deciphered, both the sender and the receiver destroy the sheet that acted as the key, so that it is never used again. When the next message is encrypted, the next random key in the pad is employed, which is also subsequently destroyed, and so on. Because each key is used once, and only once, this system is known as a
onetime pad cipher
.