Cryptography/Introduction to Simple Substitution Cypher

From Wikiversity
Jump to navigation Jump to search

How To Create A Simple Substitution Cypher[edit | edit source]

Simple Substitution Cyphers are, as their name suggests, simple. It is achieved by replacing each letter in the English (or any other, for that matter) language with another randomly selected letter.

Typically the best way to visuallize this is to write out the entire plaintext alphabet and then place its cyphered version below it:


The message is then written out using the cypher alphabet in a, you guessed it, simple substitution.

So the message:

"Welcome to cryptography" becomes
"Phlaxuh ex aoiyexwokypi"

As you can see, on occasion a letter may actually represent itself, such as J, L, N, and S in this cyphertext.

A way to disguise the words so that it is harder for someone to decode is to break the phrase up into preset length blocks so that word length is not a clue. As you will find, when this is not done, word length can be the greatest key to solving the cypher (words like I and A are easy to decode). Typically the blocks are of length 5 characters and dummy or null characters are used to finish the cypher (Q X Y an Z are often used) so that it fills every block of 5.

The original message:

"Welcome to cryptography...." becomes
"Phlax uhexa oiyex wokyp iwxyz"

Then we capitalize every letter to make it more difficult.


How To Solve a Simple Substitution Cypher[edit | edit source]

There are a few steps you can follow to solve simple substitution cyphers. 1. Look for patterns

Every instance of a particular letter is the same throughout the message in this cypher, so if you see the letter "A", it will always be "A". Similarly the combination "GFD" will always be "GFD", knowing this can allow you to figure out what "A" and "GFD" represent.

2. Check the sentence structure

If every letter isn't capitalized or if the cypher is not in preset length groups, you can use its structure to decrypt it. The obvious example of this is the letters "A" and "I" they are the only 1 letter words in the English language. As such, seeing a single letter you can guess that it is either A or I. Additionally the letter A is typically lower-case, whereas I is uppercase which can make decryption even easier.

3. Use letter frequency analysis

Certain letters, combinations, and words appear more often than others. Just like in the final round of "Wheel of Fortune" in which contestants are given the most common letters: RSTLNE, you can use this to help you identify which letters are which. In order, the most commonly used letters are "ETAOINSHRDLU" so if your cypher contains a lot of "Q"s, Q may represent the letter E. If possible try to use a letter's position in the word to help you decide which letter it is. This can be applied to larger patterns as well. The most common double letters are LL, EE, and SS, some common prefixes are IT, IN, and IS, and some common 3 letter words that repeat themselves in the same message are "The" and "And".

4. Educated Guesswork

As you begin to decrypt your cipher you may be able to guess what the message is saying. For example if you were faced with "I LI?? ?PPL? PI?" You may be able to guess that the message is "I like Apple Pie". If the message ends with --???? ???? or ?? ???? ???? it could be the author. Try to think of who would have written the message. If it is a quote like in the cryptoquotes in many newspapers think about which people are very often quoted, Benjamin Franklin, Winston Churchill and William Shakespeare come to mind.

Some codes for you to wrap your mind around[edit | edit source]

(answers at the end of the page)


2. Qdyzf Jnozy rf Atz Ngdazui Lzdl


1. "If you would know the value of money, go and try to borrow some" -Benjamin Franklin

2. Casey Jones by the Grateful Dead