1. Implementation
1.1 Introduction
The Vigenère cipher is a method of encrypting alphabetic text by using a series of different Caesar ciphers based on the letters of a keyword.
1.2 Lab exercise
1.2.1 Exercise
Implement a program that asks the user for a text, and that displays it. Modify your program so that all letters are converted to lower-case, and all non-letter characters are removed.
1.2.2 Main idea
Getting the input text and keyword. We don’t know the length of input text and keyword, so we just use malloc and realloc function to allocate address space for input text and keyword. If they are capital letter, change them to the lower-case. After that, removing the non-letter characters.
1.2.3 Source code
Input text1
2
3
4
5
6
7
8
9
10
11
12
13
14
15/* Getting the input text and covert it using covert function. */
/* to get a viginer cipher, you should input the plaintext(the longer, the better) */
origintext = (char*) malloc(1000);
printf("please input origin text \n");
for(i = 0; (*(origintext + i) = getchar()) != '\n'; i++)
{
if((i+1) % 1000 == 0)
{
origintext = (char*)realloc(origintext, strlen(origintext) + 1000);
}
}
origintext[i] = '\0';
plaintext = (char*)malloc(strlen(origintext));
memset(plaintext, '\0', strlen(plaintext));
convert(origintext, plaintext);
Covert the text1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23/* convert function is used to covert the original text to plaintext, delete *
* no-letter char and covert captical letter to lowercase */
void convert(char *origintext, char *plaintext)
{
int i, j;
i = 0;
j = 0;
memset(plaintext, '\0', strlen(plaintext));
while(origintext[i] != '\0')
{
if(origintext[i] <= 'Z' && origintext[i] >= 'A')
{
plaintext[j] = origintext[i] + 32;
j++;
}
if(origintext[i] <= 'z' && origintext[i] >= 'a')
{
plaintext[j] = origintext[i];
j++;
}
i++;
}
}
Input the key word1
2
3
4
5
6
7
8
9
10 /*input the key word*/
printf("please input the key\n");
for(i = 0,key = (char*)malloc(100); (*(key + i) = getchar()) != '\n'; i++)
{
if((i + 1) % 100 == 0)
{
key=(char*)realloc(key,strlen(key)+100);
}
}
*(key+i)='\0';
1.2.4 Result
The result was shown in figure 1.1. The plain text was about the introduction of viginere, which came from wikipedia-viginere. The keyword was set to isima.
2. Cryptanalysis: key length estimation
2.1 Introduction
When the key length is known, a Vigenère cipher can be broken using frequency analysis. The primary weakness of the Vigenère cipher is the repeating nature of its key. If a cryptanalyst correctly guesses the key’s length, then the cipher text can be treated as interwoven Caesar ciphers, which individually are easily broken. The Kasiski and Friedman tests can help determine the key length.
2.2 Babbage and Kasiki method
Friedrich Kasiski was the first to publish a successful general attack on the Vigenère cipher. Charles Babbage was goaded into breaking the Vigenère cipher when John Hall Brock Thwaites submitted a “new” cipher to the Journal of the Society of the Arts. When Babbage showed that Thwaites’ cipher was essentially just another recreation of the Vigenère cipher, Thwaites challenged Babbage to break his cipher encoded twice, with keys of different length.
2.3 Lab work
To calculate the length of key work. I removed the letters whose repetitions were less than 2 times, and then get the distance between two same strings. After that I get divisors of each distance and calculate the weight of each divisor. With highest weight, we get the most probable probable length of key word.
2.4 source code
1 | /* get_key_size function is used to calculate the keysize of the key */ |
2.5 Result
Figure 2.1 shows the same strings that were found. Figure 2.2 shows the weight of each probable key length. We can easily find the the number 5 has highest weight 226, and the key word that we input was “isima”, so we found the right length of key word, which was shown in figure 2.3.
2.6 Problem
If the length of key word was the number like 10 whose divisors were 1, 2, 5, 10. then the number 5 will have larger or at least equal weight as number 10. It’s a problem. I am sorry but I don’t know how to solve it.
3.Cryptanalysis: probable word method
3.1 Introduction
For most used English words, each character has a different frequency, which was shown in figure 3.1.
3.2 Lab work
To find the probable key word, we can detach the cipher into “key_length” part, for example, after calculation, the probable key length was 5, then we just detach the cipher into 5 parts, for the letters in each part, they have the same key character.
After that, we choose 1 character from alphabet and try to decrypt each part of cipher text. Then, we compare frequency of the decryption text with the table of the most used English words, which was shown in figure 3.1. In this case, we just calculate the sample variance of the frequencies, the smaller sample variance, the better. Repeat the decryption of each characters in alphabet, we will get a minimum sample variance, then we get the most probable key character.
Repeat “key length” time of the experiment, we will get the most probable key word.
3.3 source code
Analyze the text and get frequencies of each character
For a given text, we can get the frequencies of each character and sample variance using anal_freq function. We can also get the probability that two randomly chosen letters are the same, which was named Ke. One of the result was shown in figure 3.2.
1 | /* anal_freq function is used to analyze the text, |
Find most probable key character
Repeat the decryption of each characters in alphabet, we will get a minimum sample variance, then we get the most probable key character. The result was shown in figure 3.3.
1 | /*the find_each_key_letter function is used to find a most probable key letter for a given ciphertext*/ |
From the result in figure 3.3, we can also see that no matter what character was set as key character, Ke, the probability that two randomly chosen letters are the same, is always 0.065136. So we can’t get the most probable key character from Ke.
After “key length” times repetition, we can get the most probable key word, which was shown in figure 3.4.
3.4 Decrypt the cipher
For a given key word, we will get the matched plain text using decrypt function, which was shown below. The result was shown in figure 3.5.1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29/*decrpty the viginer ciphertext*/
void decrypt(char *ciphertext, char *key, char* plaintext)
{
int i ;
int j ;
int offset ;
i = 0 ;
j = 0 ;
while(ciphertext[i] != '\n' && ciphertext[i] != '\0')
{
offset = key[j] -(int)('a');
if((ciphertext[i] - (int)'a' - offset) < 0)
{
plaintext[i] = ciphertext[i] - offset + 26;
}
else
{
plaintext[i] = ciphertext[i] - offset; //it seems that we can't modify plaintext
}
i++;
j++;
if(key[j] == '\n' || key[j] == '\0')
{
j = 0;
}
}
plaintext[i] = '\0';
}
After decryption, we finally break viginere and get the plaintext.
4. Reference
[1]. http://en.wikipedia.org/wiki/Letter_frequency
[2]. http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher
[3]. http://sancy.univ-bpclermont.fr/~guitton/enseignements/admin.html
Note: All the source code was put in my github