When the Cipher Equals The Message: Message Concealing in RSA
Imagine you created a cipher, and sometimes when you ciphered, the cipher is actually the original message. This could happen, for example, when you do a shifted alphabet and move the letters by 26 places, and end up with the mapping of “a” to “A”, “b” to “B”, and so on.
Live Demos
I love doing live demos in cryptography, and where it is possible to demonstrate key principles with parameters that the audience picks. Two of my favourites for this are the Diffie-Hellman method and the RSA public key encryption method — and where I know I can always get the right result no matter what the audience picks. Most of the time, these work very well, and where I can reveal the correct answer. But, sometimes, I end up with the same cipher as the message — and which is, obviously, not very secure! So, on my travels to London yesterday, I managed to get Python running on my iPad, and was able to determine the values I should avoid and for the probability of my finding a non-concealed value for a given set of prime numbers with the RSA method.
The RSA method
We start by generating two prime numbers (p,q) and then calculate the modulus (N):
It is this modulus (N) that provides the core of the security of RSA, and it must be difficult to determine the prime numbers for a given modulus value. Our prime numbers must thus be of a size that makes it difficult to factorize, and they must be randomly generated. Next, we calculate ϕ(N), which is:
We then pick e so that it does not share a factor with ϕ(N):
and where gcd is the greatest common denominator. In practice e typically has a value of 65,537 (and which is a prime number, so is safe). The encryption key is now (e,N). Bob and then send this to Alice for her to encrypt a message using this public key. We determine the decryption value (d) with:
For this, we use an inverse mod function to determine d. In the code below, we use:
d=pow(e,-1,PHI)
To encrypt a message (M), we use:
and to decrypt:
Message concealing
With RSA, there will be some message values that map to the same ciphertext, and where the message will be unconcealed. For example, if we have an integer message of 13, then the cipher message might be also 13 after we apply the public exponent (e) and modulus value (n). In the following, we can reveal the values that will be unconcealed with small prime numbers [here]:
import math
import sys
p=13
q=7
e=3
if (len(sys.argv)>1):
p=int(sys.argv[1])
if (len(sys.argv)>2):
q=int(sys.argv[2])
n=p*q
PHI=(p-1)*(q-1)
while (math.gcd(e,PHI)!=1):
e=e+2
d=pow(e,-1,PHI)
print(f"e={e}, d={d}, n={n}, p={p}, q={q}, PHI={PHI}")
for m in range(0,n):
c=pow(m,e,n)
if (c==m): print(f"{m} {c}")
unconceal=(1+math.gcd(e-1,p-1))*(1+math.gcd(e-1,q-1))
print(f"Unconcealed = {unconceal}")
print(f"Percentage = {round(unconceal/n*100,1)}%")
With primes of 11 and 31, we get [here]:
e=7, d=43, n=341, p=11, q=31, PHI=300
e=7, d=43, n=341, p=11, q=31, PHI=300
0 0
1 1
32 32
56 56
67 67
87 87
88 88
98 98
99 99
154 154
155 155
186 186
187 187
242 242
243 243
253 253
254 254
274 274
285 285
309 309
340 340
Unconcealed = 21
Percentage = 6.2%
And where there are 21 messages that will create the same cipher text as the plaintext value. This includes M=340 and M=154. We can actually compute the number of unconcealed values as:
In real life, we will have an e of 65,537 and a 1,024-bit prime number. If we try for a fairly small prime number of 2^{255}-19, we only get 25 unconcealed values:
>>> import math
>>> e=65537
>>> p=2**255-19
>>> q=2**255-19
>>> unconceal=(1+math.gcd(e-1,p-1))*(1+math.gcd(e-1,q-1))
>>> print (unconceal)
25
You can try some examples here: