Digital Media Concepts/RSA (cryptosystem)

What is RSA

It is one of the first asymmetric cryptography algorithms which uses a pair of keys to encrypt and decrypt data. This algorithm is now widely used to transmit sensitive data over insecure network like the Internet.^[1]

History^[1]

Before mid-1970s, people used symmetric cryptography algorithms to encrypt data. With these algorithms, data encrypted with a particular cryptographic key could only be decrypted with the same key. Anyone without the key cannot decrypt the data. Therefore people could safely send sensitive information through insecure communication channel. However, people could not find a way to safely exchange their keys between them. Asymmetric cryptography algorithms were invented to solve this problem, and RSA is one of them. In April 1977, RSA was invented by three people from Massachusetts Institute of Technology, including two computer scientists, Ron Rivest, Adi Shamir, and a mathematician Leonard Adleman, and was later publicized in August of the same year.

RSA is now in public domain, and can be freely implemented and used by anyone.^[2]

Steps^[3]

Generate the keys^[4]

As an asymmetric cryptography algorithm, RSA cryptosystem involves two keys, the public key and the private key. Data encrypted by one key can only be decrypted with the other.

Randomly choose two prime numbers, $p$ and $q$ .
Multiply them together:
$n=p\cdot q$
Get the least common multiple of $p-1$ and $q-1$ :
$\phi =lcm(p-1,q-1)$
Randomly choose a positive integer $e$ which is less than $\phi$ and coprime to $\phi$ .
Calculate modular multiplicative inverse $d$ of $e$ modulo $\phi$ . Which means, find a positive integer $d$ which is less than $\phi$ that satisfies:
$e\cdot d\equiv 1{\pmod {\phi }}$
Now the key pair is generated. $(d,n)$ is the private key, and $(e,n)$ is the public key

Sample Code

Here's a sample implementation of generating RSA keys written in Python 3.6

Code

import secrets

def gcd(x: int, y: int) -> tuple:
    ''' Extended Euclidean Algorithm

    return: a, b, gcd
    ax + by = gcd
    '''
    def f(x: int, y: int) -> tuple:
        l = []
        while y != 0:
            q, r = divmod(x, y)
            l.append(q)
            x, y = y, r
        x, y, g = (1, 0, x) if x >= 0 else (-1, 0, -x)
        while l:
            x, y = y, x - l.pop() * y
        return x, y, g
        # below is a recursive approach. easier to understand
        # but it may exceeds python recursion limit with large numbers
        # if y == 0:
        #     return (1, 0, x) if x >= 0 else (-1, 0, -x)
        # quot, rem = divmod(x, y)         # rem + quot * y = x . . . . . (1)
        # _a, _b, _gcd = f(y, rem)         # _a * y + _b * rem = _gcd . . (2)
        # return _b, _a - quot * _b, _gcd  # <- plug rem from (1) into (2)
    assert isinstance(x, int) and isinstance(y, int), 'Integers expected'
    return f(x, y)

def miller_rabin(n: int, k: int=100):
    ''' Miller-Rabin primality test

    https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test :

    Input #1: n > 3, an odd integer to be tested for primality
    Input #2: k, the number of rounds of testing to perform
    Output: “composite” if n is found to be composite,
            “probably prime” otherwise

    write n as 2^r·d + 1 with d odd (by factoring out powers of 2 from n − 1)
    WitnessLoop: repeat k times:
       pick a random integer a in the range [2, n − 2]
       x ← a^d mod n
       if x = 1 or x = n − 1 then
          continue WitnessLoop
       repeat r − 1 times:
          x ← x^2 mod n
          if x = n − 1 then
             continue WitnessLoop
       return “composite”
    return “probably prime”
    '''
    assert isinstance(n, int) and isinstance(k, int) and k > 0
    if n < 4:
        return n > 1
    if n & 1 == 0:   # even numbers
        return False
    r = 0
    n_1 = n - 1
    d = n_1
    while d & 1 == 0:
        d >>= 1
        r += 1
    for i in range(k):
        a = secrets.randbelow(n - 3) + 2
        x = pow(a, d, n)
        if x == 1 or x == n_1:
            continue
        for j in range(r):
            x = pow(x, 2, n)
            if x == n_1:
                break
        else:
            return False
    return True

def get_prime(bits: int) -> int:
    ''' Get a pseudo-prime with given bits

    This function randomly generates a number with given bits, and test its
    primality using the miller-rabin test.
    '''
    assert isinstance(bits, int) and bits > 1
    low = 1 << bits - 1
    while True:
        r = secrets.randbits(bits - 1) + low
        if miller_rabin(r):
            return r

def generate_rsa_keys(bits: int):
    ''' Generate RSA key pair

    It returns e, d, n, which makes
    (m^e mod n)^d mod n == m
    '''
    def lcm(x: int, y: int) -> int:
        a, b, g = gcd(x, y)
        return x * y // g
    assert isinstance(bits, int) and bits > 5
    p = q = 0
    while p == q:
        p = get_prime(bits // 2)
        q = get_prime(bits // 2)
    n = p * q
    l = lcm(p - 1, q - 1)
    _g = 0
    while _g != 1:
        e = secrets.randbelow(l - 3) + 3  # [3, l)
        d, _not_used, _g = gcd(e, l)
    if d < 0: # Got negative d
        d += l
    return e, d, n

Send the public key to others

Let's say Alice and Bob want to have a secure communication. They may exchange their public keys without encryption. After that, a sender should always encrypt the data with the receiver's public key before sending it.

For example, Bob's data is encrypted with Bob's public key, and only people who know Alice's private key can decrypt the data. So Alice is the only person that meets this requirement. The same works with Bob.

Encryption and Decryption^[4]

Let's say the original data is an positive integer $m$ which should be less than $n$ . Encrypted data, positive integer $c$ , can be generated with a public key:
$c=m^{e}{\bmod {n}}$
The encrypted data $c$ can be decrypted with the corresponding private key:
$m=c^{d}{\bmod {n}}$

Proof of correctness^[4]

Proof

Because the encrypted data $c$ is calculated as:

$c=m^{e}{\bmod {n}}$

When decrypting the encrypted data $c$ , the result is:

$r=c^{d}{\bmod {n}}=(m^{e}{\bmod {n}})^{d}{\bmod {n}}=m^{e\cdot d}{\bmod {n}}$

If $m^{e\cdot d}{\bmod {n}}$ equals to the original unencrypted data $m$ , then this algorithm is correct.

To prove $m^{e\cdot d}{\bmod {n}}=m$ , Chinese remainder theorem is needed.

It says when $a$ and $b$ are coprime ( $gcd(a,b)=1$ ), if both of these statements are true:

$x\equiv y{\pmod {a}}$

$x\equiv y{\pmod {b}}$

Then this is also true:

$x\equiv y{\pmod {a\cdot b}}$

As $p$ and $q$ are both prime numbers, they are obviously coprime. So if both of these statements could be proved to be true:

$m^{e\cdot d}\equiv m{\pmod {p}}$

$m^{e\cdot d}\equiv m{\pmod {q}}$

Then this is also true:

$m^{e\cdot d}\equiv m{\pmod {n}}$

Then the correctness of RSA algorithm could be proved:

$r=m^{e\cdot d}{\bmod {n}}=m{\bmod {n}}=m$

Because $p$ and $q$ are identical, only one of $m^{e\cdot d}\equiv m{\pmod {p}}$ and $m^{e\cdot d}\equiv m{\pmod {q}}$ need to be proved. The same works with the other one.

Prove the correctness of $m^{e\cdot d}\equiv m{\pmod {p}}$ :

To accomplish this, Fermat's little theorem is necessary.

It says, if $y$ is a prime, and $x$ is not a multiple of $y$ ( $y\nmid x$ ), then this statement is true:

$x^{y-1}\equiv 1{\pmod {y}}$

As the relationship between $m$ and $p$ is unknown, the problem need to be divided into two cases.

Case 1: $m$ and $p$ are not coprime, which means $m$ is a multiple of $p$ ( $p\mid m$ ). Obviously,

$m^{e\cdot d}\equiv 0\equiv m{\pmod {p}}$

Case 2: $m$ and $p$ are coprime. In this case, Fermat's little theorem could be used. So $m^{p-1}\equiv 1{\pmod {p}}$

Because:

$e\cdot d\equiv 1{\pmod {\phi }}$

As $\phi$ is a multiple of both $p-1$ and $q-1$ , both of these statements are true:

$e\cdot d\equiv 1{\pmod {p-1}}$

$e\cdot d\equiv 1{\pmod {q-1}}$

So $e\cdot d$ can be expressed as:

$e\cdot d=1+k\cdot (p-1)$

Therefore:

$m^{e\cdot d}\equiv m^{1+k\cdot (p-1)}\equiv m\cdot (m^{p-1})^{k}\equiv m\cdot 1^{k}\equiv m{\pmod {p}}$

Q.E.D.

Safety^[5]

The public key, which is $e$ and $n$ , can be known to everyone. As far as $d$ is kept private, no one except the private key owner is able to decrypt the data encrypted by the public key. However, number $n$ has two prime factors $p$ and $q$ . If one can find $p$ and $q$ by factoring $n$ , then this person can also find out $\phi =lcm(p-1,q-1)$ , and eventually $d$ . So the problem of how safe RSA is, is equivalent to how hard it is to factor $n$ .

It is proved that, currently with traditional (non-quantum) computer, factoring a big number is an NP problem. It may or may not be an NP-complete problem. RSA remains safe when integer factorization can't be solved in polynomial time.

External Links

The original paper published by its inventors could be found here.

References

↑ ^1.0 ^1.1 Sheposh, Richard. 2017. “RSA (Cryptosystem).” Salem Press Encyclopedia of Science.
↑ "RSA Security Releases RSA Encryption Algorithm into Public Domain - RSA, The Security Division of EMC". web.archive.org. 2007-11-20. Retrieved 2019-09-30.
↑ "What is RSA algorithm (Rivest-Shamir-Adleman)? - Definition from WhatIs.com". SearchSecurity. Retrieved 2019-09-30.
↑ ^4.0 ^4.1 ^4.2 Rivest, R. L.; Shamir, A.; Adleman, L. (1978-2). "A Method for Obtaining Digital Signatures and Public-key Cryptosystems". Commun. ACM 21 (2): 120–126. doi:10.1145/359340.359342. ISSN 0001-0782. http://doi.acm.org/10.1145/359340.359342.
↑ "cryptography - How safe and secure is RSA?". Stack Overflow. Retrieved 2019-10-01.

[:0-1] 1.0 ^1.1 Sheposh, Richard. 2017. “RSA (Cryptosystem).” Salem Press Encyclopedia of Science.

[2] "RSA Security Releases RSA Encryption Algorithm into Public Domain - RSA, The Security Division of EMC". web.archive.org. 2007-11-20. Retrieved 2019-09-30.

[3] "What is RSA algorithm (Rivest-Shamir-Adleman)? - Definition from WhatIs.com". SearchSecurity. Retrieved 2019-09-30.

[:1-4] 4.0 ^4.1 ^4.2 Rivest, R. L.; Shamir, A.; Adleman, L. (1978-2). "A Method for Obtaining Digital Signatures and Public-key Cryptosystems". Commun. ACM 21 (2): 120–126. doi:10.1145/359340.359342. ISSN 0001-0782. http://doi.acm.org/10.1145/359340.359342.

[5] "cryptography - How safe and secure is RSA?". Stack Overflow. Retrieved 2019-10-01.

[1]

[2]

[3]

[4]

[5]

What is RSA

History[1]

Steps[3]

Generate the keys[4]