Assignment 6
Public Key Cryptography
CSE 13S – Fall 2021
First DESIGN.pdf draft due: November 11th at 11:59 pm PST
Assignment due: November 21st at 11:59 pm PST
1 Introduction
Doo Jdxo lv glylghg lqwr wkuhh sduwv, rqh ri zklfk wkh Ehojdh lqkdelw, wkh
Dtxlwdql dqrwkhu, wkrvh zkr lq wkhlu rzq odqjxdjh duh fdoohg Fhowv, lq rxu Jdxov,
wkh wklug.
—Julius Cæsar
Cryptography, once restricted to government, spies, and the military is now pervasive in our lives. Most
web sites that you visit are protected using SSL. Your SSH connections are protected in the same way.
How is this accomplished? Through a mixture of public key and symmetric key cryptography. The
earliest known practical public-key cryptography algorithm is RSA, after its inventors Ronald Rivest, Adi
Shamir, and Leonard Adleman (Figure 2), who published it in 1978. About five years earlier, on 20 November,
1973, Clifford Cocks (Figure 1), working for GCHQ (the British equivalent of the NSA), invented a very
similar algorithm. His classified memorandum “A note on ‘non-secret’ encryption” was to remain secret
for 24 years. In fact, when you read the Cocks memorandum, you will see that the idea of public key encryption
was proposed by J. H. Ellis three years earlier in 1970. Unknown in the public literature, the idea
was independently proposed by Ralph Merkle for public key distribution, which inspired asymmetric
cryptography by Whitfield Diffie and Martin Hellman, and finally leading to RSA.
Public-key cryptography, or asymmetric cryptography, is a cryptographic system that uses pairs of
keys: public keys (known to others) and private keys (known only by the owner). The generation of such
key pairs depends on cryptographic algorithms that are based on mathematical objects called one-way
functions. Security requires keeping the private key private; the public key can be distributed widely.
Any person can encrypt a message using the intended receiver’s public key, but that encrypted message
can only be decrypted with the receiver’s private key. This allows a server to create a cryptographic
key for suitable symmetric-key cryptography and then use a client’s openly shared public key to encrypt
the newly generated symmetric key. The server can then send this encrypted symmetric key over an
insecure channel to the client; only the client can decrypt it using its private key. With the client and
server both having the same symmetric key, they can safely use symmetric key encryption to communicate.
This scheme has the advantage of not having to pre-share symmetric keys while gaining the higher
speed of symmetric-key cryptography.
Symmetric-key algorithms use the same cryptographic keys for the encryption of plaintext and the
decryption of ciphertext. The keys may be identical, or there may be a simple transformation between
© 2021 Darrell Long 1
the two keys. The keys represent a shared secret between two or more parties. The requirement that
both parties have access to the secret key is one of the main disadvantages of symmetric-key encryption
compared to public-key encryption.
Let’s briefly look at the Cocks algorithm before moving on to the more popular RSA algorithm. We
have two principals: Alice (A), who is the receiver, and Bonnie (B), who is the sender.
(a) Alice:
i. Chooses two primes p and q such that p - (q−1) and q - (p−1). That is, p does not divide
q−1 and q does not divide p−1.
ii. Transmits the computed product n = pq to the sender, which we write as A
n−→ B.
(b) Bonnie:
i. Has a message consisting of numbers c1,c2,...,cr where 0 < ci < n.
ii. Sends these encoded as di where di = c
n
i
(mod n). When written as part of a protocol,
B
c1,...,cr −−−−→ A.
(c) Alice:
i. Computes using Euclid’s Algorithm p
0
such that p × p
0 ≡ 1 (mod q − 1), and q
0
such that
q×q
0 ≡ 1 (mod p−1).
ii. Decodes ci = d
p
0
i
(mod q) = d
q
0
i
(mod p).
Figure 1: Clifford Cocks
As with the RSA algorithm, as you will see, the strength of the algorithm relies on the assumed dif-
ficulty of factoring large composite integers. We say assumed difficulty since there is, like P
?= NP , no
proof of this widely held assumption. A proof that P 6= NP would be welcome, but unsurprising, while
a proof that P = NP would probably have theoreticians jumping out of windows.
The paper published by Rivest, Shamir and Adleman in 1978,
Ronald L. Rivest, Adi Shamir, and Leonard Adleman. “A method for obtaining digital signatures
and public-key cryptosystems,” Communications of the ACM 21.2 (1978): 120–126.
is one of the most important papers ever published. It enabled the modern Internet and changed the
world. You would do well to take the time to read it.
© 2021 Darrell Long 2
Figure 2: Adi Shamir, Ronald Rivest, and Leonard Adelman
2 RSA Algorithm
The magic words are Squeamish Ossifrage.
—Ronald L. Rivest
The security of RSA relies on the practical difficulty of factoring the product of two large prime numbers,
known as the factoring problem. Breaking RSA encryption is known as the RSA problem. Whether it is
as difficult as the factoring problem is an open question. There are no published methods to defeat the
system if a large enough key is used, but RSA would be vulnerable to attack by a quantum computer.
RSA involves a public key and a private key. Everyone can know the public key, and it is used for encrypting
messages. The intention is that messages encrypted with the public key can only be decrypted
by using the private key. The integers n and e represent the public key. The private key is represented by
the integer d.
The public key consists of the modulus n and the public exponent e. The private key consists of the
private exponent d, which must be kept secret; p, q, and ϕ(n) must also be kept secret since they are
used to calculate d. In fact, p and q (ϕ(n) is calculated from them) can be discarded after d has been
computed.
We proceed by choosing two large random primes p and q, these numbers must be kept secret. We
then publish the number n = pq. You might wonder why we can do this, and the reason is that it is
believed to be hard to factor large composite integers into their constituent primes. The fundamental
theorem of arithmetic tells us that every integer has a unique prime factorization.
Choose a random integer 2 < e < n 3 gcd(e,ϕ(n)) = 1, where ϕ(n) = (p−1)(q−1). gcd(a,b) indicates
the greatest common divisor of a and b, which is commonly computed using Euclid’s algorithm,
© 2021 Darrell Long 3
which is defined recursively as:
gcd(a,b) =
a if a = b
gcd(a,b−a) if a < b
gcd(a−b,b) if a > b
though we can calculate it much more rapidly using division. A good choice for e is 2
16 +1 = 65537. You
will understand why—to some extent—after you finish §6.3. Here’s a hint: How many 1 bits are in that
number?
The ϕ function is called the totient of n, and that denotes the number of positive integers up to a
given integer n that are relatively prime to n. Note that for any prime p, ϕ(p) = p−1, and so ϕ(n) =
ϕ(pq) = (p−1)(q−1). We can share e with impunity. We say that our public key is the pair 〈e,n〉.
We now calculate a unique secret integer d ∈ {0,...,n − 1} such that d × e ≡ 1 (mod ϕ(n)). How
do we find this d? It turns out that we have known how to do it for more than 2300 years—we use an
algorithm attributed to Euclid. How is it that we can easily calculate d while our adversary cannot? We
know a secret that he does not: we know ϕ(n) while he only knows n. We call d our private key.
We now define two functions: D(m) = me
(mod n) and E(c) = c
d
(mod n). We will show in §3
that ∀m ∈ {0,...,n−1} that D(E(m)) = E(D(m)) = m. Since D and E are mutual inverses and this will
enable us to perform not only encryption but also digital signatures.
3 Mathematics of RSA
If P = NP , then all of modern cryptography collapses. On this happy
thought. . .
Michael O. Rabin, November 1998
The mathematics of RSA are based on arithmetic in a group of integers modulo n, denoted Z/n. This
is the set {0,...,n − 1} and all sets that are isomorphic to it. For example, {n,...,2n − 1} (mod n) =
{0,...,n−1} (mod n), and there are an infinite number of such sets. Since they are all the same we will
only concern ourselves with the one with the smallest numbers. What do we mean when we say that
are the same? We mean that if x ≡ k (mod n) then an+x ≡ k (mod n),∀a ≥ 0,a ∈ Z. In other words,
additional integer products of n do not matter.
The Euler-Fermat theorem says that if a ∈ N and n ∈ N are coprime, that is gcd(a,n) = 1, then
a
ϕ(n) ≡ 1 (mod n).
This will allow us, for example, to take a message M and have Mϕ(n) ≡ 1 (mod n).
What is ϕ(n)? It is the Euler totient function, and gives the number of positive integers than that or
equal to n that are relatively prime to n. For any prime number p,
ϕ(p) = p−1.
For the RSA algorithm, we choose two large primes p and q, and we make n = pq. What does this
mean for us with respect to ϕ(n)?
ϕ(n) = ϕ(p)×ϕ(q)
= (p−1)(q−1)
= n− (p+q)+1.
© 2021 Darrell Long 4
We choose an encryption key e such that is it relatively prime to ϕ(n), that is, gcd(e,ϕ(n)) = 1. We
can then deduce our decryption key d such that e × d ≡ 1 (mod ϕ(n)). Our encryption algorithm is
simply E(M) =Me
(mod n) = C, and our decryption algorithm is D(C) = c
d
(mod n) =M.
We have the additional property of being mutual inverses: E(D(x)) = D(E(x)) = x, and that will give
us the ability to provide digital signatures. Observe that,
D(E(M)) ≡ E(M)
d ≡Med (mod n)
E(D(M)) ≡ D(M)
e ≡Mde (mod n).
Continuing in this manner, observe that,
Med ≡Mkϕ(n)+1
(mod n)
for some multiplier k ≥ 1. This is true because, prior to the modular reduction, ed must be one greater
than some multiple of ϕ(n); applying the modulus is what makes ed ≡ 1 (mod ϕ(n)). We can rewrite
this as
Med ≡ (Mϕ(n)
)
k ×M (mod n).
Here we apply Euler’s theorem, which states that if a andnare coprime integers, then a
ϕ(n) ≡ 1 (mod n).
So, assuming that M is coprime with n, we can simplify the above equation to
Med ≡ (1)
k ×M (mod n)
≡ (1)×M (mod n)
≡M (mod n)
which shows that the encryption and decryption functions are mutual inverses.
4 Your Task
“Personally,” he said, “my great ambition is to count all this,”—he waved vaguely
at the treasure around him—“and possibly sort it into piles.”
John C. Gardner, Grendel
You will be creating three programs for this assignment:
1. A key generator: keygen
2. An encryptor: encrypt
3. A decryptor: decrypt
The keygen program will be in charge of key generation, producing RSA public and private key pairs.
The encrypt program will encrypt files using a public key, and the decrypt program will decrypt the
encrypted files using the corresponding private key.
You will need to implement two libraries and a random state module that will be used in each of
your programs. One of the libraries will be hold functions relating to the mathematics behind RSA, and
the other library itself will contain implementations of routines for RSA. You also need to learn to use a
library: the GNU multiple precision arithmetic library.
© 2021 Darrell Long 5
5 GNU Multiple Precision Arithmetic
One reason you should not use web applications to do your computing is that
you lose control. It’s just as bad as using a proprietary program. Do your own
computing on your own computer with your copy of a freedom-respecting
program. If you use a proprietary program or somebody else’s web server, you’re
defenceless.
—Richard Stallman
As you should know by now, C, unlike languages like Python, does not natively support arbitrary precision
integers. The security of RSA, however, relies on large integers. So, we elect to use the GNU multiple
precision arithmetic library, usually referred to as GMP. You can find the manual and documentation for
the library here: https://gmplib.org/manual.
Take some time to look through the manual, taking note of which functions may be useful.
You will need to install both gmp and pkg-config. The latter is a utility used to assist in finding and
linking libraries, instead of having the program hard-code where to find specific headers and libraries
during program compilation. To install these packages on Ubuntu 20.04, run the following:
$ sudo apt install pkg - config libgmp -dev
Get started on this as soon as possible. Make sure to attend section for assistance on using pkg-config
in a Makefile to direct the compilation process for your programs.
You may notice that GMP already provides number theoretic functions, several of which could be
used in RSA. You may not use any GMP-implemented number theoretic functions. You must implement
those functions yourself.
The following two sections (§6 and §7) will present the functions that you have to implement, but
they both will require the use of random, arbitrary-precision integers.
GMP requires us to explicitly initialize a random state variable and pass it to any of the random integer
functions in GMP. Not only that, we also need to call a dedicated function to clean up any memory
used by the initialized random state. To remedy this, you will implement a small random state module,
which contains a single extern declaration to a global random state variable called state, and two
functions: one to initialize the state, and one to clear it. The interface for the module will be given in
randstate.h and the implementation must go in randstate.c.
void randstate_init(uint64_t seed)
Initializes the global random state named state with a Mersenne Twister algorithm, using seed as the
random seed. This entails a call to gmp_randinit_mt() and a call to gmp_randseed_ui().
void randstate_clear(void)
Clears and frees all memory used by the initialized global random state named state. This should just
be a single call to gmp_randclear().
© 2021 Darrell Long 6
6 Number Theoretic Functions
No one has yet discovered any warlike purpose to be served by the theory of
numbers or relativity, and it seems unlikely that anyone will do so for many
years.
—G. H. Hardy
Number Theory is the branch of mathematics that studies the nature and properties of numbers. Though
many have made important contributions to the field, including Gauß in his Disquisitiones Arithmeticae
(which he completed when he was 21 years old), the most important for public-key cryptography are
Fermat and Euler (Figure 3).
You will first need to implement the functions that drive the mathematics behind RSA before you
can tackle your RSA library. The interface for these functions will be given in numtheory.h and should
be defined in corresponding C file. Read each of the subsections carefully to understand, on some level,
the theory behind each of the number theoretic functions. Pseudocode is provided to assist you.
Figure 3: Leonhard Euler (1707–1783) and Pierre de Fermat (1607–1665)
6.1 Modular Exponentiation
As shown in §2, we must compute a
n where both a,n ∈ N for RSA. We could simply multiply: