Last month I wrote a short post on some research that showed some RSA public keys are insecure. A couple days ago I noticed that the excellent Programming Praxis has a challenge based on that research and subsequent reporting in the press. This is excellent because it exposes how easy it was to gather the information and make the calculations.
There were two teams which did essentially the same thing: they used
nmap to scan every IPv4 address on the Internet to discover which ones had ports 22 (
ssh) or 448 (
ssl) open and then used a custom piece of software to start the negotiation process and recover the SSL certificate and/or SSH host key of the site. According to this blog post by Nadia Heninger (one of the researchers) her team collected 10 million SSH host keys and 5.8 million SSL certificates. They discovered two problems:
- Some keys were repeated meaning that compromising one site with a repeated key meant compromising all sites with that key.
- Some keys shared one of the two primes that make up the public key.
It's the second problem that concerns us here.
For those not familiar with the RSA algorithm, a public key consists of an exponent and a modulus N where N is the product of two large, random primes. The public key can be used to encrypt a message but in order to decrypt it, you need to know the factors of the modulus. What makes RSA secure is that it's very hard to factor large numbers (RSA moduli are typically 2048 or 4096 bits long). However, if two moduli share a factor then it is easy to discover that factor by computing the greatest common denominator (gcd) of them. The gcd is one of the factors and the other factor of each key is trivially recovered by dividing by the gcd.
To show how simple this is, here is a solution to the problem presented by Programming Praxis
(defparameter rsa-keys '(708894553 704488409 705674273 707478271 710167019 708093251 702013379 704030867 708691187 708374743 712748719 713581951 704387447 707015741 704308279 710872423 707947453 704996429 706323757 705031051 714623803)) (defun check-for-common-factors (keys) (unless (null keys) (let ((key (car keys))) (dolist (k (cdr keys)) (let ((g (gcd k key))) (when (> g 1) (format t "keys ~a and ~a have common factor ~a~%" key k g) (format t "~3t~a = (~a) (~a)~%" key g (/ key g)) (format t "~3t~a = (~a) (~a)~%" k g (/ k g))))) (check-for-common-factors (cdr keys)))))
When we run that we get
CL-USER> (time (check-for-common-factors rsa-keys)) keys 704488409 and 706323757 have common factor 12401 704488409 = (12401) (56809) 706323757 = (12401) (56957) keys 705674273 and 707015741 have common factor 12421 705674273 = (12421) (56813) 707015741 = (12421) (56921) keys 707478271 and 708374743 have common factor 12451 707478271 = (12451) (56821) 708374743 = (12451) (56893) keys 708093251 and 708691187 have common factor 12457 708093251 = (12457) (56843) 708691187 = (12457) (56891) keys 704030867 and 704996429 have common factor 12379 704030867 = (12379) (56873) 704996429 = (12379) (56951) keys 704387447 and 705031051 have common factor 12377 704387447 = (12377) (56911) 705031051 = (12377) (56963) (CHECK-FOR-COMMON-FACTORS RSA-KEYS) took 353 microseconds (0.000353 seconds) to run with 4 available CPU cores. During that period, 350 microseconds (0.000350 seconds) were spent in user mode 8 microseconds (0.000008 seconds) were spent in system mode 13,920 bytes of memory allocated.
In a sense, this is a toy problem because
- The “rsa-keys” are too small to be an actual key—they could easily be factored directly.
- There are only 21 keys in the sample, not 15 million as in the actual data.
The first of these isn't really a problem because the Lisp bignum package can easily handle numbers the size of real keys. The second issue is a problem because
check-for-common-factors is a O(n2) algorithm. Fortunately, it's possible to use a linear algorithm (it's described in the Programming Praxis post) but it doesn't add anything to our understanding, it just makes things run faster (Heninger estimates that using the algorithm I showed would take about 24 years; using the (almost) linear method processed the 5.8 million SSL certificates in 18 hours).
Be sure to read Heninger's post. It's short and gives an excellent overview of the research and what it all means.