The invaluable Troy Hunt has run another analysis of recent dumps of password data from Anonymous, LulzSec, and others. This time he looks at how people select their passwords. As with his previous analyses, the results are depressing.
His idea was to look at the passwords to determine how they were selected. For example, 14% were either names—Stephen, Mary, Jessica, etc—or derived from names in an obvious way—nehpets, Mary!, Jessica1, etc. Using the plain name or adding a number accounted for 97 per cent of passwords in this category.
Other categories include dictionary words, names of places, all numeric and some other rare patterns. Thirty one per cent of the passwords exhibited no pattern. There’s lots of interesting data in Hunt’s post—he really drills down—so if you have any interest in this sort of thing, you should head on over there and read it.
The obvious take away from Hunt’s post is the undeniable fact that the overwhelming majority of people use terrible passwords even when they’re trying to be clever. There’s a secondary lesson here though: Hunt has identified a series of patterns that account for 69 per cent of the 300,000 passwords he examined. If he can do this so can the people trying to break passwords and you can be sure that those patterns will be coded into their password breaking software if they aren’t already there.
That brings us back to my post about password advice from XKCD. The “bad” password in the cartoon, Tr0ub4dor&3, seems like it should be secure. It’s got 11 characters taken from upper- and lowercase letters, numerals, and symbols. That’s an alphabet of 94 characters so if we assume the characters were chosen randomly that gives us just over 72 bits of entropy. More than enough, as a practical matter, to be unbreakable. But of course the characters weren’t chosen randomly. They were chosen according to a pattern. A well known pattern that password crackers are programmed to check for. Therefore the true entropy is much less than 72 bits; it’s more like 28 bits by Munroe’s estimation.
The process recommended by Munroe also seems like an obvious pattern. After all it uses dictionary words from a relatively short list—about 2000 given Munroe’s assignment of 11 bits of entropy to each one—and there’s only 4 of them. “What if,” people say, “the method became popular? Wouldn’t the bad guys just program their password crackers to check for that pattern?” The crucial difference is that the choice of the words is random (or has to be for the system to work) so even if everyone used the method it doesn’t help the cracker: the entropy is still 44 bits. Think of it this way, each word is a symbol taken from an alphabet of 2048 symbols and the password consists of 4 symbols. That’s 20484 = 17,592,186,044,416 possibilities for the password. Since the 4 symbols were chosen randomly, there’s no way for the cracker to reduce the search space.
As explained by Agile Bits’ Jeff in these two posts, you really need a bit more than Munroe’s scheme for a truly secure password1. He recommends the Diceware method, which uses, say, 5 words, chosen by the roll of dice, from a list of 7776. That gives 64 bits of entropy. Jeff recommends adding some private “word” that is gibberish but meaningful to you (something like m3crA,M,&M = my three children are Allison, Mary, and Michael). That way you can add enough entropy that it would take 500 million years at one million guesses a second to crack the password. Again, see Jeff’s two posts for the details.
It’s very counterintuitive that a few short dictionary words chosen from a small list provides more security than 11 characters of complicated gibberish, but it’s none-the-less true. The mathematics that shows that is fairly accessible but, of course, most people run screaming from the room when you mention mathematics. To those people, we can say only, “Stop using your intuition to argue otherwise because your intuition stinks.”
Footnotes:
1 “Secure password” is always a relative term. All we can do is estimate the amount of work it would take to crack the password by brute force—trying every possible password—and hope that the system we are applying the password to is at least as secure as the password given to it.