Unix-ninja has an excellent analysis of a large database of passwords and other information with over 18.2 million records. The file is unique because the site used home-grown crypto to encrypt the passwords and it was easily reversed. That means that this is a complete set of passwords not just the ones easy enough to be recovered. For example, if someone were using a password manager that generated long random passwords and had a password of
kj#AXP39kjl#&!VV<>xzpln;:}NsdT, that password would almost certainly not be recovered by conventional password cracking. But because the weak crypto made it possible to decrypt the entire file, all passwords, including the strong ones, are available for analysis.
Although unix-ninja doesn't say where the data came from, it appears to be from something like a dating site because it contains fairly comprehensive information (things like body type) about the users. That allows him to study how, for example, the password security of people with athletic body builds compares with other cohorts.
A lot of his results are depressingly predictable. The passwords
password are favorites as always. One interesting and unexpected—at least to me—result is that while 2.9% of the passwords had at least one symbol, only 0.6% had an upper case letter. There are other interesting patterns too. It's definitely worth a read if you care about password security. Of course, as usual, the best advice is to use a password manager and generate long random passwords with characters drawn from lower and upper case letters, symbols, and numbers.
Finally, if you're interested in the details, a lightly sanitized list of the passwords and a 27 page technical appendix of the results (many not in the post) are available for download.