CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset includes sanitized password frequency lists collected from Yahoo inMay 2011. For details of the original collection experiment, please see:Bonneau, Joseph. "The science of guessing: analyzing an anonymized corpus of 70 million passwords." IEEE Symposium on Security & Privacy, 2012.http://www.jbonneau.com/doc/B12-IEEESP-analyzing_70M_anonymized_passwords.pdfThis data has been modified to preserve differential privacy. For details ofthis modification, please see:Jeremiah Blocki, Anupam Datta and Joseph Bonneau. "Differentially Private Password Frequency Lists." Network & Distributed Systems Symposium (NDSS), 2016.http://www.jbonneau.com/doc/BDB16-NDSS-pw_list_differential_privacy.pdfEach of the 51 .txt files represents one subset of all users' passwords observedduring the experiment period. "yahoo-all.txt" includes all users; every otherfile represents a strict subset of that group.Each file is a series of lines of the format:FREQUENCY #OBSERVATIONS...with FREQUENCY in descending order. For example, the file:3 12 11 3would represent a the frequency list (3, 2, 1, 1, 1), that is, one passwordobserved 3 times, one observed twice, and three separate passwords observedonce each.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset includes sanitized password frequency lists collected from Yahoo in May 2011.
For details of the original collection experiment, please see:
Bonneau, Joseph. "The science of guessing: analyzing an anonymized corpus of 70 million passwords." IEEE Symposium on Security & Privacy, 2012. http://www.jbonneau.com/doc/B12-IEEESP-analyzing_70M_anonymized_passwords.pdf
This data has been modified to preserve differential privacy. For details of this modification, please see:
Jeremiah Blocki, Anupam Datta and Joseph Bonneau. "Differentially Private Password Frequency Lists." Network & Distributed Systems Symposium (NDSS), 2016. http://www.jbonneau.com/doc/BDB16-NDSS-pw_list_differential_privacy.pdf
Each of the 51 .txt files represents one subset of all users' passwords observed during the experiment period. "yahoo-all.txt" includes all users; every other file represents a strict subset of that group.
Each file is a series of lines of the format:
FREQUENCY #OBSERVATIONS ...
with FREQUENCY in descending order. For example, the file:
3 1 2 1 1 3
would represent a the frequency list (3, 2, 1, 1, 1), that is, one password observed 3 times, one observed twice, and three separate passwords observed once each.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset includes sanitized password frequency lists collected from Yahoo inMay 2011. For details of the original collection experiment, please see:Bonneau, Joseph. "The science of guessing: analyzing an anonymized corpus of 70 million passwords." IEEE Symposium on Security & Privacy, 2012.http://www.jbonneau.com/doc/B12-IEEESP-analyzing_70M_anonymized_passwords.pdfThis data has been modified to preserve differential privacy. For details ofthis modification, please see:Jeremiah Blocki, Anupam Datta and Joseph Bonneau. "Differentially Private Password Frequency Lists." Network & Distributed Systems Symposium (NDSS), 2016.http://www.jbonneau.com/doc/BDB16-NDSS-pw_list_differential_privacy.pdfEach of the 51 .txt files represents one subset of all users' passwords observedduring the experiment period. "yahoo-all.txt" includes all users; every otherfile represents a strict subset of that group.Each file is a series of lines of the format:FREQUENCY #OBSERVATIONS...with FREQUENCY in descending order. For example, the file:3 12 11 3would represent a the frequency list (3, 2, 1, 1, 1), that is, one passwordobserved 3 times, one observed twice, and three separate passwords observedonce each.