More Like This
-
Capability Overview
Cyber Resilience
-
Product / Service
Penetration Testing Services
Some of the most popular hashcat rule sets were created by taking a large pool of hashcat rules (copied from existing sets or randomly generated) with a popular dictionary list and tallying how many passwords each rule was responsible for recovering when applied to a collection of password hashes from public breach data. The individual rules that lead to the greatest number of recovered passwords are the “winners” and make it into a new rule list.
For example, the famous OneRuleToRuleThemAll rule set was created by measuring the performance of hundreds of thousands of different rules against the Lifeboat data dump with the iconic rockyou.txt dictionary. You can read about the methodology in detail here. The methodology itself seems sound, and the rule list became quite popular and successful. However, a couple observations make it clear that we can still do better.
With these observations in mind, we wanted to use a similar methodology to that of OneRuleToRuleThemAll to make new rule sets that are specially honed for cracking passwords that are known to comply with common password requirements. This work should accomplish three key goals.
The following sections describe the data, methodology, and results of this effort.
This section describes the data and defines some of the terms used in the methodology section.
This is a custom dictionary list that aims to improve on rockyou.txt without overinflating the size. It is composed of a deduplicated combination of the following lists.
The final size of super_dict.txt is 18,705,085 lines, which is only 30.4% bigger than rockyou.txt.
This is the candidate pool of existing rules from which the new rule sets were created. It is a deduplicated combination of the following rule sets, all of which are freely available on the internet.
The total rule count for super_rules.txt is 478,642 (by comparison, OneRuleToRuleThemAll has about 50,000).
One of the ways in which we aim to improve on OneRuleToRuleThemAll is by testing the performance of each rule against a much larger body of password hashes. Whereas the creators of OneRuleToRuleThemAll used the Lifeboat data dump as the hashes, we used a larger collection of password hashes numbering over 60 million from multiple sources, including HaveIBeenPwnd, Crackstation, and Hashmob.
The collection of public password data we used can be split into two categories: known plaintexts, and unknown hashes. The password data that was already in plaintext form was used to create rehashed subsets of the total collection based on conformity to the different password requirements we wanted to curate rule sets for. Specifically, the following hash lists were created:
“Complex” means containing at least 3 unique character classes out of lower-case letters, upper-case letters, numbers, and special characters (i.e., everything else). For simplicity, these are named, in order (with line counts in parentheses):
We will refer to the above hash lists collectively as “Per-Policy Hash Lists”.
The password data collected in the form of hashes, on the other hand, were kept as they are. These hash lists include (with line counts in parentheses):
These will be referred to simply as Lifeboat, LinkedIn, and NVIDIA, respectively.
We ran a series of hashcat sessions where we varied the dictionary, rules, hash list, and the loopback flag. The goal was to collect data that could measure the following:
The following lists summarize the experiment variables. Every permutation of the following variables was run, for a total of 88 hashcat sessions. The set of results from all hashcat sessions using rockyou.txt, OneRuleToRuleThemAll, and no loopback flag is the benchmark against which performance gains are measured.
Dictionary |
|
Rules |
|
--loopback |
|
Hash List |
|
In order to maintain consistency, collect relevant data, and keep the time requirements feasible, the following hashcat options were used:
Option | Parameter | Comment |
---|---|---|
-r | <path to rules> | The file containing the rules used to mangle the dictionary list. |
-m | <mode> | 0, 100, and 1000 for MD5, SHA1, and NTLM |
--status |
|
Print periodic status updates which are also written to a file with tee. |
-w3 |
|
Increase the work profile to dedicate a greater portion of the computer’s resources to hashcat. |
--loopback |
|
Appends recovered passwords to the working dictionary so they can be used to crack other passwords. |
--debug-mode=1 | 1 | Every time a new password is recovered, hashcat will log the rule that was used to crack it to the file specified with --debug-file. |
--debug-file | <path to logfile> | See --debug-mode |
-o | <path to output file> | Saves recovered passwords to a file as <hash>:<cleartext> |
--potfile-disable |
|
Prevents hashcat from using the potfile. This is critical to making sure the results from each hashcat session are not polluted by the results from a previous session. |
-O |
|
Optimize kernel. Limits hashcat to passwords under 32 characters in exchange for significant speed gains. |
Using the data from the debug files and the output files, we computed the top 50,000 rules that led to the most recovered passwords for each level of password requirements. These groups of 50,000 rules became the new rule sets. 50,000 was the chosen size for the new rule sets because it is a little smaller than OneRuleToRuleThemAll. However, we also created separate rule sets for the top 10,000, 1000, and 64 rules in each category. The scripts used to perform these calculations are included in our GitHub repo for reference. The new rules are named by the password requirements they are tailored for, along with the number of rules they contain. For example, 12-complex-50k.rule contains the top 50,000 rules for cracking passwords that are at least 12 characters long and contain at least three-character classes.
The resulting rule sets were then validated in another round of hashcat sessions against each hash list using both rockyou.txt and super_dict.txt. The loopback flag was omitted for the validation runs, and we only ran the validation sessions for the 50k rule sets since they are the ones designed to compete with OneRuleToRuleThemAll.
The new rule sets yielded modest increases to the number of passwords cracked in the test data, but we also tried them against two sets of password hashes from live environments as part of on-going penetration tests. The gains in recovered passwords in the live environments were surprisingly high.
This section also includes a table showing the increased rates of recovery from using the loopback flag and substituting super_dict.txt for rockyou.txt.
Tables A and B compare the performance of the new rule lists against OneRuleToRuleThemAll. Each row in Table A uses the new rule list that corresponds to the target hash list in the leftmost column for that row. For the first row, that would be 8-simple-50k.rule, and so on. In Table B, each entry is using OneRuleToRuleThemAll. An interesting observation here is that the performance gains of the new rule sets are more pronounced when the password requirements include multiple character classes ("complex").
Hash List | Number of Hashes | Total Guesses | Recovered Hashes | Percent Recovered (%) | Guessing Efficiency |
---|---|---|---|---|---|
8-simple | 61779922 | 9.33982E+11 | 37981488 | 61.47869206 | 4.06662E-05 |
8-complex | 9966491 | 9.33982E+11 | 5434359 | 54.52630219 | 5.81848E-06 |
10-simple | 31595857 | 9.33982E+11 | 14321202 | 45.3262021 | 1.53335E-05 |
10-complex | 4604400 | 9.33982E+11 | 2076155 | 45.09067414 | 2.22291E-06 |
12-simple | 16137745 | 9.33982E+11 | 4255194 | 26.36795909 | 4.55597E-06 |
12-complex | 1831483 | 9.33982E+11 | 634389 | 34.63799555 | 6.7923E-07 |
14-simple | 10664614 | 9.33982E+11 | 1363405 | 12.78438207 | 1.45978E-06 |
14-complex | 739210 | 3.92339E+11 | 189210 | 25.59624464 | 4.82262E-07 |
Table A: Policy-Based Rule List
Hash List | Number of Hashes | Total Guesses | Recovered Hashes | Percent Recovered (%) | Guessing Efficiency |
---|---|---|---|---|---|
8-simple | 61779922 | 9.72608E+11 | 37634232 | 60.91660653 | 3.87E-05 |
8-complex | 9966491 | 9.72608E+11 | 4941433 | 49.58046919 | 5.08E-06 |
10-simple | 31595857 | 9.72608E+11 | 13653850 | 43.21405177 | 1.40E-05 |
10-complex | 4604400 | 9.72608E+11 | 1842763 | 40.02178351 | 1.89E-06 |
12-simple | 16137745 | 9.72608E+11 | 3952870 | 24.49456228 | 4.06E-06 |
12-complex | 1831483 | 9.72608E+11 | 560374 | 30.596735 | 5.76E-07 |
14-simple | 10664614 | 9.72608E+11 | 1246590 | 11.68903066 | 1.28E-06 |
14-complex | 739210 | 9.72608E+11 | 173845 | 23.51767427 | 1.79E-07 |
Table B: OneRuleToRuleThemAll
In addition to validating these new rule sets against the test data, we also had to try them out on two live engagements. For one of these engagements, we were able to retrieve the password history for the whole domain. Both domains used password policies that required a minimum length of 8 characters and had the complexity flag set to true. Table C shows how the new 8-complex-50k rules compared to OneRuleToRuleThemAll when using super_dict.txt as the dictionary.
Hash List | Number of Hashes | Total Guesses* | Total Guesses** | Recovered Hashes* | Recovered Hashes** |
---|---|---|---|---|---|
Domain 1 | 2253 | 9.33982E+11 | 9.72608E+11 | 52 | 37 |
Domain 2 | 910 | 9.33982E+11 | 9.72608E+11 | 159 | 132 |
Domain 2 with History | 10865 | 9.33982E+11 | 9.72608E+11 | 593 | 2296 |
Table C: New 8-complex-50k rules compared to OneRuleToRuleThemAll when using super_dict.txt
*8-complex-50k
**OneRuleToRuleThemAll
The new rule sets post a substantial gain over OneRuleToRuleThemAll relative to the smaller overall number of passwords cracked in the live environments. On the other hand, the password history for the second domain resulted in dramatically better numbers for OneRuleToRuleThemAll—almost four times as many cracks. This most likely indicates that a large portion of the password history on this domain is older than the current password policy, which highlights an important observation. OneRuleToRuleThemAll appears to be better generalized against passwords that are not limited by length and complexity requirements. Another possible implication is that the process of honing a rule list to target a specific set of password requirements might prevent it from generalizing well against passwords with unknown properties.
For the secondary objectives of testing the performance gains of super_dict.txt over rockyou.txt and of using the loopback flag, refer to tables D and E. Table D shows the total percent of hashes cracked from the Lifeboat, LinkedIn, and NVIDIA lists using super_rules.txt.
Hash Set | rockyou.txt | super_dict.txt |
---|---|---|
Lifeboat | 72.42% | 73.82% |
63.98% | 65.71% | |
NVIDIA |
5.80% | 6.26% |
Table D: rockyou.txt vs super_dict.txt
Table E shows the increase in recovered passwords when using the loopback flag. These statistics are based on using super_dict.txt and super_rules.txt.
Hash Set | No Loopback | Loopback |
---|---|---|
Lifeboat | 73.82% | 79.05% |
65.71% | 67.41% | |
NVIDIA | 6.26% | 7.16% |
Table E: Non-Loopback vs Loopback
The performance gains from the loopback option and the super_dict.txt dictionary are modest, but not negligible. For hash types that are fast enough to be using rockyou.txt, the 30.4% increase in search space from using super_dict.txt might be worthwhile. We’d recommend hashcat users enable the loopback flag as well, because the increase to the search space is relatively small.
An accidental advantage of this project is that the rules themselves afford insight into the patterns that people gravitate towards when they must create passwords with minimum length and complexity requirements. The rule sets with the top 64 performers make an interesting case study into what patterns rise to the top and how they change depending on the password requirements.
The most obvious pattern that shows up at all password requirement levels is the use of 4-digit years, typically appended to the end of the password. In most cases, the four-digit year is appended directly to the end of the password with rules like “$2 $0 $1 $2”. Appending the year with an exclamation mark ($2 $0 $1 $2 $!) or an “@” symbol ($@ $2 $0 $1 $2) also appear to be common patterns but, more frequently, the year is instead combined with a permutation on the base word like truncation or capitalization. The following table shows some examples of how specific hashcat rules mangle affect their input.
Base Word | Rule | Output |
---|---|---|
password | $@ $2 $0 $1 $2 | password@2012 |
password | ] | passwor |
password | c | Password |
password | sa@ | p@ssword |
password | ^r ^e ^p ^u ^S | Superpassword |
By far the most common years that appeared in our hashcat rules were for the current year when the data breaches occurred. As a result, our top-64 rule sets ended up littered with years from 2002 to 2012. Since the underlying cause of those rules being in the top 64 is easy to understand, and since leaving them as-is would make the rule sets ineffective in the present day, we took the liberty of replacing these years by hand with contemporary ones in the 2022-2024 range. As a rule of thumb, 2010 and 2011 were mapped to 2022 and 2023, since those were the most common. Occurrences of other years were mapped to 2024. In some cases, there were too many different old dates in the top-64 sets to keep this mapping strategy perfectly, so some of the rules were brought up to date plus given one of the common additional mutations such as the “@” or the “!” symbols. While these substitutions were unavoidably the product of guesswork, the guiding principle was to represent the most current three years in all the common forms that the old years showed up—without duplicates. A more effective attack on 4-digit year patterns in passwords is an opportunity for future improvement, and a copy of the original top-64 lists is included in our GitHub repo for reference.
Another noticeable pattern in the top-64 rules is how the dominant patterns change as they increase in the length and complexity of the passwords. When the password requirement is a simple 8 character minimum, most of the rules are rotating or truncating a few characters and/or adding some simple numbers to the end. The 4-digit dates become very prevalent once the 12-character minimum is reached, and the rule lists for 14-character minimums have several rules that add longer numbers or prefix the password with phrases like "ilove" or "mynameis". One of the more surprising features of the 12- and 14-character complex requirements are rules like "c $@ $g $m $a $i $l $. $c $o $m" that strongly indicate many people simply use their email address as their password! Above all, the top-64 results are important to this study because the very fact that the lists change so much as you alter the password requirements provides strong evidence that refining hashcat rule selection based on password requirements is worthwhile. People do not choose passwords the same way when confronted with an 8-character minimum as they do with a 14-character minimum and a mandatory three character-classes. We can study these behaviors and we can target them. The small gains offered by this project in our first attempt at targeting password requirements are modest, but the results demonstrate that there is real potential here to advance the art of password cracking in modern environments.
Although this project was successful at a basic level in producing hashcat rule sets that outperform OneRuleToRuleThemAll when the target passwords adhere to a known set of length and complexity requirements, there are some obvious weaknesses that limit its overall impact. These caveats are worth discussing not only because they place due contextual limits on the rule sets we created, but they also illuminate opportunities to advance this kind of password cracking even further.
Each of the following items highlights a limitation of this project’s results and a corresponding opportunity for further work.
Minimum length and character class requirements are the de facto solution for forcing people to make more secure passwords. While such passwords are undeniably better than the ones created when there are no requirements, the optimal password requirements are still a subject of debate. Until password cracking (as it is practiced in the field) starts leaving behind its biases towards simple passwords from decade old data breaches, we are going to have a hard time properly stress-testing the current theories on password requirements. The rule sets we have created here are a first step towards adapting our password cracking tools to contemporary password environments. There is still a long way to go.
The new rule sets created during this project, as well as several scripts supporting the workflow used to create them, are available open source in our GitHub repo.
Capability Overview
Cyber Resilience
Product / Service
Penetration Testing Services
About Cyber Solutions:
Cyber security services are offered by Stroz Friedberg Inc., its subsidiaries and affiliates. Stroz Friedberg is part of Aon’s Cyber Solutions which offers holistic cyber risk management, unsurpassed investigative skills, and proprietary technologies to help clients uncover and quantify cyber risks, protect critical assets, and recover from cyber incidents.
General Disclaimer
This material has been prepared for informational purposes only and should not be relied on for any other purpose. You should consult with your own professional advisors or IT specialists before implementing any recommendation, following any of the steps or guidance provided herein. Although we endeavor to provide accurate and timely information and use sources that we consider reliable, there can be no guarantee that such information is accurate as of the date it is received or that it will continue to be accurate in the future.
Terms of Use
The contents herein may not be reproduced, reused, reprinted or redistributed without the expressed written consent of Aon, unless otherwise authorized by Aon. To use information contained herein, please write to our team.
Our Better Being podcast series, hosted by Aon Chief Wellbeing Officer Rachel Fellowes, explores wellbeing strategies and resilience. This season we cover human sustainability, kindness in the workplace, how to measure wellbeing, managing grief and more.
Expert Views on Today's Risk Capital and Human Capital Issues
The construction industry is under pressure from interconnected risks and notable macroeconomic developments. Learn how your organization can benefit from construction insurance and risk management.
Stay in the loop on today's most pressing cyber security matters.
Our Cyber Resilience collection gives you access to Aon’s latest insights on the evolving landscape of cyber threats and risk mitigation measures. Reach out to our experts to discuss how to make the right decisions to strengthen your organization’s cyber resilience.
Our Employee Wellbeing collection gives you access to the latest insights from Aon's human capital team. You can also reach out to the team at any time for assistance with your employee wellbeing needs.
Explore Aon's latest environmental social and governance (ESG) insights.
Our Global Insurance Market Insights highlight insurance market trends across pricing, capacity, underwriting, limits, deductibles and coverages.
How do the top risks on business leaders’ minds differ by region and how can these risks be mitigated? Explore the regional results to learn more.
Our Human Capital Analytics collection gives you access to the latest insights from Aon's human capital team. Contact us to learn how Aon’s analytics capabilities helps organizations make better workforce decisions.
Explore our hand-picked insights for human resources professionals.
Our Workforce Collection provides access to the latest insights from Aon’s Human Capital team on topics ranging from health and benefits, retirement and talent practices. You can reach out to our team at any time to learn how we can help address emerging workforce challenges.
Our Mergers and Acquisitions (M&A) collection gives you access to the latest insights from Aon's thought leaders to help dealmakers make better decisions. Explore our latest insights and reach out to the team at any time for assistance with transaction challenges and opportunities.
How do businesses navigate their way through new forms of volatility and make decisions that protect and grow their organizations?
Our Parametric Insurance Collection provides ways your organization can benefit from this simple, straightforward and fast-paying risk transfer solution. Reach out to learn how we can help you make better decisions to manage your catastrophe exposures and near-term volatility.
Our Pay Transparency and Equity collection gives you access to the latest insights from Aon's human capital team on topics ranging from pay equity to diversity, equity and inclusion. Contact us to learn how we can help your organization address these issues.
Forecasters are predicting an extremely active 2024 Atlantic hurricane season. Take measures to build resilience to mitigate risk for hurricane-prone properties.
Our Technology Collection provides access to the latest insights from Aon's thought leaders on navigating the evolving risks and opportunities of technology. Reach out to the team to learn how we can help you use technology to make better decisions for the future.
Trade, technology, weather and workforce stability are the central forces in today’s risk landscape.
Our Trade Collection gives you access to the latest insights from Aon's thought leaders on navigating the evolving risks and opportunities for international business. Reach out to our team to understand how to make better decisions around macro trends and why they matter to businesses.
With a changing climate, organizations in all sectors will need to protect their people and physical assets, reduce their carbon footprint, and invest in new solutions to thrive. Our Weather Collection provides you with critical insights to be prepared.
Our Workforce Resilience collection gives you access to the latest insights from Aon's Human Capital team. You can reach out to the team at any time for questions about how we can assess gaps and help build a more resilience workforce.
Article 8 mins
Hurricanes Helene and Milton insured loss estimates are expected to fall between $34 billion and $54 billion. Healthy, well-capitalized insurance and reinsurance markets are positioned to absorb those losses.
Article 17 mins
Buyer-friendly conditions continued across much of the global insurance market in Q3, painting a largely positive picture as we head into year-end renewals.
Article 10 mins
A successful M&A strategy relies on due diligence across financial, legal, human capital, technology, cyber security and intellectual property risks. As cyber threats become more complex, robust cyber due diligence in private equity and acquisitions is increasingly necessary.