gosimplellc / nbvcxz Goto Github PK

Password strength estimator

License: MIT License

Java 100.00%

nbvcxz's Introduction

Nbvcxz - Password strength estimator - []

nbvcxz is java library (and standalone console program) which is heavily inspired by the work in zxcvbn.

Password strength estimation is a bit of an art and science. Strength estimation is accomplished by running a password through different algorithms looking for matches in any part of the password on: word lists (with fuzzy matching), common dates, common years, spacial patterns, repeating characters, repeating sets of characters, and alphabetic sequences.

Each of these represent ways an attacker may try to crack a password. To be vigilant, we must adapt to new methods in password cracking and implement new methods to identify passwords susceptible to each new method.

Maven Central
Compile
Differentiating Features
Compatibility
A Rant On Arbitrary Password Policies
How to use
- Standalone
- Library
Bugs and Feedback
License
Requires Java
Application using this library

Maven Central

<dependency>
    <groupId>me.gosimple</groupId>
    <artifactId>nbvcxz</artifactId>
    <version>1.5.1</version>
</dependency>

Compile

Debian based

apt-get install git
apt-get install openjdk-8-jdk
apt-get install maven
git clone https://github.com/GoSimpleLLC/nbvcxz.git
cd nbvcxz
mvn package

The project will be built, and the jar file will be placed in the target sub-directory.

Differentiating Features

Internationalization support for all text output by the library (for feedback, console output, etc).
- Currently supported languages
  - English (default)
  - Afrikaans (af)
  - Dutch (nl)
  - Finnish (fi)
  - French (fr)
  - German (de)
  - Hungarian (hu)
  - Italian (it)
  - Portuguese (pt)
  - Russian (ru)
  - Spanish (es)
  - Swedish (sv)
  - Telugu (te)
  - Ukrainian (uk)
  - Chinese (zh)
Better match generation algorithm which will find the absolute lowest entropy combination of the matches.
Support for ranked and un-ranked dictionaries.
Dictionary matching has the ability to use Levenshtein Distance (LD) calculations to match passwords which are non-exact matches to a dictionary entry.
- LD calculations happen on full passwords only, and have a threshold of 1/4th the length of the password.
Dictionaries can be customized, and custom dictionaries can be added very easily.
- Exclusion dictionaries can also be built and tailored per-user to prevent obvious issues like using their own email or name as their password
Default dictionaries have excluded single character words due to many false positives
Additional PasswordMatchers and Matches can be implemented and configured to run without re-compiling.
Easy to configure how this library works through the ConfigurationBuilder.
- You can set minimum entropy scores, locale, year patterns, custom leet tables, custom adjacency graphs, custom dictionaries, and custom password matchers.
Support for generating passwords and passphrases.
- Available in the console application as well as the library.
- One use case is for generating a "forgot password" temporary pass

Compatibility

Strict compatibility between nbvcxz and zxcvbn has not been a goal of this project. The additional features in nbvcxz which have improved accuracy are the main causes for differences with zxcvbn. There are some ways to configure nbvcxz for better compatibility though, so we will go over those configuration parameters here.

Disable the Levenshtein Distance (LD) calculation. This feature was very helpful in my analysis on helping identify passwords which were only slightly different than dictionary words but were not caught with the original implementation. This feature will be sure to cause nbvcxz to produce different results than zxcvbn for a large number of passwords. Use ConfigurationBuilder setDistanceCalc(Boolean distanceCalc)
Make sure both implementations are using the same dictionaries. There are additional leaked passwords in the nbvcxz dictionary compared to zxcvbn. There are also additional dictionaries included in nbvcxz that are not in zxcvbn and vice versa. Simply different choices on what lists were important to include by default. With nbvcxz you can easily change which dictionaries are used though, so it's easy to make the different implementations use the same dictionaries. Use ConfigurationBuilder setDictionaries(List dictionaries)
Disable separator match types. This is a new match type which zxcvbn has no equivalent. It helps with passphrase detection and accurately scoring them, but if we are going for compatibility we need to disable it. Use ConfigurationBuilder setPasswordMatchers(List passwordMatchers)
The algorithm to find the best matches is different between nbvcxz and zxcvbn, that is likely to produce slightly different results in cases where zxcvbn is unable to find the best combination of matches due to the algorithm used. There were quite a few instances I noted that brought about the change to the algorithm used by nbvcxz where there were obviously "wrong" results for entropy based on the combination of matches because it got stuck in a local minimum. This is no longer an issue with nbvcxz, but will inherently produce different results for some passwords compared to the original algorithm used by zxcvbn. In the majority of cases both algorithms are able to figure out what the lowest entropy combination of matches on the password are, so I don't see this being too big of an issue.

A Rant On Arbitrary Password Policies

Lets think up an example scenario which I expect some of you may have run into way too often. We are a company NewStartup! and we are creating the next big web application. We want to ensure our users don't choose an easily guessable password, so we implement an arbitrary policy which says a password must have: an eight character minimum and contain upper case, lower case, numbers, and special characters

Now lets see how that policy applies to two passwords which are at opposite ends of the spectrum.

Password #1: Passw0rd! - This password was chosen to get around an arbitrary policy

Password #2: 5fa83b7e1r39xfa8hmiz0 - This was randomly generated using lowercase alphanumeric

Password #1 meets all of the rules in the policy and passes with flying colors. Password #2 does not contain upper case, or special characters, and thus the policy fails this password.

Was password #1 actually more secure than password #2 by any metric? That would be a hard argument to make.

In fact, password #1 is likely to be cracked quite quickly. password is one of the top passwords in all password lists an attacker is likely to try using a rule based dictionary attack. If the attacker knows that our policy requires: eight character minimum, upper case, lower case, numbers, and special characters they will then use rules like toggle case, l33t substitution, and suffix/prefix special characters to augment their dictionary list for the attack.

It's quite likely password #1 would fall to an attacker even in a rate limited online attack.

Password #2, while not allowed by our policy, is only susceptible to a brute force attack (if a secure hashing algorithm is used).

How to use

nbvcxz can be used as a stand-alone console program, or import it as a library.

Standalone

To use as a stand-alone program, just compile, and run it by calling: java -jar nbvcxz-1.5.1.jar

Library

nbvcxz can also be used as a library for password validation in java back-ends. Below is a full example of the pieces you'd need to implement within your own application.

Configure and create object

All defaults

// With all defaults...
Nbvcxz nbvcxz = new Nbvcxz();

Localization

Here we're creating a custom configuration which localizes all text to French

// Create our configuration object and set the locale
Configuration configuration = new ConfigurationBuilder()
        .setLocale(Locale.forLanguageTag("fr"))
        .createConfiguration();
        
// Create our Nbvcxz object with the configuration we built
Nbvcxz nbvcxz = new Nbvcxz(configuration);

Custom configuration

Here we're creating a custom configuration with a custom exclusion dictionary and minimum entropy

// Create a map of excluded words on a per-user basis using a hypothetical "User" object that contains this info
List<Dictionary> dictionaryList = ConfigurationBuilder.getDefaultDictionaries();
dictionaryList.add(new DictionaryBuilder()
        .setDictionaryName("exclude")
        .setExclusion(true)
        .addWord(user.getFirstName(), 0)
        .addWord(user.getLastName(), 0)
        .addWord(user.getEmail(), 0)
        .createDictionary());

// Create our configuration object and set our custom minimum
// entropy, and custom dictionary list
Configuration configuration = new ConfigurationBuilder()
        .setMinimumEntropy(40d)
        .setDictionaries(dictionaryList)
        .createConfiguration();
        
// Create our Nbvcxz object with the configuration we built
Nbvcxz nbvcxz = new Nbvcxz(configuration);

Estimate password strength

Simple

// Estimate password 
Result result = nbvcxz.estimate(password);

return result.isMinimumEntropyMet();

Feedback

This part will need to be integrated into your specific front end, and really depends on your needs. Here are some of the possibilities:

// Get formatted values for time to crack based on the values we 
// input in our configuration (we used default values in this example)
String timeToCrackOff = TimeEstimate.getTimeToCrackFormatted(result, "OFFLINE_BCRYPT_12");
String timeToCrackOn = TimeEstimate.getTimeToCrackFormatted(result, "ONLINE_THROTTLED");

// Check if the password met the minimum set within the configuration
if(result.isMinimumEntropyMet())
{
    // Start building success message
    StringBuilder successMessage = new StringBuilder();
    successMessage.append("Password has met the minimum strength requirements.");
    successMessage.append("<br>Time to crack - online: ").append(timeToCrackOn);
    successMessage.append("<br>Time to crack - offline: ").append(timeToCrackOff);    
    
    // Example "success message" that would be displayed to the user
    // This is obviously just a contrived example and would have to
    // be tailored to each front-end
    setSuccessMessage(successMessage.toString());
    return true;
}
else
{
    // Get the feedback for the result
    // This contains hints for the user on how to improve their password
    // It is localized based on locale set in configuration
    Feedback feedback = result.getFeedback();
    
    // Start building error message
    StringBuilder errorMessage = new StringBuilder();
    errorMessage.append("Password does not meet the minimum strength requirements.");
    errorMessage.append("<br>Time to crack - online: ").append(timeToCrackOn);
    errorMessage.append("<br>Time to crack - offline: ").append(timeToCrackOff);
    
    if(feedback != null)
    {
        if (feedback.getWarning() != null)
            errorMessage.append("<br>Warning: ").append(feedback.getWarning());
        for (String suggestion : feedback.getSuggestion())
        {
            errorMessage.append("<br>Suggestion: ").append(suggestion);
        }
    }
    // Example "error message" that would be displayed to the user
    // This is obviously just a contrived example and would have to
    // be tailored to each front-end
    setErrorMessage(errorMessage.toString());
    return false;
}

Generate passphrase/password

We have a passphrase/password generator as part of nbvcxz which very easy to use.

Passphrase

// Generate a passphrase from the standard (eff_large) dictionary with 5 words with a "-" between the words
String pass1 = Generator.generatePassphrase("-", 5);

// Generate a passphrase from a custom dictionary with 5 words with a "-" between the words
String pass2 = Generator.generatePassphrase(new Dictionary(...), "-", 5);

Password

// Generate a random password with alphanumeric characters that is 15 characters long
String pass = Generator.generateRandomPassword(Generator.CharacterTypes.ALPHANUMERIC, 15);

Bugs and Feedback

For bugs, questions and discussions please use the Github Issues.

License

MIT License

http://www.opensource.org/licenses/mit-license.php

Requires Java

Java 1.7+

Application using this library

Blacksmith TPM - Formerly GoSimple TPM
Pazzword - Intelligent Password Evaluator
KeePassDX - Open source password manager for Android

Anyone else using the library in their application, i'd love to hear and put a link up here.

nbvcxz's People

Contributors

Stargazers

Watchers

Forkers

martinteevarga emrul jebmiller scadgek stefb965 svzdvd robertohigor tristaoeast vfspirit lableorg servicesgpr ashishkrishnan thekalinga ogregoire involvestecnologia oviva-ag myasinkaji m2049r dpjm94 celineyoung xconnor9 npmccallum sr-lab morristech garcia-jj bingxue314159 diffplug maxxtongroup vgorin x-akseli francescoz93 daoshengtech walt-suen tomlottermann ronan-h purplesparkle pulse00 kylelkh alifesoftware polorumpus nana1899 codersauravv halcyonsoft khaes-kth ajunlonglive zhuomingliang braytech-qcl ebell495 mjegorovas eekboom firehooper gitbugactions

nbvcxz's Issues

Add support for HIBP password API

I would like to add a matcher/match type that represents searching the HIBP api for a match.

I'd also like to see about supporting "local" search of the password hashes for those who don't have an API key and want to ensure everything is kept within their control.

Multiple simultaneous connections cause heap dump

Not sure if this is an issue or if we haven't implemented something correctly. We're trying to use the library for password validation, and we've found that 200+ simultaneous connections fill the JVM heap and crash our servers. Basically we have an API enpoint that the user calls and passes their password. When then call nbvcxz.estimate(password).

L33t words not matched as expected

L33tified words seem to get only partial dictionary matches.

Result

The password Ch1ck3n1970 currently yields:

BruteForceMatch: C
BruteForceMatch: h
BruteForceMatch: 1
BruteForceMatch: c
DictionaryMatch: ken (male_names)
YearMatch: 1970

For comparison, this is Chicken1970:

DictionaryMatch: chicken (passwords)
YearMatch: 1970

Expected

For Ch1ck3n1970 I would expect the same matches as Chicken1970:

DictionaryMatch: chicken (passwords, Leet Substitutions true)
YearMatch: 1970

Wrong entropy computing

I use 1.5.0 version of the library.

I use Nbvcxz for entropy calculating and i configure it with the follow way:

private static final double MIN_ENTROPY = 27d;
private static final Nbvcxz nbvcxz;

static {
    List<Dictionary> dictionaryList = ConfigurationBuilder.getDefaultDictionaries();

    Configuration configuration =
        new ConfigurationBuilder()
            .setLocale(Locale.forLanguageTag("ru"))
            .setMinimumEntropy(MIN_ENTROPY)
            .setDictionaries(dictionaryList)
            .createConfiguration();

    nbvcxz = new Nbvcxz(configuration);
  }

Then i try to get entropy of a password:

double currentEntropy = nbvcxz.estimate(password).getEntropy();

So, when i try to check password A2013n2000 i get 22.11254640984653. And when i check password A2013n200 (the same but without last 0) i get 36.54152973076582

I think that it is some error. Why more difficult password has less entropy?

Dictionary word not always recognized

In this example I have configured nbvcxz with an additional single-entry dictionary to test how a common password occurrence (surname or given name plus a few numbers) is handled:

// Just a random string for testing. Please assume for the sake of the example that
// 'Gohklhiepu' is the user's surname.
Map<String, Integer> excludeMap = new HashMap<>();
excludeMap.put("Gohklhiepu", 0);

// Just like the README.
List<Dictionary> dictionaryList = ConfigurationBuilder.getDefaultDictionaries();
dictionaryList.add(new Dictionary("exclude", excludeMap, true));

Configuration configuration = new ConfigurationBuilder()
        .setDictionaries(dictionaryList)
        .setMinimumEntropy(30d)
        .createConfiguration();

Nbvcxz nbvcxz = new Nbvcxz(configuration);


// Test A.

Result result = nbvcxz.estimate("Gohklhiepu");

System.out.println(result.getEntropy());
// 0.0

for (Match match : result.getMatches()) {
    System.out.println(match.getDetails());
    // DictionaryMatch
}


// Test B.

result = nbvcxz.estimate("Gohklhiepu3425");

System.out.println(result.getEntropy());
// 60.29210956096036

for (Match match : result.getMatches()) {
    System.out.println(match.getDetails());
    // A series of BruteForceMatch
}

As expected, using the fictional dictionary word as password gives 0.0 entropy.

Surprisingly, word + 3425 (which is a rather weak password if we assume that word is the user's surname and the number his postal code or house number) results in a rather high entropy of 60.3. It looks like the word is not recognized as a dictionary word — all matches are BruteForceMatch.

Could this be a bug?

Please make a new release

Please make a new release, we need the fix for Issue #47.

Dictionary distance

/u/finlay_mcwalter on reddit said:

From the linked explanation:
...looking for matches in any part of the password on: word lists, ...
I wonder if, rather than (or perhaps as well as) looking for matches in any part, it would be more productive to compute the Levenshtein distance between your various "bad passwords" and the candidate password?
With that, a password of paXssXwoXrd, which has only a Levenshtein distance to password of 3, would be flagged as weak, whereas it has none of the bad characteristics you're currently looking for.
Some of the daft ways people "harden" their passwords, like l33t and cASEsWappING, will produce pretty low Levenshtein statistics without the need for a dedicated matcher.

Add finger-shift technique to dictionary matcher

Implement the finger-shift technique as described here: https://medium.com/@josh.cummings/but-what-if-my-passwords-were-musical-316c89e26a9e

Update default cracking speed

It's been 8 months since the default cracking speed has been updated, and the new generation of Nvidia graphics cards have come out since. Update the cracking speed to what you can now get for ~$20k.

Fix bruteforce output

Combine the output of concecutive bruteforce matches prior to returning the result. This is ugly as can be right now with each letter going into a bruteforce match.

Note - this will change the scoring for these, which isn't quite right currently, as we don't properly guess the cardinality of the total bruteforce section of the password.

Example:

----------------------------------------------------------
Commands: estimate password (e); generate password (g); quit (q)
Please enter your command:
e
Please enter the password to estimate:
4@8({[</369&#!1/|
----------------------------------------------------------
Time to calculate: 9 ms
Password: 4@8({[</369&#!1/|
Entropy: 75.41990388226716
Your password meets the minimum strength requirement.
Time to crack: ONLINE_THROTTLED: infinite (>100000 centuries)
Time to crack: ONLINE_UNTHROTTLED: infinite (>100000 centuries)
Time to crack: OFFLINE_BCRYPT_14: infinite (>100000 centuries)
Time to crack: OFFLINE_BCRYPT_12: infinite (>100000 centuries)
Time to crack: OFFLINE_BCRYPT_10: infinite (>100000 centuries)
Time to crack: OFFLINE_BCRYPT_8: infinite (>100000 centuries)
Time to crack: OFFLINE_BCRYPT_5: infinite (>100000 centuries)
Time to crack: OFFLINE_SHA512: 318 centuries
Time to crack: OFFLINE_SHA1: 39 centuries
Time to crack: OFFLINE_MD5: 13 centuries
-----------------------------------
Match Type: BruteForceMatch
Entropy: 3.3219280948873626
Token: 4
Start Index: 0
End Index: 0
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 5.044394119358453
Token: @
Start Index: 1
End Index: 1
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 3.3219280948873626
Token: 8
Start Index: 2
End Index: 2
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 5.044394119358453
Token: (
Start Index: 3
End Index: 3
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 5.044394119358453
Token: {
Start Index: 4
End Index: 4
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 5.044394119358453
Token: [
Start Index: 5
End Index: 5
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 5.044394119358453
Token: <
Start Index: 6
End Index: 6
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 5.044394119358453
Token: /
Start Index: 7
End Index: 7
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 3.3219280948873626
Token: 3
Start Index: 8
End Index: 8
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 3.3219280948873626
Token: 6
Start Index: 9
End Index: 9
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 3.3219280948873626
Token: 9
Start Index: 10
End Index: 10
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 5.044394119358453
Token: &
Start Index: 11
End Index: 11
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 5.044394119358453
Token: #
Start Index: 12
End Index: 12
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 5.044394119358453
Token: !
Start Index: 13
End Index: 13
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 3.3219280948873626
Token: 1
Start Index: 14
End Index: 14
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 5.044394119358453
Token: /
Start Index: 15
End Index: 15
Length: 1
-----------------------------------
Match Type: BruteForceMatch
Entropy: 5.044394119358453
Token: |
Start Index: 16
End Index: 16
Length: 1
----------------------------------------------------------

Very long passwords passed in for estimation can cause slowness

It was noted in this discussion that if a extremely long password was passed in, it could cause issues with performance.
7ep/demo@c5671bc#commitcomment-32260369

Implement a max length configuration which will have a sane default (like 72 characters to match bcrypt's max length for example), and truncate any characters after that. Return feedback that the max length was hit if that was the case.

The new match finding algorithm is slow

The new match finding algorithm dominates all the time of entire process for some types of passwords.

Add additional helper to calculate the minimum entropy

I've been thinking on the state of current computing trends, the current pattern we have for specifying a minimum entropy is based on:

setMinimumEntropy(Double minimumEntropy) - manual decision by human
setMinimumEntropy(BigDecimal secondsToCrack, String guessType) - based on a fixed size cracking rig specified with setCrackingHardwareCost(final Long crackingHardwareCost)

If instead, we allowed the user to configure an average cost to crack password based on hash algorithm selected and calculated entropy of password within a fixed time period, it would allow our users to easily decide on a minimum entropy based on threat type and potential resources of adversary.
E.g., setMinimumEntropy(BigDecimal averageSpend, String guessType)
The averageSpend could be some cloud-based best $/hash per-algorithm and would account for the virtually unlimited parallelism you can throw at cracking problems with modern cloud hardware and a cartel or nation-state sized budget.

The threat model I am trying to alleviate, is the "my login database was leaked and now it's out in the public with user information and password hashes, however Nbvcxz was used to ensure passwords met some minimum entropy to so all passwords stored in this database are invulnerable to bruteforce attacks, and have no other obvious weakness that we can detect with targeted attacks (diceware for example).

This should allow of this library to configure it to their organization's threat level in a way that is more comprehensible to someone working in a corporate security department trying to decide how to configure software using Nbvcxz that surfaces this configuration to the user. It's the only real way to quantify how fast someone can break a password when they can rent (or botnet) a virtually unlimited supply of compute power can be put to task at cracking hashes.

Some links:
https://web.archive.org/web/20230106023836/https://security.stackexchange.com/questions/117392/password-cracking-time-vs-cost
https://web.archive.org/web/20230106023847/https://blog.1password.com/cracking-challenge-update/

Need more translations

I'd love to have nbvcxz translated into as many languages as possible. Currently, it's got an English and French translation. I sadly am not multilingual, so I can't really do anything here. I am hoping some users who are multilingual can take a little bit of time to translate the not terribly large amount of text into whatever languages they are able to.

The two files which need to be translated to another language can be found here: https://github.com/GoSimpleLLC/nbvcxz/blob/master/src/main/resources/feedback.properties
and here: https://github.com/GoSimpleLLC/nbvcxz/blob/master/src/main/resources/main.properties

Edit: We have had some translations completed already, so i'll list them here.
French
Russian
Ukrainian
Afrikaans
Hungarian
Spanish
Portuguese

If you know the language, feel free to review each: https://github.com/GoSimpleLLC/nbvcxz/tree/master/src/main/resources

zxcvbn compatibility

We are currently building password rating into two separate clients, one Java, one C++ and would like to use native password rating libraries.

For Java we are already using nbvcxz, for C++ we want to use zxcvbn.

Ideally both clients would identify the same passwords as weak. However, reading https://github.com/GoSimpleLLC/nbvcxz#differentiating-features, I'm assuming this never was a goal of nbvcxz (which I understand).
What would it take to give nbvcxz a "compatability" mode that would make it produce the same (or at least almost the same) results as zxcvbn?

As a best-effort measure: are there some configurations that would get nbvcxz results closer aligned to the zxcvbn results?

acsploit

I noticed that nbvcxz was mentioned as being targeted by: https://github.com/twosixlabs/acsploit

Ensure the generated passwords don't cause issues.

Can't turn off brute force matcher

Hey there,

first: nice lib, thanks for your work!

My Problem:
I would like to use your library with only my own password matchers.
The problem are the BruteForceMatches or the isRandom-check. Is there a way to turn it off?

English wordlist too short / not original zxcvbn list?

Hey there,
thanks for rewriting this awesome library for java.

Is there a reason why english.txt "only" contains 18,091 while the zxcvbn english_wikipedia.txt, which obviously party contains the same entries, has 100,000 lines?

Otherwise I could open a pull request to fix this.

Greetings!

Add score enum indicating password strength

First, thank you for providing that awesome library!

Second, I need just like zxcvbn does, an enum indicating if the password is either, worst, bad, wak, good or strong. According to zxcvbn, it's possible to do that, based on guesses:

result.score      # Integer from 0-4 (useful for implementing a strength bar)

  0 # too guessable: risky password. (guesses < 10^3)

  1 # very guessable: protection from throttled online attacks. (guesses < 10^6)

  2 # somewhat guessable: protection from unthrottled online attacks. (guesses < 10^8)

  3 # safely unguessable: moderate protection from offline slow-hash scenario. (guesses < 10^10)

  4 # very unguessable: strong protection from offline slow-hash scenario. (guesses >= 10^10)

This would be accessible calling getScore() in me.gosimple.nbvcxz.scoring#Result

ConfigurationBuilder.getDefaultXYZ returns internal instances

Because ConfigurationBuilder.getDefaultXYZ returns the internal instances (and because of the way the documentation suggests to add stuff to the configuration builder), one can easily add for example matchers multiple times.

public int estimate(String password) {
    List<PasswordMatcher> matchers = ConfigurationBuilder.getDefaultPasswordMatchers();
            matchers.add(new PasswordMatcher() {
                @Override
                public List<Match> match(Configuration configuration, String password) {
                    System.out.println("Called with: " + password);
                    return new ArrayList<>();
                }
            });
    Configuration config = new ConfigurationBuilder()
        .setPasswordMatchers(matchers)
        .createConfiguration();
    Nbvcxz tester = new Nbvcxz(config)
    Result result = tester.estimate(password);
    return result.getBasicScore(); // Yes I've seen that you dislike this, but just to illustrate! :)
}

Running the above once will print "Called with ...". Running it a second time will print "Called with ..." twice, a third time trice, a fourth time will print it four times and so on.

One can get around this by either creating the configuration once (and not every time a test is made) or by wrapping the return value of ConfiguratioBuilder.getDefaultPasswordMatchers() in a new ArrayList.

This is not a big bug but given the documentation I spent a good half hour scratching my head as to why I was getting multiple calls to my "PasswordMatcher".

Best Regards,
Per-Erik

Substituted characters not fully implemented

It seems that some substituted characters e.g. @ for a and 5 for s are implemented, but some are not.
I have tested 2u and uu for w e.g.
P@55w0rd provides a basic score of 0
P@55uu0rd provides a basic score of 4

I found a list of substituted characters at:

https://en.wikipedia.org/wiki/Munged_password

Passwords shorter than 10 characters are considered to be weak (NIST SP800-132)

I quoted this from OWASP, see https://www.owasp.org/index.php/Authentication_Cheat_Sheet#Password_Length

From you wrote

a password must have: an eight character minimum and contain upper case, lower case, numbers, and special characters

Is it possible to change the minimum length? Or make it configurable?

Program fails if maintainer is hit by a bus

The latest release serves primarily to bump up the default hashing speed. If the maintainer is hit by a bus (or, less likely loses interest in the project) it'd be good to have an option that automatically bumps this variable up based on Moore's law.

It'll be less accurate than a human-approved value, but more future proof.

certain bad passwords make it through the filter

Try this password:

Arvest#1
Expected result: insufficient entropy
Actual result: sufficient

Then try this one:

Arvest#2

Expected result: insufficient entropy
Actual result: is correct - insufficient entropy

You are welcome to use https://github.com/7ep/demo to see this for yourself, in the unit tests at RegistrationUtilsTests at testShouldHaveInsufficientEntropyInPassword

Add ability to run in fixed-time

We should have the ability to run the estimate method within a fixed timeframe.

If the algorithms within fail to calculate within the specified timeframe, we should still return a Result that contains a Match that specifies that this was timed out and the password will be rejected for that reason.

This should allow safer use within backend systems regardless of algorithmic complexity vulnerabilities existing with certain inputs.

"secret secret secret" has basic score 4 (very unguessable) with estimated 130 billion guesses

That seems completely wrong to me.

At the same time "password password password" gets a score of 1 with estimated 900 thousand guesses.
What is going on here?

Isn't there a check for repeated words at all?

Can I customize the library to handle this better?

Matches returned not always the most optimal combination

I have found some cases where there is a better way the matches could have been combined to give a lower score.

It would happen when building the match list and there is a match added first which is thought to have the lowest average entropy, but adding that instead of another with a slightly higher average but shorter length would stop the next one in line which was significantly lower entropy from being added.

Add a pass phrase generator

Allow the library to be used to generate passphrases from any dictionary (user supplied as well).

Simple password entropy check CPU time approaches heat death of the universe

It seems that running it against relatively simple, albeit long password such as aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa seems to exceed what little CPU power we have here in our data centre, effectively making our system very secure by being very dead.

Is there anything that could we could do (configuration-wise) except slicing it up or taking a substring?

Example repro via standalone jar:

>java -jar nbvcxz-1.3.1.jar
Commands: estimate password (e); generate password (g); quit (q)
Please enter your command:
e
Please enter the password to estimate:
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa

The aaa example is just for the repro, in reality our password field (was) unbounded.

Too high estimates when finding words in dictionaries

The way that DictionaryMatcher#match works isn't consistent, or at least one could make an argument that it isn't. It gives too high entropy to some passwords that are just "worse" l33t-substituted versions of some other password. For example

The password "passw0rd" will be matched directly to a word found in the dictionary "passwords". In that dictionary "passw0rd" is at rank 411 so it is given an entropy of ~8.68.

The password "p4s5w0rd" will not be matched directly to a word found in the dictionary "passwords". Instead the match-method will try to use l33t-substitutions and one of those substitutions is the word "password". That word is then looked up in the dictionaries and found at rank 2 in the "passwords" dictionary. Thus, the password is given an entropy of ~3.32.

The "easy" fix for this might be to not short circuit the match-method with "continue" in the for-loops, but I'm not sure how that would affect other parts of the system and it will increase the running time of estimating a password.

Note that I'm not saying that "p4s5w0rd" is more secure than "passw0rd" and should get a higher score. What I'm saying is that its estimated entropy shouldn't be lower, since both strings are essentially a l33t-substitution on the word "password" - it's just that one of the strings is a common password while the other one isn't.

In general, if two strings can both be transformed into the same string and that transformed string has a lower entropy than either of the non-transformed strings, then both the non-transformed strings should get the entropy value of the transformed string.

missing instructions how to compile

How to compile is not obvious if you don't know java. Please add to readme. Btw on Debian.

sudo apt-get install mvn
cd mbvcxz
mvn package

High deviation for a certain password

Regarding this password: Submit new password

on the web demo I get a score of 4 and a guesses count of 212200000000 (guesses_log10: 11.32675)
with this library I get a score of 2 and a guesses count of 48909168

Setup:

Same small additional user dictionary
PasswordMatcher removed from the matchers list
Version: 1.5.0

I saw in the Readme that there are differences between the two versions. But this one seems to be very different. Is there a way for me to achieve results that are as close as possible? I'm using both libraries for validation. I come into a bad situation telling the user in the frontend that the password strength is good. But after submitting the server says that it's not.

Is this library thread-safe?

I've built a little command-line passphrase generator, and added your library in to calculate entropy for the generated passwords. I've noticed when I try to run multiple estimations in parallel with the same object, I get one of many kinds of errors:

ConcurrentModificationException
java.lang.IllegalStateException: There was an unexpected error and all of the matches put together do not equal the original password.
Entropy of 0.0

Can you chime in on whether this object is thread-safe?

Thanks!

Provide more feedback

@jdhoek said:

With a password like apples1970 two matches are found: a DictionaryMatch (apples is a common password) and a YearMatch for 1970.

The feedback returned does not include any mention of the year in its suggestions; probably because the DictionaryMatch ranks higher. From an end-user perspective I think it would be helpful if suggestions like feedback.year.suggestions.avoidYears are returned as well. That way in a UI with live feedback the suggestions for improvement can disappear one by one as the user improves their password.

StackOverflowError when generating estimate

I don't know what password caused this, but observed the following crash:

java.lang.StackOverflowError
	at java.base/java.util.TreeMap.getEntry(TreeMap.java:350)
	at java.base/java.util.TreeMap.get(TreeMap.java:279)
	at me.gosimple.nbvcxz.matching.DictionaryMatcher.replaceAtIndex(DictionaryMatcher.java:68)
	at me.gosimple.nbvcxz.matching.DictionaryMatcher.replaceAtIndex(DictionaryMatcher.java:82)
	at me.gosimple.nbvcxz.matching.DictionaryMatcher.replaceAtIndex(DictionaryMatcher.java:82)
	at me.gosimple.nbvcxz.matching.DictionaryMatcher.replaceAtIndex(DictionaryMatcher.java:82)
	<~700 more identical lines truncated>

This is a condensed version of the configuration that was used:

List<Dictionary> dictionaries = new ArrayList<>(ConfigurationBuilder.getDefaultDictionaries());
DictionaryBuilder userDictionary = new DictionaryBuilder()
        .setDictionaryName("user_details")
        .setExclusion(true);

var relatedWords = List.of("<user email and potentially other user fields>");

if (relatedWords != null) {
    for (String word : relatedWords) {
        userDictionary.addWord(word, 0);
    }
}

dictionaries.add(userDictionary.createDictionary());
dictionaries.add(new DictionaryBuilder()
    .setDictionaryName("example")
    .setExclusion(true)
    .addWord("example", 0)
    .addWord("example.com", 0)
    .createDictionary());

return new ConfigurationBuilder()
        .setDictionaries(dictionaries)
        .createConfiguration();

Using Nbvcxz v1.5.0.

Too high score for special characters

With the String "+=&/()!" (without quotes) nbvcxz returns a score of 4/4, entropy of around 35 and 7 brute force matches with an entropy of around 5 each. Which seems kinda overrated.

The online demo on the other hand returns a single brute force match with only a score of 2/4. Which seems more appropriate for a password of only 7 chars. I'm sure even some precomputed rainbow tables go up to 8 normal chars (letters, digits and regular special characters)

"very common password" feedback for a strong password

Hi, I tried to estimate password "discovercowboycow"

Nbvcxz nbvcxz = new Nbvcxz();
final Result result = nbvcxz.estimate("discovercowboycow");

The result has basic score of 3 (our of 4) which is considered a strong password, however a feedback (result.getFeedback().getWarning()) returns message "This is a very common password." which is just inappropriate here.

I think passwords with score 3 and 4 don't need to have a warning message (just like in original zxcvbn)

Release new version containing german translations

I saw that german translations were brought in by @sgwerder with 7b954fd.

The released version 1.4.0 on maven central does not yet contain these. It would be great if you'd make a new release.

In the meantime, if anyone stumbles upon this, pulling via http://jitpack.io works:

<repositories>
  <repository>
    <id>jitpack.io</id>
    <url>https://jitpack.io</url>
  </repository>
</repositories>
<dependencies>
  <dependency>
    <groupId>com.github.GoSimpleLLC</groupId>
    <artifactId>nbvcxz</artifactId>
    <version>7b954fde10b0eb224e43c155e7ef2b822f4ce721</version> <!-- this is the same code as 1.4.0, but with german translations -->
  </dependency>
</dependencies>

Thanks for the great work!

Define a stable automatic module name

nbvcxz does not currently use modules. When we use it in a JDK9 modular application, Java turns the JAR into a so-called automatic module, whose name is derived from the file name. However, the default name does not follow the recommended module naming conventions (reverse-dns style, module name derived from the main exported package).

It is possible to specify a stable automatic module name through a Automatic-Module-Name manifest entry, while still targeting JDK8:

Automatic-Module-Name: me.gosimple.nbvcxz

Selecting a stable module name is very important, because Java does not allow two modules to own the same package. So either provide the module name through the manifest file or event better: Provide a complete module declaration.