Giter Site home page Giter Site logo

gustf / js-levenshtein Goto Github PK

View Code? Open in Web Editor NEW
460.0 5.0 38.0 64 KB

The most efficient JS implementation calculating the Levenshtein distance, i.e. the difference between two strings.

License: MIT License

JavaScript 100.00%
levenshtein edit-distance

js-levenshtein's People

Contributors

georgyberdyshev avatar gustf avatar mvasilkov avatar tech6hutch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

js-levenshtein's Issues

Optimize for long strings

I was thinking if var vector = new Array(2 * la) would help instead of var vector = [];? In theory arrays are created with predefined size and they are reallocated if that size is exceeded (with some new limit estimate). I don't know how the browsers optimize it today, but surely the length constructor exists for a reason.

Improve Code Readability

I'm going to try my hand at this if I can wrap my head around the code. As it stands, it's fairly hard to understand what on earth is going on.

How to use it to search for a substring(or nearly it) in a large blob of text?

For example in the following text:

Easier to reason about useMemo can make your code easier to reason about by making it clear which values are being calculated and when.
Follow-up questions
What are some other ways to improve the performance of React applications?

I want to locate and split by sting like : Follow-up questions, Follow-up questions:, Follow-up question, Followup questions , Follow-up questions=, follow-up questions:= .
I know searching by regex is an option by I am looking to explore levenshtein to keep room for an unexpected letter.

TypeScript typings

This project doesn't have a typings file, which makes it hard to use in a TypeScript project. I didn't find any under @types, either.

Early stop when maximum distance reached [feature request]

Hello,

Thank you for this implementation. It is indeed very fast!

I am trying to optimize the algorithm for the specific (but rather common) case where we look for only close items. Stopping early when we know that the distance will certainly be over a given maximum distance can make the algorithm much faster.

I get already a good speed improvement (x2) by stopping early at the very beginning here and/or here with something like:

if (typeof max === 'number' && lb - la >= max) return max;

But when I try to stop early in the for loops below, by keeping the minimum value of vector and stopping if min + lb - la > max, it works but gets slower ;(

Do you have any idea of how to implement this feature in an optimized way?
The best would be of course to support this option directly in your code ;) but any advice is welcome!

Thank you very much for your help.

Cheers

precomputing against a list of strings

Perhaps not with this library specifically, but would you happen to know if it's possible to in essence "pre-compute" part of the levenshtein distance if we know what one argument will be? eg. I have a fixed set of strings, and the other input is always unknown.

That could nudge the performance even further I presume.

Question

hey I got a question, does it count different cases? like "something" and "SOMEthing" won't be 100%?

Increase compatibility

This library works fast, but its compatibility is very limited. Using a few lines like in old fast-levenshtein allows the algorithm to be used with AMD, requireJS, Web workers and commonJs.

Case Insensitive Distance

Have you considered offering a case-insensitive version of the levenshtein distance algorithm?

Would be useful for folks performing fuzzy searches.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.