gustf / js-levenshtein Goto Github PK

View Code? Open in Web Editor NEW

460.0 5.0 38.0 64 KB

The most efficient JS implementation calculating the Levenshtein distance, i.e. the difference between two strings.

License: MIT License

JavaScript 100.00%

levenshtein edit-distance

js-levenshtein's People

Contributors

Stargazers

Watchers

js-levenshtein's Issues

Optimize for long strings

I was thinking if var vector = new Array(2 * la) would help instead of var vector = [];? In theory arrays are created with predefined size and they are reallocated if that size is exceeded (with some new limit estimate). I don't know how the browsers optimize it today, but surely the length constructor exists for a reason.

TypeScript support

Any thouhts of TypeScript support?

Improve Code Readability

I'm going to try my hand at this if I can wrap my head around the code. As it stands, it's fairly hard to understand what on earth is going on.

How to use it to search for a substring(or nearly it) in a large blob of text?

For example in the following text:

Easier to reason about useMemo can make your code easier to reason about by making it clear which values are being calculated and when.
Follow-up questions
What are some other ways to improve the performance of React applications?

I want to locate and split by sting like : Follow-up questions, Follow-up questions:, Follow-up question, Followup questions , Follow-up questions=, follow-up questions:= .
I know searching by regex is an option by I am looking to explore levenshtein to keep room for an unexpected letter.

TypeScript typings

This project doesn't have a typings file, which makes it hard to use in a TypeScript project. I didn't find any under @types, either.

Early stop when maximum distance reached [feature request]

Hello,

Thank you for this implementation. It is indeed very fast!

I am trying to optimize the algorithm for the specific (but rather common) case where we look for only close items. Stopping early when we know that the distance will certainly be over a given maximum distance can make the algorithm much faster.

I get already a good speed improvement (x2) by stopping early at the very beginning here and/or here with something like:

if (typeof max === 'number' && lb - la >= max) return max;

But when I try to stop early in the for loops below, by keeping the minimum value of vector and stopping if min + lb - la > max, it works but gets slower ;(

Do you have any idea of how to implement this feature in an optimized way?
The best would be of course to support this option directly in your code ;) but any advice is welcome!

Thank you very much for your help.

Cheers

precomputing against a list of strings

Perhaps not with this library specifically, but would you happen to know if it's possible to in essence "pre-compute" part of the levenshtein distance if we know what one argument will be? eg. I have a fixed set of strings, and the other input is always unknown.

That could nudge the performance even further I presume.

gustf / js-levenshtein Goto Github PK

js-levenshtein's People

Contributors

Stargazers

Watchers

Forkers

js-levenshtein's Issues

Optimize for long strings

TypeScript support

Improve Code Readability

How to use it to search for a substring(or nearly it) in a large blob of text?

TypeScript typings

Early stop when maximum distance reached [feature request]

precomputing against a list of strings

Question

Increase compatibility

Case Insensitive Distance

Where can I plug a custom substitution cost function?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent