Comments (6)
Hi Anton,
The soundex match does work for most of the phonetically similar name. You should see names like Smith / Smythe or John / Jon match fine. The examples you gave I don't see it being supported by Soundex unfortunately and I am not sure if there is a better library that does support it. If you do find it, there is an option to override the Tokenizer function and make use of it.
Another option is to make use of a PreProcessing dictionary for Names. There is a mapping file (name-dictonary) within the library which is mainly concerned with removing prefix, postfix and salutations from a name.
This can be re-purposed to give the mapping for such nickname's.
Here is a test example to provide this mapping externally.
But beyond this in my experience if you have other attributes of a person , like address, phone numbers or emails . This discrepancy in the first name should not significantly impact the overall score of finding similarity.
Hope that helps
Thanks,
Manish
from fuzzy-matcher.
from fuzzy-matcher.
from fuzzy-matcher.
Hi Anton,
Having the library support 1.7 will be a little tricky. Most of the code uses the functional paradigm which was introduced in Java 1.8 and also stream processing that allows parallel processing to solve for large datasets.
I do know of a few implementation where this library is being used with .NET. I believe they used JNI bridge to interact with the interfaces from fuzzy-matcher
If you have a jvm environment to run the jar , you only need to create proxies for a few classes that this library exposes as public methods. Like the MatchService
, which is the entry point ... and most of the classes in domain
which you will need to send and receive the results
These are simple java classes without any java 1.8 features in them.
Thanks,
Manish
from fuzzy-matcher.
from fuzzy-matcher.
closing the issue. Feel free to open a new one if there are still questions
from fuzzy-matcher.
Related Issues (20)
- Address matching: street containing hyphens HOT 1
- Matching On Single Word
- Matching two strings HOT 4
- comparing two string with different dimension HOT 2
- Language Supported HOT 1
- Fuzzy matching issue : only fetching the exact match HOT 9
- Upgrade to Java 11 HOT 5
- Combine Tokenizers for better results HOT 2
- Phone number assumed to be a US number HOT 3
- Help HOT 1
- Kotlin not support HOT 2
- Name List matcher HOT 2
- Is there any way to create my own matchers? HOT 1
- SLF4J Failed to load HOT 3
- New Element Type for product names HOT 2
- upgrade commons-text to a non-vulnerable version HOT 2
- Information on Library usage HOT 5
- Though there is matching result but matcher is not returning. HOT 3
- How to use getScore in Element class? what is the matchingCount? HOT 2
- Questions HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fuzzy-matcher.