Comments (5)
Hi Gabe,
The intention of the library is to be domain agnostic, and the fuzzy match be driven entirely by "ElementType".
Currently there are quite a few domain specific type defined like "Name", "Address", "Phone Number", "Email", etc
And some generic types like "Text", "Number", "Date"
The idea is to expand these types as we get more requirements from the open source community and make the library useful in multiple domains
That said even without an enhancement to this library, it supports overriding the default behaviors and make it useful.
For example the Element API allows you to override most of its matching capability
https://github.com/intuit/fuzzy-matcher#element-configuration
Out of these configuration the "PreProcessingFunction" and "TokenizerFunction" gives an ability to inject user defined code at run time (by means of Java Functions), and provides additional flexibility to match most types of data.
If there are specific use cases you run into, feel free to send some details and example data sets, and we can look at including it in our next release.
Hope this helps.
- Manish
from fuzzy-matcher.
Hello Manish,
Thanks for your explanation! I'm perfectly happy with current functionality. The ask was around making a domain (parameters/value types/etc.) exchangeable.
If I want to create a model which is not person related I'd have to "live" with these, ignore them, and add new ones which are of no interest to people related properties.
Perhaps a enhancement idea/request to have the domain implemented as a plug-able (interfaces?) feature.
cheers,
-gabe
from fuzzy-matcher.
Hi Gabe,
I like the idea of having plug-able interfaces for various domain , that can enable easy matching. Will take that into consideration in the next iteration of our release.
In the meantime, I wanted to assure that there is little to no impact on having multiple ElementTypes present (both in terms of memory or cpu usage), even if it is not used.
The ElementType are simple easy to use ENUM's which itself is made up of different combinations of Pre-Processing function, Tokenizer Function and Match Type
This just makes it easy for the end-user to implement matching without dwelling too much into the library.
There was a an issue posted earlier, which in-fact alluded to the fact of removing ElementTypes altogether with a similar concern of not making it domain specific. Personally I like your suggestion better.
Will take both this POV's into account, as the library evolves.
We are interested in knowing as to which domains this library has been applied to, to help inform our direction. We have had quite a few usage in the realm of Person and Transaction matching domains. Feel free to comment on which domain you see it being useful.
Thanks again for suggestions and helping this project move into the right direction.
- manish
from fuzzy-matcher.
I believe the sky is the limit here.
Dating sites are an excellent example of variable amount of properties/attributes to be matched with.
Job searching: finding a good candidate based on skills matched with a particular job description.
from fuzzy-matcher.
@manishobhatia, after some time I've finally been able to give your library a try! In hindsight my question was completely irrelevant. I did myself allow to be mislead by the addresses example. As you said, the library is completely domain agnostic. The naming of the ElementType function enumerations suggests otherwise.
With that out of the way, what I believe could be beneficial is the addition of a "in" matching. Does a value exist in a set of values, initially with an exact match. The workaround currently is to duplicate the documents for all permutations of the list or lists of values.
cheers,
-gabe
from fuzzy-matcher.
Related Issues (20)
- Matching two strings HOT 4
- comparing two string with different dimension HOT 2
- Language Supported HOT 1
- Fuzzy matching issue : only fetching the exact match HOT 9
- Upgrade to Java 11 HOT 5
- Combine Tokenizers for better results HOT 2
- Phone number assumed to be a US number HOT 3
- Help HOT 1
- Kotlin not support HOT 2
- Name List matcher HOT 2
- Is there any way to create my own matchers? HOT 1
- SLF4J Failed to load HOT 3
- New Element Type for product names HOT 2
- upgrade commons-text to a non-vulnerable version HOT 2
- Information on Library usage HOT 5
- Though there is matching result but matcher is not returning. HOT 3
- How to use getScore in Element class? what is the matchingCount? HOT 2
- Questions HOT 1
- Cross-Language Fuzzy Matching: Arabic Document Matching returns 0 matches HOT 3
- Why Does Matching Fail in These Scenarios? HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fuzzy-matcher.