Giter Site home page Giter Site logo

Comments (2)

manishobhatia avatar manishobhatia commented on August 16, 2024

Hi Abhi,
In these examples since 1 out of 2 names matches , you should get a 50% match.

So for example if these names are part of a List of name

List<String> sourceString = Arrays.asList("A Mathur", "ABhishek Mathur", "Donald Trump", "D Trump");

We just need to feed the library with a Document with an Element of Name.

AtomicInteger idCount = new AtomicInteger();
    
List<Document> sourceDoc = sourceString.stream().map(name -> {
            return new Document.Builder(idCount.incrementAndGet() + "")
                    .addElement(new Element.Builder().setType(NAME).setValue(name).createElement())
                    .setThreshold(0.4)
                    .createDocument();
        }).collect(Collectors.toList());

Map<String, List<Match<Document>>> result = matchService.applyMatchByDocIdOld(sourceDoc);

Note, that each document needs a Key , you can feed your own unique key for these.
Also we would need to reduce the Document threshold a little, since by default it considers a matching document greater than 0.5

You should be able to see the match results , using this same print to console

result.entrySet().forEach(entry -> {
            entry.getValue().forEach(match -> {
                System.out.println("Data: " + match.getData() + " Matched With: " + match.getMatchedWith() + " Score: " + match.getScore().getResult());
            });
        });

Result

Data: {[{'A Mathur'}]} Matched With: {[{'ABhishek Mathur'}]} Score: 0.5
Data: {[{'ABhishek Mathur'}]} Matched With: {[{'A Mathur'}]} Score: 0.5
Data: {[{'Donald Trump'}]} Matched With: {[{'D Trump'}]} Score: 0.5
Data: {[{'D Trump'}]} Matched With: {[{'Donald Trump'}]} Score: 0.5

from fuzzy-matcher.

amathur2k avatar amathur2k commented on August 16, 2024

Thanks Manish for the detailed response, however i fear reducing the threshold to under 0. will start matching Miachel to Mitchell, and D Trump to J Trump. I am looking at the rosette api's t see how they are doing this. though they dont have code open sourced.

from fuzzy-matcher.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.