Giter Site home page Giter Site logo

question on accuracy about frog HOT 8 CLOSED

languagemachines avatar languagemachines commented on May 28, 2024
question on accuracy

from frog.

Comments (8)

kosloot avatar kosloot commented on May 28, 2024

hopefully @antalvdb can shine a light on this?

from frog.

antalvdb avatar antalvdb commented on May 28, 2024

This paper from 2007 has the basic performance estimations for POS tagging, morphological analysis, and dependency parsing. The latter is computed by a predecessor of the CoNLL dependency parsing evaluator. The paper does not specify scores for the lemmatizer (but see this paper), the shallow parser / XP chunker (current score on test data: 91.3 precision, 92.5 recall, 91.9 F-score) or the named entity recognizer (current score on test data: overall F-score 82.1, persons 81.7, locations 90.9, organizations 75.1). The latter scores have not been published yet.

from frog.

jwijffels avatar jwijffels commented on May 28, 2024

Thank you for links to the papers and the scores. That already provides some information.

I'm trying to compare frog to the model accuracies reported with udpipe models (https://ufal.mff.cuni.cz/udpipe/users-manual#universal_dependencies_20_models_performance) which were either built on the UD_Dutch or UD_Dutch-LassySmall corpus (see http://universaldependencies.org/treebanks/nl-comparison.html or details on these 2 corpora). As the numbers reported in the papers are highly likely driven by the corpus used I wonder if there are accuracy related metrics (precision/recall/f/uas/las) scores available for a model which was also trained on these corpora from universaldependencies? Or is this wishfull thinking that someone would have done this?

from frog.

antalvdb avatar antalvdb commented on May 28, 2024

Unfortunately we do not have the time to do these types of comparative evaluations. We always welcome anyone willing to put in time to do these types of exercises, and are happy to assist where possible.

Frog's parser is described in more detail here. The memory-based parser emulates the Alpino parser. Its inference (constraint satisfaction inference) is fast, but produces parses that are less accurate than those of Alpino. We trade accuracy for (predictable, relatively high) speed.

from frog.

jwijffels avatar jwijffels commented on May 28, 2024

Thank you for the input. I understand completely that you don't have time for this. It's not a small task.
I'm basically asking because recently I wrote an R wrapper around UDPipe (https://github.com/bnosac/udpipe) and I'm now investigating how good UDPipe is in comparison to other similar parsers, e.g. the Alpino parser or frog for Dutch, opennlp or the python pattern nlp package.

I was recently making a comparison between UDPipe & Spacy (https://github.com/jwijffels/udpipe-spacy-comparison) but I would like to have e.g. Frog added as well as the python pattern library, Alpino and OpenNLP. Do you know if such research has already been done so that maybe I can take a short-cut in this analysis.

from frog.

proycon avatar proycon commented on May 28, 2024

As far as I know, it has not been done and would be a very welcome study..

from frog.

jwijffels avatar jwijffels commented on May 28, 2024

Last week, I got into contact with Gertjan van Noord & Gosse Bouma as they had a paper on evaluating Alpino versus Parsey/Parseysaurus on the UD_Lassy-Small treebank http://aclweb.org/anthology/W17-0403 . I received the output of the Alpino results in CONLLU format which allowed a comparison to UDPipe. There were some nitty-gritty details on the evaluation but it already gave an indication of accuracy.

All it takes for making a comparison is providing the annotation result of some text for which we know the annotation in conllu output after which the evaluation script used by the CONLL17 shared task available at https://github.com/ufal/conll2017/blob/master/evaluation_script/conll17_ud_eval.py can be used. But the tricky part is to get the annotation result in conllu format :)

from frog.

kosloot avatar kosloot commented on May 28, 2024

as this is not a REAL issue, I close this

from frog.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.