Giter Site home page Giter Site logo

Comments (8)

kosloot avatar kosloot commented on May 28, 2024 1

Some timestamps have been added already.

from frog.

proycon avatar proycon commented on May 28, 2024

with --nostdout: frog-:Frogging in total took: 475 seconds, 806 milliseconds and 28 microseconds

(might not mean much, load varies)

from frog.

proycon avatar proycon commented on May 28, 2024

With tokenisation:

frog-:Frogging morr001cryp01_01.notok.folia.xml
frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass!
frog-:tokenisation took:  3 seconds, 359 milliseconds and 505 microseconds
frog-:CGN tagging took:   15 seconds, 951 milliseconds and 475 microseconds
frog-:NER took:           59 seconds, 497 milliseconds and 132 microseconds
frog-:Mblem took:         25 seconds, 428 milliseconds and 531 microseconds
frog-:Frogging in total took: 495 seconds, 298 milliseconds and 53 microseconds

from frog.

proycon avatar proycon commented on May 28, 2024

Reducing threads to 4 instead of 40 (no tokenisation, no stdout):

frog-:Frogging morr001cryp01_01.tok.folia.xml
frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass!
frog-:tokenisation took:  0 seconds, 49 milliseconds and 709 microseconds
frog-:CGN tagging took:   11 seconds, 411 milliseconds and 928 microseconds
frog-:NER took:           50 seconds, 811 milliseconds and 141 microseconds
frog-:Mblem took:         3 seconds, 60 milliseconds and 229 microseconds
frog-:Frogging in total took: 67 seconds, 930 milliseconds and 822 microseconds

from frog.

proycon avatar proycon commented on May 28, 2024

single threaded:

frog-:Frogging morr001cryp01_01.tok.folia.xml
frog-tok-:ucto: --filter=NO is automatically set. inputclass equals outputclass!
frog-:tokenisation took:  0 seconds, 17 milliseconds and 784 microseconds
frog-:CGN tagging took:   9 seconds, 568 milliseconds and 987 microseconds
frog-:NER took:           44 seconds, 400 milliseconds and 315 microseconds
frog-:Mblem took:         1 seconds, 873 milliseconds and 655 microseconds
frog-:Frogging in total took: 56 seconds, 17 milliseconds and 539 microseconds

from frog.

proycon avatar proycon commented on May 28, 2024

Another minor feature request for debugging:

  • frog-:Initialization done. (could it report time for this as well?)

from frog.

proycon avatar proycon commented on May 28, 2024

So I think the conclusion is to go for parallellisation of document processing when --testdir is used instead of modules, and to set a smaller amount of threads?

from frog.

kosloot avatar kosloot commented on May 28, 2024

I did a quick examination of the possibility to handle several files in parallel in Frog, but this is quite difficult to accomplish.
Every thread will need it's own copy of all the needed modules. Like CGN tagger, NER Tagger, MBLEM etc.
This is not an easy task. MBT provides a kind of 'cloning' mechanism. But MBLEM and MBMA don't yet.
(although based on Timbl, which HAS a clone() function)
Implementing this is therefore not easy. And only fruitful when there a 'enough' files to process in parallel.

A more direct way might be to create a wrapper that uses the Frog Server and feed that with several files in parallel. My guess is, that such a wrapper is workable, and faster to accomplish.

For now, I don't consider this a bug, but a design decision.

from frog.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.