Giter Site home page Giter Site logo

Comments (12)

guillaumekln avatar guillaumekln commented on June 14, 2024 1

Is the source program reading from stdin and writing to stdout normally?

Yes. However, it will try to read --batch_size lines unless the special control character EOF is received.

So for your application, you certainly want to set --batch_size 1.

from ctranslate.

guillaumekln avatar guillaumekln commented on June 14, 2024 1

+ "\r\n"

This seems to be the issue by the way.

from ctranslate.

loretoparisi avatar loretoparisi commented on June 14, 2024

@guillaumekln thank you so much!!!! It perfectly worked now!

[loretoparisi@:mbploreto opennmt]$ node translate.js 
[ '--model',
  '/root/onmt_baseline_wmt15-all.en-de_epoch13_7.19_release.t7',
  '--beam_size',
  5,
  '--batch_size',
  '1',
  '-' ]
----data <unk> der <unk> Fuchs über die faulen <unk>

<unk> der <unk> Fuchs über die faulen <unk>
SOURCE (en) "The quick brown fox jumps over the lazy dog" 
DEST (de) "<unk> der <unk> Fuchs über die faulen <unk>\n"
exec:translate end.
exec:translate exit.
task:translate pid:15115 terminated due to receipt of signal:SIGINT
[loretoparisi@:mbploreto opennmt]$ 

from ctranslate.

loretoparisi avatar loretoparisi commented on June 14, 2024

@guillaumekln sorry just noted that. When using --batch_size=1 I have a slightly different translation:

source (en): "The quick brown fox jumps over the lazy dog"
dest (de) (from bash, params: --beam_size 5): Der <unk> Fuchs springt über den faulen Hund
dest (from node script, params: --beam_size 5 --batch_size 1): <unk> der <unk> Fuchs über die faulen <unk>

from ctranslate.

guillaumekln avatar guillaumekln commented on June 14, 2024

I think there is something else. Can you reproduce it when directly invoking cli/translate on the command line?

from ctranslate.

loretoparisi avatar loretoparisi commented on June 14, 2024

nope, with command line trying different parameters:

[loretoparisi@:mbploreto build]$ echo "The quick brown fox jumps over the lazy dog" | ./cli/translate --model /root/onmt_baseline_wmt15-all.en-de_epoch13_7.19_release.t7 --beam_size 5 -
Der <unk> Fuchs springt über den faulen Hund
[loretoparisi@:mbploreto build]$ echo "The quick brown fox jumps over the lazy dog" | ./cli/translate --model /root/onmt_baseline_wmt15-all.en-de_epoch13_7.19_release.t7
Der <unk> Fuchs springt über den faulen Hund
[loretoparisi@:mbploreto build]$ echo "The quick brown fox jumps over the lazy dog" | ./cli/translate --model /root/onmt_baseline_wmt15-all.en-de_epoch13_7.19_release.t7 --batch_size 1 --beam_size 5
Der <unk> Fuchs springt über den faulen Hund

I always get the same output: Der <unk> Fuchs springt über den faulen Hund.

Programmatically in node I'm passing:

[ '--model',
  '/root/onmt_baseline_wmt15-all.en-de_epoch13_7.19_release.t7',
  '--beam_size',
  5,
  '--batch_size',
  1,
  '-' ]

and the input text "The quick brown fox jumps over the lazy dog" + "\r\n".

from ctranslate.

guillaumekln avatar guillaumekln commented on June 14, 2024

The command line is the reference so if you are getting another output there is something going on in your application.

from ctranslate.

loretoparisi avatar loretoparisi commented on June 14, 2024

@guillaumekln Yes confirmed!!!

[loretoparisi@:mbploreto opennmt]$ node translate.js 
[ '--model',
  '/root/onmt_baseline_wmt15-all.en-de_epoch13_7.19_release.t7',
  '--beam_size',
  5,
  '--batch_size',
  1,
  '-' ]
Der <unk> Fuchs springt über den faulen Hund
SOURCE (en) "The quick brown fox jumps over the lazy dog" 
DEST (de) "Der <unk> Fuchs springt über den faulen Hund\n"
exec:translate end.
exec:translate exit.
task:translate pid:54209 terminated due to receipt of signal:SIGINT
[loretoparisi@:mbploreto opennmt]$

My write function now looks like

/**
     * Send data to child process
     */
    this.send = function(data) {
        this.child.stdin.setEncoding('utf-8');
        this.child.stdin.write( data + '\n' );
    }//send

I also realize that the same happened when doing text summarization, so now it works:

task:translate pid:54209 terminated due to receipt of signal:SIGINT
[loretoparisi@:mbploreto opennmt]$ node textsum.js 
[ '--model',
  '/root/textsum_epoch7_14.69_release.t7',
  '--beam_size',
  10,
  '--batch_size',
  1,
  '-' ]
night never just my bed smell
SOURCE (en) "Last night you were in my room And now my bed sheets smell like you Every day discovering something brand new" 
DEST (-) "night never just my bed smell\n"
exec:translate end.
exec:translate exit.
task:translate pid:54229 terminated due to receipt of signal:SIGINT
[loretoparisi@:mbploreto opennmt]$ 

Thank you.

from ctranslate.

loretoparisi avatar loretoparisi commented on June 14, 2024

@guillaumekln Sorry here for all these questions! Prefer to write here, since it's related to the command line and more than a performance question than an issue. I have noticed that iterating over several lines to translate performances decrease as the number of lines grows.

Of course I'm still using --batch_size=1, so my question is: Is the model load at every call in this iteration?

I suppose this since it ends up with a memory leak: (node:61283) Warning: Possible EventEmitter memory leak detected. 11 unpipe listeners added. Use emitter.setMaxListeners() to increase limit , I think due to a OOM issue.

Considering that the number of lines to translate changes every time and I need to keep the translation by line (executing within annode process), how to handle that?

A example.
A similar translation task that I'm doing using Facebook Fairseq. In this case, the command line tool loads the model once, then I just send data to the child process stdin and the model executes the beam search, so that there is no OOM in this case.

Thank you.

from ctranslate.

guillaumekln avatar guillaumekln commented on June 14, 2024

Is the model load at every call in this iteration?

No. It will only be loaded when cli/translate is started and unloaded when the process dies.

You should be able to achieve the same approach as you described for fairseq. Keep stdin open and write line by line.

from ctranslate.

loretoparisi avatar loretoparisi commented on June 14, 2024

@guillaumekln thanks I will try that way!

from ctranslate.

loretoparisi avatar loretoparisi commented on June 14, 2024

Thank you, it works as expected!!!

[loretoparisi@:mbploreto opennmt]$ node translate.js 
Module:OpenNMT.en-de of OpenNMT loaded.
[ '--model',
  '/root/onmt_baseline_wmt15-all.en-de_epoch13_7.19_release.t7',
  '--beam_size',
  5,
  '--batch_size',
  1,
  '-' ]
<unk>
OpenNMT.load
OpenNMT.translate: translating [0] Ayy, I remember syrup sandwiches and crime allowances
OpenNMT.translate: translating [1] Finesse a nigga with some counterfeits
OpenNMT.translate: translating [2] Parmesan where my accountant lives
<unk> Ich erinnere mich an <unk> und <unk>
OpenNMT.translate: translated [0]
<unk> Ich erinnere mich an <unk> und <unk>

<unk> mit einigen Fälschungen
OpenNMT.translate: translated [1]
<unk> mit einigen Fälschungen

<unk> , wo mein Buchhalter lebt .
OpenNMT.translate: translated [2]
<unk> , wo mein Buchhalter lebt .

OpenNMT.translate: translated:3 
 [ { line: 0,
    source: 'Ayy, I remember syrup sandwiches and crime allowances',
    target: '<unk> Ich erinnere mich an <unk> und <unk>\n' },
  { line: 1,
    source: 'Finesse a nigga with some counterfeits',
    target: '<unk> mit einigen Fälschungen\n' },
  { line: 2,
    source: 'Parmesan where my accountant lives',
    target: '<unk> , wo mein Buchhalter lebt .\n' } ]
OpenNMT.unload
exec:translate end.
exec:translate exit.
task:translate pid:71271 terminated due to receipt of signal:SIGINT

from ctranslate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.