Giter Site home page Giter Site logo

Comments (13)

bijoythomas avatar bijoythomas commented on March 29, 2024 8

So I figured out what is happening. The same issue exists in the last version of Node too. The way I am sending data is something like below

for (i-0; i<some_large_number; i++) {
 poppulate_data_in_array(arr);
 if (arr.length === 500) {
   res.write(SON.stringify(arr));
   res.flush();
   arr = [];
}

// Finally if there is anything in array, send it across too

A look at the zlib.js source https://github.com/nodejs/node/blob/master/lib/zlib.js line 448 shows that the flush() call is not synchronous. Instead, it sets up a 'drain' listener to execute later. But with my for-loop consuming the thread, this listener has no chance to run. In fact, the loop will keep adding the listeners for ' drain' and that explains why I was getting the error below:
(node) warning: possible EventEmitter memory leak detected. 11 drain listeners added

I changed my code to allow the execution of the listener by making use of the callback that the Zlib.flush() takes and trampoline between data generation and sending it. So it looks something like below:

function populateArray() {
    if (iterCount < maxCount) {
      arr.push(generate_some_data());
    }
    send();
}

function send() {
    if (iterCount === maxCount) {
      if (arr.length > 0) {
        res.write(JSON.stringify(arr));
      }
      res.end();
      return;
   }

    if (arr.length === 500) {
      res.write(JSON.stringify(arr));
      res.flush(function() { // <--------- callback to flush which on invocation resumes the array population
        arr = [];
        populateArray(++iterCount);

      });
    } else {
      populateArray(++iterCount);
    }
}

In order to get this to work, I had to change the flush implementation to take a callback

res.flush = function flush(cb) {
      if (stream) {
        stream.flush(opts.flush, cb);
      }   
}

Where opts is the variable that captures the options passed into the compression function. The Zlib flush implementation takes the flush mode but I'm not aware of a practical use case of using multiple flush modes for a single compression run. Or if there is one, the function could take the addition flush argument but the user has to make sure the right mode is passed in.

from compression.

dougwilson avatar dougwilson commented on March 29, 2024 4

This is because it's how compression works with gzip: to get small sizes, gzip needs to accumulate the payload so it can do substring search and replacements.

If you want to stream, you can use the res.flush() command between writes, but the compression will be much less efficient.

An example can be found at the bottom of the readme.

from compression.

dougwilson avatar dougwilson commented on March 29, 2024

Basically you are just describing to me exactly how gzip compression and the Node.js core zlib stream works and not actually a bug. Let me know if my answer above helps at all. If you think there is a legitimate bug, let me know the specifics and even perhaps a PR :)

from compression.

bijoythomas avatar bijoythomas commented on March 29, 2024

Thanks. I added the call to flush the response after each write but it is still behaving the same. Everything seems to be coming only when res.end() is called. I am calling res.write() [and now res.flush()] in batches. After about 10 calls, I see this,

(node) warning: possible EventEmitter memory leak detected. 11 drain listeners added. Use emitter.setMaxListeners() to increase limit

Trace
    at Gzip.addListener (events.js:179:15)
    at Gzip.Readable.on (_stream_readable.js:671:33)
    at Gzip.once (events.js:204:8)
    at Gzip.Zlib.flush (zlib.js:451:10)
    at ServerResponse.flush (/Users/bijoythomas/Zulu/SchedulingMVP/app_node/node_modules/compression/index.js:70:16)

from compression.

dougwilson avatar dougwilson commented on March 29, 2024

Hi! The warning you are getting is a known and fixed Node.js core bug; if you upgrade Node.js to the latest, it should be fixed (or go back to 0.10, which I know was before that regression).

As for your issue, I'm not able to replicate it and all the tests running in Travis CI cannot, either.

If you can provide a PR for the fix or at least provide full code and instructions on how I can reproduce it, I can probably find some time in the next week to debug it. A lot of times people can have a proxy between them and their app, which can cause issues, so be sure you are running your server on localhost to narrow it down to this module. Another common issue is app.use of another module that is buffering your response, so you can try removing all other middleware to narrow it down.

from compression.

bijoythomas avatar bijoythomas commented on March 29, 2024

Thanks for the suggestions. I am running on localhost with no proxies in between. So far, I can confirm that the difference comes down to one line: app.use(compression()).

When commented, I can see the data streaming and the network tab in Chrome dev tools shows me the streaming information (sizes and time)

When enabled, I can see a long pause (Pending) and then finally the data coming in.

I will try to narrow it down to an example.

from compression.

dougwilson avatar dougwilson commented on March 29, 2024

Hi @bijoythomas , you cannot use the Chrome network tab to make the measurements for GZIPed things, as far as I know. We get lots of issues in this repo from Chrome reporting fake things. I thought you said you were using Wireshark to verify this? Did you use Wireshark to verify on your most recent update or just Chrome?

from compression.

bijoythomas avatar bijoythomas commented on March 29, 2024

Both. Wire Shark and Chrome.

from compression.

dougwilson avatar dougwilson commented on March 29, 2024

Cool. Yea, I'm not able to reproduce for whatever reason, all the tests in Travis CI on all their Node.js versions get streaming compressed responses, etc., so I'm not sure how I can proceed without your help on trying to get me a reproducible case or just a PR to address the issue.

from compression.

dougwilson avatar dougwilson commented on March 29, 2024

It's possible there is some kind of Express bug (I see you are using a version from June 2, 2014, well over a year old) or even some kind of bug with the Node.js you are using (from February). Have you also tried upgrading either of those two? I know I've only been testing with the latest versions from those lines to rule out any bugs (Express 4.13.3 and Node.js 0.12.7) that may be in them.

from compression.

dougwilson avatar dougwilson commented on March 29, 2024

In addition, I know you said it was the addition of app.use(compression()) that caused it, but there has been several issues where another module was used that, when both of these modules were used together, this module triggered a bug in the other module. So if you haven't tried already, you still should try removing all other middleware to rule out the issue being a combination of this + another one. I don't have your application, so if there is an interaction like this going on, it could very well be another reason why I'm not able to reproduce the issue myself.

Without being able to see the issue, I'm not able to debug the issue and determine the case or a fix :(

from compression.

bijoythomas avatar bijoythomas commented on March 29, 2024

Sure. I will try to narrow down the middleware and see if things change.

I did a little bit of debugging by adding console.log statements in the compression/index.js file at several points in my node_modules directory.

I see that in the res.flush() function we do have a 'gzip' stream and it is getting called but nothing gets onto the wire.

I added another log statement in the res.end function in there there and once that gets called, is when the actual res.write (actual http response object) is happening. I had added logging in the stream.on('data') function.

I know its hard to understand w/o seeing where I put the logs, but essentially what I'm seeing is a bunch of Flush calls with no data on the wire, followed by the res.end() calls and AFTER that I see the 'data' event from the stream with the underlying res.write() call happening and returning true/false and draining/resuming the stream. But like I said, all this activity seems to be happening only after the res.end() call. It's like the flush has no effect.

from compression.

dougwilson avatar dougwilson commented on March 29, 2024

Gotcha. In fact, from what you are describing, it sounds like you are describing a Node.js bug (perhaps just in 0.12.0).

Essentially all we do is whenever the Node.js code stream emits a data event, we call res.write with that chunk. Every time you call res.write, we call stream.write to that Node.js core stream. When you call res.flush we just call stream.flush on that stream as well. If that doesn't cause the stream to emit a data event, then we don't have anything to call res.write with, of course.

from compression.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.