Comments (37)
Weird. I just added a test that verifies that we do indeed set the Vary
header on a 304. Perhaps there is something more going on beyond the report? Is is possible you can provide code to reproduce?
from compression.
Actually, I see that typically you would not send a Content-Type
header with a 304 response, thus by default we would filter out the request (since we only vary responses with content types we support compression on). I need to read some more on this...
from compression.
From reading the paragraph:
The server generating a 304 response MUST generate any of the
following header fields that would have been sent in a 200 (OK)
response to the same request: [...] Vary.
It seems like it may actually be impossible for this module to comply with that, because without the Content-Type
response header sent, how can we know if a 200 response would have thus included a Vary
header or not?
from compression.
Right, yeah...the issue is the typical lack of Content-Type
and the problem with complying is that we can't know what the 200 would've done. I'm not sure what the solution is either (or if there even is a decent generic one); I'll think on it more too. For now, I just wanted to put the issue out there.
from compression.
For now, I just wanted to put the issue out there.
Indeed, yes! I never even thought about it :) I'll also be thinking on it, haha, which is why I re-opened it when I realized why it would be occurring (with the default lack of Content-Type
).
from compression.
Hey @ethanresnick , just passing by this issue to see if you have any additional thoughts?
from compression.
I haven't had much time to think about it, but here's one (perhaps naive) idea that comes to mind: have a map that persists between requests and that stores, for each 200 response, whether it was compressible; if so, the module would add Accept-Encoding
to Vary
to subsequent requests for the same resource even if those requests 304.
One tricky bit here would be figuring out how to make the map's key to identify a request, but that seems doable. Off the top of my head, it seems like we'd want it to be a composite of the request URI and the value of any headers that the user put in Vary
. I don't think including the http method in the key should be necessary, since the imaginary request that the 304 response's headers are based on is always the GET
.
The advantage of this approach is that its totally transparent to users of the module. The downside is that it would require some extra memory.
I think this downside could be mitigated though by also allowing the user to provide a function that, if provided, would act as a replrcement for adding to/looking things up in the map. The function would be called on 304s and would receive the same information about the request that would've gone into the map's cache key, and would be responsible for returning true or false (indicating whether the 200 response was cacheable). These functions could be quite simple too--always returning true in the most basic case (e.g. for a server that only serves Json), or switch
ing on a file extension pulled from the request uri, or similar.
Hopefully, this could kind of get the best of both worlds: a totally transparent solution for people who don't mind a bit of extra memory and a space efficient solution for those who need it.
Again, just one idea. I'm curious to hear others too.
from compression.
That sounds like introducing a lot of complication to this module, though, and it also doesn't work out-of-the-box with horizontal scale, which is not something I want to publish (just like we, as npm community, don't publish modules that do fs.*Sync
at runtime by default, we don't publish things that prevent horizontal scaling by default). It also has a lot of issues with even being consistent:
- It only reliably works where the cache has existed for all time. Once the cache is cleared (server restart with that default implementation), then requests for a specific route will always 304 without
Vary
until finally a request comes in that resulted in a 200 for us to capture the information. - It falls apart as soon as someone doesn't always serve the same
Content-Type
for everyone from the same URL or someone has a more complexfilter
argument. Default things in the library need to always work reliably, no matter what the other options to the same library are.
from compression.
You're definitely right about the horizontal scaling and that the design assumes the cache has been on forever. I was thinking a bit about these issues when I made the proposal, but I just assumed it would be better for the module to sometimes do the right thing than to never do so. That's probably not right.
One solution then would be to make the user-provided function I talked about above mandatory. It would be called on 304 and asked to return whether the content from the corresponding 200 would've been compressible.
from compression.
One solution then would be to make the user-provided function I talked about above mandatory. It would be called on 304 and asked to return whether the content from the corresponding 200 would've been compressible.
Does the filter
function not fit this? We call it on 304 currently, and they can say yay or nay.
from compression.
Does the filter function not fit this?
It does! I'd forgotten about that function entirely, since I've always just left the default.
I think we'd want to tweak it's description in the docs to explain this case (since the user's not really "considering for compression" a 304 response), and maybe expand the default filter function to make some decent guesses when Content-Type
isn't set (e.g. by checking if the request URI ends in a compressible extension). But that seems like a workable approach!
EDIT That said, I can also see a UX argument for making this a separate function, since a lot more people will probably need to customize this than will need to customize the default filter function
from compression.
At this point, we may as well simply always set Vary.
from compression.
Maybe just make that the default (i.e. the default function just returns true
)? That way, people who care about maximizing cache hits can use the shorter Vary on non-compressible resources?
from compression.
Maybe just make that the default (i.e. the default function just returns true)?
You mean just make filter
always return true by default?
from compression.
Yup!
from compression.
(On 304s, that is; the existing logic would remain for other responses)
from compression.
Yup! (On 304s, that is; the existing logic would remain for other responses)
Well, that won't work, because it's still a violation of the RFC, which states:
The server generating a 304 response MUST generate any of the following header fields that would have been sent in a 200 (OK) response to the same request: Cache-Control, Content-Location, Date, ETag, Expires, and Vary.
Making the change you suggest is still just as much in violation that the module is in right now. All you're doing is change it so 304s will always include a Vary header and 200s will sometimes includes a Vary header.
from compression.
Neither of my proposals was intended to obviate the need for some people to customize to achieve RFC compliance. Under the current setup, though, no one gets RFC compliance out of the box; with a default, at least some people do (i.e. those whose servers fit the default's logic)—and the others are no worse off than they are now.
So my proposals were about trying to find a decent default with the above in mind. I proposed the “always return true” default because I thought you were saying that the more sophisticated default (of checking URI endings etc) was too complicated, and obviously we want to balance minimizing the number of people that need to customize with not making the base library overly complicated.
If you don't want anyone to have to customize to achieve compliance, I think the only way to get it is to make the default the filter function work the same way regardless of whether the response is a 304…but that would mean not relying on the Content-Type header, which would likely mean that far fewer compressible requests would get compressed by default. I don’t think that’s a worthwhile tradeoff.
Maybe there’s some entirely different option here, but I’m not seeing it...
from compression.
Ok, well here's the problem here: I need you to provide a specific problem to solve here. Your original post was that we are not RFC compliant, which is true. Thus if that is the issue, then the solution must bring us into RFC compliance, which is a MUST and thus we cannot sometimes be right, we MUST always be right, which with the current code means always setting Vary above the filter. I will be committing and releasing that change today unless you can provide a different problem statement.
from compression.
Hey Doug. I apologize if my messages here have been a bit confused—I've been trying to attend to this thread while simultaneously doing a million other things.
You're right that always setting Vary: Accept-Encoding
, regardless of whether the request ends up being compressible (i.e. passing the filter), does bring us into RFC compliance on the point I quoted at the start of the thread.
What I was trying to say here though is that just always setting Vary
comes with its own tradeoff, namely: resources that are not in fact compressible become less cacheable, because caches have to use the (often-poorly-normalized) Accept-Encoding
header as part of their cache keys. So, if we can avoid that consequence, that should be part of the problem statement too.
Maybe we can rewrite onHeaders
like so:
onHeaders(res, function(){
// on loading the module, the user can optionally provide a function that
// determines for a 304 response whether the corresponding 200 would've varied.
// It lives in the variable `wouldCompress` (as opposed to `shouldCompress`)
var toVary =
// always vary if the user didn't provide `wouldCompress`,
// since we can't be smarter.
typeof wouldCompress !== "function" ||
// othewise determine whether to vary on 304s using wouldCompress,
// and using filter (its non-304 counterpart) for other responses.
(res.status() === 304 ? wouldCompress(req, res) : filter(req, res));
if(toVary) {
vary(res, 'Accept-Encoding');
}
// determine if request is filtered
if (!filter(req, res)) {
nocompress('filtered')
return
}
// do compress...
})
from compression.
All of that wouldCompress
logic the user can literally do with filter
, so we are not going to add another function for no reason. There are only three courses of action I'm going to really entertain from here, so this thread can end:
- We close the issue and leave the behavior as it is now, where the user can provide a
filter
that works how they like and we set theVary
based on theirfilter
function. - We move the
Vary
header setting above the filter. The user now can only control theVary
getting set based on where the middleware is located or using a middleware wrapping module. - You define the exact reason we need to make the change for. If it is RFC compliance, then number 2 above the the absolute only solution. If it is just to interop with some third-part caching software, please let me know so we can ensure that the solution works for that software.
I don't mean to be sort, but this issue is just going in circles. If you even want to speak over the phone or Skype or something, you can email me. I'll be releasing number 2 tonight unless I hear a very compelling argument why we should not be RFC compliant.
from compression.
Basically, can you explain how, exactly, the code in #44 (comment) allows you to in any way do something different than simply defining your own filter
function? Your code, by default, will act just like it does now. Otherwise, it'll call a different function for 304s, yet you can easily make that code patch branch in your own filter
function, so it does absolutely nothing different than filter
, only makes it more complicated. TL;DR your code suggestion in #44 (comment) is just asking for option number 1: do nothing and close this issue.
app.use(compresion({
filter: function (req, res) {
if (res.statusCode !== 304) return /* your filter function or compression.filter(req, res) */
else /* your wouldCompress function */
}
}))
from compression.
@dougwilson My code does number 2 by default. The user doesn't pass in would compress, so Vary always gets set.
from compression.
Gotcha. So number 2 it is!
from compression.
@dougwilson Right, of your three options, I'm definitely for the second. The reason I introduced a new wouldCompress
function is because, if you do option two without it, then the user has to use filter
to remove certain Vary
headers that they don't want added (but that were put on previously when option 2 was applied). And then suddenly filter isn't just a simple predicate, but is this ugly side effect thing
from compression.
Ok, but like I said: we are not adding a second weird function. They don't have to use filer
to remove from Vary
, they can re-arrange the middleware to be in a more sane location or use on-header
in their code to remove from Vary
.
from compression.
Ok, then option 2 it is.
from compression.
If you like, though, I can add an option that you can set to true
or false
for even adding to Vary
at all and then people can set it to false
and use their own logic if they like. How does that sound?
from compression.
Yeah, I'd be in favor of adding that option too. Would simplify things for people who really want to manage vary.
from compression.
So basically we get:
onHeaders(res, function(){
if(options.setVary) {
vary(res, 'Accept-Encoding');
}
// determine if request is filtered
if (!filter(req, res)) {
nocompress('filtered')
return
}
// do compress...
})
from compression.
I apologize again for all the going in circles. I'm gonna close this now, and then head out. Enjoy your 4th of July :)
from compression.
Ok. For posterity, here is why it was located after the filter
: senchalabs/connect#848 people say that before it was breaking Akami. I guess we'll just go back to breaking Akami again :)
from compression.
Then it seems like we've gotten to a pretty good place with following the rfc by default but adding an option :) [I'm not using Akamai, and wasn't arguing based on that issue, but it's good to know about.]
from compression.
Cool. Yes, happy 4th! I'll think on this issue a bit, because I'm not sure about breaking Akami by default all of a sudden (we can, but it may have to be a major version in a few months down the road instead of this weekend).
from compression.
Gotcha. If possible we should first check whether Akamai's already fixed its behavior. Because they may well have---it's been over a year since that other issue, and, more importantly, the creation of RFC 7232 may have brought this problem to their attention (the prior RFC 2068 actually did not have this requirement; it had a slightly weaker one instead). Now, I'm not sure how to go about checking this given the private kbs, but maybe someone you know could help?
from compression.
Right, I don't know how to check, either. I may be able to find someone with Akamai access I can ask, but it could be a few weeks if that's OK.
from compression.
Though I doubt Akamai works any different today than a year ago, seeing as Drupal just made a commit to fix this very thing only about 15 days ago: http://cgit.drupalcode.org/cdn/commit/?id=a3408dc
from compression.
Related Issues (20)
- Setting Vary header although caching is disabled HOT 1
- "drain" event listener leak when using res.once("drain"); can't use res.removeListener("drain") HOT 3
- compresssion doesn't work ,the vue.txt is 2m HOT 2
- Content-Type: application/json; charset=utf-8 No effect HOT 2
- Question: Why this middleware HOT 2
- Corrupted compressed .js-files for Mac OS / Safari -clients HOT 11
- Is compression working when node server is running on a container? HOT 2
- middleware fails when the request has more than 1 values for accept-encoding header HOT 2
- Is compression result cached? HOT 1
- change Transfer-encoding HOT 1
- Why does the data size increase after compression HOT 1
- Force size to be a minimum... HOT 2
- Chunked encoding is broken after using this middleware HOT 1
- Using a current debug version HOT 1
- Deflate backwards HOT 7
- Compression instrumentation (before/after compression hooks) HOT 2
- Angular Not Compressing? HOT 2
- compression not working json payload HOT 6
- Crash when compressing characters like ū HOT 1
- Express returns a non-compliant HTTP/206 response when gzip is enabled HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from compression.