Giter Site home page Giter Site logo

Comments (4)

sporkmonger avatar sporkmonger commented on July 21, 2024

Hmm, yeah that '+' probably should not be being decoded.

However, "There is no way to know if normalized uri will have characters escaped or not" is not accurate. Addressable normalizes to the set of characters that the URI spec explicitly allows in decoded form for each component. This tends to be dramatically more permissive of decoded characters than most people are used to, but it's more precisely correct.

The '+' character is a weird edge case where the correct answer is often unclear. The URI spec indicates it should be decoded, but the HTML specification would make decoding it ambiguous as to whether it should be treated as a '+' or a space. I continue to curse the people who put that paragraph into the HTML spec. One of the worst-thought-out and worst-specified bits of text ever in an accepted spec.

So as an accommodation to the HTML spec and widely used conventions, I opt to break the URI spec in the handling of the '+' character within query strings only.

But what all this should tell you is that if you're looking for specific characters to be encoded and specific characters to be unencoded, normalize may not be the method you want. The encoding methods in Addressable can take an optional character class string which directly controls what gets encoded.

from addressable.

bblimke avatar bblimke commented on July 21, 2024

Thanks for the detailed explanation. I understand the problem with '+' characters now. Indeed, the HTML spec is a pain.

Does it makes sense to change character class used in normalized_query and to not decode %2B? If it doesn't I will try to encode query with a custom character class in webmock, instead of using normalized_query.

from addressable.

sporkmonger avatar sporkmonger commented on July 21, 2024

I started trying to get this resolved, but it turns out to be a lot harder to do correctly than I expected.

The trick with the '+' character is that it needs to be handled differently from all other characters. Normally you would unencode, apply Unicode NFKC normalization, then selectively re-encode. However, the '+' character should simply have nothing happen to it at all. However this turns out to be hard to do, because the normalize method is currently implemented sort of like this:

encode(unicode_normalize(unencode(input)))

The trick is that if you tell the unencode method not to unencode "%2B", now the encode method is going to want to encode the "%" character as "%25".

from addressable.

sporkmonger avatar sporkmonger commented on July 21, 2024

Closed by #99.

from addressable.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.