Giter Site home page Giter Site logo

Comments (4)

pilif avatar pilif commented on June 24, 2024

You can't unconditionally utf-8 encode: If the input data is already in utf-8 (likely), this would lead to a double encode which, again, breaks characters.

You can of course detect whether the input is already in utf-8 (mb_detect_encoding()), but it's not the quickest operation of them all.

I would recommend this being handled on the client side for performance reasons, especially because more and more projects are using utf-8 internally.

from php-resque.

roynasser avatar roynasser commented on June 24, 2024

@pilif Hey! That is what I ended up doing... in my enquer I encode the strings, and I always decode them on my job workers.

Anyways, I thought it was useful to post here as it took me a little while to figure out what was going on with the empty payloads...

from php-resque.

danhunsaker avatar danhunsaker commented on June 24, 2024

Even mb_detect_encoding() isn't enough, sometimes. The trouble is that detecting character encodings is always a bit of a hack, relying on a lot of guess-work. This is because the actual encoding of a string isn't actually stored anywhere, and most encodings are extremely similar to one another. A string of UTF-8 encoded text that doesn't happen to use any characters outside the range shared by ASCII, for example (anything with a byte value between [IIRC] 32 and 126, decimal - most English text qualifies here), could be detected as either ASCII or UTF-8, or perhaps even a host of other encodings that use the same range of bytes in the same way.

The only piece of code in a position to know for sure is the piece that actually accepts the input. In most PHP use cases, this is the browser. Most of the rest depend on the encoding of whatever other application (MySQL, Redis, web API, etc) the data is coming from. Some of these will tell you the encoding they've used. Some of them won't. And sadly, sometimes the ones that do mention an encoding are simply wrong.

Ultimately, the only sane location to handle encoding conversions is in the Resque client, not the library itself.

from php-resque.

danhunsaker avatar danhunsaker commented on June 24, 2024

Is this still an issue, or can it be closed?

from php-resque.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.