Giter Site home page Giter Site logo

Comments (6)

GregMage avatar GregMage commented on June 16, 2024

I think it's OK now

from xoopscore25.

geekwright avatar geekwright commented on June 16, 2024

I took a closer look, and cleared off some cob webs in a dark corner in my brain. 😄

I think the new utf8_encode() on the PHP side is just masking the real issue. The problem seems to start from the escape() call in formdhtmltextarea.js

escape("é") results in "%E9" which is just the raw code point % encoded.
escape("跠") results in "%u8DE0" which is an encoding of the raw unicode code point

There is no single function to reverse this in PHP. Changing that "%uXXXX" into something PHP can use is the what that preg_match_all code is about.

A further consideration is that escape() is no longer part of any javascript standard. Implementations still include it, but it is considered obsolete.

In contrast, let's look at what happens with encodeURIComponent()

encodeURIComponent("é") results in "%C3%A9" which is the correct UTF-8 encoding for the code point, percent encoded for transmission.

encodeURIComponent("跠") results in "%E8%B7%A0" which is again correct UTF-8 ready for transmission.

Since encodeURIComponent (and its reverse decodeURIComponent) are based on the rfc3986 standard, we can reverse both of these in compatible PHP functions, rawurldecode and rawurlencode.

If we use encodeURIComponent() in formdhtmltextarea.js, and then decode it with rawurldecode() in formdhtmltextarea_preview.php, I think everything should pass through end to end, no matter what the page encoding is set to. With what we have now, the high ascii range (0x80-0xFF) gets handled out of bounds and differently on both ends.

Also, looking closer, we probably should decode everything before we process the xoopscode with the text sanitizer?

from xoopscore25.

geekwright avatar geekwright commented on June 16, 2024

I just want to add a short note on what brought all of this to my attention in the first place.

The protector filter I was testing is checking language characteristics. (The idea is that if you are a new user posting in korean on a french site, you are probably a spammer.) The funny thing was, the ajax preview sailed right through, even though anything that submitted the actual form got rejected. The custom hybrid encoding didn't trigger any warnings in the preview, even though it showed up as korean text.

from xoopscore25.

GregMage avatar GregMage commented on June 16, 2024

You're right, my code was not good. I did like as you said and it works.

Also, looking closer, we probably should decode everything before we process the xoopscode with the text sanitizer?

Apparently, this is not necessary. I think now it's ok! #214

This is a very good idea to test language

from xoopscore25.

geekwright avatar geekwright commented on June 16, 2024

There are a lot of subtle interactions in this issue - thanks for being patient!

from xoopscore25.

geekwright avatar geekwright commented on June 16, 2024

Works well!

from xoopscore25.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.