Giter Site home page Giter Site logo

is_utf8 helper about enc HOT 9 CLOSED

krlmlr avatar krlmlr commented on August 18, 2024
is_utf8 helper

from enc.

Comments (9)

krlmlr avatar krlmlr commented on August 18, 2024

This will need C code to distinguish between "unknown" encoding and truly "all ASCII", basically an enhanced do_encoding() that also checks IS_ASCII(). Eventually this could be added upstream.

from enc.

hadley avatar hadley commented on August 18, 2024

What's the difference between unknown and ascii? I thought unknown implied ascii.

from enc.

krlmlr avatar krlmlr commented on August 18, 2024

Need to double-check.

from enc.

krlmlr avatar krlmlr commented on August 18, 2024

There' a dedicated ASCII bit that is set if the code of all characters is 127 or less.

from enc.

hadley avatar hadley commented on August 18, 2024

But that never gets used in Encoding(): https://github.com/wch/r-source/blob/2f2e4711ad7089f97f22c6b1ae25ba582d2e99a6/src/main/util.c#L1110

from enc.

hadley avatar hadley commented on August 18, 2024

Also, none of the internal string representation stuff is in the exported API, so I think that means doing checks in R with Encoding() will be the easiest way forward.

from enc.

krlmlr avatar krlmlr commented on August 18, 2024

ASCII implies unknown, but not the other way round. It will be difficult to detect pure ASCII strings using Encoding() only.

from enc.

hadley avatar hadley commented on August 18, 2024

Yes, but it'll be accurate >95% of the time, I'd imagine

from enc.

github-actions avatar github-actions commented on August 18, 2024

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

from enc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.