Giter Site home page Giter Site logo

HTML convert about node-iconv HOT 5 CLOSED

bnoordhuis avatar bnoordhuis commented on August 16, 2024
HTML convert

from node-iconv.

Comments (5)

bnoordhuis avatar bnoordhuis commented on August 16, 2024

I don't understand what you mean. node-iconv converts text from one character set to another, it doesn't do anything with HTML.

from node-iconv.

tluyben avatar tluyben commented on August 16, 2024

If you read Google ES with thiss;

    request({ uri:'http://www.google.es' }, function (error, response, body) {
      if (error && response.statusCode !== 200) {
        console.log('Error when contacting google.com')
      }

it is ISO-8859-1 encoding as you can see in the HTML which is returned by Google.

When I output the response straight away with

sys.puts(body);

It outputs normal HTML but the wrong characters, so I want to convert one char set into another, so I run;

iconv = new Iconv('ISO-8859-1', 'UTF-8');
body = iconv.convert(body);

when I then print

sys.puts(body)

All characters are encoded, for instance;

          <textarea id="csi" style="display: none;"></textarea><div       id="mngb"><div id="gb"><script>window.gbar&&gbar.eli&&gbar.eli()<

etc....

And the characters are not correct either after the convert.

Edit: I'll try to make some simple to reproduce it...

from node-iconv.

tluyben avatar tluyben commented on August 16, 2024
    request({ uri:'http://www.google.es' }, function (error, response, body) {
      if (error && response.statusCode !== 200) {
        console.log('Error when contacting google.com')
      }
              iconv = new Iconv('ISO-8859-1', 'UTF-8'); 
              body = iconv.convert(body);

I'll try to reproduce the html encoded characters as well, but the above does not fix the characters anyway, even though I think it's ISO-8859-1 to UTF-8; it still contains stuff like;

              Google.es tambi�n en

Any idea what could be wrong?

Edit: php -r 'echo iconv("ISO-8859-1", "UTF-8", file_get_contents("http://www.google.es"));'

Does what is expected.

from node-iconv.

tluyben avatar tluyben commented on August 16, 2024

Ok, after a lot of searching and trying, I found that I have to use

    _res = new Buffer(_res, 'binary');
    var iconv = new Iconv(enc, 'UTF-8');
    _res = iconv.convert(_res).toString('utf8');

That was incredibly unclear, but ok, it works ;)

from node-iconv.

bnoordhuis avatar bnoordhuis commented on August 16, 2024

Glad you got it solved.

from node-iconv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.