Giter Site home page Giter Site logo

Comments (11)

gggeek avatar gggeek commented on September 5, 2024

It is most likely because you are not telling correctly to the xmlrpc library what character encoding your text is using.

I assume that you are in charge of the server, and that the string comes from a database or some other internal processing.
If the string is encoded as UTF-8, then you need to set
$xmlrpc_internalencoding='UTF8';
as documented here: http://gggeek.github.io/phpxmlrpc/doc-2/ch08s02.html#id937475

from phpxmlrpc.

gggeek avatar gggeek commented on September 5, 2024

ps: the online debugger has the 'Show debug info' option which can be used to see all details of the request and response received. This is useful when troubleshooting character encoding problems

from phpxmlrpc.

gggeek avatar gggeek commented on September 5, 2024

pps: if the error you mention is different from what I assume, e.g. the data wit wrong encoding comes from the client, or you are in charge of the client and not the server, do not hesitate to come back with more info

from phpxmlrpc.

fatica avatar fatica commented on September 5, 2024

Thanks for the reply. We had actually set this value as a troubleshooting attempt but it did not affect the output.

$xmlrpc_internalencoding='UTF8';

Execution of method examples.echo on server http://phpxmlrpc.sourceforge.net/server.php ...

Debug info:

---SENDING---
POST /server.php HTTP/1.0
User-Agent: XML-RPC for PHP 3.0.0
Host: phpxmlrpc.sourceforge.net
Accept-Charset: UTF-8,ISO-8859-1,US-ASCII
Content-Type: text/xml
Content-Length: 163

examples.echo Windmühle ---END--- ---GOT--- HTTP/1.0 200 OK Cache-Control: max-age=172800 Date: Thu, 09 Apr 2015 17:34:12 GMT Via: 1.1 varnish Age: 0 Server: Apache/2.2.15 (CentOS) Vary: Accept-Charset, Accept-Encoding, User-Agent Content-Length: 1119 Content-Type: text/xml Expires: Sat, 11 Apr 2015 17:34:12 GMT X-Varnish: 1385841160 faultCode 109 faultString XML error: Invalid character at line 4, column 40 ---END--- HEADER: cache-control: max-age=172800 HEADER: date: Thu, 09 Apr 2015 17:34:12 GMT HEADER: via: 1.1 varnish HEADER: age: 0 HEADER: server: Apache/2.2.15 (CentOS) HEADER: vary: Accept-Charset, Accept-Encoding, User-Agent HEADER: content-length: 1119 HEADER: content-type: text/xml HEADER: expires: Sat, 11 Apr 2015 17:34:12 GMT HEADER: x-varnish: 1385841160 ---SERVER DEBUG INFO (DECODED) --- +++GOT+++ examples.echo Windmühle +++END+++
HEADER: Via: gggeek.altervista.org (PHP)
HEADER: Accept-Charset: UTF-8,ISO-8859-1,US-ASCII
HEADER: Host: phpxmlrpc.sourceforge.net
HEADER: User-Agent: gggeek.altervista.org/199.27.128.200 (compatible; MSIE 7.0;)
HEADER: Content-Length: 163
HEADER: Content-Type: text/xml
HEADER: X-Forwarded-For: 199.27.128.200, 176.9.36.157
HEADER: X-Remote-Addr: 176.9.36.157
HEADER: X-Varnish: 1385841160

---END---
---PARSED---
xmlrpcval::__set_state(array(
'me' =>
array (
'struct' =>
array (
'faultCode' =>
xmlrpcval::__set_state(array(
'me' =>
array (
'int' => 109,
),
'mytype' => 1,
'_php_class' => NULL,
)),
'faultString' =>
xmlrpcval::__set_state(array(
'me' =>
array (
'string' => 'XML error: Invalid character at line 4, column 40',
),
'mytype' => 1,
'_php_class' => NULL,
)),
),
),
'mytype' => 3,
'_php_class' => NULL,
))
---END---
XMLRPC call FAILED!

Fault code: [109] Reason: 'XML error: Invalid character at line 4, column 40'

09/Apr/2015:19:34:12

from phpxmlrpc.

gggeek avatar gggeek commented on September 5, 2024

mmm, it seems like the debugger still has some problems properly coping with character sets (in this case: latin-1 characters, not utf-8 ones).

I just pushed a fix for this to the php53 branch (current master).

I can backport it to the 3.0 branch, but there are a couple of things I want to get fixed properly, so it will take a little time. If you want, in the meantime I can update the public debugger for you to test.

About your specific use case: apart from the debugger not working, what exact setup and symptoms do you have?

  • are you in control of both server side and client side?
  • what character sets are used at all steps of the chain?

A known limitation of the library is that, server-side, if latin-1 characters are received, and the charset declaration is in the http header bot not the xml prologue, parsing will fail.
This could be either better documented, or properly fixed, but I fear that it might slow down the code a bit...

from phpxmlrpc.

fatica avatar fatica commented on September 5, 2024

Hi,

I'm pretty sure we fixed the problem on our end by adding a layer just
before the request is posted to the library as follows:

$data = $this->file_get_contents_utf8('php://input');
...
$xmlrpc->service($data);
...

   //Added to prevent incorrect encoding
    function file_get_contents_utf8($fn) {
        $content = file_get_contents($fn);
        return mb_convert_encoding($content, 'UTF-8',
             mb_detect_encoding($content, 'UTF-8, ISO-8859-1', true));
   }

The original issue arose from our client, and they only provided to us the
XML request body, which starts like this:

MetaLocator.upsertLocations****_**_**10headerid,name,published,type,address,address2,city,postalcode,phone,fax,link,email,country,language,tld,tag1,icon,lat,lng,externalidrow1"8977","

NCC Guttermann GmbH",1,"B2B PC","Wolbecker Windmühle
55","","Münster","48167","02506/9320-0","02506/9320-20","
https://www.nccms.de","","DE","de

They were likely not setting the Content-type header correctly and/or the
XML prologue when posting to the service as you point out, so the service
had to detect the encoding, which it did as ISO-8859-1. However, having
only the body of the request (without headers) makes it difficult to
reproduce what the client was actually doing.

It appears that both our customer and the debugger have the same issue: not
setting the XML prologue and the Content-type header to include the
charset as in Content-Type: text/xml; charset=utf-8 when posting to the
XMLRPC back end.

By the way, thank you so much for the prompt attention. We were very
surprised to hear back so quickly and succinctly.

Michael Fatica
Principal
Fatica Consulting L.L.C.
303-325-5912
www.fatica.net

On Fri, Apr 10, 2015 at 5:11 AM, Gaetano Giunta [email protected]
wrote:

mmm, it seems like the debugger still has some problems properly coping
with character sets (in this case: latin-1 characters).

I just pushed a fix for this to the php53 branch (current master).

I can backport it to the the 3.0 branch, but there are a couple of things
I want to get fixed properly, so it will take a little time. If you want,
in the meantime I can update the public debugger for you to test.

About your specific use case: apart from the debugger not working, what
exact setup and symptoms do you have?

  • are you in control of both server side and client side?
  • what character sets are used at all steps of the chain?

A known limitation of the library is that, server-side, if latin-1
characters are received, and the charset declaration is in the http header
bot not the xml prologue, parsing will fail.
This could be either better documented, or properly fixed, but I fear that
it might slow down the code a bit...

Reply to this email directly or view it on GitHub
#24 (comment).

from phpxmlrpc.

gggeek avatar gggeek commented on September 5, 2024

You are welcome.

Indeed, it seems that the debugger was doing exactly the same as your client. Not sure if we can call it good luck or bad luck :-)

It definitely is a weak spot in the library, even though in today's utf8-oriented world it is probably a rare occurrence.

For the nitty gritty details: since version 5, the php library used for xml parsing is stupid enough to

  • choke on latin-1 characters unless there is a charset declaration in the xml prologue
  • disallow the coder to tell it that the xml is actually latin-1 instead of utf8

(it was actually better in the dark ages of php4, go figure...) That's why the lib can not cope with latin-1 characters when the charset is properly declared in the http headers

I will work on a "proper" fix for the long term, doing either something similar to what you have done, or injecting the charset declaration in the xml prologue if it is only found in the http headers.

from phpxmlrpc.

gggeek avatar gggeek commented on September 5, 2024

ps: i renamed the issue with a title which better fits results of the investigation

from phpxmlrpc.

gggeek avatar gggeek commented on September 5, 2024

Update: a fix has been developed and applied so far only to the php53 branch, with tests.

Unfortunately it seems that the new code fails on HHVM because it hits an incompatibility in HHVM itself: facebook/hhvm#4837

I will backport the fix to 3.0 branch and document this as known bug

from phpxmlrpc.

gggeek avatar gggeek commented on September 5, 2024

Fixed in release 3.0.1

from phpxmlrpc.

gggeek avatar gggeek commented on September 5, 2024

Note: you will most likely have to remove your added workaround if upgrading to 3.0.1 and later

from phpxmlrpc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.