Giter Site home page Giter Site logo

Comments (7)

sabas avatar sabas commented on May 28, 2024

Uhm, it's strange!
Does it happen also when you var_dump the array?
What's the charset in UNB?
Could you send me an example via email?

from edifact.

ArvidEnbom avatar ArvidEnbom commented on May 28, 2024

Yes, it does also happen if I var_dump the $parser->get() array.

The EDI file I was testing with was using a file encoding of UTF-8, however this is literally the first time I work with this file format, so I have no idea what you mean by "UNB".
This EDI file gave me this error There's a not printable character on line... seven times (once for each line with a swedish character, I'm assuming)

I have two example EDI files, however, and the second one was using Windows 1252 encoding instead of UTF-8. I tried this one as well, and it no longer gave me the not printable character error, so at first I thought it worked, but then I scrolled down and found that it hadn't worked.

I'm not sure if I am allowed to send you one of the EDI files, so we'll have to hold off on that for now.

from edifact.

sabas avatar sabas commented on May 28, 2024

Try this, load your file with file_get_contents in a variable $text

$p = new Parser();
$p->setStripRegex("/[\x01-\x1F\x7F-\x9F]/"); //or something less restricting
$p->loadString($text);

from edifact.

ArvidEnbom avatar ArvidEnbom commented on May 28, 2024

I'm already sort of doing that. Here's my entire debug file currently

<!DOCTYPE html>
<html lang="sv">
	<head>
		<meta charset="UTF8">
	</head>
	<body>
		<pre><?php

			require( __DIR__ . "/EDI/Parser.php" );

			$edi = file_get_contents( "in.edi" );

			$parser = new \EDI\Parser($edi);

			var_dump( $parser->get() );
		?></pre>
	</body>
</html>

I'll try your above code too, one moment

EDIT: Tried your above code, no change. Also tried changing the pattern a bunch, including these

$parser->setStripRegex("[\x01-\x1]");
$parser->setStripRegex("[\x7F-\x9F]");
$parser->setStripRegex("//");
$parser->setStripRegex("/MATCH NOTHING PLEASE/");

none of those worked.

from edifact.

sabas avatar sabas commented on May 28, 2024

If you var dump $edi you see the correct chars?

If you generate an anonymized version I can look at it, because with other diacritics I see it works as expected...

from edifact.

ArvidEnbom avatar ArvidEnbom commented on May 28, 2024

Yes, I do see the correct chars then
I've asked to see if I was allowed to share a file, so I'm sending it to you via email

from edifact.

ArvidEnbom avatar ArvidEnbom commented on May 28, 2024

To anyone else reading this: After some emails back and forth we found the solution.

use utf8_decode on the file contents

$file = utf8_decode(file_get_contents($path));
$parser = new EDI\Parser($file);
// ... and so on

And then, use utf8_encode($var) every time you read a value from the resulting array.

This will fix it.

from edifact.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.