Comments (4)
The problem is that predefined entities are turned into Atom(c), where c is the character represented by the entity. For example, < turns into an Atom('<'). This happens in MarkupParser:
case '&' => // EntityRef or CharRef
nextch(); ch match {
case '#' => // CharacterRef
nextch()
val theChar = handle.text(tmppos, xCharRef(() => ch, () => nextch()))
xToken(';')
ts &+ theChar
The &+ operator is defined in NodeBuffer as follows:
def &+(o: Any): NodeBuffer = {
o match {
case null | _: Unit | Text("") => // ignore
case it: Iterator[_] => it foreach &+
case n: Node => super.+=(n)
case ns: Iterable[_] => this &+ ns.iterator
case ns: Array[_] => this &+ ns.iterator
case d => super.+=(new Atom(d))
}
...
}
That sounds innocuous, but there is no good way to save such a thing. Calling XML.write or PrettyPrinter.formatNodes ultimately goes through Utilities.sequenceToXML and this branch:
else if (children forall isAtomAndNotText) { // add space
val it = children.iterator
val f = it.next()
serialize(f, pscope, sb, stripComments, decodeEntities, preserveWhitespace, minimizeTags)
while (it.hasNext) {
val x = it.next()
sb.append(' ')
serialize(x, pscope, sb, stripComments, decodeEntities, preserveWhitespace, minimizeTags)
}
}
When saving an element that contains only non-text atoms, they are separated by spaces.
The simplest remedy would be to change
ts &+ theChar
to
ts &+ Text(theChar.toString)
in MarkupParser.
from scala-xml.
Good analysis of the problem. Not sure if that change to MarkupParser
will fix it though.
The parser will need some work to be smart enough to preserve white space between entities. Obviously, white space shouldn't be preserved, when it's not desired, between other types of XML "nodes". However, that feature seems to be the source of the problem. An Atom
is considered a Node
, but entities are indeed a special case. They should be treated like text, so any adjacent whitespace should be treated as text.
The MarkupParser
doesn't have a method to consume white space characters in to a Text
object. The appendText
method would be useful here. Unfortunately, the preserveWS
affects the behavior of appendText
.
from scala-xml.
I've taken a first pass at writing unit tests should anyone want to take this one on.
from scala-xml.
Hi @ashawley, I have tried to fix this issue and also added the tests you provided. It would be great if you could take a look at it. Thanks!
from scala-xml.
Related Issues (20)
- Shut down Gitter room, enable GitHub Discussions? HOT 4
- CDATA in MarkupParser HOT 3
- ConstructingParser does not tolerate start of file whitespace HOT 1
- ConstructingParser throws NoSuchMethodError in Scala 3 HOT 6
- Release 2.0.1
- versionPolicyCheck fails for scalajs-library HOT 4
- ""
- Publish next release (to publish Scala 3 Native support) HOT 15
- Version 2.10 null pointer on toString
- Name predicates: which XML version? HOT 2
- Re-enable versionPolicyCheck HOT 2
- Update CHANGELOG, README? HOT 1
- Drop support for Scala 2.11? HOT 5
- Unused value warnings for XML literals starting with Scala 2.13.9 HOT 1
- Infinite loop caused by mismatched quotes HOT 3
- Roll a 2.2 release HOT 5
- Release 1.3.1 HOT 10
- FactoryAdapter loadDocument() modifies XMLReader and breaks it HOT 4
- Publish an artifacts for Scala-Native-0.5.0-RC2 HOT 6
- Publish for Scala Native 0.5 HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from scala-xml.