Giter Site home page Giter Site logo

scala-xml's Introduction

This is Scala 2! Welcome!

This is the home of the Scala 2 standard library, compiler, and language spec.

For Scala 3, visit scala/scala3.

How to contribute

Issues and bug reports for Scala 2 are located in scala/bug. That tracker is also where new contributors may find issues to work on: good first issues, help wanted.

For coordinating broader efforts, we also use the scala/scala-dev tracker.

To contribute here, please open a pull request from your fork of this repository.

Be aware that we can't accept additions to the standard library, only modifications to existing code. Binary compatibility forbids adding new public classes or public methods. Additions are made to scala-library-next instead.

We require that you sign the Scala CLA before we can merge any of your work, to protect Scala's future as open source software.

The general workflow is as follows.

  1. Find/file an issue in scala/bug (or submit a well-documented PR right away!).
  2. Fork the scala/scala repo.
  3. Push your changes to a branch in your forked repo. For coding guidelines, go here.
  4. Submit a pull request to scala/scala from your forked repo.

For more information on building and developing the core of Scala, read the rest of this README, especially for setting up your machine!

Get in touch!

In order to get in touch with other Scala contributors, join the #scala-contributors channel on the Scala Discord chat, or post on contributors.scala-lang.org (Discourse).

If you need some help with your PR at any time, please feel free to @-mention anyone from the list below, and we will do our best to help you out:

username talk to me about...
@lrytz back end, optimizer, named & default arguments, reporters
@retronym 2.12.x branch, compiler performance, weird compiler bugs, lambdas
@SethTisue getting started, build, CI, community build, Jenkins, docs, library, REPL
@dwijnand pattern matcher, MiMa, partest
@som-snytt warnings/lints/errors, REPL, compiler options, compiler internals, partest
@Ichoran collections library, performance
@viktorklang concurrency, futures
@sjrd interactions with Scala.js
@NthPortal library, concurrency, scala.math, LazyList, Using, warnings
@bishabosha TASTy reader
@joroKr21 higher-kinded types, implicits, variance

P.S.: If you have some spare time to help out around here, we would be delighted to add your name to this list!

Branches

Target the oldest branch you would like your changes to end up in. We periodically merge forward from older release branches (e.g., 2.12.x) to new ones (e.g. 2.13.x).

If your change is difficult to merge forward, you may be asked to also submit a separate PR targeting the newer branch.

If your change is version-specific and shouldn't be merged forward, put [nomerge] in the PR name.

If your change is a backport from a newer branch and thus doesn't need to be merged forward, put [backport] in the PR name.

Choosing a branch

Most changes should target 2.13.x. We are increasingly reluctant to target 2.12.x unless there is a special reason (e.g. if an especially bad bug is found, or if there is commercial sponsorship).

The 2.11.x branch is now inactive and no further 2.11.x releases are planned (unless unusual, unforeseeable circumstances arise). You should not target 2.11.x without asking maintainers first.

Repository structure

Most importantly:

scala/
+--build.sbt                 The main sbt build definition
+--project/                  The rest of the sbt build
+--src/                      All sources
   +---/library              Scala Standard Library
   +---/reflect              Scala Reflection
   +---/compiler             Scala Compiler
+--test/                     The Scala test suite
   +---/files                Partest tests
   +---/junit                JUnit tests
   +---/scalacheck           ScalaCheck tests
+--spec/                     The Scala language specification

but also:

scala/
   +---/library-aux          Scala Auxiliary Library, for bootstrapping and documentation purposes
   +---/interactive          Scala Interactive Compiler, for clients such as an IDE (aka Presentation Compiler)
   +---/intellij             IntelliJ project templates
   +---/manual               Scala's runner scripts "man" (manual) pages
   +---/partest              Scala's internal parallel testing framework
   +---/partest-javaagent    Partest's helper java agent
   +---/repl                 Scala REPL core
   +---/repl-frontend        Scala REPL frontend
   +---/scaladoc             Scala's documentation tool
   +---/scalap               Scala's class file decompiler
   +---/testkit              Scala's unit-testing kit
+--admin/                    Scripts for the CI jobs and releasing
+--doc/                      Additional licenses and copyrights
+--scripts/                  Scripts for the CI jobs and releasing
+--tools/                    Scripts useful for local development
+--build/                    Build products
+--dist/                     Build products
+--target/                   Build products

Get ready to contribute

Requirements

You need the following tools:

  • Java SDK. The baseline version is 8 for both 2.12.x and 2.13.x. It is almost always fine to use a later SDK such as 11 or 15 for local development. CI will verify against the baseline version.
  • sbt

MacOS and Linux work. Windows may work if you use Cygwin. Community help with keeping the build working on Windows and documenting any needed setup is appreciated.

Tools we use

We are grateful for the following OSS licenses:

Build setup

Basics

During ordinary development, a new Scala build is built by the previously released version, known as the "reference compiler" or, slangily, as "STARR" (stable reference release). Building with STARR is sufficient for most kinds of changes.

However, a full build of Scala is bootstrapped. Bootstrapping has two steps: first, build with STARR; then, build again using the freshly built compiler, leaving STARR behind. This guarantees that every Scala version can build itself.

If you change the code generation part of the Scala compiler, your changes will only show up in the bytecode of the library and compiler after a bootstrap. Our CI does a bootstrapped build.

Bootstrapping locally: To perform a bootstrap, run restarrFull within an sbt session. This will build and publish the Scala distribution to your local artifact repository and then switch sbt to use that version as its new scalaVersion. You may then revert back with reload. Note restarrFull will also write the STARR version to buildcharacter.properties so you can switch back to it with restarr without republishing. This will switch the sbt session to use the build-restarr and target-restarr directories instead of build and target, which avoids wiping out classfiles and incremental metadata. IntelliJ will continue to be configured to compile and run tests using the starr version in versions.properties.

For history on how the current scheme was arrived at, see https://groups.google.com/d/topic/scala-internals/gp5JsM1E0Fo/discussion.

Building with fatal warnings: To make warnings in the project fatal (i.e. turn them into errors), run set Global / fatalWarnings := true in sbt (replace Global with the name of a module—such as reflect—to only make warnings fatal for that module). To disable fatal warnings again, either reload sbt, or run set Global / fatalWarnings := false (again, replace Global with the name of a module if you only enabled fatal warnings for that module). CI always has fatal warnings enabled.

Using the sbt build

Once you've started an sbt session you can run one of the core commands:

  • compile compiles all sub-projects (library, reflect, compiler, scaladoc, etc)
  • scala / scalac run the REPL / compiler directly from sbt (accept options / arguments)
  • enableOptimizer reloads the build with the Scala optimizer enabled. Our releases are built this way. Enable this when working on compiler performance improvements. When the optimizer is enabled the build will be slower and incremental builds can be incorrect.
  • setupPublishCore runs enableOptimizer and configures a version number based on the current Git SHA. Often used as part of bootstrapping: sbt setupPublishCore publishLocal && sbt -Dstarr.version=<VERSION> testAll
  • dist/mkBin generates runner scripts (scala, scalac, etc) in build/quick/bin
  • dist/mkPack creates a build in the Scala distribution format in build/pack
  • junit/test runs the JUnit tests; junit/testOnly *Foo runs a subset
  • scalacheck/test runs scalacheck tests, use testOnly to run a subset
  • partest runs partest tests (accepts options, try partest --help)
  • publishLocal publishes a distribution locally (can be used as scalaVersion in other sbt projects)
    • Optionally set baseVersionSuffix := "bin-abcd123-SNAPSHOT" where abcd123 is the git hash of the revision being published. You can also use something custom like "bin-mypatch". This changes the version number from 2.13.2-SNAPSHOT to something more stable (2.13.2-bin-abcd123-SNAPSHOT).
    • Note that the -bin string marks the version binary compatible. Using it in sbt will cause the scalaBinaryVersion to be 2.13. If the version is not binary compatible, we recommend using -pre, e.g., 2.14.0-pre-abcd123-SNAPSHOT.
    • Optionally set ThisBuild / Compile / packageDoc / publishArtifact := false to skip generating / publishing API docs (speeds up the process).

If a command results in an error message like a module is not authorized to depend on itself, it may be that a global sbt plugin is causing a cyclical dependency. Try disabling global sbt plugins (perhaps by temporarily commenting them out in ~/.sbt/1.0/plugins/plugins.sbt).

Sandbox

We recommend keeping local test files in the sandbox directory which is listed in the .gitignore of the Scala repo.

Incremental compilation

Note that sbt's incremental compilation is often too coarse for the Scala compiler codebase and re-compiles too many files, resulting in long build times (check sbt#1104 for progress on that front). In the meantime you can:

  • Use IntelliJ IDEA for incremental compiles (see IDE Setup below) - its incremental compiler is a bit less conservative, but usually correct.

IDE setup

We suggest using IntelliJ IDEA (see src/intellij/README.md).

Metals may also work, but we don't yet have instructions or sample configuration for that. A pull request in this area would be exceedingly welcome. In the meantime, we are collecting guidance at scala/scala-dev#668.

In order to use IntelliJ's incremental compiler:

  • run dist/mkBin in sbt to get a build and the runner scripts in build/quick/bin
  • run "Build" - "Make Project" in IntelliJ

Now you can edit and build in IntelliJ and use the scripts (compiler, REPL) to directly test your changes. You can also run the scala, scalac and partest commands in sbt. Enable "Ant mode" (explained above) to prevent sbt's incremental compiler from re-compiling (too many) files before each partest invocation.

Coding guidelines

Our guidelines for contributing are explained in CONTRIBUTING.md. It contains useful information on our coding standards, testing, documentation, how we use git and GitHub and how to get your code reviewed.

You may also want to check out the following resources:

Scala CI

Build Status

Once you submit a PR your commits will be automatically tested by the Scala CI.

Our CI setup is always evolving. See scala/scala-dev#751 for more details on how things currently work and how we expect they might change.

If you see a spurious failure on Jenkins, you can post /rebuild as a PR comment. The scabot README lists all available commands.

If you'd like to test your patch before having everything polished for review, you can have Travis CI build your branch (make sure you have a fork and have Travis CI enabled for branch builds on it first, and then push your branch). Also feel free to submit a draft PR. In case your draft branch contains a large number of commits (that you didn't clean up / squash yet for review), consider adding [ci: last-only] to the PR title. That way only the last commit will be tested, saving some energy and CI-resources. Note that inactive draft PRs will be closed eventually, which does not mean the change is being rejected.

CI performs a compiler bootstrap. The first task, validatePublishCore, publishes a build of your commit to the temporary repository https://scala-ci.typesafe.com/artifactory/scala-pr-validation-snapshots. Note that this build is not yet bootstrapped, its bytecode is built using the current STARR. The version number is 2.13.2-bin-abcd123-SNAPSHOT where abcd123 is the commit hash. For binary incompatible builds, the version number is 2.14.0-pre-abcd123-SNAPSHOT.

You can use Scala builds in the validation repository locally by adding a resolver and specifying the corresponding scalaVersion:

$ sbt
> set resolvers += "pr" at "https://scala-ci.typesafe.com/artifactory/scala-pr-validation-snapshots/"
> set scalaVersion := "2.12.2-bin-abcd123-SNAPSHOT"
> console

"Nightly" builds

The Scala CI builds nightly download releases and publishes them to https://scala-ci.typesafe.com/artifactory/scala-integration/ .

Using a nightly build in sbt is explained in this Stack Overflow answer

Although we casually refer to these as "nightly" builds, they aren't actually built nightly, but "mergely". That is to say, a build is published for every merged PR.

Scala CI internals

The Scala CI runs as a Jenkins instance on scala-ci.typesafe.com, configured by a chef cookbook at scala/scala-jenkins-infra.

The build bot that watches PRs, triggers testing builds and applies the "reviewed" label after an LGTM comment is in the scala/scabot repo.

Community build

The Scala community build is an important method for testing Scala releases. A community build can be launched for any Scala commit, even before the commit's PR has been merged. That commit is then used to build a large number of open-source projects from source and run their test suites.

To request a community build run on your PR, just ask in a comment on the PR and a Scala team member (probably @SethTisue) will take care of it. (details)

Community builds run on the Scala Jenkins instance. The jobs are named ..-integrate-community-build. See the scala/community-builds repo.

scala-xml's People

Contributors

acruise avatar adriaanm avatar ashawley avatar biswanaths avatar catap avatar dependabot[bot] avatar dragos avatar dubinsky avatar edgecaseberg avatar eed3si9n avatar gkossakowski avatar isomarcte avatar joescii avatar khernyo avatar lolgab avatar lrytz avatar nthportal avatar odersky avatar olivierblanvillain avatar paulp avatar philippus avatar retronym avatar rogach avatar scala-steward avatar sethtisue avatar smarter avatar soc avatar som-snytt avatar xeno-by avatar xuwei-k avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scala-xml's Issues

Trouble building on Windows

Hello,

I run sbt in the root directory of scala-xml, but it's unable to find scala.xml.Properties. I'm unable to find it also. Given the name, I think it would be in the pull. Is something missing from source control? Or am I missing a dependency?

Edit: Using Windows 8.1, Scala 2.11.7, and sbt 0.13.8.

Here's the output of the compile:

[info] Compiling 84 Scala sources to C:\dev\scala-xml\target\scala-2.12.0-M2\cla
sses...
[info] 'compiler-interface' not yet compiled for Scala 2.12.0-M2. Compiling...
[info] Compilation completed in 11.349 s
[info] Running scala.xml.Properties
error java.lang.ClassNotFoundException: scala.xml.Properties
java.lang.ClassNotFoundException: scala.xml.Properties
[trace] Stack trace suppressed: run last compile:run for the full output.
java.lang.RuntimeException: Nonzero exit code: 1
at scala.sys.package$.error(package.scala:27)
[trace] Stack trace suppressed: run last compile:run for the full output.
error Nonzero exit code: 1
[error] Total time: 70 s, completed Jul 28, 2015 6:32:38 PM

Here's what I get from last compile:run

[info] Running scala.xml.Properties
[debug] Waiting for threads to exit or System.exit to be called.
[debug] Classpath:
[debug] C:\dev\scala-xml\target\scala-2.12.0-M2\classes
[debug] C:\Users\Nikhil.ivy2\cache\org.scala-lang\scala-library\jars\sc
ala-library-2.12.0-M2.jar
[debug] Waiting for thread run-main-0 to terminate.
error java.lang.ClassNotFoundException: scala.xml.Properties
java.lang.ClassNotFoundException: scala.xml.Properties
at sbt.classpath.ClasspathFilter.loadClass(ClassLoaders.scala:63)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at sbt.Run.getMainMethod(Run.scala:72)
at sbt.Run.run0(Run.scala:60)
at sbt.Run.sbt$Run$$execute$1(Run.scala:51)
at sbt.Run$$anonfun$run$1.apply$mcV$sp(Run.scala:55)
at sbt.Run$$anonfun$run$1.apply(Run.scala:55)
at sbt.Run$$anonfun$run$1.apply(Run.scala:55)
at sbt.Logger$$anon$4.apply(Logger.scala:85)
at sbt.TrapExit$App.run(TrapExit.scala:248)
at java.lang.Thread.run(Thread.java:745)
[debug] Thread run-main-0 exited.
[debug] Interrupting remaining threads (should be all daemons).
[debug] Sandboxed run complete..
java.lang.RuntimeException: Nonzero exit code: 1
at scala.sys.package$.error(package.scala:27)
at sbt.BuildCommon$$anonfun$toError$1.apply(Defaults.scala:1943)
at sbt.BuildCommon$$anonfun$toError$1.apply(Defaults.scala:1943)
at scala.Option.foreach(Option.scala:236)
at sbt.BuildCommon$class.toError(Defaults.scala:1943)
at sbt.Defaults$.toError(Defaults.scala:38)
at sbt.Defaults$$anonfun$runTask$1$$anonfun$apply$36$$anonfun$apply$37.a
pply(Defaults.scala:719)
at sbt.Defaults$$anonfun$runTask$1$$anonfun$apply$36$$anonfun$apply$37.a
pply(Defaults.scala:717)
at scala.Function1$$anonfun$compose$1.apply(Function1.scala:47)
at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:40)
at sbt.std.Transform$$anon$4.work(System.scala:63)
at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:22
6)
at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:22
6)
at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:17)
at sbt.Execute.work(Execute.scala:235)
at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:226)
at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:226)
at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestric
tions.scala:159)
at sbt.CompletionService$$anon$2.call(CompletionService.scala:28)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:51
1)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:617)
at java.lang.Thread.run(Thread.java:745)
error Nonzero exit code: 1

problems parsing a string with query arguments

Hey, I am doing this in my code: println((XML.loadFile(file2) \ "cat:entries").length)
and in my file2 I have the xml:
cat:entries
"<"cat:entry URI="http://www.nitrc.org/ir/data/experiments/NITRC_IR_E10452/scans/T2/resources/bval/files?format=zip&projectIncludedInPath=true&subjectIncludedInPath=true" format="ZIP"/> /cat:entries

This produces this error:
Exception in thread "main" org.xml.sax.SAXParseException; lineNumber: 8; columnNumber: 136; The reference to entity "projectIncludedInPath" must end with the ';' delimiter.
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203)
at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400)
at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:1437)
at com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:891)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanAttribute(XMLDocumentFragmentScannerImpl.java:1548)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1320)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2787)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:327)
at scala.xml.factory.XMLLoader$class.loadXML(XMLLoader.scala:41)
at scala.xml.XML$.loadXML(XML.scala:60)
at scala.xml.factory.XMLLoader$class.loadFile(XMLLoader.scala:48)
at scala.xml.XML$.loadFile(XML.scala:60)
at Curator$.delayedEndpoint$Curator$1(Curator.scala:19)
at Curator$delayedInit$body.apply(Curator.scala:11)
at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at Curator$.main(Curator.scala:11)
at Curator.main(Curator.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:144)

NodeSeq attribute search does not work as expected

When using the back-slash operator on a NodeSeq, the behaviour does not match how XPath works. Getting the attribute value works when there's only a single element in the node sequence:

> val x = <x><a b='1'/></x>
> x \ "a" \ "@b"
res2: scala.xml.NodeSeq = 1

but fails when there are more:

> val x = <x><a b='1'/><a b='2'/></x>
> x \ "a" \ "@b"
res3: scala.xml.NodeSeq = NodeSeq()

expected:

res3: Seq(1, 2)

Non exhaustive match warning while testing.

The following waring is generated while testing.

[warn] /home/b1swanath/code/scala-xml/src/test/scala/scala/xml/ReuseNodesTest.scala:86: match may not be exhaustive.
[warn] It would fail on the following inputs: (List(_), Nil), (Nil, List(_))
[warn]     (original.toList,transformed.toList) match {
[warn]     ^
[warn] one warning found

How to reproduce,

do "sbt clean" followed by "sbt test".

Surprising filter on node failure

With the following code:

https://gist.github.com/fancellu/03bb694c920898613795


So as you can see, with hard coded XML, the filter works fine. With a seemingly equivalent structure, it doesn't find anything.

Probably due to different hashCodes, even though elements render the same

(root"number").hashCode //> Int = 1598709610
(root2"number").hashCode //> Int = 1439920665

(root"number").head.text==(root2"number").head.text //> Boolean=true

NodeBuffer does print as ArrayBuffer

new scala.xml.NodeBuffer() &+ scala.xml.Text("Text 1")

should print as:

NodeBuffer(Text 1)

but prints as

ArrayBuffer(Text 1).

I think you just need to add

override def stringPrefix: String = "NodeBuffer"

PrettyPrinter strips newlines from text in nodes, even pcdata

Migrated from https://issues.scala-lang.org/browse/SI-4303.

There is substantial discussion there that is not reproduced here.

Original description:

What steps will reproduce the problem

scala> <foo>{"hi\nthere"}</foo>
res6: scala.xml.Elem =
<foo>hi
there</foo>

scala> new PrettyPrinter(9999,2).format(<foo>{"hi\nthere"}</foo>)
res7: String = <foo>hi there</foo>

scala> new PrettyPrinter(9999,2).format(<foo>{PCData("hi\nthere")}</foo>)
res8: String = <foo><![CDATA[hi there]]></foo>

When trying to use XMLEventReader with an utf-8 encoded XML document that starts with a byte order mark (EF BB BF), an empty iterator is returned, and an error message is printed to stderr

When trying to use XMLEventReader with an utf-8 encoded XML document that starts with a byte order mark (EF BB BF), an empty iterator is returned, and an error message is printed to stderr.
test.scala
import scala.io.Source
import scala.xml.pull.XMLEventReader

val t = new XMLEventReader(Source.fromFile("hasBOM.xml"))

println(t)
Output of scala test.scala
file:/tmp/hasBOM.xml:1:1: < expected^
empty iterator

The attached file hasBOM.xml has CR+LF newlines, but LF or CR newlines do not change the behaviour.

https://drive.google.com/file/d/0BzQoj9XC6BxUUWZfdzgwb3J4OWc/view?usp=sharing

sync issue for MaxQueueSize

MaxQueueSize is populated in one thread and then used in an other without sync. This randomly (if a code runs for hundreds of times) results in the Parser thread to read MaxQueueSize as 0. In turn this fails with an exception:

Exception in thread "XMLEventReader" java.lang.IllegalArgumentException
at java.util.concurrent.LinkedBlockingQueue.(LinkedBlockingQueue.java:261)
at scala.xml.pull.ProducerConsumerIterator$class.scala$xml$pull$ProducerConsumerIterator$$queue(XMLEventReader.scala:133)
at scala.xml.pull.XMLEventReader.scala$xml$pull$ProducerConsumerIterator$$queue$lzycompute(XMLEventReader.scala:27)
at scala.xml.pull.XMLEventReader.scala$xml$pull$ProducerConsumerIterator$$queue(XMLEventReader.scala:27)
at scala.xml.pull.ProducerConsumerIterator$$anonfun$produce$1.apply$mcV$sp(XMLEventReader.scala:144)
at scala.xml.pull.ProducerConsumerIterator$$anonfun$produce$1.apply(XMLEventReader.scala:144)
at scala.xml.pull.ProducerConsumerIterator$$anonfun$produce$1.apply(XMLEventReader.scala:144)
at scala.xml.pull.ProducerConsumerIterator$class.interruptibly(XMLEventReader.scala:125)
at scala.xml.pull.XMLEventReader.interruptibly(XMLEventReader.scala:27)
at scala.xml.pull.ProducerConsumerIterator$class.produce(XMLEventReader.scala:144)
at scala.xml.pull.XMLEventReader.produce(XMLEventReader.scala:27)
at scala.xml.pull.XMLEventReader$Parser$$anonfun$setEvent$1.apply(XMLEventReader.scala:68)
at scala.xml.pull.XMLEventReader$Parser$$anonfun$setEvent$1.apply(XMLEventReader.scala:68)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
at scala.xml.pull.XMLEventReader$Parser.setEvent(XMLEventReader.scala:68)
at scala.xml.pull.XMLEventReader$Parser.run(XMLEventReader.scala:95)
at java.lang.Thread.run(Thread.java:745)

LibraryDep for scala-xml is wrong

"org.scala-lang.modules" %% "scala-xml" % "1.0.2" does not exist. It should be "org.scala-lang.modules" %% "scala-xml_2.11" % "1.0.2"

scala.xml.XML.write with minimizeTags = xml.MinimizeMode.Never still abbreviates tags

[Originally reported by @ptwithy at https://issues.scala-lang.org/browse/SI-8834]

Looking at the source here:
https://github.com/scala/scala/blob/v2.10.4/src/library/scala/xml/Utility.scala#L227
I can see that the problem is that minimizeTags is not being passed in recursive calls to sequenceToXML. It should be a 1-line fix...
*** 224,230 ****
} else {
// children, so use long form: <xyz ...>...
sb.append('>')
! sequenceToXML(el.child, el.scope, sb, stripComments)
sb.append("</")
el.nameToString(sb)
sb.append('>')
--- 224,230 ----
} else {
// children, so use long form: <xyz ...>...
sb.append('>')
! sequenceToXML(el.child, el.scope, sb, stripComments, minimizeTags = minimizeTags)
sb.append("</")
el.nameToString(sb)
sb.append('>')

XMLEventReader does not handle HTML entities correctly

(Created a new issue from a comment #72 (comment) by @Mark-L6n)

Furthermore, it does not handle other HTML entities well. There are well over 1,000 HTML entities (see list), and their values are simply tossed out with EvEntityRef. I am processing Wikipedia dumps and will encounter a wide range of them.
Why are only 4/5 entities processed, when any entity can occur in a text field? There shouldn't be a security concern, as a motivation for using entities is security.

Also, why are entities treated as an event at all? It'd be nice to have the option to disable this functionality so one could simply get all the text in a EvText() event.
Hopefully, there can be a way to either:

  1. enable EvEntityRef to process all HTML entities or
  2. disable EvEntityRef events from occurring and breaking up EvText() events.

Example problem:

object testEntityErr {
  import scala.io.Source
  import scala.xml.pull._

  val testStr = "<text> &amp; &quot; &lt; &gt; </text>" +
    "<notext> &nbsp; &apos; &copy; &reg; &euro; &dollar; &cent; &pound; &yen; </notext>"
  val xml = new XMLEventReader(Source.fromString(testStr))
  for (event <- xml) {
    event match {
      case EvEntityRef(e) => println(e)
      case EvComment(_) => println(event)
      case _ => // ignore
    }
  }
}

Output:

import scala.io.Source
import scala.xml.pull._

testStr: String = <text> &amp; &quot; &lt; &gt; </text><notext> &nbsp; &apos; &copy; &reg; &euro; &dollar; &cent; &pound; &yen; </notext>

xml: scala.xml.pull.XMLEventReader = non-empty iterator
amp
quot
lt
gt
EvComment( unknown entity nbsp; )
EvComment( unknown entity apos; )
EvComment( unknown entity copy; )
EvComment( unknown entity reg; )
EvComment( unknown entity euro; )
EvComment( unknown entity dollar; )
EvComment( unknown entity cent; )
EvComment( unknown entity pound; )
EvComment( unknown entity yen; )
res0: Unit = ()

Add OSGi headers

scala-xml, as part of the scala library, used to be an OSGi bundle. It should continue to be one after modularization, especially since scalac (through scaladoc) depends on it.

See Scala OSGi regression thread.

scala.xml.Utility.trim() doesn't properly handle adjacent Text nodes

This issue migrated from https://issues.scala-lang.org/browse/SI-3062.

Please note: There is discussion there which is not reproduced here clarifying the needed algorithmic change.

Reproduced here is just the original description of the issue.

if Text("My name is ") followed by Text("Harry") the space following the word "is" will be incorrectly trimmed. Adjacent Text nodes need to be combined before whitespace is removed.

scala> import scala.xml._
import scala.xml._

scala> <div>{Text("My name is ")}{Text("Harry")}</div>
res0: scala.xml.Elem = <div>My name is Harry</div>

scala> Utility.trim(res0)
res1: scala.xml.Node = <div>My name isHarry</div>

This is important when modifying XML and then trimming it. For example we might start with <div>My name is <user:name/></div> and then replace the <user:name/> Elem with "Harry" thus leading to the adjacent Text nodes.

BasicTransformer has exponential complexity

Please see
https://issues.scala-lang.org/browse/SI-3689
https://issues.scala-lang.org/browse/SI-4528

It seems BasicTransformer has exponential complexity on the nesting level of the XML being processed. This is due to this method:

def transform(ns: Seq[Node]): Seq[Node] = {
  val (xs1, xs2) = ns span (n => unchanged(n, transform(n)))

  if (xs2.isEmpty) ns
  else xs1 ++ transform(xs2.head) ++ transform(xs2.tail)
}

Each modified node is transformed twice: once at the span, and again at the if/else. So, for , with c being modified, node a gets transformed twice, node b gets transformed four times (twice for each time a is transformed), and node c gets transformed eight times.

PrettyPrinter removes newlines

Migrated from https://issues.scala-lang.org/browse/SI-4543.

There is discussion there that is not reproduced here.

Original description is:

What steps will reproduce the problem (please be specific and

scala> <foo>{PCData("hello\nworld")}</foo>
res51: scala.xml.Elem =
<foo><![CDATA[hello
world]]></foo>

scala> (new PrettyPrinter(Int.MaxValue, 2)).format(res51)
res52: String = <foo><![CDATA[hello world]]></foo>

What is the expected behavior?

Should print "hello\nworld"

scala> (new PrettyPrinter(Int.MaxValue, 2)).format(res51)
res52: String = <foo><![CDATA[hello\nworld]]></foo>

What do you see instead?
prints "hello world"

What versions of the following are you using?
Scala: 2.8.1

More information for potential Maintainers ?

Hey,

Can anybody please update the front-page with a e-mail address to reach( I really doubt anybody looking at issues or pull requests) for whoever is interested to maintain this project ? Also it would be good if you provide some more information regarding what it entails maintaining this project.

Thanks.

Concurrent modification exception under XML.load

Dispatch user reported this bug: https://github.com/n8han/Databinder-Dispatch/issues/33

Dispatch is using the load method of the XML singleton object from potentially many threads. The stack trace suggests that the the static method SAXParserFactory.newInstance called in XML.load -> XMLLoader.loadXML is not thread safe. This factory should only be instantiated one time and the instance retained for all calls to newSAXParser; otherwise, if the factory itself is rebuilt on each call to XML.load it will need to be synchronized.

caught javax.xml.parsers.FactoryConfigurationError: Provider org.apache.xerces.jaxp.SAXParserFactoryImpl could not be instantiated: java.util.ConcurrentModificationException
javax.xml.parsers.FactoryConfigurationError: Provider org.apache.xerces.jaxp.SAXParserFactoryImpl could not be instantiated: java.util.ConcurrentModificationException
at javax.xml.parsers.SAXParserFactory.newInstance(SAXParserFactory.java:134)
at scala.xml.factory.XMLLoader$class.parser(XMLLoader.scala:28)
at scala.xml.XML$.parser(XML.scala:40)
at scala.xml.factory.XMLLoader$class.load(XMLLoader.scala:53)
at scala.xml.XML$.load(XML.scala:40)
at dispatch.HandlerVerbs$$anonfun$$less$greater$1.apply(handlers.scala:88)

XMLEventReader causing deadlock when stop() is called

I have a n XMLEventReader that I use to parse a few elements from the beginning of big files. Then I call stop() to terminate it's background thread (because it is causing a lot of resources to be used if I leave it running).

The problem seems to be that there is an internal LinkedBlockingQueue with 1000 elements capacity. This queue must be full by the time I call stop(). Since I don't consume any more elements from the thread that calls stop() and the blocking list is already full, my code deadlocks:

Stack Trace
main 1
sun.misc.Unsafe.park line: not available [native method]
java.util.concurrent.locks.LockSupport.park line: 175
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await line: 2039
java.util.concurrent.LinkedBlockingQueue.put line: 350
scala.xml.pull.ProducerConsumerIterator$$anonfun$produce$1.apply$mcV$sp line: 144
scala.xml.pull.ProducerConsumerIterator$$anonfun$produce$1.apply line: 144
scala.xml.pull.ProducerConsumerIterator$$anonfun$produce$1.apply line: 144
scala.xml.pull.ProducerConsumerIterator$class.interruptibly line: 125
scala.xml.pull.XMLEventReader.interruptibly line: 27
scala.xml.pull.ProducerConsumerIterator$class.produce line: 144
scala.xml.pull.XMLEventReader.produce line: 27
scala.xml.pull.XMLEventReader.stop line: 56
pmc.IdFromNXml.apply line: 67

It looks like a bug to me. Also I don't understand the need for a separate thread. Since we're required to call stop() to stop processing, why the background thread? The class is instantiated via

new XMLEventReader(source)

and we need to close the source anyway, so it could all be done within the same thread.

XMLEventReader does not handle &apos; properly

(This issue migrated from https://issues.scala-lang.org/browse/SI-7796)

Of the five required predefined entities in XML, XMLEventReader does not handle ', returning an EvComment of " unknown entity apos; " instead of an EvEntityRef.

This test program:

import scala.xml.pull._
import scala.io.Source
object reader {
  val src = Source.fromString("<test>&amp;&lt;&gt;&apos;&quot;</test>")
  val er = new XMLEventReader(src)
  def main(args: Array[String]) {
    while (er.hasNext)
      Console.println(er.next)
  }
}

outputs:

EvElemStart(null,test,,)
EvEntityRef(amp)
EvEntityRef(lt)
EvEntityRef(gt)
EvComment( unknown entity apos; )
EvEntityRef(quot)
EvElemEnd(null,test)

Also, apos does not appear in XhtmlEntities.scala (may be unrelated).

Since these five entities are predefined, I would argue that the parser should auto-replace them with their equivalents so the user doesn't have to.

Sbt is not able to build with java 1.7

On a system with java 1.7, sbt is not able to figure which scala version to use. The reason are the following line from build.sbt. As you can see there is no condition for java 1.7, and it gives the error "don'know what Scala versions to build on $java". Please see https://github.com/scala/scala-xml/blob/master/build.sbt#L11

crossScalaVersions         := {
  val java = System.getProperty("java.version")
  if (java.startsWith("1.6."))
    Seq("2.11.7", "2.12.0-M1")
  else if (java.startsWith("1.8."))
    Seq("2.12.0-M2")
  else
    sys.error(s"don't know what Scala versions to build on $java")
}

More secure parsing

@jroper says to add the following to XMLLoader.parser:

See http://blog.csnc.ch/2012/08/secure-xml-parser-configuration/

try { 
  f.setFeature("http://xml.org/sax/features/external-general-entities", false);
  f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
} catch {
  case e: ParserConfigurationException => // warn that the SAXParserFactory supplied by the JDK doesn't support this feature, and that the application may therefore be vulnerable to external entity attacks, encourage to define your own parser instead
  case e: SAXNotRecognizedExcetpion => // as above
  case e: SaxNotSupportedException => // as above
}

XML parser reverses attribute order

This issue is copied across from https://issues.scala-lang.org/browse/SI-6341

The original person who raised the issue attached a patch to add a failing test case.

When parsing XML and then writing it out again, the order of attributes is reversed.

This prevents certain use cases where one wants to modify an XML file and have a minimal amount of changes. E.g. adding a new element to a file causes all existing attributes to get reversed.

The cause is here:

https://github.com/scala/scala/blob/81e3121eb09010375783d13e8c4c853686a34ffd/src/library/scala/xml/parsing/MarkupParser.scala#L302

aMap: MetaData is built by prepending new attributes to the linked list, causing the reversal.

The attached patch adds a failing test case. (It failed when I wrote it originally, now I can't test it because the build fails).

commit 272ed40345c9f8f4af878f15299155cecfd8db2e
Author: Robin Stocker [email protected]
Date: Sat Sep 8 19:11:50 2012 +0200

Add failing test for scala.xml attribute order problem

diff --git a/test/files/run/xml-attribute-order-parse.check b/test/files/run/xml-attribute-order-parse.check
new file mode 100644
index 0000000..6637a4f
--- /dev/null
+++ b/test/files/run/xml-attribute-order-parse.check
@@ -0,0 +1 @@
+
diff --git a/test/files/run/xml-attribute-order-parse.scala b/test/files/run/xml-attribute-order-parse.scala
new file mode 100644
index 0000000..3be386c
--- /dev/null
+++ b/test/files/run/xml-attribute-order-parse.scala
@@ -0,0 +1,9 @@
+import xml.XML
+
+object Test {

  • def main(args: Array[String]): Unit = {
  • val input = """"""
  • val doc = XML.loadString(input)
  • println(doc)
  • }
    +}

locally-built classes should not break ivy resolution in scala-tool config

when resolving scala-compiler for, say, the doc task, no scala-xml jar is put on the classpath when the compiler's scala-xml dependency has the same version as the one we're building now.

This can be patched by appending the locally built jar to ScalaInstance's classpath. Ideally, we'd keep this project out of the resolution process entirely so we end up with the correct scala-xml dependency for scala-compiler.

This only really comes up during bootstrapping, where we work around it by setting the local version to a dummy, run the doc task, then reset the version: https://github.com/scala/jenkins-scripts/blob/master/job/scala-release-2.11.x#L114

I tried to implement that but failed (failed attempt below).

// to avoid problem with circular dependency when bootstrapping the modules
compilers in (Compile, doc) := {
  val instance = scalaInstance.value
  val cp = (packageBin in Compile).value +: instance.allJars
  println(s"Classpath for scalaInstance =\n\t * ${cp mkString "\n\t * "}")
  val launcher = appConfiguration.value.provider.scalaProvider.launcher
  val newInstance = ScalaInstance(instance.libraryJar, instance.compilerJar, launcher, cp:_*)
  Compiler.compilers(newInstance, classpathOptions.value, javaHome.value)(appConfiguration.value, streams.value.log)
}

This doesn't work (even with the crazy hacks to force an update)

val scalaInstanceFromUpdate = TaskKey[ScalaInstance]("scala-instance-from-update", "Defines the Scala instance to use for compilation, running, and testing.")

scalaInstanceFromUpdate := Defaults.scalaInstanceFromUpdate.value

scalaInstance := {
 val s = state.value
 val extracted = Project.extract(s)
 val newState = extracted.append(List(version := "1.0.0-BLA", libraryDependencies += "junit" % "junit" % "4.10" % "scala-tool"), s)
 val newState2 = extracted.runTask(Keys.update, newState)._1
 val inst = extracted.runTask(scalaInstanceFromUpdate, newState2)._2
 println(inst.allJars.toList)
 inst
}

Can a scala-js version of this be published?

I am not sure if this is feasible, but a scala-js version of this library would help me a lot, since I have a ton of existing code which I could start using on the client-side immediately.

Thanks,
~ hrj

We are missing wiki for this repository

Hey All,

We don't have wiki entry for this repo.

Please do suggest an outline and if possible patches to build the wiki. If it existed anywhere else on the nets, let us know, so that we can replicate over here.

Thanks.

Scaladoc links to Scala standard library not being generated

from a recent Travis run:

[warn] /home/travis/build/scala/scala-xml/src/main/scala/scala/xml/pull/package.scala:4: Could not find any member to link for "scala.collection.Iterator".
[warn] /**
[warn] ^
[warn] /home/travis/build/scala/scala-xml/src/main/scala/scala/xml/pull/XMLEventReader.scala:19: Could not find any member to link for "scala.collection.Iterator".
[warn] /**
[warn] ^
[warn] /home/travis/build/scala/scala-xml/src/main/scala/scala/xml/parsing/FactoryAdapter.scala:165: Could not find any member to link for "org.xml.sax.SAXException".
[warn]   /**
[warn]   ^
[warn] /home/travis/build/scala/scala-xml/src/main/scala/scala/xml/include/XIncludeException.scala:13: Could not find any member to link for "java.lang.Throwable#getMessage".
[warn] /**
[warn] ^
[warn] /home/travis/build/scala/scala-xml/src/main/scala/scala/xml/include/sax/EncodingHeuristics.scala:39: Could not find any member to link for "IOException".
[warn]   /**
[warn]   ^
[warn] /home/travis/build/scala/scala-xml/src/main/scala/scala/xml/Elem.scala:44: Could not find any member to link for "scala.sys.process.ProcessBuilder".
[warn]   /**
[warn]   ^
[warn] 6 warnings found

I think this just needs a little configuration, probably a line or two in the build

CDATA test broken

Since scala/scala#4306:

> set every scalaVersion := "2.11.7-SNAPSHOT"
test
[info] Defining */*:scalaVersion, *:scalaVersion
[info] The new values will be used by */*:crossScalaVersions, */*:scalaBinaryVersion and 14 others.
[info]  Run `last` for details.
[info] Reapplying settings...
[info] Set current project to scala-xml (in build file:/Users/adriaan/git/scala-xml/)
> test
[warn] Credentials file /Users/adriaan/.ivy2/.credentials does not exist
[info] Updating {file:/Users/adriaan/git/scala-xml/}scala-xml...
[info] Resolving org.scala-lang#scala-compiler;2.11.7-SNAPSHOT ...
[warn] circular dependency found: org.scala-lang.modules#scala-xml_2.11;1.0.4-SNAPSHOT->org.scala-lang#scala-compiler;2.11.7-SNAPSHOT->org.scala-lang.modules#scala-xml_2.11;1.0.3
[info] Resolving jline#jline;2.12.1 ...
[info] downloading https://oss.sonatype.org/content/repositories/snapshots/org/scala-lang/scala-library/2.11.7-SNAPSHOT/scala-library-2.11.7-20150413.015800-17.jar ...
[info]  [SUCCESSFUL ] org.scala-lang#scala-library;2.11.7-SNAPSHOT!scala-library.jar (3021ms)
[info] downloading https://oss.sonatype.org/content/repositories/snapshots/org/scala-lang/scala-compiler/2.11.7-SNAPSHOT/scala-compiler-2.11.7-20150413.015800-17.jar ...
[info]  [SUCCESSFUL ] org.scala-lang#scala-compiler;2.11.7-SNAPSHOT!scala-compiler.jar (7544ms)
[info] downloading https://oss.sonatype.org/content/repositories/snapshots/org/scala-lang/scala-reflect/2.11.7-SNAPSHOT/scala-reflect-2.11.7-20150413.015800-17.jar ...
[info]  [SUCCESSFUL ] org.scala-lang#scala-reflect;2.11.7-SNAPSHOT!scala-reflect.jar (2763ms)
[info] Done updating.
[info] Compiling 84 Scala sources to /Users/adriaan/git/scala-xml/target/scala-2.11/classes...
[info] 'compiler-interface' not yet compiled for Scala 2.11.7-20150410-102911-6e7b32649b. Compiling...
[info]   Compilation completed in 13.95 s
[warn] there were two deprecation warnings; re-run with -deprecation for details
[warn] one warning found
[info] Compiling 15 Scala sources to /Users/adriaan/git/scala-xml/target/scala-2.11/test-classes...
[info] scala-xml: found 0 potential binary incompatibilities
[warn] there was one deprecation warning; re-run with -deprecation for details
[warn] there were 42 feature warnings; re-run with -feature for details
[warn] two warnings found
:9:19: '/' expected instead of ''                  ^
:9:19: name expected, but char '' cannot start a name                  ^
Testing scala-xml version 1.0.4-SNAPSHOT.
Testing scala-xml version 1.0.4-SNAPSHOT.
Testing scala-xml version 1.0.4-SNAPSHOT.
Testing scala-xml version 1.0.4-SNAPSHOT.
[error] Test scala.xml.XMLTest.escape failed: expected:<[
[error]  &quot;Come, come again, whoever you are, come!
[error] Heathen, fire worshipper or idolatrous, come!
[error] Come even if you broke your penitence a hundred times,
[error] Ours is the portal of hope, come as you are.&quot;
[error]                               Mevlana Celaleddin Rumi]> but was:<[<![CDATA[
[error]  "Come, come again, whoever you are, come!
[error] Heathen, fire worshipper or idolatrous, come!
[error] Come even if you broke your penitence a hundred times,
[error] Ours is the portal of hope, come as you are."
[error]                               Mevlana Celaleddin Rumi]]>]>
[error] Failed: Total 88, Failed 1, Errors 0, Passed 87
[error] Failed tests:
[error]     scala.xml.XMLTest
[error] (test:test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 65 s, completed Apr 13, 2015 2:20:44 PM

tests are done against scala-xml dependency and not scala-xml local code !!!

SBT_OPTS="-verbose"
sbt test

and you'll see for example that XMLEventReader used class is coming from .ivy2/cache/org.scala-lang.modules/scala-xml_2.11.0-M8/bundles/scala-xml_2.11.0-M8-1.0.0-RC7.jar and not from the locale one I'm testing and which was freshly recompiled ./target/scala-2.11.0-M8/classes/scala/xml/pull/XMLEventReader.class with fixes I'm trying to test.

All tests are concerned.

[Loaded scala.xml.pull.XMLEventReaderTest from file:/home/work/experiments/scala/scala-xml-dacr/target/scala-2.11.0-M8/test-classes/]
[Loaded scala.xml.pull.XMLEventReader from file:/home/work/.ivy2/cache/org.scala-lang.modules/scala-xml_2.11.0-M8/bundles/scala-xml_2.11.0-M8-1.0.0-RC7.jar]
[Loaded scala.xml.pull.XMLEventReader$POISON$ from file:/home/work/.ivy2/cache/org.scala-lang.modules/scala-xml_2.11.0-M8/bundles/scala-xml_2.11.0-M8-1.0.0-RC7.jar]
[Loaded scala.xml.pull.XMLEventReader$Parser from file:/home/work/.ivy2/cache/org.scala-lang.modules/scala-xml_2.11.0-M8/bundles/scala-xml_2.11.0-M8-1.0.0-RC7.jar]
[Loaded scala.xml.pull.XMLEventReader$Parser$$anonfun$run$1 from file:/home/work/.ivy2/cache/org.scala-lang.modules/scala-xml_2.11.0-M8/bundles/scala-xml_2.11.0-M8-1.0.0-RC7.jar]
[Loaded scala.xml.pull.XMLEventReader$Parser$$anonfun$setEvent$1 from file:/home/work/.ivy2/cache/org.scala-lang.modules/scala-xml_2.11.0-M8/bundles/scala-xml_2.11.0-M8-1.0.0-RC7.jar]

roll version 1.0.5 so that Scala 2.12.0-M3 gets the fix for #51

rolling a release for such a small reason may seem a little puzzling to users, but due to the circular dependency between scala and scala-xml, this will sometimes need to happen, and hey, with so much automation in place, rolling the release is easy

community: open issues here for any significant old issue from JIRA

today I closed all open XML tickets in the Scala JIRA: 9047, 8834, 7796, 7726, 7395, 7311, 7282, 6800, 6741, 6341, 5775, 5133, 5132, 5131, 5101, 4865, 4836, 4543, 4529, 4528, 4520, 4303, 4296, 4286, 4267, 4050, 3881, 3689, 3573, 3527, 3336, 3334, 3286, 3246, 3062, 2725, 1787, 1654

many of these ought to become issues here in this repo on GitHub. interested community members are invited to open new issues for them here, with links in both directions

EDIT: there is a list below of still-unmigrated issues

XMLEventReader causes OutOfMemoryError with invalid XML declaration

I initially noticed this bug because a third-party REST API returns malformed XML on some occasions, which causes our server to run out of memory.

After a lot of head-scratching I found that an unclosed attribute string in the XML declaration causes this systematically. Here's the simplest example of reproducing the bug that I've found:

import scala.xml.pull.XMLEventReader
import scala.io.Source

object XMLEventReader_OutOfMemory {
    def main(args: Array[String]): Unit = {
        val src = Source.fromString("<?xml x=\"")
        new XMLEventReader(src)
    }
}

Running the code causes an OutOfMemoryException:

$ scalac XMLEventReader_OutOfMemory.scala && scala XMLEventReader_OutOfMemory
Exception in thread "XMLEventReader" java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3332)
    at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
    at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:622)
    at java.lang.StringBuilder.append(StringBuilder.java:202)
    at scala.collection.mutable.StringBuilder.append(StringBuilder.scala:267)
    at scala.xml.parsing.MarkupParserCommon.xAttributeValue(MarkupParserCommon.scala:70)
    at scala.xml.pull.XMLEventReader$Parser.xAttributeValue(XMLEventReader.scala:60)
    at scala.xml.parsing.MarkupParserCommon.xAttributeValue(MarkupParserCommon.scala:78)
    at scala.xml.pull.XMLEventReader$Parser.xAttributeValue(XMLEventReader.scala:60)
    at scala.xml.parsing.MarkupParser.xAttributes(MarkupParser.scala:318)
    at scala.xml.pull.XMLEventReader$Parser.xAttributes(XMLEventReader.scala:60)
    at scala.xml.parsing.MarkupParser.xmlProcInstr(MarkupParser.scala:143)
    at scala.xml.pull.XMLEventReader$Parser.xmlProcInstr(XMLEventReader.scala:60)
    at scala.xml.parsing.MarkupParser.prologOrTextDecl(MarkupParser.scala:159)
    at scala.xml.parsing.MarkupParser.prolog(MarkupParser.scala:209)
    at scala.xml.pull.XMLEventReader$Parser.prolog(XMLEventReader.scala:60)
    at scala.xml.parsing.MarkupParser.document(MarkupParser.scala:239)
    at scala.xml.pull.XMLEventReader$Parser.document(XMLEventReader.scala:60)
    at scala.xml.pull.XMLEventReader$Parser.scala$xml$pull$XMLEventReader$Parser$$$anonfun$2(XMLEventReader.scala:96)
    at scala.xml.pull.XMLEventReader$Parser$$Lambda$93/1925707867.apply(Unknown Source)
    at scala.xml.pull.ProducerConsumerIterator.interruptibly(XMLEventReader.scala:125)
    at scala.xml.pull.XMLEventReader.interruptibly(XMLEventReader.scala:27)
    at scala.xml.pull.XMLEventReader$Parser.run(XMLEventReader.scala:96)
    at java.lang.Thread.run(Thread.java:745)

Cannot stop XMLEventReader thread with Thread.sleep present (or v large file)

https://gist.github.com/fancellu/780d7c3dab1a8309f561

I'm trying to pull in some XML events, and stop after a few.

Blocks on ProducerConsumerIterator.produce

If I take out the Thread.sleep, it works fine.

Any idea why? Is this a known "feature" ?

Also, on a big (4GB) xml file, er.stop doesn't even stop it, seems to run to end of file

This isn't really what I want out of streaming.

Any help would be appreciated.

Thanks.

XML tag starting with a colon

A partial fix was added to #93 for handling XML names starting with a colon.

@som-snytt raises the issue that colons may not be fully corrected.

Here is an empty element:

scala> val x = <:/>
<console>:10: error: not found: value <:/>
       val x = <:/>
               ^

Here is a name with a letter, but starting with a colon:

scala> val a = <:a/>
<console>:1: error: illegal start of simple expression
val a = <:a/>
        ^

Trying with a string doesn't work much better

scala> val a = "<:a/>"
a: String = <:a/>

scala> val x = scala.xml.XML.loadString(a)
java.lang.IllegalArgumentException: prefix of zero length, use null instead
  at scala.xml.Elem.<init>(Elem.scala:102)
  at scala.xml.Elem$.apply(Elem.scala:34)
  at scala.xml.parsing.NoBindingFactoryAdapter.createNode(NoBindingFactoryAdapter.scala:30)
  at scala.xml.parsing.NoBindingFactoryAdapter.createNode(NoBindingFactoryAdapter.scala:19)
  at scala.xml.parsing.FactoryAdapter.endElement(FactoryAdapter.scala:182)
  at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.endElement(AbstractSAXParser.java:609)
  at com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(AbstractXMLDocumentParser.java:183)
  at com.sun.org.apache.xerces.internal.impl.dtd.XMLDTDValidator.emptyElement(XMLDTDValidator.java:766)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1343)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$ContentDriver.scanRootElementHook(XMLDocumentScannerImpl.java:1292)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:3138)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:880)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:606)
  at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:510)
  at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:848)
  at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:777)
  at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
  at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
  at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:649)
  at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl.parse(SAXParserImpl.java:333)
  at scala.xml.factory.XMLLoader$class.loadXML(XMLLoader.scala:41)
  at scala.xml.XML$.loadXML(XML.scala:60)
  at scala.xml.factory.XMLLoader$class.loadString(XMLLoader.scala:60)
  at scala.xml.XML$.loadString(XML.scala:60)
  ... 43 elided

NamespaceBinding.buildString inproper handling of TopScope

Starting with saca 2.11 NamespaceBinding.buildString uses private metohod doBuildString for namespace part generation. problem is that this method is not overrided in TopScope sub class. This generates unnecessary artifacts in form of empty (xmlns="") when constructing xml:

  def pretty(xml: Node) = new scala.xml.PrettyPrinter(200, 2).format(xml)
  val xx = <soapenv:Envelope xmlns:soapenv="aaa">
    <soapenv:Header xmlns:wsa="ssss">
    </soapenv:Header>
    <soapenv:Body><body xmlns:soapenv="aaa"></body></soapenv:Body>
  </soapenv:Envelope>
  println(pretty(xx))

Results in:

<soapenv:Envelope xmlns:soapenv="aaa" xmlns="">
  <soapenv:Header xmlns:wsa="ssss"> </soapenv:Header>
  <soapenv:Body>
    <body xmlns:soapenv="aaa"></body>
  </soapenv:Body>
</soapenv:Envelope>

In this example problem can be fixed by setting pscope=TopScope for PrettyPrinter, but it does not help if xml is build from multiple parts.

def pretty(xml: Node) = new scala.xml.PrettyPrinter(200, 2).format(xml, TopScope)
val body = <body xmlns:soapenv="aaa"></body>
val xx = <soapenv:Envelope xmlns:soapenv="aaa">
    <soapenv:Header xmlns:wsa="ssss">
    </soapenv:Header>
    <soapenv:Body>{body}</soapenv:Body>
    <soapenv:Body><body xmlns:soapenv="aaa"></body></soapenv:Body>
  </soapenv:Envelope>
println(pretty(xx))

Results in:

<soapenv:Envelope xmlns:soapenv="aaa">
  <soapenv:Header xmlns:wsa="ssss"> </soapenv:Header>
  <soapenv:Body>
    <body xmlns:soapenv="aaa" xmlns=""></body>
  </soapenv:Body>
  <soapenv:Body>
    <body xmlns:soapenv="aaa"></body>
  </soapenv:Body>
</soapenv:Envelope>

Sugested solution:

override def buildString(sb: StringBuilder, stop: NamespaceBinding) {
  NamespaceBinding.doBuildString(shadowRedefined(stop), sb, stop)
}


object NamespaceBinding{
  def doBuildString(b: NamespaceBinding, sb: StringBuilder, stop: NamespaceBinding) {
    if ((this == null) || (this eq stop)) return // contains?

    sb append " xmlns%s=\"%s\"".format(
      (if (b.prefix != null) ":" + b.prefix else ""),
      (if (b.uri != null) b.uri else "")
    )
    b.parent match {
      case binding: NamespaceBinding => doBuildString(binding, sb, stop)
      case TopScope =>
    }
  }
}

Or just change visibility of doBuildString and override it properly in TopScope like it's done with buildString methods

XML ConstructingParser too aggressive trimming whitespace around character references

This issue migrated from https://issues.scala-lang.org/browse/SI-3527.

Original description of the issue:

when preserveWS = false. example:

Welcome to Scala version 2.8.0.RC3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_20).

scala> xml.parsing.ConstructingParser.fromSource(io.Source.fromString("<x>a &amp; b</x>"), preserveWS = false).document.text
res0: String = a&b

note that spaces on either side of the escaped ampersand have been lost.

I can't see why you would ever want this behavior. and I'm not an XML laywer, but section 4.4 of the XML 1.0 spec suggests that the character encoded by a character reference should be "retrieved and processed, in place of the reference itself, as though it were part of the document at the location the reference was recognized"

perhaps vaguely related: #73

Allow attributes beginning with '*'

scala> val x = <div ngIf="selectedHero"></div>
compiles, but
scala> val x = <div *ngIf="selectedHero"></div>
throws

<console>:1: error: in XML literal: '>' expected instead of '*'
val x = <div *ngIf="selectedHero"></div>
             ^

Angular 2 recommends preceding built-in directives with '*'.

OOM on malformed input

Welcome to Scala version 2.11.2 (OpenJDK 64-Bit Server VM, Java 1.7.0_65).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import scala.xml.pull._
import scala.xml.pull._

scala> import scala.io.Source
import scala.io.Source

scala> val r = new XMLEventReader(Source.fromString("<broken attribute='is truncated") )
[error] (XMLEventReader) java.lang.OutOfMemoryError: Java heap space

ch returns 0.asInstancsOf[Char] every time it is called, while MarkupParserCommon.xAttributeValue loops until ch returns endCh, happily appending all the zeros to its internal string buffer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.