net-a-porter / scala-uri Goto Github PK
View Code? Open in Web Editor NEWSimple scala library for building and parsing URIs
License: Other
Simple scala library for building and parsing URIs
License: Other
If you pass a query parameter value with a + symbol it does not encode this to %2B.
I would assume this is because + is a legal 'punct' character (http://docs.oracle.com/javase/6/docs/api/java/net/URI.html) and can be used to encode spaces. You'll likely get the same behavior with other characters in those groups as well.
Since these are query parameter values is it fair to assume all such reserved characters should be encoded?
My exact issue can be worked around by manually encoding the parameters myself, but this is a little unclean.
Add support to allow filtering of query params with a function argument
uri.filterParams(_._1 == "myParam"); //Removes all query params except ones called myParam
The following test should pass:
"Parsing URIs" should "parse percent escapes" in {
val source = com.github.theon.uri.Uri(
Some("http"),
Some("xn--ls8h.example.net"),
None,
List("", "path with spaces"),
com.github.theon.uri.Querystring(Map("a b" → List("c d")))
)
val parsed = parseUri(source.toString)
parsed should equal(source)
}
Trendy new D3 charts in the new scala meter: http://axel22.github.io/scalameter//2013/06/14/release_0_4_M2.html
If you have this string
coldplay.com?singer=chris%26will
then you parse that and re-encode you get
coldplay.com?singer=chris%2526will
The reserved characters in the path are different from the reserved characters in the query string. Add some unit tests to check the situations outlined here are handled correctly:
Hello,
I am currently working with Scala 2.10.3 and SBT 0.13.1.
My project contains most of the dependencies Spray 1.3.1.
When I add scala-uri to my project, I get an error in SBT: Conflicting cross-Version in suffixes: com.chuusai: shapeless
How to solve this problem?
Thanks.
reproduceable always, just open a REPL and type:
val uri = "http://www.google.com" / "test" / "json" ? ("p" -> "test")
uri.toStringRaw // outputs: http://www.google.com/test//json?p=test
this only happens when using the DSL with more than one path component AND query string
a uri defined with the dsl can lead to surprising results :
scala> "http://host" / "path" / "to" / "resource" ? ("a" -> "1" ) & ("b" -> "2")
res1: com.netaporter.uri.Uri = http://host/path/to/%2Fresource%3Fa%3D1?b=2
As I understand it this is because of Scala operator precedence : & then / then ?
and because the & operator is not defined for Tuple2, even with the dsl.
From there two options :
scala> ("http://host" / "path" / "to" / "resource") ? ("a" -> "1" ) & ("b" -> "2")
res2: com.netaporter.uri.Uri = http://host/path/to/resource?a=1&b=2
Query Parameters are currently Tuples (String,String)
.
It would be nice if these were case class to get nicer fields name
and value
vs _1
and _2
.
Still allow Tuples to be passed into methods such as addParam
and addParams
Hello
i got this exception
foremost i was thinking my uri was bad, so i tested withval url = "http://www.google.com" uri
i'm using the last git version because need #58
play.api.Application$$anon$1: Execution exception[[RuntimeException: java.lang.NoSuchMethodError: org.parboiled2.ParserInput$.apply(Ljava/lang/String;)Lorg/parboiled2/ParserInput$StringBasedParser;]]
at play.api.Application$class.handleError(Application.scala:296) ~[play_2.11-2.3.0-RC2.jar:2.3.0-RC2]
at play.api.DefaultApplication.handleError(Application.scala:402) [play_2.11-2.3.0-RC2.jar:2.3.0-RC2]
at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$14$$anonfun$apply$1.applyOrElse(PlayDefaultUpstreamHandler.scala:205) [play_2.11-2.3.0-RC2.jar:2.3.0-RC2]
at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$14$$anonfun$apply$1.applyOrElse(PlayDefaultUpstreamHandler.scala:202) [play_2.11-2.3.0-RC2.jar:2.3.0-RC2]
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) [scala-library-2.11.1.jar:na]
Caused by: java.lang.RuntimeException: java.lang.NoSuchMethodError: org.parboiled2.ParserInput$.apply(Ljava/lang/String;)Lorg/parboiled2/ParserInput$StringBasedParser;
at play.api.mvc.ActionBuilder$$anon$1.apply(Action.scala:523) ~[play_2.11-2.3.0-RC2.jar:2.3.0-RC2]
at play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4$$anonfun$apply$5.apply(Action.scala:130) ~[play_2.11-2.3.0-RC2.jar:2.3.0-RC2]
at play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4$$anonfun$apply$5.apply(Action.scala:130) ~[play_2.11-2.3.0-RC2.jar:2.3.0-RC2]
at play.utils.Threads$.withContextClassLoader(Threads.scala:21) ~[play_2.11-2.3.0-RC2.jar:2.3.0-RC2]
at play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4.apply(Action.scala:129) ~[play_2.11-2.3.0-RC2.jar:2.3.0-RC2]
Caused by: java.lang.NoSuchMethodError: org.parboiled2.ParserInput$.apply(Ljava/lang/String;)Lorg/parboiled2/ParserInput$StringBasedParser;
at com.netaporter.uri.parsing.UriParser$.parse(UriParser.scala:167) ~[scala-uri_2.10-0.4.1.jar:0.4.1]
at com.netaporter.uri.Uri$.parse(Uri.scala:271) ~[scala-uri_2.10-0.4.1.jar:0.4.1]
at com.netaporter.uri.dsl.package$.stringToUri(package.scala:18) ~[scala-uri_2.10-0.4.1.jar:0.4.1]
at com.netaporter.uri.dsl.package$.stringToUriDsl(package.scala:19) ~[scala-uri_2.10-0.4.1.jar:0.4.1]
This doesn't work, but should with the default PercentEncoder.
def blah:String = {
"blah" ? ("blah" -> "blah")
}
Instead you receive the error:
could not find implicit value for parameter e: com.github.theon.uri.UriEncoder
Current workaround is:
def blah:String = {
implicit val e = PercentEncoder
"blah" ? ("blah" -> "blah")
}
Converting a Uri to a String drops the port number.
val uri: Uri = "http://localhost:3620/example"
println(uri)
println(uri.toString)
outputs
Uri(Some(http),Some(localhost),Some(3620),List(, example),Querystring(Map()))
http://localhost/example
So passing the String into a Uri extracts the port correctly so building the String to print out must be dropping it.
When I add this to my Play 2.2.1 app I get the following error on build:
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: UNRESOLVED DEPENDENCIES ::
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: com.typesafe.sbt#sbt-pgp;0.8.1: not found
[warn] :: com.github.scct#scct_2.10;0.3-SNAPSHOT: not found
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn]
[warn] Note: Some unresolved dependencies have extra attributes. Check that these dependencies exist with the requested attributes.
[warn] com.typesafe.sbt:sbt-pgp:0.8.1 (sbtVersion=0.13, scalaVersion=2.10)
[warn]
[trace] Stack trace suppressed: run last *:update for the full output.
[error] (*:update) sbt.ResolveException: unresolved dependency: com.typesafe.sbt#sbt-pgp;0.8.1: not found
[error] unresolved dependency: com.github.scct#scct_2.10;0.3-SNAPSHOT: not found
I'm guessing this will probably end up being an issue with the coveralls plugin, but this is where I noticed it.
If you execute the publishM2 sbt task, the generated Maven POM file contains the dependency:
<dependency>
<groupId>com.sqality.scct</groupId>
<artifactId>scct_2.10</artifactId>
<version>0.2.2</version>
</dependency>
I'm pretty sure the code coverage library shouldn't end up being in the compile (default) scope.
The parsing unit tests seem to fail randomly against 2.9.2
(note: parsing never fails on 2.10
). I believe it to be caused by this bug in 2.9.2
with parser combinators: https://issues.scala-lang.org/browse/SI-4929
Example failure: https://travis-ci.org/theon/scala-uri/jobs/4393715
According to RFC 3986 parseUri("abc").protocol
should return None
. It is a relative reference, a path-noscheme.
UriParser.scala is currently implemented with core Scala parser combinators. I'd like to replace this with a parboiled implementation as:
When parsing a JDBC url:
Uri.parse("jdbc:mysql://localhost:3306/some_db")
it is being parsed as a relative url. This is because in UrlParser
, the _scheme
rule expects the scheme part to be alphanumeric.
Ideally, you could relax the parser a bit to allow for colons, thus supporting JDBC urls as well.
When parsing the uri fragment, it seems the fragment part is not correctly parsed:
In particular, this line is probably missing ~> extractFragment
Maybe this is intended behavior, but I suspect not. The following code throws an error because %yum
is not a proper escape sequence. It might be better if it just ignored improper escape sequences. Or something else. At the very least, the parts of the query or path that are properly encoded should be available. You might try using java.net.URLDecode.
import com.github.theon.uri._
import com.github.theon.uri.Uri._
val url = """http://lesswrong.com/index.php?query=abc%yum&john=hello"""
url.query.params
Gives the error:
java.lang.NumberFormatException: For input string: "yu"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Integer.parseInt(Integer.java:481)
at com.github.theon.uri.PercentDecoder$$anonfun$1.apply(PercentDecoder.scala:20)
at com.github.theon.uri.PercentDecoder$$anonfun$1.apply(PercentDecoder.scala:19)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:38)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:233)
at scala.collection.mutable.ArrayOps.map(ArrayOps.scala:38)
at com.github.theon.uri.PercentDecoder$.decodeString(PercentDecoder.scala:19)
at com.github.theon.uri.PercentDecoder$$anonfun$decode$2$$anonfun$apply$1.apply(PercentDecoder.scala:13)
at com.github.theon.uri.PercentDecoder$$anonfun$decode$2$$anonfun$apply$1.apply(PercentDecoder.scala:13)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
at scala.collection.immutable.List.foreach(List.scala:76)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:233)
at scala.collection.immutable.List.map(List.scala:76)
at com.github.theon.uri.PercentDecoder$$anonfun$decode$2.apply(PercentDecoder.scala:13)
at com.github.theon.uri.PercentDecoder$$anonfun$decode$2.apply(PercentDecoder.scala:13)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:233)
at scala.collection.immutable.Map$Map2.foreach(Map.scala:140)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:233)
at scala.collection.immutable.Map$Map2.map(Map.scala:123)
at com.github.theon.uri.PercentDecoder$.decode(PercentDecoder.scala:13)
at com.github.theon.uri.UriParser$.parse(UriParser.scala:47)
at com.github.theon.uri.Uri$.parseUri(Uri.scala:247)
at com.github.theon.uri.Uri$.stringToUri(Uri.scala:242)
Version 0.3.4
This is a valid uri but does not parse under scala uri:
http://localhost:8080/ping?oi=TscV16GUGtlU&ppc=&bpc=
I have this happen since a partner of ours replaces macro values ${MY_VALUE} with an empty string if the value is not available.
Work on this when parboiled2 supports scala 2.11
Hello,
this url work with java.net.URI but there is a parsing error with net a porter scala-uri
http://localhost:9002/iefjiefjief-efefeffe-fefefee/toto?access_token=ijifjijef-fekieifj-fefoejfoef&gquery=filter(time_before_closing%3C=45)
On the frontpage it says under https://github.com/theon/scala-uri#url-percent-decoding:
By Default, scala-uri will URL percent decode paths and query string parameters during parsing
Followed by an example. The output of the given example doesn't match what I'm seeing when using the library. Taking the exact same example:
val uri: Uri = "http://example.com/i-have-%25been%25-percent-encoded"
println(uri.toString())
I obtain this result instead:
http://example.com/i-have-%2525been%2525-percent-encoded
i.o.w. the URL, although already encoded, was re-encoded. I assume this is a bug?
This is with 0.3.5 and Scala 2.9.2. I also tested with 0.4-SNAPSHOT and it seems to have the same behavior, although it's hard to say since toString
seems to work different now and not hand out the plain URL anymore?
I get some weird error about
module not found: org.scala-sbt#sbt;0.13.0-RC2
in Intellij when using 4.0
Might just be my setup but thought I'd flag it.
Add support for url fragments: http://en.wikipedia.org/wiki/Fragment_identifier
Calling uri.toString() (with parentheses) renders:
Uri(None,None,None,List(, blah),Querystring(Map(blah -> List(blah))))
Whereas calling uri.toString (without parentheses) renders correctly:
/blah?blah=blah
Although the { and } characters are not illegal in URIs according to the spec, the Java URI class will throw an error if they are encountered.
java.net.URI.create("{}")
Cause: java.net.URISyntaxException: Illegal character in path at index 0: {}
There is no harm in encoding them and it's likely users of this code will also interact with Java URIs, therefore I think they should be added to the encoded character list.
Thanks,
Chris
Based on http://tools.ietf.org/html/rfc3986#section-3.3
//pchar = unres | pct-enc | subdelim | : | @
//unres = ALPHA | DIGIT | - | . | _ | ~
//subdelim = ! | $ | & | ' | ( | ) | * | + | , | ; | =
"URI path pchars" should "not be encoded by default" in {
val uri: Uri = "http://example.com/-._~!$&'()*+,;=:@/test"
uri.toString should equal("http://example.com/-._~!$&'()*+,;=:@/test")
}
When parsing an URI like such: http://localhost:9000/?foo=test&&bar=test. Scala URI throws a parse error. While the URI is technically invalid, I think its a bit strict and should be able to normalize these kind of errors. Lots of websites make that stupid mistake, so as a parser we should be able to handle it.
I'm going to look into the source and see if there is a way I can fix it, but I'm not sure approach you would like me to take.
Let me know what you think
Struck-through items have been completed:
Seq[String]
with underlying Vector[String]
rather than List[String]
Vector[(String,String)]
rather than a Map[String,String]
so that query string param ordering can be maintained. Maybe have a lazy val
Map[String,String]
for quick lookups?Uri
into a separate namespaceCurrently you can only define one encoding strategy for the entire URL. However as pointed out here, someone might want spaces as %20
in the path, but as +
in the query string.
Allow different encoding to be specified for the path and query string.
http://www.w3.org/DesignIssues/MatrixURIs.html
Seq[(String,String)]
. (querystring params will also be changing to a Seq[(String,String)] per #29)Not an issue, but how are you getting code coverage values? Are you using jacoco or cobertura or something else ?
Its easy right now, but it might be nice to parse out the domain structure of the host.
"en.wikipedia.org" -> Seq("org", "wikipedia", "en") or Seq("en", "wikipedia", "org")
I'm sure this is possible now, but ensure it is easy, add to the documentation and add unit test.
Have a configuration option to enable Matrix Parameters. This will remove any parsing overhead for those who don't need it.
For example, Uri.parse("http://test.net/##")
will yield a org.parboiled2.ParseError
. This is mildly confusing, as (first) it's not immediately clear what the exception is related to, but also because it leaves one wondering whether there's a bug in scala-uri
or whether an illegitimate URI has genuinely been rejected.
Perhaps a custom exception class extending from java.net.URISyntaxException
would be the way to go?
You may or may not want to do this:
"Query parameters" should "use application/x-www-form-urlencoded serialization by default" in {
val uri = "http://example.com/" ?
("safe" -> "*-.0-9A-Za-z_") &
("bad" -> "\u0021\u0029\u002B\u002C\u002F\u003A\u0040\u005B\u0060\u007B") &
("control" -> "\u0019\u007F") &
("space" -> " ")
uri.toString should equal("http://example.com/?safe=*-.0-9A-Za-z_&bad=%21%29%2B%2C%2F%3A%40%5B%60%7B&control=%19%7F&space=+")
}
This is following http://url.spec.whatwg.org/#urlencoded-serializing since http://tools.ietf.org/html/rfc3986#section-3.4 doesn't really comment on how to serialize the key-value pairs. (It just states that pchars, /, and ? are all valid chars for a URI's query.)
I think it's reasonable to have the space encode as %20 instead of a + by default, since you could always mix in the spaceAsPlus encoder, so I separated that out. I also separated the control chars since they don't currently show up in the final output.
when I add scala-uri to my build.sbt I get this:
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: UNRESOLVED DEPENDENCIES ::
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
[warn] :: org.scala-sbt#sbt;0.13.0-RC2: not found
[warn] ::::::::::::::::::::::::::::::::::::::::::::::
I added this dep to my build.sbt:
"com.netaporter" % "scala-uri" % "0.4.0",
I guess we have to bump the sbt version?
I get this error when I upgrade to 0.4.2. How do I fix it?
[error] Modules were resolved with conflicting cross-version suffixes in {file:/xxxxxxx:
[error] org.scalamacros:quasiquotes _2.10, _2.10.3
[trace] Stack trace suppressed: run last projectx/*:update for the full output.
error Conflicting cross-version suffixes in: org.scalamacros:quasiquotes
The percent encoding examples listed in the readme are confusing:
import com.github.theon.uri.Uri._
val uri: Uri = "http://example.com/i-have-%25been%25-percent-encoded"
uri.toString //This is: http://example.com/i-have-%25been%25-percent-encoded
uri.toStringRaw //This is: http://example.com/i-have-%been%-percent-encoded
However, when I run this example, I get:
scala> val uri: Uri = "http://example.com/i-have-%25been%25-percent-encoded"
uri: com.github.theon.uri.Uri = http://example.com/i-have-%2525been%2525-percent-encoded
scala>
scala> uri.toString //This is: http://example.com/i-have-%25been%25-percent-encoded
res12: String = http://example.com/i-have-%2525been%2525-percent-encoded
scala>
scala> uri.toStringRaw //This is: http://example.com/i-have-%been%-percent-encoded
res13: String = http://example.com/i-have-%25been%25-percent-encoded
Also, I think this is the opposite behavior than I would typically want, if I am going from a string to a Uri, I will typically want to decode, rather than encode.
"Percent decoding" should "decode 2-byte groups" in {
val uri = Uri.parse("http://example.com/%C2%A2?cents_sign=%C2%A2")
uri.toStringRaw should equal("http://example.com/¢?cents_sign=¢")
}
"Percent decoding" should "decode 3-byte groups" in {
val uri = Uri.parse("http://example.com/%E2%82%AC?euro_sign=%E2%82%AC")
uri.toStringRaw should equal("http://example.com/€?euro_sign=€")
}
"Percent decoding" should "decode 4-byte groups" in {
val uri = Uri.parse("http://example.com/%F0%9F%82%A0?ace_of_spades=%F0%9F%82%A1")
uri.toStringRaw should equal("http://example.com/\uD83C\uDCA0?ace_of_spades=\uD83C\uDCA1")
}
My actual test code has the real playing card and ace of spades characters in it, but GH doesn't allow posting an issue with unicode characters above 0xffff, so I subbed in \uD83C\uDCA0 and \uD83C\uDCA1.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.