Giter Site home page Giter Site logo

Comments (11)

dakrone avatar dakrone commented on August 19, 2024

I looked into this, I think it's Jackson actually consuming the rest of the stream.

I checked that the StringReader had been consumed by calling (.read string-reader) and verifying it was still -1 after parsing, even with trailing junk.

I also double-checked with another project that uses Jackson, and it ignores trailing junk also.

Is the trailing data causing problems for you?

from cheshire.

cespare avatar cespare commented on August 19, 2024

I looked into this, I think it's Jackson actually consuming the rest of the stream.

I checked that the StringReader had been consumed by calling (.read string-reader) and verifying it was still -1 after parsing, even with trailing junk.

Indeed. This solution was a guess, and it wasn't the right approach. I looked into it more closely and I'm pretty sure the answer is that you need to check that (.nextToken jp) is nil at the end of parse/parse. Here's a REPL session to show what I mean.

user=> (def jp (.createJsonParser (JsonFactory.) (StringReader. "{\"foo\": 123}")))
#'user/jp
user=> (.nextToken jp)
#<JsonToken START_OBJECT>
user=> (.nextToken jp)
#<JsonToken FIELD_NAME>
user=> (.nextToken jp)
#<JsonToken VALUE_NUMBER_INT>
user=> (.nextToken jp)
#<JsonToken END_OBJECT>
user=> (.nextToken jp)
nil

user=> (def jp (.createJsonParser (JsonFactory.) (StringReader. "{\"foo\": 123}null")))
#'user/jp
user=> (.nextToken jp)
#<JsonToken START_OBJECT>
user=> (.nextToken jp)
#<JsonToken FIELD_NAME>
user=> (.nextToken jp)
#<JsonToken VALUE_NUMBER_INT>
user=> (.nextToken jp)
#<JsonToken END_OBJECT>
user=> (.nextToken jp)
#<JsonToken VALUE_NULL>

user=> (def jp (.createJsonParser (JsonFactory.) (StringReader. "{\"foo\": 123}junk")))
#'user/jp
user=> (.nextToken jp)
#<JsonToken START_OBJECT>
user=> (.nextToken jp)
#<JsonToken FIELD_NAME>
user=> (.nextToken jp)
#<JsonToken VALUE_NUMBER_INT>
user=> (.nextToken jp)
#<JsonToken END_OBJECT>
user=> (.nextToken jp)

JsonParseException Unrecognized token 'junk': was expecting ('true', 'false' or 'null')
 at [Source: java.io.StringReader@58a606e3; line: 1, column: 33]  com.fasterxml.jackson.core.JsonParser._constructError (JsonParser.java:1524)

I also double-checked with another project that uses Jackson, and it ignores trailing junk also.

Yeah, as I mentioned I saw that data.json and clj-json have this same behavior as well. It's easy for this to happen when building a batch parsing function from a streaming parser.

Is the trailing data causing problems for you?

Yes, we parse a very large amount of JSON data and we see all manner of corruption.

from cheshire.

cespare avatar cespare commented on August 19, 2024

Any updates?

from cheshire.

dakrone avatar dakrone commented on August 19, 2024

@cespare I could add a wrapper around the regular parsing functions that would verify the nil on .nextToken after parsing has completed, then it could optionally be used (to not incur the overhead for everyone, just people who need it)

Does that sound like it would work for you?

from cheshire.

cespare avatar cespare commented on August 19, 2024

Sure, that would let us fix this problem.

This should not be a wrapper, it should be the default behavior. It's misleading that parse-string only looks for a valid JSON prefix in the given string. The extra cost is very small compared with doing JSON parsing in the first place, and well worth avoiding the confusion.

If the current behavior is somehow useful, then perhaps the function should be named parse-json-prefix. More likely to be useful would be an API for reading successive JSON values from a reader (the current parse-string cannot do that because the StringReader is just thrown away after the first value is read).

from cheshire.

philc avatar philc commented on August 19, 2024

Any update on this? Would love to see a fix.

from cheshire.

Satshabad avatar Satshabad commented on August 19, 2024

Also would love to see a fix

from cheshire.

mkwatson avatar mkwatson commented on August 19, 2024

Me too! This was very surprising!

from cheshire.

overthink avatar overthink commented on August 19, 2024

+1 Just bit me also.

from cheshire.

dakrone avatar dakrone commented on August 19, 2024

Closed by #122

from cheshire.

cespare avatar cespare commented on August 19, 2024

I still think it's misleading that the default parse-string function accepts a string that's invalid JSON, but at least there's a workaround for people after they're bitten by this :|

from cheshire.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.