Comments (6)
I would love this feature. :)
If it doesn't cause a performance hit, it should definitely be the default, if it does, having a new lazy-parser is definitely viable also.
Added you to the repo, now you're on the hook Zach :)
from cheshire.
It's worth noting that this should be possible (though harder) for maps as well.
Example use case: by default couchdb returns data structured like so:
{"some":"resultset","meta":"data","rows":[
{...},
{...},
{...},
{...}
]}
And presumably requiring that maps be parsed eagerly means there's no straightforward way to consume these kinds of documents lazily.
I know there are a few (maybe only half-baked) implementations of a lazy map floating around.
from cheshire.
An implementation of this can be found at 2a3b2bc, in the branch lazy-top-level-arrays
. I'd appreciate a code review.
Interestingly, this appears to be faster than the existing approach:
cheshire.core> (def (generate-string (range 128)))
; Evaluation aborted.
cheshire.core> (def s (generate-string (range 128)))
#'cheshire.core/s
cheshire.core> (use 'criterium.core)
nil
cheshire.core> (quick-bench (dorun (parse-string s)))
WARNING: Final GC required 30.225372725730608 % of runtime
Evaluation count : 50562 in 6 samples of 8427 calls.
Execution time mean : 12.141253 µs
Execution time std-deviation : 108.685624 ns
Execution time lower quantile : 12.046142 µs ( 2.5%)
Execution time upper quantile : 12.319445 µs (97.5%)
Overhead used : 2.629471 ns
Found 1 outliers in 6 samples (16.6667 %)
low-severe 1 (16.6667 %)
Variance from outliers : 13.8889 % Variance is moderately inflated by outliers
nil
;; disable lazy parsing here
cheshire.core> (quick-bench (dorun (parse-string s)))
WARNING: Final GC required 31.019576852051898 % of runtime
Evaluation count : 39966 in 6 samples of 6661 calls.
Execution time mean : 15.046137 µs
Execution time std-deviation : 300.837064 ns
Execution time lower quantile : 14.811363 µs ( 2.5%)
Execution time upper quantile : 15.488795 µs (97.5%)
Overhead used : 2.629471 ns
nil
Apparently chunked-seqs are more efficient to construct than transient vectors. I could see an argument for using chunked seqs everywhere, except for a few downsides:
- they're not indexed
- they're not counted
- they must be sized ahead of time, so small arrays will waste memory
I think the current approach (lazy-seqs at the top level, vectors everywhere else) is a pretty decent compromise, but I think that's worth discussing in more detail.
Anyway, let me know what you think.
from cheshire.
This looks great. I added a long-running benchmark just for top-level array parsing.
Here are the benchmarks:
pre: http://p.sa2s.us/1369196973422f5b1db5f.txt
post: http://p.sa2s.us/13691969896500361ea03.txt
The code looked pretty good to me, I did remove an unused variable that was being let-bound by replacing it with a do
, but other that that it looked good. I also noticed your =/identical? change and replaced =
in a few other places in the parsing code, good catch.
I agree with you that I think lazy-seqs at the top level, eager everywhere else sounds like a good idea, and I imagine it's the case most often used.
Let me know what you think, and thanks again for the help and contribution!
from cheshire.
Oh, whoops, should have caught the let thing. Your changes all seem good, feel free to merge it in whenever you like.
from cheshire.
Sweet, released 5.2.0 with this, thanks again!
from cheshire.
Related Issues (20)
- Always update to latest version of Jackson, when available
- Any way to not escape the quotes HOT 2
- Suggestion: prove equivalence with clojure.data.json HOT 6
- Add support for capturing source position/locations when parsing HOT 2
- Allow key-fn like hook for value HOT 2
- Converting nested fields from java objects to clojure types doesnot work
- Custom conversion when parsing?
- `parsed-smile-seq` actually parses JSON
- Run tests against JSONTestSuite
- Resulting Array from parse not processed lazily HOT 1
- :key-fn no longer works in generate-string? HOT 1
- Using the Ubuntu Cheshire package with Clojure HOT 1
- lein unable to find cheshire on classpath HOT 1
- Security vulnerability in Jackson dependency HOT 4
- Can't convert object of type org.postgresql.util.PGobject HOT 1
- support jackson's fast number parsing/writing features HOT 1
- Update test.generative HOT 2
- Possible to support parsing copy-paste JSON? HOT 1
- Test failures with Java 21
- Deterministic parse and generate HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cheshire.