Giter Site home page Giter Site logo

Comments (25)

yogwoggf avatar yogwoggf commented on July 30, 2024 6

If you want a fast implementation: https://github.com/Vurv78/qjson.lua
Runs thousands of times faster than this repo

Sorry, but that claim in that form simply isn't true (or at best highly misleading). On which inputs? Since this is a complexity issue, any linear-time implementation can be arbitrarily faster.

Also, your parser doesn't seem spec-compliant to me. I see no provisions for unicode escape sequences. It also seems to have the same complexity issue as this one: It doesn't properly use a rope for encoding.

Your isarray relies on undocumented behavior (pairs degrading to ipairs).

Formatting strings with %q does not always produce valid JSON.

TL;DR: Your JSON parser is broken and has a time complexity issue as well. Your performance claims in their current form can't be taken seriously.

Besides, my PR fixes the complexity issue.

There are benchmarks. It is better.

from json.lua.

Vurv78 avatar Vurv78 commented on July 30, 2024 4

If you want a fast implementation: https://github.com/Vurv78/qjson.lua

Runs thousands of times faster than this repo

LuaJIT 2.1.0-beta3
11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz

Running benchmarks...
| Name (Decode)   | Min        | Max        | Avg        | Avg / Best  |
| vurv78/qjson    | 0.008763   | 0.010837   | 0.00939495 | x1.72715    |
| rxi/json        | 0.00475    | 0.007055   | 0.00543957 | x1          |
| actboy168/json  | 0.010547   | 0.013259   | 0.0112183  | x2.06235    |
| luadist/dkjson  | 0.011222   | 0.014534   | 0.0126976  | x2.3343     |

| Name (Encode)   | Min        | Max        | Avg        | Avg / Best  |
| vurv78/qjson    | 0.001677   | 0.002637   | 0.00189174 | x1          |
| rxi/json        | 0.010513   | 0.011322   | 0.010924   | x5.77459    |
| actboy168/json  | 0.009892   | 0.012293   | 0.0104864  | x5.54327    |
| luadist/dkjson  | 0.014829   | 0.01985    | 0.0157059  | x8.30237    |
Lua 5.3
11th Gen Intel(R) Core(TM) i7-11800H @ 2.30GHz

Running benchmarks...
| Name (Decode)   | Min        | Max        | Avg        | Avg / Best  |
| luadist/dkjson  | 0.028568   | 0.033705   | 0.0306484  | x1.94652    |
| rxi/json        | 0.045178   | 0.053548   | 0.0480421  | x3.05121    |
| vurv78/qjson    | 0.015006   | 0.018043   | 0.0157452  | x1          |
| actboy168/json  | 0.019061   | 0.023373   | 0.0200551  | x1.27372    |

| Name (Encode)   | Min        | Max        | Avg        | Avg / Best  |
| luadist/dkjson  | 0.021639   | 0.024754   | 0.0226422  | x4.10556    |
| rxi/json        | 0.015463   | 0.019618   | 0.0166444  | x3.01802    |
| vurv78/qjson    | 0.005148   | 0.006336   | 0.00551502 | x1          |
| actboy168/json  | 0.016263   | 0.018331   | 0.0170535  | x3.09218    |

from json.lua.

appgurueu avatar appgurueu commented on July 30, 2024 3

If you want a fast implementation: https://github.com/Vurv78/qjson.lua

Runs thousands of times faster than this repo

Sorry, but that claim in that form simply isn't true (or at best highly misleading). On which inputs? Since this is a complexity issue, any linear-time implementation can be arbitrarily faster.

Also, your parser doesn't seem spec-compliant to me. I see no provisions for unicode escape sequences.
It also seems to have the same complexity issue as this one: It doesn't properly use a rope for encoding.

Your isarray relies on undocumented behavior (pairs degrading to ipairs).

Formatting strings with %q does not always produce valid JSON.

TL;DR: Your JSON parser is broken and has a time complexity issue as well. Your performance claims in their current form can't be taken seriously.

Besides, my PR fixes the complexity issue.

from json.lua.

yogwoggf avatar yogwoggf commented on July 30, 2024 3

There are benchmarks. It is better.

I don't see how you are replying to any of my points.

I did. All of your points are invalidated by the benchmarks. Theory is theory but it comes down to the actual performance.

from json.lua.

yogwoggf avatar yogwoggf commented on July 30, 2024 3

It's probably because you have a horribly slow CPU.

from json.lua.

appgurueu avatar appgurueu commented on July 30, 2024 1

Just to be sure, I ran the benchmarks of rxi/json.lua against qjson.lua. Here are the results:

[decode]
Lua version   : Lua 5.4
CPU name      : AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx
../json.lua   : 0.0773s [x2.21] (min: 0.0753s, max 0.095s)
../qjson.lua  : 0.035s [x1] (min: 0.0334s, max 0.0455s)
dkjson.lua    : 0.0785s [x2.24] (min: 0.0725s, max 0.1s)
jfjson.lua    : 0.104s [x2.98] (min: 0.0992s, max 0.134s)

[encode]
Lua version   : Lua 5.4
CPU name      : AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx
../json.lua   : 0.0287s [x1.63] (min: 0.0269s, max 0.0353s)
../qjson.lua  : 0.0176s [x1] (min: 0.016s, max 0.0232s)
dkjson.lua    : 0.0385s [x2.19] (min: 0.0357s, max 0.0485s)
jfjson.lua    : 0.0507s [x2.88] (min: 0.0498s, max 0.054s)
json4lua.lua  : 0.0476s [x2.7] (min: 0.0465s, max 0.0604s)

and here's LuaJIT

[decode]
Lua version   : LuaJIT 2.1.0-beta3
CPU name      : AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx
../json.lua   : 0.0134s [x1] (min: 0.0123s, max 0.0147s)
../qjson.lua  : 0.0216s [x1.61] (min: 0.0211s, max 0.0227s)
dkjson.lua    : 0.0286s [x2.13] (min: 0.0276s, max 0.0296s)
jfjson.lua    : 0.0191s [x1.43] (min: 0.0185s, max 0.0209s)

[encode]
Lua version   : LuaJIT 2.1.0-beta3
CPU name      : AMD Ryzen 5 3500U with Radeon Vega Mobile Gfx
../json.lua   : 0.0153s [x2.92] (min: 0.0148s, max 0.0161s)
../qjson.lua  : 0.00524s [x1] (min: 0.0047s, max 0.0056s)
dkjson.lua    : 0.0169s [x3.23] (min: 0.0163s, max 0.0182s)
jfjson.lua    : 0.017s [x3.23] (min: 0.0166s, max 0.0181s)
json4lua.lua  : 0.0229s [x4.37] (min: 0.0224s, max 0.0239s)

(oops, looks like qjson.lua isn't even always the fastest, despite omitting features and implementing JSON slightly incorrectly)

I don't know what you did wrong when benchmarking; qjson.lua is faster (about 2x), but definitely not a thousand or tens of thousands times for the given benchmarks.

Perhaps do some benchmarks before citing them?

from json.lua.

IsFaser avatar IsFaser commented on July 30, 2024

(Translation)
Using "concat" is a good idea. I think it can be improved. "table.insert" can be modified to "[#tab +1] = value". The following is the test data

local tab = {}
for i=1, 10000000 do
-- table.insert(tab, i)
-- time : 8.61 7.33 7.21 s

-- tab[#tab +1] = i
-- time : 5.54 5.25 5.46 s

-- tab[table.size(tab) +1] = i
-- time : so long time
end

from json.lua.

DnsIs avatar DnsIs commented on July 30, 2024

I completely agree with you.
Did the same test, same results.
Super.

".." - should be avoided for large amounts of data.

from json.lua.

appgurueu avatar appgurueu commented on July 30, 2024

String concatenation using .. should never be done in loops as it incurs quadratic time complexity; each .. has to copy the entire string in linear time.

Using find + gsub could be more efficient than a rope (a rope takes 8 bytes per character, ugh), but seems to be hard to implement as the Unicode surrogate pair handling as it is currently implemented doesn't mesh well with it.

table.insert vs res[#res + 1] doesn't make much of a difference on LuaJIT, but on PUC Lua, the latter is indeed noticeably faster.

from json.lua.

appgurueu avatar appgurueu commented on July 30, 2024

There are benchmarks. It is better.

I don't see how you are replying to any of my points.

from json.lua.

appgurueu avatar appgurueu commented on July 30, 2024

I did. All of your points are invalidated by the benchmarks. Theory is theory but it comes down to the actual performance.

Oh yeah? So something being fast somehow magically makes it correct? Look, I have the fastest JSON decoder:

function read_json(str) return 1 end

it's easy to optimize e.g. string decoding using gsub if you don't have to care about unicode escape sequences

"The" benchmarks of rxi/json.lua are completely insufficient; they can't claim to be representative. For the theoretical complexity issue I'm outlining, a benchmark can be trivially construed; in fact I've done it for an issue on this repo.

from json.lua.

rollerozxa avatar rollerozxa commented on July 30, 2024

Really...?

[decode]
Lua version   : LuaJIT 2.1.0-beta3
CPU name      : AMD Ryzen 5 5600G with Radeon Graphics
../json.lua   : 0.00712s [x1] (min: 0.00673s, max 0.0075s)
../qjson.lua  : 0.0138s [x1.94] (min: 0.0118s, max 0.0146s)
dkjson.lua    : 0.0179s [x2.52] (min: 0.0177s, max 0.0182s)
jfjson.lua    : 0.0123s [x1.73] (min: 0.0118s, max 0.013s)

[encode]
Lua version   : LuaJIT 2.1.0-beta3
CPU name      : AMD Ryzen 5 5600G with Radeon Graphics
../json.lua   : 0.00677s [x2.66] (min: 0.00641s, max 0.00701s)
../qjson.lua  : 0.00255s [x1] (min: 0.00227s, max 0.00274s)
dkjson.lua    : 0.00876s [x3.44] (min: 0.00849s, max 0.00912s)
jfjson.lua    : 0.0101s [x3.95] (min: 0.00993s, max 0.0102s)
json4lua.lua  : 0.0168s [x6.59] (min: 0.0164s, max 0.0173s)

from json.lua.

yogwoggf avatar yogwoggf commented on July 30, 2024

Really...?

[decode]
Lua version   : LuaJIT 2.1.0-beta3
CPU name      : AMD Ryzen 5 5600G with Radeon Graphics
../json.lua   : 0.00712s [x1] (min: 0.00673s, max 0.0075s)
../qjson.lua  : 0.0138s [x1.94] (min: 0.0118s, max 0.0146s)
dkjson.lua    : 0.0179s [x2.52] (min: 0.0177s, max 0.0182s)
jfjson.lua    : 0.0123s [x1.73] (min: 0.0118s, max 0.013s)

[encode]
Lua version   : LuaJIT 2.1.0-beta3
CPU name      : AMD Ryzen 5 5600G with Radeon Graphics
../json.lua   : 0.00677s [x2.66] (min: 0.00641s, max 0.00701s)
../qjson.lua  : 0.00255s [x1] (min: 0.00227s, max 0.00274s)
dkjson.lua    : 0.00876s [x3.44] (min: 0.00849s, max 0.00912s)
jfjson.lua    : 0.0101s [x3.95] (min: 0.00993s, max 0.0102s)
json4lua.lua  : 0.0168s [x6.59] (min: 0.0164s, max 0.0173s)

Have you tried turning your computer on and off again? Sometimes that effects my benchmarks.

from json.lua.

Derpius avatar Derpius commented on July 30, 2024

Would probably help to have a shared benchmark script, so that people on both sides of the argument can verify that the tests are being conducted correctly (or at the very least consistently...)

from json.lua.

rollerozxa avatar rollerozxa commented on July 30, 2024

Would probably help to have a shared benchmark script, so that people on both sides of the argument can verify that the tests are being conducted correctly (or at the very least consistently...)

The benchmark script is available in this repository. Just add qjson.lua to the table of JSON implementations to benchmark.

from json.lua.

Derpius avatar Derpius commented on July 30, 2024

The benchmark script is available in this repository

It has some odd contents such as this:

  if name == "jfjson.lua" then
    local _encode, _decode = json.encode, json.decode
    json.encode = function(...) return _encode(json, ...) end
    json.decode = function(...) return _decode(json, ...) end
  end

Which will affect LuaJIT (have not tested by how much) as it will not JIT compile variadic functions

from json.lua.

appgurueu avatar appgurueu commented on July 30, 2024

Which will affect LuaJIT (have not tested by how much)

How would that affect qjson.lua or json.lua? It's clearly in an if concerning only jfjson.lua (though it might explain why jfjson.lua is so slow in the LuaJIT benchmarks).

as it will not JIT compile variadic functions

[citation needed]

A much more likely explanation is that qjson.lua is hitting a LuaJIT NYI here since it relies heavily on patterns in string.find.

from json.lua.

Vurv78 avatar Vurv78 commented on July 30, 2024

Addressing my incorrect claim that it is "thousands of times faster":

This was very wrong and a dumb mistake on my end. I've largely rewritten the code to be even further optimized and taken advantage of the rope optimization (pushed that a bit further too).

It's now faster in every case besides decoding on LuaJIT since it relies on patterns, which stitch.

I've fixed my comments to reflect this, thank you for bringing it to my attention.

as it will not JIT compile variadic functions

[citation needed]

See the link I posted above

from json.lua.

appgurueu avatar appgurueu commented on July 30, 2024

This was very wrong and a dumb mistake on my end. I've largely rewritten the code to be even further optimized and taken advantage of the rope optimization (pushed that a bit further too).

Good. What remains to be addressed:

  • Your fix for isarray is insufficient. It deals with proper sequences (no holes) properly. It also deals with "pure" dicts (only string keys) or tables without integer keys in general properly (those will have length 0). However it doesn't deal with pathological cases such as arrays with holes properly (or perhaps you want to generate arrays for these? beware though: these can be arbitrarily sparse, which is a major performance concern). Currently it's pretty bad: Depending on what #t yields, arrays with holes may be either rejected or accepted - Schrödinger's Array! Arrays with holes could be fixed by just checking that len == 0 at the end. That would still not be sufficient though; arrays might have holes and out of range keys making up for them (consider e.g. {1, nil, 3, [42] = true}). What you really need to check is that all keys are integers (type(k) == "number" and math.floor(k) == k) within range k >= 1 and k <= #t; then you can compute the density as counted keys divided by maximum integer key, and decide based on that whether you really want to serialize to array.
  • You still use %q for encoding; this will fail e.g. for "\1", which will encode to just that: "\1", which is invalid JSON. This is unfortunately not trivial to fix since Lua allows arbitrary bytestrings whereas JSON expects strings of Unicode code points (e.g. there are unicode escapes, no byte escapes like Lua's escapes). You will also run into issues of UTF-8 vs. UTF-16.
  • Maybe I skimmed over it, but I still see no handling of Unicode escape sequences.

from json.lua.

Derpius avatar Derpius commented on July 30, 2024

as it will not JIT compile variadic functions

[citation needed]

@appgurueu https://web.archive.org/web/20220708225331/http://wiki.luajit.org/NYI

Took some time to find where the hell this was (the wiki has been deleted). Looks like only vararg results are unsupported based on that table, but I think in the context of a benchmark any potential for overhead should be removed though.

How would that affect qjson.lua or json.lua?

It doesn't, I'm just saying that generally the benchmarking script in this repo is not giving each project being benchmarked an equal footing (regardless of whether they're being directly compared here or not).

from json.lua.

Derpius avatar Derpius commented on July 30, 2024

On an unrelated note, regarding the isarray discussion would the following approach not make more sense (from a performance perspective at least):

local function isArray(value)
    return getmetatable(value) == Array
end

...

local myArray = someJsonLib.Array({1, 2, 3, 4})
print(isArray(myArray))

If detecting whether a table is an array or not is a severe bottleneck, isn't the above approach the best way to tackle this? (seems especially useful for exceptionally large arrays)

from json.lua.

Vurv78 avatar Vurv78 commented on July 30, 2024

On an unrelated note, regarding the isarray discussion would the following approach not make more sense (from a performance perspective at least):

local function isArray(value)
    return getmetatable(value) == Array
end

...

local myArray = someJsonLib.Array({1, 2, 3, 4})
print(isArray(myArray))

If detecting whether a table is an array or not is a severe bottleneck, isn't the above approach the best way to tackle this? (seems especially useful for exceptionally large arrays)

From a user's perspective, probably nobody would want to have to adjust all their input data, but from performance, would probably be much faster. Although I think getmetatable might be pretty slow on puc-lua

A faster way could be a lookup table which holds references to all of the array tables

local function encode(t, arrays)
    if arrays[t] then
    else
    end
end

local data = { 1, 2, {} }
encode(data, { [data] = true, [data[3]] = true })

But still, this would require action on the user's part.

from json.lua.

Derpius avatar Derpius commented on July 30, 2024

From a user's perspective, probably nobody would want to have to adjust all their input data

I imagine that a user encoding the data is probably in control of how that data is generated (or can write a transformer easily). Ignoring that though, I'm not sure a user of a pure-lua JSON library really cares about splitting hairs over performance in the first place...

A faster way could be a lookup table which holds references to all of the array tables

Not sure how that's a more ergonomic API than wrapping the arrays at source, or with a transformer. That solution still requires knowing where every single array in the dataset is, and provides a duplicate source of truth for what is or isn't an array.

from json.lua.

Vurv78 avatar Vurv78 commented on July 30, 2024

Your fix for isarray is insufficient. It deals with proper sequences (no holes) properly.

This is intended. I don't think tables with gaps should be treated as arrays, or at least, they should cut off.

Checking how other JSON libraries handle encoding { 1, 2, [4] = 55 }

Name Runtime Handling
Garry's Mod util.TableToJSON C {"1":1.0,"2":2.0,"4":55.0}
rapidjson C [1, 2]
lua-cjson C [1,2,null,55] (only handles small gaps, throws error for excessively sparse array)
actboy168/json Lua [1,2,null,55]
luadist/dkjson Lua [1,2,null,55]
grafi-tt/lunajson Lua [1,2]
rxi/json Lua Throws error
vurv78/qjson Lua {"1":1,"2":2,"4":55}

There's no real clear pick here. In the future maybe I'd like a configuration function to be exported which lets you tweak things like this to allow or disallow sparse arrays, but for now I prefer current behavior, or swapping to rapidjson/lunajson's method.

from json.lua.

appgurueu avatar appgurueu commented on July 30, 2024

I think throwing an error is the best course of action. A table with holes simply can't be represented properly with accepting the possibility of sparse tables being exponentially wasteful.

from json.lua.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.