Giter Site home page Giter Site logo

tgockel / json-voorhees Goto Github PK

View Code? Open in Web Editor NEW
126.0 126.0 17.0 8.19 MB

A killer modern C++ library for interacting with JSON.

Home Page: http://tgockel.github.io/json-voorhees/

License: Apache License 2.0

CMake 2.40% Shell 2.55% C++ 94.43% Dockerfile 0.62%
c-plus-plus json json-serialization

json-voorhees's People

Contributors

jimon avatar jjberry314 avatar mvshyvk avatar sth avatar tgockel avatar venediktov avatar vexingcodes avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

json-voorhees's Issues

Invalid `\uxxxx` in a parsed string throws `range_error`

If a json string is parsed and contains a string literal with a \u that is not followed by hex digits, a range_error is thrown by from_hex_digit, but the surrounding code doesn't expect that.

The parse_string() function, which calls the string decoding, only handles decode_error, not range_error. Probably this function should handle range_error in the same way, or string_decode() should already convert it into a decode_error when parsing the \u escape.

The exception can for example be triggered by this:

"\"\\uzzzz\""_json

Assertion when printing data containing invalid UTF-8

I assume the problem is non-utf8 content, but I haven't debugged it in detail. Creating the value works, but not converting it back into a string:

jsonv::value v = "\"\xe4\""_json;
std::cout << v << std::endl;

I'm not completely sure if the json literal is supposed to parse without errors, but it does.
Converting the value back to a string then fails with an assertion:

#0  0x00007ffff71a0e37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffff71a2528 in __GI_abort () at abort.c:89
#2  0x00007ffff7199ce6 in __assert_fail_base (fmt=0x7ffff72e9c08 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=assertion@entry=0x7ffff7b93984 "idx + length <= source_size", file=file@entry=0x7ffff7b93908 "/home/sth/src/extern/json-voorhees/src/jsonv/char_convert.cpp", 
    line=line@entry=217, 
    function=function@entry=0x7ffff7b93d80 <jsonv::detail::string_encode(std::ostream&, jsonv::detail::string_view)::__PRETTY_FUNCTION__> "std::ostream& jsonv::detail::string_encode(std::ostream&, jsonv::detail::string_view)") at assert.c:92
#3  0x00007ffff7199d92 in __GI___assert_fail (assertion=0x7ffff7b93984 "idx + length <= source_size", 
    file=0x7ffff7b93908 "/home/sth/src/extern/json-voorhees/src/jsonv/char_convert.cpp", line=217, 
    function=0x7ffff7b93d80 <jsonv::detail::string_encode(std::ostream&, jsonv::detail::string_view)::__PRETTY_FUNCTION__> "std::ostream& jsonv::detail::string_encode(std::ostream&, jsonv::detail::string_view)") at assert.c:101
#4  0x00007ffff7b2cae4 in jsonv::detail::string_encode (stream=..., source=...) at /home/sth/src/extern/json-voorhees/src/jsonv/char_convert.cpp:217
#5  0x00007ffff7b2c327 in jsonv::stream_escaped_string (stream=..., str=...) at /home/sth/src/extern/json-voorhees/src/jsonv/detail.cpp:102
#6  0x00007ffff7b7a4f2 in jsonv::ostream_encoder::write_string (this=0x7fffffffdbb0, value=...) at /home/sth/src/extern/json-voorhees/src/jsonv/encode.cpp:154
#7  0x00007ffff7b7a1de in jsonv::encoder::encode (this=0x7fffffffdbb0, source=...) at /home/sth/src/extern/json-voorhees/src/jsonv/encode.cpp:77
#8  0x00007ffff7b890bb in jsonv::operator<< (stream=..., val=...) at /home/sth/src/extern/json-voorhees/src/jsonv/value.cpp:523
#9  0x0000000000400ca5 in main () at /home/sth/src/extern/json-voorhees/test.cpp:9

Parsing invalid UTF-8 inputs should throw

With default settings, this test should work:

PARSE_TEST(invalid_utf8_input)
{
    ensure_throws(jsonv::parse_error, jsonv::parse("\"\xe4\""));
}

Unfortunately, jsonv::parse erroneously succeeds.

Create a looser version of as_X access functions

Some people working with JSON are used to the JavaScript type conversion rules (aka: crazy time). We should appease these people with to_X functions which will attempt to do convert the value to what they want.

Fix use of implicit casting in object implementation

Currently, the code does not compile with USE_BOOST_STRING_REF=1. boost::string_ref's cast operator to a std::basic_string<char, ...> is marked as explicit, while jsonv::detail::string_ref provides an implicit conversion operator. N3442 wishes the C++ Standard Library to provide an explicit conversion operator and the current implementation of std::experimental::string_view also uses an explicit conversion. I need to fix my implementation of string_ref and need to think of a way to make the object_impl use of std::map work when C++14 is enabled.

Parser ignores stray slashes

A single slash is interpreted as the potential start of a comment and is ignored even if a full comment couldn't be matched. This causes "/1"_json or R"({"a": ////////"b"})"_json to be parsed without errors, while they aren't valid json.

When a slash is encountered, match_pattern() is called to parse the comment, which will return match_result::unmatched. This result is then ignored in tokenizer::next() and the token will be treated as valid. The single slash will then be treated as a full, valid comment.

The unmatched result should result in a parse error instead.

decode_error should be a parse_error like any other

If you encounter a decode_error in a string, it shoots out of the parsing system and into your grill. It should end up in the list and be subject to parse_options::failure_mode and parse_options::max_failures.

Create an installer

I need to write a make install task. Not difficult, but definitely a TODO before v0.3.

String encoding should allow UTF-8 output

All std::string -> JSON string conversion goes through the jsonv::detail::string_encode function, which performs numeric encoding for characters which do not fit into the ASCII encoding. This is not entirely necessary, since a JSON document can validly contain UTF-8 sequences of characters. The library should allow replacement of this encoding function if the user knows the decoding side can handle UTF-8 encoded JSON.

Generic visitor system for a JSON tree

To the tune of: std::function<void (const jsonv::path& toHere, const jsonv::value& item)>.

I don't want to deal with mutating a jsonv::value while being visited, so that's its own can of worms.

Add a "destructive remove" from an object

If I am performing a destructive operation on some kind::object and wish to move the std::string keys from that object and make them my own, there is no way to do this. I can move the values associated with the key, but not the key itself.

This is a shortcoming in C++ in general, so it's probably not terribly urgent to do this. However, if there is a good proposal for having this for std::map and whatnot, JSON Voorhees should follow that pattern.

std::wstring views of object keys and string values

The Unicode Windows API is 100% UTF-16 based. Despite the fact that UTF-16 is a complete trainwreck of an encoding system, people still need to use it if they're working with the Windows API. Some degree of support for char16_ts (std::wstring, std::wistream, std::wostream) would probably be appreciated by those folks that have to deal with the Windows API.

I would prefer it if value could continue to store things as std::string internally and convert on demand when requested, but I'm not sure about the implications for the keys of objects (there is no convenient wrapper there).

MSVC Support

This library should work on Windows...eventually. I'm mostly going to put this off until MSVC implements C++11.

Investigate other parsers

The hand-rolled parser is fast, but it is a major pain to maintain. Any LALR parser generator can parse JSON, I just need to find one that works well.

Reporting for benchmark suite

Using cmake -DBENCHMARK=1 and running ./json-benchmark is a good start, but there should be some more formal reporting on this...graphs and whatnot.

Lazy deserialization of strings

It takes quite a while to parse objects and arrays containing large strings and most of this time is spent copying (the underlying strcpy takes ~12x longer than the discovery of what needs to be copied for ASCII strings). If a user does not end up needing the value (a common use case), then we have wasted time performing the copy.

What is the inverse of traverse?

Use Case
In web development, upon request, it is common to take a collection of paths and values and reconstruct an entire JSON tree.
In web development, upon response, it is common to take a value and flatten it into a collection of paths and values.

This is performed in PHP and in jQuery which adds to their success. In the case of jQuery this was a breaking change but they did it any way because it was so important. Many [C++] JSON libraries omit this capability.

Your library does have traverse that could be used to turn a json object into a map<jsonv::path, jsonv::value>, though a convenience function to do this common use case would be appreciated. So the former use case is mostly taken care of; except for the preceding "." common to your paths.

Now what about the former use case? What is the inverse of traverse function?
value has the following
value& at_path (const path &p)
but it doesn't seem to have
void set_value_at_path (const path &p, ?value?)
"value & path (const path &p)" comes close but doesn't take an additional ?value? parameter.
For convenience it would be great if there was a batch void paths|set_values_at_paths(std::map<path, value> values). It would be beneficial to perform this to an existing non primitive values or statically to create a new value.

In both use cases strongly type [de]serialization to and from map<path,value> is also welcomed for convenience.

Though at a minimal a set_value_at_path function is needed and all the others could be derived. If this can be performed now, please document how due to the importance of the proliferation of the web and its contribution to the popularity of JSON.

missing include for std::isfinite

Hello,

while compiling you top-notch library on OS X, two files use std::isfinite and it causes a compile error because "cmath" is not included.

jsonv/encode.cpp
jsonv/util.cpp

Please keep up the good work on this jewel !

Support for polymorphic allocators

This is currently an experimental "library fundamentals" library, but when it becomes more concrete, it would make sense to use it in value.

32-bit compilation

When compiling the library on a 32-bit target, it generates this static assert.

../../thirdparty/json-voorhees/src/jsonv/value.cpp:356:5: error: static_assert
      failed "!!"
    static_assert(sizeof _data == sizeof _data.object, "!!");
    ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I am not sure if this library supports 32-bit architecture. It was compiled successfully on 64-bit in the same platform (OS X).

An invalid escape sequence causes segfault

This unit test will cause a segmentation fault, as jsonv::detail::string_decode will dereference NULL:

TEST(string_decode_invalid_escape_sequence)
{
    string_decode_static("\\y");
}

This should throw a jsonv::decode_error instead.

A real benchmark suite

JSON Voorhees should have a real suite of benchmarks to pit it against other C++ JSON libraries with real-world JSON.

Generalized access to low-level encode/decode API

Consider the JSON object which contains a key "x" with the numeric value 2^64+1:

{ "x": 18446744073709551617 }

How might I access this in C++?

// Let's assume the call to parse doesn't throw...it will in the current code
jsonv::value v = jsonv::parse("18446744073709551617");
int64_t     i = v.as_integer(); // can't represent the value
double      d = v.as_decimal(); // loss of precision
std::string s = v.as_string(); // throws kind_error

How does one get access to the value in this perfectly valid JSON? The parser can verify the JSON string 18446744073709551617 represents a valid numeric value, but there is no way to store it or get it back to the user!

The converse of this also does not work. If I have an int128_t, there is no way to emit 18446744073709551617 without resorting to stringification (which is incorrect).

One solution would be choose a C++ arbitrary-precision arithmetic library and go with that. The biggest drawback to this approach is: Which one? Boost.Multiprecision, TTMath, GNU Multi-Precision Library, MAPM, InfInt, MathX... Using any of these libraries would force a dependency on any library user, even if they never intend to deal with non-representable integers.

I believe the correct solution is to allow direct access to the shared_buffer which ultimately backs the parsing. This will allow users to deal with the values in their own way.

jsonv::value v = jsonv::parse("18446744073709551617");
jsonv::shared_buffer buff = v.get_encoded_buffer();
// buffer contains 18446744073709551617 (no null termination)
my_int x = my_int::parse(buffer.cbegin(), buffer.size());

For setting:

jsonv::shared_buffer buffer(to_string(x));
auto v = jsonv::value::from_buffer(jsonv::kind::integer, buffer);

There are some strange caveats to this, all of which are still up in the air...

  • When a user calls as_integer on a non-representable (but still integer) value, what should happen? If you throw, there is a major drawback that x.get_kind() == kind::integer ? x.as_integer() : 0 can throw.
    • For something like as_decimal, the answer is clear -- by asking for a double, you're assuming a degree of precision loss.
  • Should jsonv::value::from_buffer validate that the input buffer is a valid encoding for the supplied kind?
  • When asking for get_encoded_buffer, what should happen if it does not exist?
  • Can you ask for get_encoded_buffer on the aggregate types array and object?

Decode numeric encodings into arbitrarily encoded std::string

Parsing outputs a UTF-8 encoded std::string from JSON numeric encodings (\uNNNN). This might not be what the user desires.

This is only on my radar because the workaround for this is so absurdly inconvenient for platforms that do not support output of UTF-8 encoded strings...the only workaround I can think of forces the user to convert at every string access ร  la convert_utf8_to_utf1(val.as_string()). That said, if every platform you'd use JsonVoorhees on supports UTF-8, this isn't worth dealing with. I'm going to wait until somebody actually cares about this to address it.

At a more general level: It might be completely pointless to support not-UTF-8 when the resultant string representation is the sequence of single bytes std::string. In Windows, where UCS-2 seems to be the norm for presentation, I should more to having strings be backed by std::wstring before addressing this sort of thing.

String "\\\" blah" will not include the blah portion

The match_string function can incorrectly escape early if a " is preceded by a \ which was preceded by a \. In other words, the string "\\\" and keep going" will not include the and keep going bit.

TEST(token_attempt_match_string_double_reverse_solidus_before_escaped_quote)
{
    static const char tokens[] = R"("\\\" and keep going")";
    token_kind kind;
    std::size_t length;
    match_result result = static_attempt_match(tokens, kind, length);
    ensure(result == match_result::complete);
    ensure_eq(token_kind::string, kind);
    ensure_eq(sstrlen(tokens), length);
}

Segmentation fault if extracting jsonv::value that is not as serialization_builder expects

Hi,

just starting to give JSON Voorhees a try...
by trying an example based on the following link.

This following causes a "Segmentation Fault" (64 bit Debian GNU/Linux)

#include <iostream>
#include <string>

#include <jsonv/value.hpp>
#include <jsonv/serialization_builder.hpp>
#include <jsonv/parse.hpp>

struct foo
{
  int         a;
  int         b;
  std::string c;
};
struct bar
{
  foo         x;
  foo         y;
  std::string z;
  std::string w;
};

std::ostream& operator<<(std::ostream& os, const foo &f)
{
  os << " a: " << f.a << '\n'
     << " b: " << f.b << '\n'
     << " c: " << f.c << std::endl;
  return os;
}

std::ostream& operator<<(std::ostream& os, const bar &b)
{
  os << "x:\n" << b.x << '\n'
     << "y:\n" << b.y << '\n'
     << "z: " << b.z << '\n'
     << "w: " << b.w << std::endl;
  return os;
}


int main()
{

  jsonv::formats local_formats =
  jsonv::formats_builder()
  .type<foo>()
     .member("a", &foo::a)
     .member("b", &foo::b)
     .default_value(10)
     .member("c", &foo::c)
  .type<bar>()
     .member("x", &bar::x)
     .member("y", &bar::y)
     .member("z", &bar::z)
     .since(jsonv::version(2, 0))
     .member("w", &bar::w)
     .until(jsonv::version(5, 0))
  ;

//      "aaaaa" !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
  std::string json_string = R"({
 "x": { "aaaaa": 50, "b": 20, "c": "Blah" }, 
 "y": { "a": 10,          "c": "No B?" },
 "z": "Only serialized in 2.0+",
 "w": "Only serialized before 5.0"
}
)";


  jsonv::formats format = jsonv::formats::compose({ jsonv::formats::defaults(), local_formats });

  jsonv::value val = jsonv::parse(json_string);
  bar x = jsonv::extract<bar>(val, format);
  std::cout << x << std::endl;

  return 0;
}

I'd expect some kind of exception or error handling here!

The fix is

//      "a" !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
  std::string json_string = R"({
 "x": { "a": 50, "b": 20, "c": "Blah" }, 
 "y": { "a": 10,          "c": "No B?" },
 "z": "Only serialized in 2.0+",
 "w": "Only serialized before 5.0"
}
)";

Does this library have exceptions for parse errors?
Can I print a nice message to the user explaining:

  • line number where error occurred
  • expected property ("a") and actual incorrectly supplied property ("aaaaa")

Thanks

Use boost::variant

Most of JSON libraries re-invent boost::variant<> to store polymorphic values.

Why?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.