Giter Site home page Giter Site logo

boostorg / json Goto Github PK

View Code? Open in Web Editor NEW
415.0 415.0 94.0 12.9 MB

A C++11 library for parsing and serializing JSON to and from a DOM container in memory.

Home Page: https://boost.org/libs/json

License: Boost Software License 1.0

CMake 0.68% C++ 96.52% Python 0.13% HTML 1.26% Shell 0.46% Batchfile 0.23% Starlark 0.71%
boost cplusplus cplusplus-11 fast header-only json json-libraries parse

json's Introduction

Boost.JSON

Branch master develop
Azure Build Status Build Status
Docs Documentation Documentation
Drone Build Status Build Status
Matrix Matrix Matrix
Fuzzing --- fuzz
Appveyor Build status Build status
codecov.io codecov codecov

Boost.JSON

Overview

Boost.JSON is a portable C++ library which provides containers and algorithms that implement JavaScript Object Notation, or simply "JSON", a lightweight data-interchange format. This format is easy for humans to read and write, and easy for machines to parse and generate. It is based on a subset of the JavaScript Programming Language (Standard ECMA-262), and is currently standardised in RFC 8259. JSON is a text format that is language-independent but uses conventions that are familiar to programmers of the C-family of languages, including C, C++, C#, Java, JavaScript, Perl, Python, and many others. These properties make JSON an ideal data-interchange language.

This library focuses on a common and popular use-case: parsing and serializing to and from a container called value which holds JSON types. Any value which you build can be serialized and then deserialized, guaranteeing that the result will be equal to the original value. Whatever JSON output you produce with this library will be readable by most common JSON implementations in any language.

The value container is designed to be well suited as a vocabulary type appropriate for use in public interfaces and libraries, allowing them to be composed. The library restricts the representable data types to the ranges which are almost universally accepted by most JSON implementations, especially JavaScript. The parser and serializer are both highly performant, meeting or exceeding the benchmark performance of the best comparable libraries. Allocators are very well supported. Code which uses these types will be easy to understand, flexible, and performant.

Boost.JSON offers these features:

  • Fast compilation
  • Require only C++11
  • Fast streaming parser and serializer
  • Constant-time key lookup for objects
  • Options to allow non-standard JSON
  • Easy and safe modern API with allocator support
  • Optional header-only, without linking to a library

Visit https://boost.org/libs/json for complete documentation.

Requirements

  • Requires only C++11
  • Link to a built static or dynamic Boost library, or use header-only (see below)
  • Supports -fno-exceptions, detected automatically

The library relies heavily on these well known C++ types in its interfaces (henceforth termed standard types):

  • string_view
  • memory_resource, polymorphic_allocator
  • error_category, error_code, error_condition, system_error

Header-Only

To use as header-only; that is, to eliminate the requirement to link a program to a static or dynamic Boost.JSON library, simply place the following line in exactly one new or existing source file in your project.

#include <boost/json/src.hpp>

MSVC users must also define the macro BOOST_JSON_NO_LIB to disable auto-linking.

Embedded

Boost.JSON works great on embedded devices. The library uses local stack buffers to increase the performance of some operations. On Intel platforms these buffers are large (4KB), while on non-Intel platforms they are small (256 bytes). To adjust the size of the stack buffers for embedded applications define this macro when building the library or including the function definitions:

#define BOOST_JSON_STACK_BUFFER_SIZE 1024
#include <boost/json/src.hpp>

Endianness

Boost.JSON uses Boost.Endian in order to support both little endian and big endian platforms.

Supported Compilers

Boost.JSON has been tested with the following compilers:

  • clang: 3.5, 3.6, 3.7, 3.8, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
  • gcc: 4.8, 4.9, 5, 6, 7, 8, 9, 10, 11, 12
  • msvc: 14.0, 14.1, 14.2, 14.3

Supported JSON Text

The library expects input text to be encoded using UTF-8, which is a requirement put on all JSON exchanged between systems by the RFC. Similarly, the text generated by the library is valid UTF-8.

The RFC does not allow byte order marks (BOM) to appear in JSON text, so the library considers BOM syntax errors.

The library supports several popular JSON extensions. These have to be explicitly enabled.

Visual Studio Solution

cmake -G "Visual Studio 16 2019" -A Win32 -B bin -DCMAKE_TOOLCHAIN_FILE=cmake/toolchains/msvc.cmake
cmake -G "Visual Studio 16 2019" -A x64 -B bin64 -DCMAKE_TOOLCHAIN_FILE=cmake/toolchains/msvc.cmake

Quality Assurance

The development infrastructure for the library includes these per-commit analyses:

  • Coverage reports
  • Benchmark performance comparisons
  • Compilation and tests on Drone.io, Azure Pipelines, Appveyor
  • Fuzzing using clang-llvm and machine learning

License

Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at https://www.boost.org/LICENSE_1_0.txt)

json's People

Contributors

aerostun avatar akrzemi1 avatar alandefreitas avatar ayounes-synaptics avatar balusch avatar dimarusyy avatar djarek avatar doganulus avatar eelis avatar eldiener avatar evanlenz avatar grisumbras avatar gummif avatar julien-blanc-tgcm avatar kamrann avatar koalayt avatar liang3zy22 avatar m8mble avatar madmongo1 avatar maximilianriemensberger avatar maxkellermann avatar mborland avatar mloskot avatar nigels-com avatar ofenloch avatar pauldreik avatar pdimov avatar sdarwin avatar sdkrystian avatar vinniefalco avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

json's Issues

operator= wrong signatures

There should be 2 overloads: operator=(string const&) and operator=(string&&). The same is probably needed for the constructors, and other places where string appears as a parameter type. And this may be needed for array and object overloads.

Compile as standalone static library

This patch adds a new cmake option to enable compilation of boost::json as a static standalone dependency without any dependency and also compile the examples:

mkdir b && cd b
cmake .. -DBOOST_JSON_STANDALONE=1
make

diff --git a/CMakeLists.txt b/CMakeLists.txt
index d0e383e..834e9ce 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -48,7 +48,13 @@ else()
     target_compile_definitions(boost_json PUBLIC BOOST_JSON_STATIC_LINK=1)
 endif()
 
-if(CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)
+option(BOOST_JSON_STANDALONE "Build boost::json as a static standalone library" FALSE)
+
+if(BOOST_JSON_STANDALONE)
+    target_compile_features(boost_json PUBLIC cxx_std_17)
+    target_compile_definitions(boost_json PUBLIC BOOST_JSON_STANDALONE)
+    add_subdirectory(example)
+elseif(CMAKE_SOURCE_DIR STREQUAL CMAKE_CURRENT_SOURCE_DIR)
     if(${CMAKE_VERSION} VERSION_LESS 3.16)
         message(FATAL_ERROR "Boost.JSON development requires CMake 3.16 or newer.")
     endif()

take the last duplicate key

"...most popular implementations (including the ECMAScript specification which is implemented in modern browsers) follow the rule of taking only the last key-value pair..."

Parsing invalid JSON results in segmentation fault.

I have been trying to parse user generated json, which obviously could be faulty. In this case I tried parsing a json string, which is missing a separating comma between two values.

Example:

#include <string>
#include <boost/json.hpp>

int main(int argc, char *argv[]) {

    std::string test = "{"
                       "\"username\": \"user\"" // <- Missing separating ,
                       "\"password\": \"password\""
                       "}";
    boost::json::error_code ec;
    boost::json::parser parser;
    parser.start();
    parser.write(test.c_str(), test.size(), ec);
    parser.finish();

    if (ec) {
        std::cout << ec.message() << std::endl;
        return EXIT_FAILURE;
    }

    auto const jv = parser.release();

    if (ec) {
        std::cout << ec.message() << std::endl;
        return EXIT_FAILURE;
    } else {
        std::cout << jv << std::endl;
        return EXIT_SUCCESS;
    }
}

I tried both parse and parser, both resulting in segmentation fault. I checked the docs about how I should validate the json, but all I could find was either using error_codes or exceptions.

Am I missing a function to check untrusted json for errors or is there a bug in the library?

Assertion in boost::json::detail::pow10(int)

This was already reported in https://github.com/vinniefalco/json/issues/13#issuecomment-560517119, but here is a minimized version and I think it is good to have it in a separate issue.

I is the string 0.00....... with a lot of zeros following. Here it is, base 64 encoded:

MC4wMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAw
MDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAw
MDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAw
MDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAw
MDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAw
MDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDA=

Running it through the fuzzer gives the following output:

paul@tonfisk:~/code/delaktig/boost.json/fuzzing$ ./fuzzer old_crashes/crash_01.json 
INFO: Seed: 2483239189
INFO: Loaded 1 modules   (1092 inline 8-bit counters): 1092 [0x7d21a0, 0x7d25e4), 
INFO: Loaded 1 PC tables (1092 PCs): 1092 [0x5a1598,0x5a59d8), 
./fuzzer: Running 1 inputs 1 time(s) each.
Running: old_crashes/crash_01.json
fuzzer: ../include/boost/json/detail/impl/number.ipp:96: double boost::json::detail::pow10(int): Assertion `exp >= 0 && exp < 618' failed.

Segfault when running bench

Build Host: Fedora 31
Target: Fedora 31 (Linux)
Build System: CMake
Toolchains: gcc-9 c++11, clang-9 c++14
Branch: develop

Synopsys
Misuse of RapidJSON public interface results in internal assertion in RapidJSON library

Steps to Reproduce

$ cmake -DCMAKE_BUILD_TYPE=Debug ...
build the project as normal
$ bench <source_root>/bench/data/canada.json

Detail

The problem can be isolated to the benchmark enabled by
vi.emplace_back(new rapidjson_crt_impl); in function main of bench.cpp

The parse member function of the benchmark contains the following lines:

            CrtAllocator alloc;
            GenericDocument<
                UTF8<>, CrtAllocator> d(&alloc);
            d.Clear();  <<<--- PROBLEM IS HERE

In RapidJSON, the implementation of GenericDocument<>::Clear() executes an assertion that the internal value representing the document is an Array:

    void Clear() {
        RAPIDJSON_ASSERT(IsArray()); 
        GenericValue* e = GetElementsPointer();
        for (GenericValue* v = e; v != e + data_.a.size; ++v)
            v->~GenericValue();
        data_.a.size = 0;
    }

In the above use pattern, the document is not an Array, as it has not been initialised at all. It is actually null.

The assertion fails, resulting in either SEGFAULT of a debugger break.

Solution

remove the line d.Clear();

Side Effects of Solution

None. The document is re-created each time around the loop in any case.

Review all error codes

Go through every line that assigns an error and make sure it is fine-grained. Group the error logically and assign conditions.

Handle overflow in number parsing

In some areas of number parsing, e.g. ++dig_ or ++sig_ we need to check for overflow / wraparound and handle it. This will also need tests.

Accessing parsed numbers

Currently a JSON number may be parsed as either kind::int64, kind::uint64 or kind::double_ depending on the actual value, e.g. 1 is parsed as kind::int64, 18446744073709551615 is of kind::uint64, and 1.0 is of kind::double_.

This makes it convoluted to parse into a specific type if a sub-range may be parsed as another type, for example, to uint64_t I have to write:

uint64_t get_as_uint64(const char* s) {
  const auto value = json::parse(s);
  if (const auto* n = value.if_int64(); n && *n >= 0) { // [ 0 .. 2^63 - 1 ]
    return *n;
  } else if (const auto* n = value.if_uint64()) { // [ 2^63 .. 2^64 - 1 ]
    return *n;
  } else {
    throw std::out_of_range{"not an uint64_t"};
  }
}

whereas I would expect to just write:

json::parse(s).as_uint64(); // should work for [0 .. 2^64 - 1] and throws on anything else

My intuition is that there should be a json::number type that can hold any parsed numbers and may be converted to the user-requested type upon access as long as there is no loss of precision:

const json::number n = json::parse("42").as_number();
assert(n.as_int64() == 42); // only perform expensive check if the number fits into the data type without loss of precision here
assert(n.as_uint64() == 42);
assert(n.as_double() == 42.0);
// value::as_x() would just be a shorthand for value.as_number().as_x();

assert(json::parse("42.0").as_number().as_int64(); == 42); // ???, debatable

Thoughts?

CMAKE_CXX_STANDARD in cmake/toolchains/common.cmake is ignored

cmake -B _build.gcc -DBOOST_JSON_STANDALONE=ON -DCMAKE_TOOLCHAIN_FILE=cmake/toolchains/gcc.cmake
cmake -B _build.gcc -DCMAKE_CXX_STANDARD=14 -DBOOST_JSON_STANDALONE=ON -DCMAKE_TOOLCHAIN_FILE=cmake/toolchains/gcc.cmake

Something seems fishy in how set(CMAKE_CXX_STANDARD 11 CACHE STRING "") is handled, the src/src.cpp is compiled with -std=c++17 but the test/limits.cpp with -std=c++11 or with -std=c++14 (the second command).

The fish has been fished:

https://github.com/CPPAlliance/json/blob/d92902c7cf259f4103a0b798050f50ae5b8aa289/CMakeLists.txt#L63

Then, the common.cmake needs something like this perhaps

if(BOOST_JSON_STANDALONE)
    set(CMAKE_CXX_STANDARD 17 CACHE STRING "")
else()
    set(CMAKE_CXX_STANDARD 11 CACHE STRING "")
endif()

Add `callbacks` class

We can move the implementation of the callbacks in parser to a separate class to make the callbacks public, so that it can be re-used to allow third party parsers that support sax to produce json::value objects easily.

parser class bypasses storage class on finish() and write()

When a parser object is provided storage during the call to start(), subsequent calls to finish will bypass the storage object, and call operator new directly. It's not clear if this is intended, or an oversight. Digging into the code, it looks like the parser object holds a raw_stack object, which contains its own storage_ptr. Unfortunately (or intentionally), the parser constructor doesn't have an constructor that can take a reference to a storage_ptr, so raw_stack gets default initialized to a default_impl backed storage class in all cases, and bypasses the storage class entirely for the lifetime of the parser.

  boost::json::storage_ptr storage =
      boost::json::make_storage<boost::json::pool>(1024 * 1024);
  boost::json::parser p;
  p.start(storage);
  const std::array<char, 2> input = {'{', '}'};
  p.finish(input.data(), input.size());  // This calls operator new through boost::json::detail::default_impl::allocate

From the storage docs, this seems like unintended behavior, and one would expect the provided storage class to be used for memory during the parse operation.

placing this code:

 rs_ = detail::raw_stack(sp);

On the line below causes start to reconstruct the raw_stack object with the new storage, and causes the behavior I would expect, but it's not clear that's the intent, and might be inefficient to reconstruct the stack object if the same storage is reused repeatedly.
https://github.com/vinniefalco/json/blob/c08721906d67c17eb8bda679550ca7710856d917/include/boost/json/impl/parser.ipp#L165

Tidy up error codes

Trim the error codes used by the parser and remove the unused error codes from the enum.

Parsing Failures in Corner Cases

I've benchmarked this JSON library with the Native JSON Benchmark Suite. This revealed a few interesting "Errors" in corner-cases. I'm not certain, whether the expected results are mandated by JSON, but wanted to share the findings nonetheless.

Doubles

* `[-0.0]`
  * expect: `-0 (0x0168000000000000000)`
  * actual: `0 (0x0160)`

* `[2.22507e-308]`
  * expect: `2.2250699999999998e-308 (0x016FFFFE2E8159D0)`
  * actual: `2.2250700000295652e-308 (0x016FFFFE2E824391)`

* `[-2.22507e-308]`
  * expect: `-2.2250699999999998e-308 (0x016800FFFFE2E8159D0)`
  * actual: `-2.2250700000295652e-308 (0x016800FFFFE2E824391)`

* `[4.9406564584124654e-324]`
  * expect: `4.9406564584124654e-324 (0x0161)`
  * actual: `0 (0x0160)`

* `[2.2250738585072009e-308]`
  * expect: `2.2250738585072009e-308 (0x016FFFFFFFFFFFFF)`
  * actual: `0 (0x0160)`

* `[2.2250738585072014e-308]`
  * expect: `2.2250738585072014e-308 (0x01610000000000000)`
  * actual: `0 (0x0160)`

* `[1e-10000]`
  * expect: `0 (0x0160)`
  * actual: `0 (0x0160)`

* `[0.9868011474609375]`
  * expect: `0.9868011474609375 (0x0163FEF93E000000000)`
  * actual: `0.98680114746093761 (0x0163FEF93E000000001)`

* `[2.2250738585072011e-308]`
  * expect: `2.2250738585072009e-308 (0x016FFFFFFFFFFFFF)`
  * actual: `0 (0x0160)`

* `[1e-214748363]`
  * expect: `0 (0x0160)`
  * actual: `0 (0x0160)`

* `[1e-214748364]`
  * expect: `0 (0x0160)`
  * actual: `0 (0x0160)`

* `[0.017976931348623157e+310]`
  * expect: `1.7976931348623157e+308 (0x0167FEFFFFFFFFFFFFF)`
  * actual: `inf (0x0167FF0000000000000)`

* `[2.2250738585072012e-308]`
  * expect: `2.2250738585072014e-308 (0x01610000000000000)`
  * actual: `0 (0x0160)`

* `[2.22507385850720113605740979670913197593481954635164564e-308]`
  * expect: `2.2250738585072009e-308 (0x016FFFFFFFFFFFFF)`
  * actual: `0 (0x0160)`

* `[2.22507385850720113605740979670913197593481954635164565e-308]`
  * expect: `2.2250738585072014e-308 (0x01610000000000000)`
  * actual: `0 (0x0160)`

* `[0.999999999999999944488848768742172978818416595458984374]`
  * expect: `0.99999999999999989 (0x0163FEFFFFFFFFFFFFF)`
  * actual: `1 (0x0163FF0000000000000)`

* `[1.00000000000000011102230246251565404236316680908203125]`
  * expect: `1 (0x0163FF0000000000000)`
  * actual: `1.0000000000000002 (0x0163FF0000000000001)`

* `[1.00000000000000011102230246251565404236316680908203124]`
  * expect: `1 (0x0163FF0000000000000)`
  * actual: `1.0000000000000002 (0x0163FF0000000000001)`

* `[7205759403792793199999e-5]`
  * expect: `72057594037927928 (0x016436FFFFFFFFFFFFF)`
  * actual: `72057594037927936 (0x0164370000000000000)`

* `[10141204801825834086073718800384]`
  * expect: `1.0141204801825834e+31 (0x016465FFFFFFFFFFFFF)`
  * actual: `1.0141204801825835e+31 (0x0164660000000000000)`

* `[1014120480182583464902367222169599999e-5]`
  * expect: `1.0141204801825834e+31 (0x016465FFFFFFFFFFFFF)`
  * actual: `1.0141204801825835e+31 (0x0164660000000000000)`

* `[5708990770823839207320493820740630171355185152]`
  * expect: `5.7089907708238395e+45 (0x0164970000000000000)`
  * actual: `5.7089907708238389e+45 (0x016496FFFFFFFFFFFFF)`

* `[5708990770823839207320493820740630171355185152001e-3]`
  * expect: `5.7089907708238395e+45 (0x0164970000000000000)`
  * actual: `5.7089907708238389e+45 (0x016496FFFFFFFFFFFFF)`

* `[2.225073858507201136057409796709131975934819546351645648023426109724822222021076945516529523908135087914149158913039621106870086438694594645527657207407820621743379988141063267329253552286881372149012981122451451889849057222307285255133155755015914397476397983411801999323962548289017107081850690630666655994938275772572015763062690663332647565300009245888316433037779791869612049497390377829704905051080609940730262937128958950003583799967207254304360284078895771796150945516748243471030702609144621572289880258182545180325707018860872113128079512233426288368622321503775666622503982534335974568884423900265498198385487948292206894721689831099698365846814022854243330660339850886445804001034933970427567186443383770486037861622771738545623065874679014086723327636718751234567890123456789012345678901e-308]`
  * expect: `2.2250738585072014e-308 (0x01610000000000000)`
  * actual: `0 (0x0160)`

Strings

* `["Hello\u0000World"]`
  * expect: `"Hello\0World"` (length: 11)
  * actual: `"Hello"` (length: 5)

* `["\uD834\uDD1E"]`
  * expect: `"𝄞"` (length: 4)
  * actual: `"턞"` (length: 3)

For the reference: The tests have been performed similarly to the following snippet:

#define BOOST_JSON_HEADER_ONLY 1
#include <boost/json.hpp>

// [...]

   namespace JSON = boost::json;
   bool ParseDouble(const char* json, double* d) const {
      try {
         const auto root = JSON::parse(JSON::string_view{json});
         const auto& element = root.get_array()[0];
         *d = [&]() -> double
         {
            switch (element.kind())
            {
               case JSON::kind::double_:
                  return element.get_double();
               case JSON::kind::uint64:
                  return element.get_uint64();
               case JSON::kind::int64:
                  return element.get_int64();
               default:
                  throw false;
            }
         }();
         return true;
      }
      catch (...) {
         return false;
      }
   }

   bool ParseString(const char* json, std::string& s) const {
      try {
         const auto root = JSON::parse(JSON::string_view{json});
         const auto& element = root.get_array()[0];
         s = element.get_string().c_str();
         return true;
      }
      catch (...) {
         return false;
      }
   }

In case I did a testing-mistake, I'd be happy to repeat the exercise.

simplify parser

we dont need all the extra overloads of finish(), we can just make write work like finish and the user can deal with the error if they want to continue processing the extra data.

Trim down string

We can combine separate overloads in string into fewer functions accepting a string_view parameter. To check if the pointer lies within a range use std::less.

Hash function needs a salt

This needs a good hash function for strings, and we should probably guard against algorithmic complexity attacks by allowing for a salt.

Test standalone with libstdc++

The Travis configuration for standalone clang-9 does not work because the installed libstdc++ is not new enough to contain <memory_resource>

string::substr uncertainty about storage

Right now string::substr returns a string using the default storage. We might need to simply remove this overload. If users want a copy they can use subview in a constructor call, possibly explicitly specifying a storage pointer.

Update bintray credentials (was: add fuzzing)

This is a ticket to keep track of adding fuzz testing. I promised to help out, so here I am!
I wanted to highlight the plan so everyone is happy with the direction.

The plan is to

  • add a fuzz target (and build support, if necessary)
  • helper scripts to get an initial seed corpus
  • add a github action which runs the fuzzing for a short time (30 seconds or so) to make sure easy to find bugs are detected already at the pull request stage

Building

I had problem building the library - I suspect others wanting to try it out also may run into problems. Is there documentation somewhere that I missed? I expected the usual git clone, submodule recursive update, cmake to work out of the box but there seems to be a dependency on boost beast.

Fuzz target

There is already a fuzzer in #13 . @msimonsson are you ok that I incorporate your fuzzer code from that ticket under the boost license? I assume this is OK, but I am not a lawyer (well perhaps a C++ language lawyer wannabe :-).

Github Action

I have used this for the simdjson project recently, worked fine. I am not sure if it is possible to browse through it unless being a member, but here are some links:

For efficiency, it is good if the corpus can be stored somewhere between the runs. Otherwise it has to bootstrap each time which is inefficient. I use bintray for simdjson - @vinniefalco where would you be ok to store the corpus, do you perhaps already have a bintray account? In the meanwhile, I will use my own.

Implementation

I develop this in my clone for now.
See the Readme over here: https://github.com/pauldreik/json/tree/paul/fuzz/fuzzing

Dynamic cast causes code to fail to compile with -fno-rtti on gcc

https://github.com/CPPAlliance/json/blob/ad1a3378b6caac61da44bec403336a02d3fd381f/include/boost/json/detail/default_resource.hpp#L47

The above line prevents compilation if -fno-rtti is enabled in gcc (version 9, but I suspect that doesn't matter)

In member function ‘virtual bool boost::json::detail::default_resource::do_is_equal(const boost::container::pmr::memory_resource&) const’:
...boost/json/detail/default_resource.hpp:48:41: error: ‘dynamic_cast’ not permitted with ‘-fno-rtti’

This appears to be a regression on this commit:
a47b0f3

Backing up to one commit previous allows the same code to build and link just fine.

Optional bench and build time cloning of nlohmann and rapidjson

Hello,
Thanks for this library, I find it very interesting, especially for deeply embedded platform (without OS).
I feel that having nlohmann and rapidjson imposed as git submodule just for the purpose of benchmarking is too much. Full cloning of nlohmann in particular is quite big...
What about moving that to build time retrieving thanks to cmake ExternalProject_Add?

include(ExternalProject)
ExternalProject_Add(bench/lib/nlohmann
GIT_REPOSITORY https://github.com/nlohmann/json.git
GIT_TAG 99d7518d21cbbfe91d341a5431438bf7559c6974
)

Also compilation of bench should be optional.

Thanks!

Extra array level when serializing member value

I'm observing weird output when testing with the following condensed minimal program:

#define BOOST_JSON_HEADER_ONLY 1
#include <iostream>
#include <boost/json.hpp>

class Wrapper
{
public:
   Wrapper(boost::json::value v) : wrapped{v} {}

   boost::json::value wrapped;
};

int main() {
   const auto input = boost::json::string_view{"[]"};
   const auto value = boost::json::parse(input);
   const auto wrapper = Wrapper{value};
   const auto output = boost::json::to_string(wrapper.wrapped);
   std::cout << input << " -> " << boost::json::to_string(wrapper.wrapped) << std::endl;
}

I'd expect output [] -> [] but I'm seeing [] -> [[]]. If I skip using the Wrapper, I get the desired output. Am I violating some requirement of the value type?

Update README.md

The README.md should include most or all of the contents of 01_overview.qbk

Heap-buffer-overflow when fuzzing with libFuzzer

Fuzzing the validate() function in example/validate.cpp results in heap-buffer-overflow:

==75832==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000005901 at pc 0x00000031fa46 bp 0x7fffffff8f50 sp 0x7fffffff8f48
READ of size 1 at 0x603000005901 thread T0
    #0 0x31fa45 in boost::json::basic_parser::write_some(char const*, unsigned long, std::__1::error_code&)
        boost/json/impl/basic_parser.ipp:456:16
    #1 0x30f73c in boost::json::basic_parser::write(char const*, unsigned long, std::__1::error_code&)
        boost/json/impl/basic_parser.ipp:894:9
    #2 0x309cf2 in boost::json::basic_parser::finish(char const*, unsigned long, std::__1::error_code&)
        boost/json/impl/basic_parser.ipp:924:5
    #3 0x30937e in (anonymous namespace)::validate(std::__1::basic_string_view<char, std::__1::char_traits<char> >)
        bjson.cc:73:11

Input:

\"~QQ36644632   {n
Base64: In5RUTM2NjQ0NjMyICAge24=

Please let me know if you need more details.

Crash on input

Parser chokes on this (after base64 decode):

WyL//34zOVx1ZDg0ZFx1ZGM4M2RcdWQ4M2RcdWRlM2M4dWRlMTlcdWQ4M2RcdWRlMzlkZWUzOVx1
ZDg0ZFx1ZGM4M2RcdWQ4M2RcdWRlMzlcXHVkY2M4M1x1ZDg5ZFx1ZGUzOVx1ZDgzZFx1ZGUzOWRb
IGZhbHNlLDMzMzMzMzMzMzMzMzMzMzMzNDMzMzMzMTY1MzczNzMwLDMzMzMzMzMzMzMzMzMzMzMz
MzM3ODAsMzMzMzMzMzMzMzM0MzMzMzMxNjUzNzM3MzAsMzMzMzMzMzMzMzMzMzMzMzMzMzMzMzMz
MzM3ODAsMzMzMzMzMzMzMzMzMzQzMzMzMzE2NTM3MzczMCwzMzMzMzMzMzMzMzMzMzMzMzMzNzgw
LDMzMzMzMzM4MzU1MzMwNzQ3NDYwLDMzMTY2NTAwMDAzMzMzMzMwNzQ3MzMzMzMzMzc3OSwzMzMz
MzMzMzMzMzMzMzMzNDMzMzMzMzMwNzQ3NDYwLDMzMzMzMzMzMzMzMzMzMzMzMzMzNzgwLDMzMzMz
MzMzMzMzMzMzMzA4ODM1NTMzMDc0Mzc4MCwzMzMzMzMzMzMzMzMzMzMwODgzNTUzMzA3NDc0NjAs
MzMzMzMzMzMxNjY1MDAwMDMzMzMzNDc0NjAsMzMzMzMzMzMzMzMzMzMzMzMzMzc4MCwzMzMzMzMz
MzMzMzM3MzMzMzE2NjUwMDAwMzMzMzMzMDc0NzMzMzMzMzM3NzksMzMzMzMzMzMzMzMzMzMzMzQz
MzMzMzMwNzQ3NDYwLDMzMzMzMzMzMzMzMzMzMzMzMzMzMzMzMzMzMzc4MCwzMzMzMzMzMzMzNzgw
LDMzMzMzMzMzMzMzMzA4ODM1NTMzMDc0NzQ2MCwzMzE2NjUwMDAwMzMzMzMzMDc0NzMzMzMzMzM3
NzksMzMzMzMzMzMzMzMzMzMzMzQzMzMzMzMwNzQ3NDYwLDMzMzMzMzMzMzMzMzMzMzMzMzMzNzgw
LDMzMzMzMzMzMzMzMzMzMzA4ODM1NTMzMDc0Mzc4MCwzMzMzMzMzMzMzMzMzMzMzMzMwODgzNTUz
MzA3NDM3ODAsMzMzMzMzMzMzMzMzMzMzMDg4MzU1MzMwNzQ3NDYwLDMzMzMzMzMzMzMzMDczMzM3
NDc0NjAsMzMzMzMzMzMzMzMzMzMzMzMzNzgwLDMzMzMzMzMzMzMzMzA4ODM1NTMzMDc0NzQ2MCwz
MzE2NjUwMDAwMzMzMzMzMDc0NzMzMzMzMzM3NzksMzMzMzMzMzMzMzMzMzMzMzQzMzMzMzMzMDc0
NzQ2MCwzMzMzMzMzMzMzMzMzMzMzMzMzMzMzMzMzMzMzMzM3ODAsMzMzMzMzMzMzMzMzMzMzMDg4
MzU1MzMwNzQzNzgwLDMzMzMzMzMzMzMzMzMzMzA4ODM1NTMzMDc0NzQ2MCwzMzMzMzMzMzMzMzMz
MzMzMzM0MjQ3LDMzMzMzMzMzMzMzMzMzMzQzMzMzMzMzMzMzMzMzMzM3MzMzMzQzMzMzMzMzMDc0
NzQ2MCwzMzMzMzMzMzMzMzMzMzMzMzMzNzgwLDMzMzMzMzMzMzMzMzA4ODM1NTMzMDc0NzQ2MCwz
MzE2NjUwMDAwMzMzMzMzMDc0NzMzMzMzMzM3NzksMzMzMzMzMzMzMzMzMzMzMzQzMzMzMzMwNzQ3
NDYwLDMzMzMzMzMzMzMzMzMzMzMzMzMzNzgwLDMzMzMzMzMzMzMzMzMzMzA4ODM1NTMzMDc0Mzc4
MCwzMzMzMzMzMzMzMzMzMzMwODgzNTUzMzA3NDc0NjAsMzMzMzMzMzMzLDMzMzMzMzMzMzMzMzMz
MzMzMzM3ODAsMzMzMzMzMzMzMzc4MCwzMzMzMzMzMzMzMzMwODgzNTUzMzA3NDc0NjAsMzMxNjY1
MDAwMDMzMzMzMzA3NDczMzMzMzMzNzc5LDMzMzMzMzMzMzM3ODAsMzMzMzMzMzgzNTUzMzA3NDc0
NjAsMzMxNjY1MDAwMDMzMzMzMzA3NDczMzMzMzMzNzc5LDMzMzMzMzMzMzMzMzMzMzM0MzMzMzMz
MzA3NDc0NjAsMzMzMzMzMzMzMzMzMzMzMzMzMzM3ODAsMzMzMzMzMzMzMzMzMzMzMDg4MzU1MzMw
NzQzNzgwLDMzMzMzMzMzMzMzMzMzMzA4ODM1NTMzMDc0NzQ2MCwzMzMzMzMzMzE2NjUwMDAwMzMz
MzM0NzQ2MCwzMzMzMzMzMzMzMzMzMzMzMzMzNzgwLDMzMzMzMzMzMzMzMzM0MzMzMzMxNjUzNzM3
MzAsMzMzMzMzMzMzMzMzMzMzMzMzMzc4MCwzMzMzMzMzODM1NTMzMDc0NzQ2MCwzMzE2NjUwMDAw
MzMzMzMzMDc0NzMzMzMzMzM3NzksMzMzMzMzMzMzMzMzMzMzMzQzMzMzMzMzMDc0NzQ2MCwzMzMz
MzMzMzMzMzMzMzMzMzMzMzc4MCwzMzMzMzMzMzMzMzMzMzMwODgzNTUzMzA3NDM3ODAsMzMzMzMz
MzMzMzMzMzMzMDg4MzU1MzMwNzQ3NDYwLDMzMzMzMzMzMTY2NTAwMDAzMzMzMzQ3NDYwLDMzMzMz
MzMzMzMzMzMzMzMzMzM3ODAsMzMzMzMzMzMzMzMzNzMzMzM0MzMzMzMzMzA3NDc0NjAsMzMzMzMz
MzMzMzMzMzMzMzMzMzc4MCwzMzMzMzMzMzMzMzMwODgzNTUzMzA3NDc0NjAsMzMxNjY1MDAwMDMz
MzMzMzA3NDczMzMzMzMzNzc5LDMzMzMzMzMzMzMzMzMzMzM0MzMzMzNcdWQ4N2RcdWRlZGV1ZGM4
ZGUzOVx1ZDg0ZFx1ZGM4M2RcdWQ4OGRcdWRlMzlcdWQ4OWRcdWRlMjM5MzMzZWUzOVxk

string initializer_list tests

The tests for initializer_list in string have to be uncommented once we figure out how to make them work with varying small-buffer sizes.

Just one more customisation point and we're good.

Boost.JSON currently has two customisation points for the serialisation of UDTs to JSON values:

  • provide a member function T::to_json(boost::json::value&) const
  • specialise the template boost::json::to_value_traits<> in the namespace boost::json.

This issues lays out an argument why, in the author's view, there is a need for one more.

It is not possible to automatically provide a conversion for enums with the use of a macro.

It is common practice to use one of the recent, very useful header-only mini-libraries to provide missing utility to c++ enum types. One example of this is wise_enum (https://github.com/quicknir/wise_enum) although there are a few.

By declaring an enum with the WISE_ENUM_XXX macros, users gain proper c++ enums wich have full compile-time reflection. For example wise_enum::to_string(E) yields a string_view and wise_enum::from_string<T>(str) yields an optional<E>.

This has great utility in logging/serialisation etc.

I go further in my programs and provide an ADL to_string overload for all object types, essentially providing:

template< class E, wise_enum::is_wise_enum_v<E>* = nullptr >
string_view 
to_string(
    E const& e)  
{
    return wise_enum::to_string(e);
}

as part of the macro which declares the enum. Other such mini-functions are printed by this macro in order to provide operator<< and operator>>, which means, for example, that the enum is now parsable by boost::program_options as an option name rather than an integer.

The provision of an ADL version simplifies the writing of generic code, for example:

    template<class Impl>
    struct packet_enable_json
    {
        void to_json(boost::json::value& jv) const
        {
            auto& self = static_cast<Impl const&>(*this);
            auto& object = jv.emplace_object();
            object.reserve(2);
            object.emplace("type", to_string(self.id()));   // <<== HERE
            object.emplace("data", boost::json::to_value(Impl::as_nvps(self), object.storage()));
        }
    };

In the above real example, any Impl who's .id() method yields any type which has the ADL to_string function available will be correctly serialised to json.

This works when the objects are always represented as JSON strings. But in the above example I have a problem if the result type of Impl::as_nvps(self) yields an enum.

Currently boost.json will serialise the underlying integer representation (in the case of an old-style enum) or fail to compile with an enum class

The solution of course is to provide a specialisation of boost::json::to_value_traits but (prior to fully compliant c++20 compilers) I have the problem that my utility macro will need to:

  1. end the current namespace,
  2. open the boost::json namespace,
  3. print the specialisation in terms of wise_enum::to_string and then,
  4. re-establish the original namespace in which the enum is declared.

The final step is not possible without futher decorating the macro with namespace names, which is ugly.

###I offer three two solutions:

  1. Allow the overloading of boost::json::to_value as a customisation point. (this doesn't actually help)

  2. Add one additional step to the deduction of behaviour of boost::json::to_value:

The library provides a set of customization points to enable conversions between objects of type T and a JSON value by:

  • A specialization of to_value_traits for T containing the public static member function void to_json_traits::assign( value&, T const& ),

  • A specialization of value_cast_traits for T containing the public static member function T value_cast_traits::construct( value const& ),

  • A free function declared in the namespace of T compatible with the following signature: void to_json(T const&, boost::json::value& jv , boost::json::storage_ptr = {})

  • A public non-static member function void T::to_json( value& ) const,

  • A public constructor T::T( value const& ),

  1. Add an additional Enable template parameter to to_value_traits so that user programs can partially specialise the trait for all types which match a predicate.

For example:

template< class Enum >
struct to_value_traits < Enum, std::enable_if_t< wise_enum::is_wise_enum_v<Enum> > >
{
  static void assign( value& jv, Enum const& e )
  {
    jv = wise_enum::to_string(e);
  }
}

Opinions about ADL are as varied and as strong as opnions about exceptions, almost-always-auto and the use of goto.

I am in favour of the provision of ADL overloads, as they are simple and unobtrsive.

However, any of the above solutions would be acceptable to me as they would provide the necessary functionality.

@vinniefalco
@pdimov
@sdkrystian

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.