Convert JSON files to Apache Arrow. You may also be interested in csv2arrow, json2parquet, or csv2parquet.
You can get the latest releases from https://github.com/domoritz/json2arrow/releases/.
cargo install json2arrow
USAGE:
json2arrow [OPTIONS] <JSON> [ARROW]
ARGS:
<JSON> Input JSON file
<ARROW> Output file, stdout if not present
OPTIONS:
-h, --help
Print help information
-m, --max-read-records <MAX_READ_RECORDS>
The number of records to infer the schema from. All rows if not present. Setting
max-read-records to zero will stop schema inference and all columns will be string typed
-n, --dry
Only print the schema
-p, --print-schema
Print the schema to stderr
-s, --schema-file <SCHEMA_FILE>
File with Arrow schema in JSON format
-V, --version
Print version information
The --schema-file option uses the same file format as --dry and --print-schema.
Since we use the Arrow JSON loader, we are limited to what it supports. Right now, it supports JSON line-delimited files.
{ "a": 42, "b": true }
{ "a": 12, "b": false }
{ "a": 7, "b": true }
To format the code, run
cargo clippy && cargo fmt