neurojson / jdata Goto Github PK

View Code? Open in Web Editor NEW

14.0 4.0 8.0 974 KB

JData: a language-independent data annotation for portable storage and interchange

Home Page: https://neurojson.org

serialization data-structures json ubjson specification scientific-data interchange-format

jdata's People

Contributors

Stargazers

Watchers

Forkers

openjdata yaoruoyang edwardx324 ljuillen hzxszjj0841 fangq

jdata's Issues

np.stack error on complex numpy array

to encode complex array, the np.stack method is used, but it results in the following error :

File "C:\XXX\Lib\site-packages\jdata\jdata.py", line 136, in encode
newobj["_ArrayData_"] = np.stack(d.ravel().real, d.ravel().imag)
TypeError: only integer scalar arrays can be converted to a scalar index

imaginary part is seen as the "axis" parameter of np.stack function. adding 2 parenthesis should fix the bug
newobj["_ArrayData_"] = np.stack((d.ravel().real, d.ravel().imag))

Engineering format on UNIX-timestamp

In Matlab, if I run:
`
timestamp = datestr(now,'yyyy-mm-dd HH:MM:SS'); %Matlab time whole days since year 0

t1 = datetime(timestamp) ;
t1.TimeZone = 'Europe/Oslo';
timestamp = posixtime(t1);
jsonencode(timestamp)
`
The result is '1.55895412E+9'
It would be verry nice to avoid the '+E9' at the end.
Any way to JSON-encode 1558954120?

Spec for a compact format for object persistance

I would like to propose to contribute with a spec for optimized serialization for any kind of struct, class or object.

It's similar to the metadata node you already proposed.

If it seems off-topic, personally I think it matches nicely with BJData and UBJSON's compact form: People come to binary format usually not willing to waste space repeating the same strings over and over. When simple UBJ_OBJECT are used for this purpose, that's what happens.

In contrast to opaque index or ndarrays, this spec allows for much better readability of the data and guarantees correct interpretation of data in the future by preventing you to lose record of which field is what or where. This is specially useful when data types and fields changes too often. I'm guessing you already know that.

In two cases I'm proposing new tags to UBJSON, to allow nesting inside those type's values, but the same could be done with reserved strings as you proposed, without changing or depending on UBJSON.

I think this spec would be specially useful when combined with reflection features that could allow automatic serialization and deserialization.

Inheritance could also be supported either with simple Single Table Inheritance or more sophisticated means.

I've turned a similar scheme of object serialization with UBJSON into a draft for this spec idea. See if you think that belongs somewhere in your project.

This idea is about an UBJSON container that holds both metadata and the object data to be stored in a compact way;

The metadata part is another container which contains values that specify the fields of all object types that will be stored later on the file.

Example of metadata:

[["Time", ["year", "month", "day", "hour", "minute"]],
 ["Place", ["longitude", "latitude"]],
 ["Appointment", ["start", "end", "place"]],
]

Then each instance is represented in the data portion as an array of the appropriate type: UBJ_MIXED or something else or even ndarrays if it's scalar data.

The type of each data instance is identified either by a preceding integer "tag" in the case of an heterogeneous array, or in the case of homogeneous representation, by it's location on the file.

If the array index or integer tag matches the index of a metadata type entry, then that type entry describes the fields of that data entry.

Simple example of heterogeneous data:

[[0, [2020, 05, 13, 2, 1]],
 [1, [34.234, 21.342342]],
 [0, [2001, 05, 01, 16, 30]],
 [0, [1970, 01, 01, 01, 01]],
 [1, [74.234, -5.342342]],
]

In another example using heterogeneous data, object nesting is achieved using a new UBJSON tag "t". This tag would be followed by the integer tag or index of the class or type, then by it's contents.

UBJSON numeric type tags are omitted for readability:

An "Appointment" entry would look like this:

[[]
[t] [2]
	[t] [0] [2020, 05, 13, 2, 1]
	[t] [0] [2020, 05, 13, 2, 31]
	[t] [1] [20.555555, 30.777777]
(...)
[]]

Example of linear homogeneous data in UBJ_MIXED arrays:

[
  [
    [2020, 05, 13, 2, 1],
    [2020, 05, 13, 2, 31],
    [2001, 05, 01, 16, 30],
    [1970, 01, 01, 01, 01],
  ],
  [
    [34.234, 21.342342],
    [74.234, -5.342342],
    [20.555555, 30.777777],
  ]
]

Another way to a similar nesting representation as above but using index-based reference, and other new UBJSON tag, this time "R" for reference.

This representation is useful when there are many repetitions.

Each R tag will be followed by a type tag or index followed by the index of the instance being referenced.

Of course, this representation is much less stream-friendly.

Again, other UBJSON types are omitted for simplicity:

[
  [
    [2020, 05, 13, 2, 1],
    [2020, 05, 13, 2, 31],
    [2001, 05, 01, 16, 30],
    [1970, 01, 01, 01, 01],
  ],
  [
    [34.234, 21.342342],
    [74.234, -5.342342],
    [20.555555, 30.777777],
  ],
  [
    [[R] [0] [0]
     [R] [0] [1]
     [R] [1] [2]
    ]
  ]
]

This example serializes the same Appointment instance data presented before.

That's it. Thanks for reading this long issue.

Add Markdown-formatted specification draft

Need to switch from wiki-based format to markdown to take advantage of the github markdown support.

A few sections also need to be expanded, for example, fangq@5f4cf09#diff-176fd73490b65bd49f1ec8871c068114R227

Support JSONPath for `_DataLink_` URL format, already supported in JSONLab

JSONPath is a simple format to reference the inner elements of the JSON encoded data, and has been supported in JSONLab (fangq/jsonlab@05edb7a).

The _DataLink_ definition should be updated to support JSONPath flavored data anchor definitions.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.