Giter Site home page Giter Site logo

Comments (8)

FrancescAlted avatar FrancescAlted commented on August 16, 2024 1

Well, str() still keeps track of the endianness, although in a strange form. Look at this:

In []: d3 = np.dtype("f4")

In []: str(d3)
Out[]: 'float32'

In []: d3 = np.dtype(">f4")

In []: str(d3)
Out[]: '>f4'

so, it only 'looses' the endianness indicator only when it is the same than current platform (yeah, funny). Anyway, here it is a quick portable solution (I think):

In []: d = np.dtype('f4')

In []: st = np.dtype("f4,i8")

In []: mydtype = lambda t: t.descr if t.descr[0][0] else t.descr[0][1]

In []: mydtype(d)
Out[]: '<f4'

In []: mydtype(st)
Out[]: [('f0', '<f4'), ('f1', '<i8')]

Hope this helps.

from zarr-python.

shoyer avatar shoyer commented on August 16, 2024 1

Indeed, I also did not realize ast.literal_eval() was a thing :). But given that you're using JSON there is something to be said for making the dtype machine readable. I would probably opt for either nested lists [['f0', '<f4'], ['f1', '<i8']] or dict of lists {'names': ['f0', 'f1'], 'formats': ['<f4', '<i8']}. Both are obviously based on NumPy but would make writing an interface in another language more straightforward.

from zarr-python.

alimanfoo avatar alimanfoo commented on August 16, 2024

Good point. Very happy to consider switching to JSON. See also comments in #5.

from zarr-python.

alimanfoo avatar alimanfoo commented on August 16, 2024

I'm looking into this but hitting an issue regarding serialization of the array's numpy dtype. I need to save the numpy dtype as one of the fields in the array metadata, however I'm struggling to find a way to convert a dtype to/from a string that supports simple and structured dtypes and preserves endianness.

I looked at bcolz and that uses str(), effectively:

dt = ...  # some numpy dtype
s = str(dt)  # convert dtype to string
dt = np.dtype(s)  # convert string to dtype

However this fails on a structured dtype. Also it doesn't preserve endianness for simple dtypes in some cases.

Some other folks use the .str attribute on the dtype, which does preserve endianness but collapses structured dtypes down to '|V...' so losing the internal dtype structure.

There is also the .descr attribute which has all the dtype information, but then it does slightly weird things with simple dtypes, e.g.::

In [74]: d = np.dtype('f4')

In [75]: d.descr
Out[75]: [('', '<f4')]

In [76]: np.dtype(d.descr)
Out[76]: dtype([('f0', '<f4')])

Any help appreciated, cc @shoyer @mrocklin @FrancescAlted.

from zarr-python.

alimanfoo avatar alimanfoo commented on August 16, 2024

Thanks @FrancescAlted, very helpful. I have a working solution in #14 which does something similar but feels hacky. The essence of it is these two functions:

def encode_dtype(d):
    if d.fields is None:
        return d.str
    else:
        return str(d)


def decode_dtype(s):
    try:
        return np.dtype(s)
    except ValueError:
        return np.dtype(ast.literal_eval(s))

These functions allow a numpy dtype to be encoded as a string for writing as a single string value to the JSON file, then also decodes that string back to a dtype object.

Any comments very welcome.

from zarr-python.

FrancescAlted avatar FrancescAlted commented on August 16, 2024

Whatever works for you ;) I did not know about ast.literal_eval(). Interesting.

from zarr-python.

alimanfoo avatar alimanfoo commented on August 16, 2024

Thanks @shoyer, I've changed the implementation to use nested lists for structured dtypes. I'll merge tomorrow if no further comments, then work on #5.

from zarr-python.

alimanfoo avatar alimanfoo commented on August 16, 2024

Closed via #14.

from zarr-python.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.