Giter Site home page Giter Site logo

astropenguin / xarray-dataclasses Goto Github PK

View Code? Open in Web Editor NEW
66.0 2.0 3.0 3.33 MB

:zap: xarray data creation made easy by dataclass

Home Page: https://pypi.org/project/xarray-dataclasses

License: MIT License

Python 100.00%
python xarray python-package xarray-extension dataarray dataset typing dataclass

xarray-dataclasses's Issues

Release v0.1.1

  • Update version numbers written in:
    • pyproject.toml
    • xarray_dataclasses/__init__.py
    • tests/test_metadata.py

Release v0.1.2

  • Update version numbers written in:
    • pyproject.toml
    • xarray_dataclasses/__init__.py
    • tests/test_metadata.py

Add factory fields for custom DataArray/Dataset creation

Add support of special fields (__dataarray_factory__, __dataset_factory__) for custom DataArray or Dataset creation.

class CustomDataset(xr.Dataset):
    __slots__ = ()


@dataclass
class Custom(DataArrayMixin):
    data: Data[tuple["x", "y"], float]
    __dataset_factory__ = CustomDataset


ds = asdataset(Custom(...)) # statically typed as CustomDataset
type(ds) # -> CustomDataset

Do not use variables with ClassVar or InitVar types

Update field.infer_field_kind() so as not to assign values with ClassVar or InitVar types to xarray's attrs.

from dataclasses import ClassVar, InitVar
from xarray_dataclasses import Coord, Data, dataarrayclass


@dataarrayclass
class Image:
    data: Data[('x', 'y'), float]
    x: Coord['x', int] = 0
    y: Coord['y', int] = 0

    spam: str = "spam"  # -> a member of attrs
    ham: ClassVar[str] = "ham"  # -> not a member of attrs
    egg: InitVar[str] = "egg"  # -> not a member of attrs

Add bases module

Add bases module which provides DataArrayClass, a base class for dataclasses.

from xarray_dataclasses import DataArray, DataArrayClass


class Image(DataArrayClass):
    data: DataArray[("x", "y"), float]
    x: DataArray["x", int] = 0
    y: DataArray["y", int] = 0

Add typing module

Add typing module which provides a type for xarray.DataArray with fixed dims and dtype.

DataArray[("x", "y"), "f8"]

# DataArray[('x', 'y'), float64]

The type can be instantiated.

DataArray[("x", "y"), "f8"]([[0, 1], [2, 3]])

# <xarray.DataArray (x: 2, y: 2)>
# array([[0., 1.],
#        [2., 3.]])
# Dimensions without coordinates: x, y

Attributes on data array in dataset. (Feature)

Currently there seems to be no way to specify attributes for a DataArray in a Dataset. Indeed, even if they are passed to new() in a DataArray, they are discarded.

It would be nice to reuse DataArray specs to be able specify these:

from xarray_datasetclasses import dataarrayclass, datasetspce, Data, Attr, DataArray

@dataarrayclass
class FooSpec:
    data: Data['x', float]
    meta: Attr[int]

@datasetclass
class BarSpec:
    array: DataArray[FooSpec]

Does that look useful? Here DataArray is from xarray_datasetclasses ... if this name isn't good because of conflict with xarray, perhaps ... ArraySpec?

Add dataset module

Similar to dataarrayclass, introduce a class decorator datasetclass.
Here is an example code to express the dataset of xarray's docs.

from xarray_dataclasses.dataset import datasetclass
from xarray_dataclasses.typing import Coord, Data


@datasetclass
class Weather:
    # data variables
    temperature: Data[('x', 'y', 'time'), float]]
    precipitation: Data[('x', 'y', 'time'), float]]

    # dimensions
    x: Coord['x', int] = 0
    y: Coord['y', int] = 0
    time: Coord['time', 'datetime64[ns]'] = '2021-01-01'

    # coordinates
    lon: Coord[('x', 'y'), float] = 0.0
    lat: Coord[('x', 'y'), float] = 0.0
    reference_time: Coord[(), 'datetime64[ns]'] = '2021-01-01'

Add Coord and Data types

Add Coord and Data types (subclass of typing.DataArray) to explicitly distinguish between data (data var) and coordinates. This is necessary to be done before adding @datasetclass (#24).

from xarray_dataclasses import Coord, Data, dataarrayclass


@dataarrayclass
class Image:
    data: Data[('x', 'y'), float]
    x: Coord['x', int] = 0
    y: Coord['y', int] = 0

Fix wrong comment

Fix wrong comment at xarray_dataclasses/typing.py:L38.

  • Before fix: # for Python 3.7 and 3.9
  • After fix: # for Python 3.7 and 3.8

Update dev environment

  • Update dev Python packages
  • Update dev JavaScript packages
  • Fix codes that cause type check errors

Update mix-in classes

Update mix-in classes (AsDataArray, AsDataset) so that the annotations of .new() are dynamically updated.

Fix type hints

  • Update Python codes to be compatible with the strict mode of Pyright
  • Fix the version of Pyright

Update dataclass type

Use typing.ParamSpec in the dataclass type hint so that .new() can be statically typed.

Update typing module

Update xarray_dataclaases.typing.DataArray so that it accepts the same parameters as xarray.DataArray.

Release v0.3.1

Release package that closes the following issues/PRs.

  • #40 loosen Xarray dependency requirements

Update instance check of DataArray type

Update __instancecheck__ of typing.DataAarray so that the following tests are passed.

import numpy as np
from dataclasses import field
from xarray_dataclasses import DataArray


assert isinstance(0, DataArray['x', int])
assert isinstance(field(default=0), DataArray['x', int])
assert isinstance([0, 1, 2], DataArray['x', int])
assert isinstance(np.array([0, 1, 2]), DataArray['x', int])

Fix parsing dataclass under __future__.annotations enabled

The current (v0.4.0) dataclass parser (xarray_dataclasses.parse) does not work when the postponed evaluation of annotations (PEP 563) is enabled by importing from __future__ import annotations. This is because the parser gets type hints of dataclass fields from cls.__dataclass_fields__: Under such environments, field.type becomes string and loses runtime information.

This issue is for resolving the problem by using typing_extensions.get_type_hints(cls, include_extras=True) that ensures evaluated type hints made from strings.

Swap the order of dims and dtype in Coord[...] and Data[...]

Swap the order of dims and dtype s.t. Coord[dims, dtype] and Data[dims, dtype].
This is because their order in the printed format is (dims, dtype, value).

<xarray.DataArray (x: 3, y: 3)>
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])
Coordinates:
  * x        (x) int64 0 0 0
  * y        (y) int64 0 0 0

Release v0.2.0

  • Update docstrings of functions to be more detailed
  • Update package version (0.1.2 โ†’ 0.2.0)
  • Update README

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.