Giter Site home page Giter Site logo

lesommer / oocgcm Goto Github PK

View Code? Open in Web Editor NEW
38.0 38.0 11.0 3 MB

oocgcm is a python library for the analysis of large gridded geophysical dataset.

Home Page: http://oocgcm.rtfd.io

License: Apache License 2.0

Python 100.00%
dask geoscientific ocean ocean-models python python-packages xarray

oocgcm's People

Contributors

alaindomissy avatar apatlpo avatar lesommer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

oocgcm's Issues

Shall I rename projection coordinates ?

I am not sur whether I am using CF standart names 'projection_x_coordinate' and 'projection_y_coordinate' appropriately.

The current version of oocgm assumes that projection coordinates refer to lat/lon and have no unit. I this correct ?

Problem when slicing a NEMO 2d grid objects defined with dask chunks.

a slice of a grid object defined with chunks appears to be disfunctionnal.

eg.

from oocgcm.oceanmodels.nemo import grids
_grd = grids.nemo_2d_grid(...,chunks=xr_chunks)
grd = _grd[500:1500,500:1500]

this works :

gssh = grd.horizontal_gradient(buoyancy)

but this raises an exception

ghb = grd.norm_of_vectorfield(grd.horizontal_gradient(buoyancy))

Inheritance of DataArray

I have created a Window object that inherits from DataArray. I had some issues with recursive call of the constructor inside the print and plot method. I achieved to fix it for the print method by looking at the DataArray code but I had to override the plot method for the moment. Has anyone tried to create an object that inherits from DataArray ?
I think it is an important issue that needs to be address for future development. I am not sure that the structure of xarray objects has made the inheritance easy.

Subclassing DataArray to create a GriddedArray structure that refers to the grid ?

In the long term, it might be more pythonic for methods related to e.g. a scalar field (eg : its gradient, smoothing etc...) to be attributes of the dataArray itself.

grd = oocgcm.oceanmodels.nemo.grids.nemo_2d_grid(...)
ssh = oocgcm.oceanmodels.nemo.io.GriddedScalarArray('filessh.nc',grid=grd,grid_location='t')
ssh[0].gradient().divergence().plot() # plots the laplacian of ssh

Suggestion: correct coordinates after curl/norm

Hello,

Here is my example: I am reading a u,v field, respectively on U-grid and V-grid but also with respective vertical coordinates: depthu and depthv. I want to compute curl, which will be centered on f-grid. I point out that the issue is similar with vector norm (centered on T-grid)

            vec_xr = gridsc.VectorField2d(vi_xr, vj_xr,
                                   x_component_grid_location='u',
                                   y_component_grid_location='v')

            #EE mode
            vcurl_xr=grd.vertical_component_of_curl(vec_xr)
            print vi_xr
            print vj_xr
            print vec_xr
            print vcurl_xr

Here is the output:

<xarray.DataArray (t: 5, depthu: 19, y: 3454, x: 5422)>
dask.array<elemwis..., shape=(5, 19, 3454, 5422), dtype=float64, chunksize=(1, 1, 1727, 2711)>
Coordinates:
    nav_lat        (y, x) float32 26.5648 26.5648 26.5648 26.5648 26.5648 ...
    nav_lon        (y, x) float32 -81.4512 -81.4346 -81.4179 -81.4012 ...
  * depthu         (depthu) float32 0.480455 1.55879 2.79421 4.18731 5.73867 ...
  * x              (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
  * y              (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
  * t              (t) datetime64[ns] 2008-01-07T12:00:00 ...
    time_centered  (t) datetime64[ns] 2008-01-07T12:00:00 ...
Attributes:
    grid_location: u

<xarray.DataArray (t: 5, depthv: 19, y: 3454, x: 5422)>
dask.array<elemwis..., shape=(5, 19, 3454, 5422), dtype=float64, chunksize=(1, 1, 1727, 2711)>
Coordinates:
    nav_lat        (y, x) float32 26.5648 26.5648 26.5648 26.5648 26.5648 ...
    nav_lon        (y, x) float32 -81.4512 -81.4346 -81.4179 -81.4012 ...
  * depthv         (depthv) float32 0.480455 1.55879 2.79421 4.18731 5.73867 ...
  * x              (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
  * y              (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
  * t              (t) datetime64[ns] 2008-01-07T12:00:00 ...
    time_centered  (t) datetime64[ns] 2008-01-07T12:00:00 ...
Attributes:
    grid_location: v

VectorField2d(x_component=<xarray.DataArray (t: 5, depthu: 19, y: 3454, x: 5422)>
dask.array<elemwis..., shape=(5, 19, 3454, 5422), dtype=float64, chunksize=(1, 1, 1727, 2711)>
Coordinates:
    nav_lat        (y, x) float32 26.5648 26.5648 26.5648 26.5648 26.5648 ...
    nav_lon        (y, x) float32 -81.4512 -81.4346 -81.4179 -81.4012 ...
  * depthu         (depthu) float32 0.480455 1.55879 2.79421 4.18731 5.73867 ...
  * x              (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
  * y              (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
  * t              (t) datetime64[ns] 2008-01-07T12:00:00 ...
    time_centered  (t) datetime64[ns] 2008-01-07T12:00:00 ...
Attributes:
    grid_location: u, y_component=<xarray.DataArray (t: 5, depthv: 19, y: 3454, x: 5422)>
dask.array<elemwis..., shape=(5, 19, 3454, 5422), dtype=float64, chunksize=(1, 1, 1727, 2711)>
Coordinates:
    nav_lat        (y, x) float32 26.5648 26.5648 26.5648 26.5648 26.5648 ...
    nav_lon        (y, x) float32 -81.4512 -81.4346 -81.4179 -81.4012 ...
  * depthv         (depthv) float32 0.480455 1.55879 2.79421 4.18731 5.73867 ...
  * x              (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
  * y              (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
  * t              (t) datetime64[ns] 2008-01-07T12:00:00 ...
    time_centered  (t) datetime64[ns] 2008-01-07T12:00:00 ...
Attributes:
    grid_location: v, x_component_grid_location='u', y_component_grid_location='v')

<xarray.DataArray 'vertical component of the curl' (t: 5, depthv: 19, y: 3454, x: 5422, depthu: 19)>
dask.array<elemwis..., shape=(5, 19, 3454, 5422, 19), dtype=float64, chunksize=(1, 1, 1727, 2711, 1)>
Coordinates:
  * depthv         (depthv) float32 0.480455 1.55879 2.79421 4.18731 5.73867 ...
  * x              (x) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
  * y              (y) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ...
  * t              (t) datetime64[ns] 2008-01-07T12:00:00 ...
    time_centered  (t) datetime64[ns] 2008-01-07T12:00:00 ...
  * depthu         (depthu) float32 0.480455 1.55879 2.79421 4.18731 5.73867 ...
Attributes:
    grid_location: f

Original coordinates of the u,v components are broadcast through the operation (here, curl) and the result is a 4-coordinate variable.

It can be very consequent in memory and writing operations, because it creates a 4-D variable which is N times bigger than the expected 3-D variable, with N as the vertical size.

Best regards,
Simon

Slicing/isosurfaces methods for 3D grids.

Eventually, 3D grids should come with methods for projecting 3D fields onto 2D isosurfaces (ex : temperature on an isopycnal).

This issue gathers informations / ideas as to how to proceed.

Can not manage memory properly with norm_of_vectorfield

Hi all,
My problem is the following : I want to process simple statistics using the norm of a vector (u/v fields).
When using norm_of_vectorfield, memory gets busier than when writing manually a norm (as if grids were collocated). Fatal outcome is that norm_of_vectorfield may lead to overfill memory, whereas memory is well bounded with a xarray manual writing of the norm.

Here is a simple script showing (watching the monitor) that memory gets higher with norm_of_vectorfield.

import sys, os
import xarray as xr
from xarray import ufuncs as uf
from contextlib import contextmanager
import time
from oocgcm.oceanmodels.nemo import grids
from oocgcm.core import grids as gridsc

#-----------------------------------------------------------------------------------------------------------
@contextmanager
def timeit_context(name):
    startTime = time.time()
    yield
    elapsedTime = time.time() - startTime
    print('{} takes {} s'.format(name, int(elapsedTime)))

#-----------------------------------------------------------------------------------------------------------

chunks = (1727, 2711)
xr_chunks = {'x': chunks[-1], 'y': chunks[-2]}
xr_chunks_u = {'x': chunks[-1], 'y': chunks[-2], 'time_counter':1, 'depthu': 1}
xr_chunks_v = {'x': chunks[-1], 'y': chunks[-2], 'time_counter':1, 'depthv': 1}

# - Parameter
natl60_path = '/home7/pharos/othr/NATL60/'
coordfile = natl60_path + 'NATL60-I/NATL60_coordinates_v4.nc'
maskfile = natl60_path + 'NATL60-I/NATL60_v4.1_cdf_byte_mask.nc'

grd = grids.nemo_2d_grid(nemo_coordinate_file=coordfile, nemo_byte_mask_file=maskfile, chunks=xr_chunks)


#30 times, 2 fields, 300 vertical levels
filenameu = natl60_path+'NATL60-MJM155-S/5d/2008/NATL60-MJM155_y2008m0*d*.5d_gridU.nc'
filenamev = natl60_path+'NATL60-MJM155-S/5d/2008/NATL60-MJM155_y2008m0*d*.5d_gridV.nc'


# open u and v components over 10 levels
vi_xr = xr.open_mfdataset(filenameu,chunks=xr_chunks_u)['vozocrtx'].isel(depthu=slice(0,10))
vj_xr = xr.open_mfdataset(filenamev,chunks=xr_chunks_v)['vomecrty'].isel(depthv=slice(0,10))

# vector object
vec_xr = gridsc.VectorField2d(vi_xr, 
                              vj_xr,
                              x_component_grid_location='u',
                              y_component_grid_location='v') 

#1 Manual vector norm
with timeit_context('     #1'):
    print '# euclidian norm'   
    vnorm_xr=uf.sqrt(uf.square(vec_xr[0])+uf.square(vec_xr[1]))
    print vnorm_xr.mean(dim='time_counter').isel(depthv=9,depthu=9).values

#2 oocgcm vector norm
with timeit_context('     #2'):     
    print '# euclidian norm 2'
    vnorm2_xr=grd.norm_of_vectorfield(vec_xr)
    print vnorm2_xr.mean(dim='time_counter').isel(depthv=9,depthu=9).values


print 'end'

Thank you very much for any advice,
Simon

Add imports of sub-packages to top-level __init__.py

Without import statements of the subpackages, it's harder to play around with/experiment with oocgcm as a new user, since you can't access any of the subpackages

In [1]: import oocgcm

In [2]: dir(oocgcm)
Out[2]:
['__all__',
 '__builtins__',
 '__cached__',
 '__doc__',
 '__file__',
 '__loader__',
 '__name__',
 '__package__',
 '__path__',
 '__spec__',
 '__version__',
 'version']

In [3]: oocgcm.core
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-4aa17fd422c5> in <module>()
----> 1 oocgcm.core

AttributeError: module 'oocgcm' has no attribute 'core'

In [4]: import oocgcm.core

In [5]: oocgcm.core
Out[5]: <module 'oocgcm.core' from '/Users/spencer/miniconda3/lib/python3.5/site-packages/oocgcm-0.1.0-py3.5.egg/oocgcm/core/__init__.py'>

So simply adding e.g. from . import core etc. to the top-level __init__.py would help. I'm happy to do this as a PR...are there any of the subpackages that you don't want in the public API? Otherwise I can just add them all.

A decorator for testing chunks/grid_location.

for equation of state related quantities we need to test the compatibility of chunks and grid_location of xarray dataarrays before applying a function that works on numpy arrays. This can be done with a decorator.

Data objects (scalar, vectors ...)

Data objects (scalar, vectors ...) should probably be dataarray that are augmented with grid information (as done with grid_location) and should be in an independent part of the code.

What sort of data structure for vector fields ?

Vector fields (horizontal velocity, gradient of scalar field) are not defined on the same grid.
my current understanding of xarray dataset is that they are defined on the same grid. we can therefore think of providing a new (minimal) datastructure for vector field, or actually two of them (2D/3D vector field).

Problem while reading a self-made netcdf file with io.return_xarray_mfdataset

Hello,

My problem is as follows: I want to read a self-made netcdf file with io.return_xarray_mfdataset.
The netcdf header gives:

group: netcdf4 {
dimensions:
y = 3454 ;
x = 5422 ;
t = 1 ;
variables:
float nav_lat(y, x) ;
nav_lat:axis = "Y" ;
nav_lat:standard_name = "latitude" ;
nav_lat:long_name = "Latitude" ;
nav_lat:units = "degrees_north" ;
nav_lat:nav_model = "grid_T" ;
float nav_lon(y, x) ;
nav_lon:axis = "X" ;
nav_lon:standard_name = "longitude" ;
nav_lon:long_name = "Longitude" ;
nav_lon:units = "degrees_east" ;
nav_lon:nav_model = "grid_T" ;
double time_centered(t) ;
time_centered:standard_name = "time" ;
time_centered:long_name = "Time axis" ;
time_centered:title = "Time" ;
time_centered:time_origin = "1958-01-01 00:00:00" ;
time_centered:bounds = "time_centered_bounds" ;
time_centered:units = "seconds since 1958-01-01" ;
time_centered:calendar = "gregorian" ;
double t(t) ;
t:axis = "T" ;
t:standard_name = "time" ;
t:long_name = "Time axis" ;
t:title = "Time" ;
t:time_origin = "1958-01-01 00:00:00" ;
t:bounds = "time_counter_bounds" ;
t:units = "seconds since 1958-01-01" ;
t:calendar = "gregorian" ;
float sossheig(t, y, x) ;
sossheig:_FillValue = 0.f ;
sossheig:long_name = "sea surface height" ;
sossheig:units = "m" ;
sossheig:online_operation = "average" ;
sossheig:interval_operation = "40s" ;
sossheig:interval_write = "5d" ;
sossheig:coordinates = "nav_lon nav_lat time_centered" ;
} // group netcdf4

My code is:
from oocgcm.core import io
chunks = (1727, 2711)
xr_chunks_tmean = {'y': chunks[-2], 'x': chunks[-1], 't':1}
vmean_xrt =io.return_xarray_mfdataset(filemean, chunks=xr_chunks_tmean)[vdict[vkey]['vname']][:]
I get the error output
ValueError: some chunks keys are not dimensions on this object: ['y', 'x', 't']

Problem with boundary values in .core.grids._horizontal_gradient

numpy.gradient compute the spatial derivatives with a second order scheme except near boundaries where the derivatives are evaluated with a first order scheme. oocgcm.core.grids._horizontal_gradient wraps numpy.gradient for xarray.DataArray with dask.

in order for the result not to depend on the chunk size, we need to set depth and use dask.array.map_overlap.

but it is not clear how to specify that the calculation should be first order near the boundary when the chunk is close to the boundary. numpy.gradient sees the ghost cell and therefore applies the second order derivative...

the current implementation explicitely sets a np.nan boundary condition to avoid errors.

this could be a question for dask mailing list.

oocgcm does not trivially apply to mitgcm outputs

I have worked on adding the grid of the mitgcm, it requires a few hack to make it work with the current API:

  • I do not have a mask with the mitgcm files so I used the 'cell_vertical_fraction_at_t_location' instead. It seems to work for the moment even if it is not the best solution
  • I skipped the chunk checking
  • I have relaxed the condition on the required array and have added a function to generate the the cell_size if needed

I have successfully computed some gradients but the laplacian does not work. I think there is some work to do to test if oocgcm apply well to different models.

I have pushed a branch to keep working on it, @rabernat you might be interested in.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.