Giter Site home page Giter Site logo

newbieofuestc / intake-esm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from intake/intake-esm

0.0 1.0 0.0 11.57 MB

An intake plugin for parsing an Earth System Model (ESM) catalog and loading assets into xarray datasets.

Home Page: https://intake-esm.readthedocs.io

License: Apache License 2.0

Makefile 1.06% Python 98.94%

intake-esm's Introduction

Intake-esm

Badges

CI GitHub Workflow Status GitHub Workflow Status Code Coverage Status
Docs Documentation Status
Package Conda PyPI
License License
Citation Zenodo

Motivation

Computer simulations of the Earth’s climate and weather generate huge amounts of data. These data are often persisted on HPC systems or in the cloud across multiple data assets of a variety of formats (netCDF, zarr, etc...). Finding, investigating, loading these data assets into compute-ready data containers costs time and effort. The data user needs to know what data sets are available, the attributes describing each data set, before loading a specific data set and analyzing it.

Finding, investigating, loading these assets into data array containers such as xarray can be a daunting task due to the large number of files a user may be interested in. Intake-esm aims to address these issues by providing necessary functionality for searching, discovering, data access/loading.

Overview

intake-esm is a data cataloging utility built on top of intake, pandas, and xarray, and it's pretty awesome!

  • Opening an ESM collection definition file: An ESM (Earth System Model) collection file is a JSON file that conforms to the ESM Collection Specification. When provided a link/path to an esm collection file, intake-esm establishes a link to a database (CSV file) that contains data assets locations and associated metadata (i.e., which experiment, model, the come from). The collection JSON file can be stored on a local filesystem or can be hosted on a remote server.

    In [1]: import intake
    
    In [2]: col_url = "https://raw.githubusercontent.com/NCAR/intake-esm-datastore/master/catalogs/pangeo-cmip6.json"
    
    In [3]: col = intake.open_esm_datastore(col_url)
    
    In [4]: col
    Out[4]: <pangeo-cmip6 catalog with 4287 dataset(s) from 282905 asset(s)>
  • Search and Discovery: intake-esm provides functionality to execute queries against the catalog:

    In [5]: col_subset = col.search(
       ...:     experiment_id=["historical", "ssp585"],
       ...:     table_id="Oyr",
       ...:     variable_id="o2",
       ...:     grid_label="gn",
       ...: )
    
    In [6]: col_subset
    Out[6]: <pangeo-cmip6 catalog with 18 dataset(s) from 138 asset(s)>
  • Access: when the user is satisfied with the results of their query, they can ask intake-esm to load data assets (netCDF/HDF files and/or Zarr stores) into xarray datasets:

      In [7]: dset_dict = col_subset.to_dataset_dict(zarr_kwargs={"consolidated": True})
    
      --> The keys in the returned dictionary of datasets are constructed as follows:
              'activity_id.institution_id.source_id.experiment_id.table_id.grid_label'
      |███████████████████████████████████████████████████████████████| 100.00% [18/18 00:10<00:00]

See documentation for more information.

Installation

Intake-esm can be installed from PyPI with pip:

python -m pip install intake-esm

It is also available from conda-forge for conda installations:

conda install -c conda-forge intake-esm

intake-esm's People

Contributors

andersy005 avatar matt-long avatar bonnland avatar jukent avatar jbusecke avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.