nsidc / nsidc-data-tutorials Goto Github PK

Jupyter notebook-based tutorials to learn how to access and work with select NSIDC DAAC data.

License: MIT License

Dockerfile 0.01% Jupyter Notebook 99.13% Python 0.87%

nsidc-data-tutorials's Introduction

NSIDC-Data-Tutorials

Summary

This combined repository includes tutorials and code resources provided by the NASA National Snow and Ice Data Center Distributed Active Archive Center (NSIDC DAAC). These tutorials are provided as Python-based Jupyter notebooks that provide guidance on working with various data products, including how to access, subset, transform, and visualize data. Each tutorial can be accessed by navigating to the /notebooks folder of this repository. Please see the README files associated with each individual tutorial folder for more information on each tutorial and their learning objectives. Please note that all branches outside of Main should be considered in development and are not supported.

Tutorials

SnowEx_ASO_MODIS_Snow

Snow Depth and Snow Cover Data Exploration

Originally demonstrated through the NASA Earthdata Webinar "Let It Snow! Accessing and Analyzing Snow Data at the NSIDC DAAC" on May 6, 2020, this tutorial provides guidance on how to discover, access, and couple snow data across varying geospatial scales from NASA's SnowEx, Airborne Snow Observatory, and Moderate Resolution Imaging Spectroradiometer (MODIS) missions. The tutorial highlights the ability to search and access data by a defined region, and combine and compare snow data across different data formats and scales using a Python-based Jupyter Notebook.

ICESat-2_MODIS_Arctic_Sea_Ice

Getting the most out of NSIDC DAAC data: Discovering, Accessing, and Harmonizing Arctic Remote Sensing Data

Originally presented during the 2019 AGU Fall Meeting, this tutorial demonstrates the NSIDC DAAC's data discovery, access, and subsetting services, along with basic open source resources used to harmonize and analyze data across multiple products. The tutorial is provided as a series of Python-based Jupyter Notebooks, focusing on sea ice height and ice surface temperature data from NASA’s ICESat-2 and MODIS missions, respectively, to characterize Arctic sea ice.

IceFlow

Caution

The IceFlow notebooks and supporting code have some known problems and users should exercise caution. It is likely that users will run into errors while interacting with the notebooks. Requests for ITRF transformations are not currently working as expected. We recommend users look at the corrections notebook for information about how to apply ITRF transformations to data themselves. IceFlow is currently under maintenence, and we hope to resolve some of these issues soon.

Harmonized data for pre-IceBridge, ICESat and IceBridge data sets. These Jupyter notebooks are interactive documents to teach students and researchers interested in cryospheric sciences how to access and work with airborne altimetry and related data sets from NASA’s IceBridge mission, and satellite altimetry data from ICESat and ICESat-2 missions using the NSIDC IceFlow API

ITS_LIVE

Global land ice velocities. The Inter-mission Time Series of Land Ice Velocity and Elevation (ITS_LIVE) project facilitates ice sheet, ice shelf and glacier research by providing a globally comprehensive and temporally dense multi-sensor record of land ice velocity and elevation with low latency. Scene-pair velocities were generated from satellite optical and radar imagery.

The notebooks on this project demonstrate how to search and access ITS_LIVE velocity pairs and provide a simple example on how to build a data cube.

ICESat-2_Cloud_Access

Accessing and working with ICESat-2 Data in the Cloud

Originally presented to the UWG (User Working Group) in May 2022, this tutorial demonstrates how to search for ICESat-2 data hosted in the Earthdata Cloud and how to directly access it from an Amazon Web Services (AWS) Elastic Compute Cloud (EC2) instance using the earthaccess package.

MEaSUREs

Download, crop, resample, and plot multiple GeoTIFFs

This tutorial guides you through programmatically accessing and downloading GeoTIFF files from the NSIDC DAAC to your local computer. We then crop and resample one GeoTIFF based on the extent and pixel size of another GeoTIFF, then plot one on top of the other.

We will use two data sets from the NASA MEaSUREs (Making Earth System data records for Use in Research Environments) program as an example:

Usage with Binder

The Binder button above allows you to explore and run the notebook in a shared cloud computing environment without the need to install dependencies on your local machine. Note that this option will not directly download data to your computer; instead the data will be downloaded to the cloud environment.

Usage with Docker

On Mac OSX or Linux

Install Docker. Use the left-hand navigation to select the appropriate install depending on operating system.
Download the NSIDC-Data-Tutorials repository from Github.
Unzip the file, and open a terminal window in the NSIDC-Data-Tutorials folder's location.
From the terminal window, launch the docker container using the following command, replacing [path/notebook_folder] with your path and notebook folder name:

docker run --name tutorials -p 8888:8888 -v [path/notebook_folder]:/home/jovyan/work nsidc/tutorials

Example:

docker run --name tutorials -p 8888:8888 -v /Users/name/Desktop/NSIDC-Data-Tutorials:/home/jovyan/work nsidc/tutorials

Or, with docker-compose:

docker-compose up

If you want to mount a directory with write permissions, you need to grant the container the same permissions as the one on the directory to be mounted and tell it that has "root" access (within the container). This is important if you want to persist your work or download data to a local directory and not just the docker container. Run the example command below for this option:

docker run --name tutorials -e NB_UID=$(id -u) --user root -p 8888:8888 -v  /Users/name/Desktop/NSIDC-Data-Tutorials:/home/jovyan/work nsidc/tutorials

The initialization will take some time and will require 2.6 GB of space. Once the startup is complete you will see a line of output similar to this:

To access the notebook, open this file in a browser:
        file:///home/jovyan/.local/share/jupyter/runtime/nbserver-6-open.html
    Or copy and paste one of these URLs:
        http://4dc97ddd7a0d:8888/?token=f002a50e25b6f623aa775312737ba8a23ffccfd4458faa6f
     or http://127.0.0.1:8888/?token=f002a50e25b6f623aa775312737ba8a23ffccfd4458faa6f

If you started your container with the -d/--detach option, check docker logs tutorials for this output.

Open up a web browser and copy one of the URLs as instructed above.
You will be brought to a Jupyter Notebook interface running through the Docker container. The left side of the interface displays your local directory structure. Navigate to the work folder of the NSIDC-Data-Tutorials repository folder. You can now interact with the notebooks to explore and access data.

On Windows

Install Docker.
Download the NSIDC-Data-Tutorials repository from Github.
Unzip the file, and open a terminal window (use Command Prompt or PowerShell, not PowerShell ISE) in the NSIDC-Data-Tutorials folder's location.
From the terminal window, launch the docker container using the following command, replacing [path\notebook_folder] with your path and notebook folder name:

docker run --name tutorials -p 8888:8888 -v [path\notebook_folder]:/home/jovyan/work nsidc/tutorials

Example:

docker run --name tutorials -p 8888:8888 -v C:\notebook_folder:/home/jovyan/work nsidc/tutorials

Or, with docker-compose:

docker-compose up

If you want to mount a directory with write permissions you need to grant the container the same permissions as the one on the directory to be mounted and tell it that has "root" access (within the container)

docker run --name tutorials --user root -p 8888:8888 -v C:\notebook_folder:/home/jovyan/work nsidc/tutorials

The initialization will take some time and will require 2.6 GB of space. Once the startup is complete you will see a line of output similar to this:

To access the notebook, open this file in a browser:
        file:///home/jovyan/.local/share/jupyter/runtime/nbserver-6-open.html
    Or copy and paste one of these URLs:
        http://(6a8bfa6a8518 or 127.0.0.1):8888/?token=2d72e03269b59636d9e31937fcb324f5bdfd0c645a6eba3f

If you started your container with the -d/--detach option, check docker logs tutorials for this output.

Follow the instructions and copy one of the URLs into a web browser and hit return. The address should look something like this:

http://127.0.0.1:8888/?token=2d72e03269b59636d9e31937fcb324f5bdfd0c645a6eba3f

You will now see the NSIDC-Data-Tutorials repository within the Jupyter Notebook interface. Navigate to /work to open the notebooks.
You can now interact with the notebooks to explore and access data.

Usage with Mamba/Conda

Note: If we already have conda or mamba installed we can skip the first step.

Install mambaforge (Python 3.9+) for your platform from mamba documentation
Download the NSIDC-Data-Tutorials repository from Github by clicking the green 'Code' button located at the top right of the repository page and clicking 'Download Zip'. Unzip the file, and open a command line or terminal window in the NSIDC-Data-Tutorials folder's location.
From a command line or terminal window, install the required environment with the following commands:

Linux

mamba create -n nsidc-tutorials --file binder/conda-linux-64.lock

OSX

mamba create -n nsidc-tutorials --file binder/conda-osx-64.lock

Windows

mamba create -n nsidc-tutorials --file binder/conda-win-64.lock

You should now see that the dependencies were installed and our environment is ready to be used.

Activate the environment with

conda activate nsidc-tutorials

Launch the notebook locally with the following command:

jupyter lab

This should open a browser window with the JupyterLab IDE, showing your current working directory on the left-hand navigation. Navigate to the tutorial folder of choice and click on their associated *.ipynb files to get started.

Tutorial Environments

Although the nsidc-tutorial environment should run all the notebooks in this repository, we also include tutorial-specific environments that will only contain the dependencies for them. If we don't want to "pollute" our conda environments and we are only going to work with one of the tutorials we recommend to use them instead of the nsidc-tutorial environment. The steps to install them are exactly the same but the environment files are inside the environment folders in each of the tutorials. e.g. for ITS_LIVE

cd notebooks/itslive 
mamba create -n nsidc-itslive --file environment/conda-linux-64.lock
conda activate nsidc-itslive
jupyter lab

This should create a pinned environment that should be fully reproducible across platforms.

NOTE: Sometimes Conda environments change (break) even with pinned down dependencies. If you run into an issue with dependencies for the tutorials please open an issue and we'll try to fix it as soon as possible.

Credit

This software is developed by the National Snow and Ice Data Center with funding from multiple sources.

License

This repository is licensed under the MIT license.

nsidc-data-tutorials's People

Contributors

Stargazers

Watchers

nsidc-data-tutorials's Issues

Change learning goals to "will be able to..."

For example, change "After using this notebook you should be able to:" to "On completely this tutorial you will be able to..."

Matplotlib warning get_cmap warning in notebooks/SMAP/03_smap_quality_flags.ipynb

I received a matplot lib warning for each image in the notebooks/SMAP/03_smap_quality_flags.ipynb notebook:
/tmp/ipykernel_478/3335087365.py:3: MatplotlibDeprecationWarning: The get_cmap function was deprecated in Matplotlib 3.7 and will be removed two minor releases later. Use ``matplotlib.colormaps[name]`` or ``matplotlib.colormaps.get_cmap(obj)`` instead. cax = ax.imshow((surf_flag_L3_P>>i)&1, cmap=plt.cm.get_cmap('bone', 2))

New issue: CRYO-206

Related to CRYO-187

error running Customize and Access Data.ipynb

in the notebooks/ICESat-2_MODIS_Arctic_Sea_Ice/ folder

Running in Binder instance (from GH link on repo). I logged in with my Earthdata account (via code in the notebook), so not being logged in is not the problem.

The last code is

fn.request_data(param_dict,session)
fn.clean_folder()

This error is returned. Note I tried this yesterday too and got the same error so doesn't seem to be a temporary 404 problem.

Request HTTP response:  201

Order request URL:  https://n5eil02u.ecs.nsidc.org/egi/request?short_name=MOD29&version=6&bounding_box=140%2C72%2C153%2C80&temporal=2019-03-23T00%3A00%3A00Z%2C2019-03-23T23%3A59%3A59Z&page_size=2000&email=eli.holmes%40noaa.gov&bbox=140%2C72%2C153%2C80&time=2019-03-23T00%3A00%3A00%2C2019-03-23T23%3A59%3A59&coverage=%2Fgt1l%2Fsea_ice_segments%2Fdelta_time%2C%2Fgt1l%2Fsea_ice_segments%2Flatitude%2C%2Fgt1l%2Fsea_ice_segments%2Flongitude%2C%2Fgt1l%2Fsea_ice_segments%2Fheights%2Fheight_segment_confidence%2C%2Fgt1l%2Fsea_ice_segments%2Fheights%2Fheight_segment_height%2C%2Fgt1l%2Fsea_ice_segments%2Fheights%2Fheight_segment_quality%2C%2Fgt1l%2Fsea_ice_segments%2Fheights%2Fheight_segment_surface_error_est%2C%2Fgt1l%2Fsea_ice_segments%2Fheights%2Fheight_segment_length_seg%2C%2Fgt2l%2Fsea_ice_segments%2Fdelta_time%2C%2Fgt2l%2Fsea_ice_segments%2Flatitude%2C%2Fgt2l%2Fsea_ice_segments%2Flongitude%2C%2Fgt2l%2Fsea_ice_segments%2Fheights%2Fheight_segment_confidence%2C%2Fgt2l%2Fsea_ice_segments%2Fheights%2Fheight_segment_height%2C%2Fgt2l%2Fsea_ice_segments%2Fheights%2Fheight_segment_quality%2C%2Fgt2l%2Fsea_ice_segments%2Fheights%2Fheight_segment_surface_error_est%2C%2Fgt2l%2Fsea_ice_segments%2Fheights%2Fheight_segment_length_seg%2C%2Fgt3l%2Fsea_ice_segments%2Fdelta_time%2C%2Fgt3l%2Fsea_ice_segments%2Flatitude%2C%2Fgt3l%2Fsea_ice_segments%2Flongitude%2C%2Fgt3l%2Fsea_ice_segments%2Fheights%2Fheight_segment_confidence%2C%2Fgt3l%2Fsea_ice_segments%2Fheights%2Fheight_segment_height%2C%2Fgt3l%2Fsea_ice_segments%2Fheights%2Fheight_segment_quality%2C%2Fgt3l%2Fsea_ice_segments%2Fheights%2Fheight_segment_surface_error_est%2C%2Fgt3l%2Fsea_ice_segments%2Fheights%2Fheight_segment_length_seg&request_mode=async

order ID:  5000002704796
status URL:  https://n5eil02u.ecs.nsidc.org/egi/request/5000002704796
HTTP response from order response URL:  201

Initial request status is  pending

Status is not complete. Trying again.
Retry request status is:  pending
Status is not complete. Trying again.
Retry request status is:  pending
Status is not complete. Trying again.
Retry request status is:  processing
Status is not complete. Trying again.
Retry request status is:  processing
Status is not complete. Trying again.
Retry request status is:  processing
Status is not complete. Trying again.
Retry request status is:  processing
Status is not complete. Trying again.
Retry request status is:  processing
Status is not complete. Trying again.
Retry request status is:  processing
Status is not complete. Trying again.
Retry request status is:  processing
Status is not complete. Trying again.
Retry request status is:  processing
Status is not complete. Trying again.
Retry request status is:  processing
Status is not complete. Trying again.
Retry request status is:  processing
Status is not complete. Trying again.
Retry request status is:  processing
Status is not complete. Trying again.
Retry request status is:  complete
Zip download URL:  https://n5eil02u.ecs.nsidc.org/esir/5000002704796.zip
Beginning download of zipped output...
---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
/tmp/ipykernel_150/2671428711.py in <module>
----> 1 fn.request_data(param_dict,session)
      2 fn.clean_folder()

~/notebooks/ICESat-2_MODIS_Arctic_Sea_Ice/tutorial_helper_functions.py in request_data(param_dict, session)
    235             zip_response = session.get(downloadURL)
    236             # Raise bad request: Loop will stop for bad response code.
--> 237             zip_response.raise_for_status()
    238             with zipfile.ZipFile(io.BytesIO(zip_response.content)) as z:
    239                 z.extractall(path)

/srv/conda/envs/notebook/lib/python3.9/site-packages/requests/models.py in raise_for_status(self)
    951 
    952         if http_error_msg:
--> 953             raise HTTPError(http_error_msg, response=self)
    954 
    955     def close(self):

HTTPError: 404 Client Error: Not Found for url: https://n5eil02u.ecs.nsidc.org/esir/5000002704796.zip

Remove IceFlow notebooks and code reliant on Valkyrie

We are decomissioning Valkyrie and removing notebooks/code reliant on valkyrie. #76 will serve as an intermediate replacement, and https://github.com/nsidc/tps-ecs-decom-notes/issues/11 will provide a longer-term solution and more robust replacement for Valkyrie.

make tools easily installable as modules

I'm trying to use IceFlow in another workflow, but it's non-trivial to install in its current configuration. You can't install the module using a full path because of the dashes in the repo name (NSIDC-Data_Tutorials), and if you add the repo to your path and try to import there are all sorts of relative path/module errors. @betolink are there any plans to set this up? Currently I'm modifying iceflow/__init__.py to import each module, and then debugging the relative path calls for each module.

Use pathlib.Path throughout SMAP tutorials

SMAP tutorials use os for path and file operations.

See notebooks/SMAP/01_download_smap_data_rendered.ipynb

pathlib.Path is a better option. This is a small change but provides a more pythonic approach.

Decommission IceFlow

This Issue encapsulates all work required to fully retire IceFlow including its backend components, including:

Updating the existing corrections notebook to demonstrate how to apply CRS transformations on existing IceFlow-supported datasets.
Execute decommissioning of IceFlow including Valkyrie orders and and IceFlow library/client.

New access patterns for NOAA@NSIDC data

I've been exploring requests and BeautifulSoup to get a list of files on HTTPS. I have code to recursively list files in a directory. I'm in two minds if this should be a tutorial or a how-to. The code "walks" the server directory tree and returns a generator containing the urls for each file. Recursion and generators are hard for many to get their heads around (they are for me at least). But it fills a need.

Ideally, we would have a STAC catalog for these datasets so that we do not need to have these kinds of access patterns. This might be for my next playtime.

import time
from http import HTTPStatus

import requests
from requests.exceptions import HTTPError

from bs4 import BeautifulSoup


retry_codes = [
    HTTPStatus.TOO_MANY_REQUESTS,
    HTTPStatus.INTERNAL_SERVER_ERROR,
    HTTPStatus.BAD_GATEWAY,
    HTTPStatus.SERVICE_UNAVAILABLE,
    HTTPStatus.GATEWAY_TIMEOUT,
]


def get_page(url: str, 
             retries: int = 3) -> requests.Response:
    """Gets resonse from requests

    Parameters
    ----------
    url : url to resource
    retries : number of retries before failing

    Returns
    -------
    requests.Response object
    """
    for n in range(retries):
        try:
            response = requests.get(url)
            response.raise_for_status()

            return response

        except HTTPError as exc:
            code = exc.response.status_code
        
            if code in retry_codes:
                # retry after n seconds
                time.sleep(n)
                continue

            raise    


def get_filelist(url: str, 
                 ext: str = ".nc"):
    """Returns a generator containing files in directory tree
    below url.

    Parameters
    ----------
    url : url to resource
    ext : file extension of files to search for

    Returns
    -------
    Generator containing list files
    """
    
    def is_subdirectory(href):
        return (href.endswith("/") and 
                href not in url and
                not href.startswith("."))

    def is_file(href, ext):
        return href.endswith(ext)
        
    response = get_page(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    for a in soup.find_all('a', href=True):
        if is_subdirectory(a["href"]):
            yield from get_filelist(url+a["href"])
        if is_file(a["href"], ext):
            yield(url + a["href"])

Integrate data-access-notebook

In order to streamline maintenance/sustainment of all of our NSIDC DAAC notebooks related to data access and customization in a single repo, we ought to migrate https://github.com/nsidc/NSIDC-Data-Access-Notebook to this repo.

As part of this integration, we should also update to using earthaccess where possible, and consider breaking out separate notebooks by generic search capabilities followed by a separate on-prem subsetter API-focused notebook.

Add real-world example of dataset ITRF transformation

Iceflow: add note to README about ITRF issue and decomissioning plans

CI tests failing

This PR, which just adds a cautionary note to the READMEs about IceFlow, triggered CI tests that subsequently failed.

h5coro error reading ATL06 file

When running the https://github.com/nsidc/NSIDC-Data-Tutorials/blob/cryo-184/notebooks/ICESat-2_Cloud_Access/ATL10-h5coro.ipynb notebook for the whole "Antarctic" region, h5coro gives the following error.

H5Coro encountered error reading gt1r/freeboard_segment/latitude: invalid heap signature: 0x0
H5Coro encountered error reading gt1r/freeboard_segment/longitude: invalid heap signature: 0x0
H5Coro encountered error reading gt1r/freeboard_segment/delta_time: invalid heap signature: 0x0
H5Coro encountered error reading gt1r/freeboard_segment/seg_dist_x: invalid heap signature: 0x0
H5Coro encountered error reading gt1r/freeboard_segment/heights/height_segment_length_seg: invalid heap signature: 0x0
H5Coro encountered error reading gt1r/freeboard_segment/beam_fb_height: invalid heap signature: 0x0
H5Coro encountered error reading gt1r/freeboard_segment/heights/height_segment_type: invalid heap signature: 0x0

This causes a TypeError to be returned instead of a GeoPandas.Dataframe. The concatenation step in read_atl10 then fails.

File ~/NSIDC-Data-Tutorials/notebooks/ICESat-2_Cloud_Access/h5cloud/read_atl10.py:132, in read_atl10(files, bounding_box, executors, environment, credentials)
    129     return df
    131 dfs = pqdm(files, read_h5coro, n_jobs=executors)
--> 132 combined = pd.concat(dfs)
    134 return combined

I think we need a try, except block so that a None or some other value is returned.

We also need to then filter out the Nones so that pd.concat works.

Update SMAP notebooks to version 009

After helping to troubleshoot #83 , I saw that the example used in 01_download_smap_data.ipynb is from version 008 of SPL3SMP. We should ensure we have the latest version of data in these notebooks and that no other changes are needed to the other code blocks.

Suggest using xarray and cartopy for SMAP tutorial 2

The SMAP tutoral 2.0 read_and_plot_smap_data uses h5py and numpy. The whole notebook could be simplified and streamlined by using xarray.

If we stick with h5py a lot of the existing code could also be streamlined and made more transparent.

For example, code cell 3 involves a lot of code to get a list of groups and dataset paths, which can be simplified to the following.

with h5py.File(smap_files[0], 'r') as root:
    list_of_names = []
    root.visit(list_of_names.append)
list_of_names

['Metadata',
 'Metadata/AcquisitionInformation',
 'Metadata/AcquisitionInformation/platform',
 'Metadata/AcquisitionInformation/platformDocument',
 'Metadata/AcquisitionInformation/radar',
 'Metadata/AcquisitionInformation/radarDocument',
 'Metadata/AcquisitionInformation/radiometer',
 'Metadata/AcquisitionInformation/radiometerDocument',
 'Metadata/DataQuality',
 'Metadata/DataQuality/CompletenessOmission',
 'Metadata/DataQuality/DomainConsistency',
 'Metadata/DatasetIdentification',
 'Metadata/Extent',
 'Metadata/GridSpatialRepresentation',
 'Metadata/GridSpatialRepresentation/Column',
 'Metadata/GridSpatialRepresentation/GridDefinition',
 'Metadata/GridSpatialRepresentation/GridDefinitionDocument',

Code cell 5 that gets soil_moisture for the AM pass could be rewritten to use the path to the dataset

with h5py.File(smap_files[0], 'r') as root:
    soil_moisture = root['Soil_Moisture_Retrieval_Data_AM/soil_moisture'][:]
soil_moisture

array([[-9999., -9999., -9999., ..., -9999., -9999., -9999.],
       [-9999., -9999., -9999., ..., -9999., -9999., -9999.],
       [-9999., -9999., -9999., ..., -9999., -9999., -9999.],
       ...,
       [-9999., -9999., -9999., ..., -9999., -9999., -9999.],
       [-9999., -9999., -9999., ..., -9999., -9999., -9999.],
       [-9999., -9999., -9999., ..., -9999., -9999., -9999.]],
      dtype=float32)

But as I note, this is much, much simpler with xarray.

Include a `valkyrie.download()` type cell

We should include a description of how to download the data. I don't think h5py can read a remote file.

Would we have to use requests?

Another thought is can we leverage icepyx for this?

The parameter of earthaccess.login()

Hey, I When I run Jupyter, I enter my username and password interactively to log in (earthaccess.login(strategy='interactive', persist=True)). Now I want to run it in a local script, how can I set a non-interactive login method in earthaccess.login()?

AttributeError from import of iceflow ui: NASAGIBS.BlueMarble no longer listed in xyzservices

What Happened

The 0_introduction.ipynb notebook raises an AttributeError in the first cell when trying to import iceflow.ui. This appears to result from a call to ipyleaflet.basemaps.NASAGIBS.BlueMarble.

Investigation

Looking at https://xyzservices.readthedocs.io/en/stable/introduction.html under NASAGIBS services, it appears that there is no longer an entry for BlueMarble. BlueMarble is also not shown in https://xyzservices.readthedocs.io/en/stable/gallery.html.

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File [~/mambaforge/envs/nsidc-iceflow/lib/python3.9/site-packages/xyzservices/lib.py:44](http://localhost:8889/home/apbarret/mambaforge/envs/nsidc-iceflow/lib/python3.9/site-packages/xyzservices/lib.py#line=43), in Bunch.__getattr__(self, key)
     43 try:
---> 44     return self.__getitem__(key)
     45 except KeyError as err:

KeyError: 'BlueMarble'

The above exception was the direct cause of the following exception:

AttributeError                            Traceback (most recent call last)
Cell In[1], line 2
      1 # Importing IceFlow client library
----> 2 from iceflow.ui import IceFlowUI
      3 from iceflow.client import IceflowClient
      5 import earthaccess

File [~/src/NSIDC-Data-Tutorials/notebooks/iceflow/iceflow/ui.py:9](http://localhost:8889/lab/tree/iceflow/ui.py#line=8)
      6 from IPython.display import display, HTML
      7 from ipyleaflet import (Map, SearchControl, AwesomeIcon, GeoJSON,
      8                         Marker, DrawControl, LayersControl)
----> 9 from .layers import custom_layers, flight_layers, widget_projections
     10 from .client import IceflowClient
     13 class IceFlowUI:

File [~/src/NSIDC-Data-Tutorials/notebooks/iceflow/iceflow/layers.py:106](http://localhost:8889/lab/tree/iceflow/layers.py#line=105)
     64 north_3413 = {
     65     'name': 'EPSG:3413',
     66     'custom': True,
   (...)
     81     ]
     82 }
     84 south_3031 = {
     85     'name': 'EPSG:3031',
     86     'custom': True,
   (...)
    101     ]
    102 }
    104 widget_projections = {
    105     'global': {
--> 106         'base_map': basemaps.NASAGIBS.BlueMarble,
    107         'projection': projections.EPSG3857,
    108         'center': (30, -30),
    109         'zoom': 2,
    110         'max_zoom': 8
    111     },
    112     'north': {
    113         'base_map': basemaps.NASAGIBS.BlueMarble3413,
    114         'projection': north_3413,
    115         'center': (80, -50),
    116         'zoom': 1,
    117         'max_zoom': 4
    118     },
    119     'south': {
    120         'base_map': basemaps.NASAGIBS.BlueMarble3031,
    121         'projection': south_3031,
    122         'center': (-90, 0),
    123         'zoom': 1,
    124         'max_zoom': 4
    125     }
    126 }

File [~/mambaforge/envs/nsidc-iceflow/lib/python3.9/site-packages/xyzservices/lib.py:46](http://localhost:8889/home/apbarret/mambaforge/envs/nsidc-iceflow/lib/python3.9/site-packages/xyzservices/lib.py#line=45), in Bunch.__getattr__(self, key)
     44     return self.__getitem__(key)
     45 except KeyError as err:
---> 46     raise AttributeError(key) from err

AttributeError: BlueMarble

Break down environments and put them in separate git branches.

Right now we have a single environment for all the tutorials and it works but it's not the best practice. We should create a branch for each tutorial, say for SNOWEX there would be a git branch called binder-snowex with only the binder directory and the dependencies needed for that specific tutorial. Same with the others. Another advantage of this is that commits to main that don't interfere with the environment won't trigger a new build for Binder.

The file referenced in ../data is not there

Do we include a small example file?

Break out h5coro notebook into separate tutorial folder

We now have several notebooks and subfolders under https://github.com/nsidc/NSIDC-Data-Tutorials/tree/main/notebooks/ICESat-2_Cloud_Access. Consider reorganizing and breaking out the h5coro notebooks and/or scripts to better distinguish between basic cloud access vs more advanced and larger scale access + aggregation.

Tests: execute all notebook cells

GHA are currently setup to test that quarto can render the notebooks. As presently defined, the call to quarto render simply renders the notebooks into HTML - it does not execute the code defined in the notebook cells.

We should consider tweaking the call to quarto render to include --execute so that all cells are tested. At a minimum, this should be done in a local environment and identified errors corrected. I did so and encountered a number of issues, including undefined variable names (see the attached error log, or reproduce locally).

render-execute-error-log.txt

The complication with running render with --execute in GHA is that it can take a while (several minutes - not sure exactly) and we may need access to EDL credentials.

Finally, one other thing we may want to change about the quarto render test, is to include the SnowEx notebooks, which are currently excluded (not sure why).

error running 02_read_and_plot_smap_data.ipynb

In the notebooks/SMAP/folder while running in conda and after successfully running 01_download_smap_data.ipynb

In the code cell:

sm_data_3d = np.empty([sm_data.shape[0],sm_data.shape[1],len(flist)])
times = []
print('sm_data_3d has dimensions '+str(sm_data_3d.shape))
i=0
for fName in flist:
    sm_data_3d[:,:,i],time_i = read_SML3P(fName)
    times.append(time_i)
    i+=1

This error is returned:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[16], [line 6](vscode-notebook-cell:?execution_count=16&line=6)
      [4](vscode-notebook-cell:?execution_count=16&line=4) i=0
      [5](vscode-notebook-cell:?execution_count=16&line=5) for fName in flist:
----> [6](vscode-notebook-cell:?execution_count=16&line=6)     sm_data_3d[:,:,i],time_i = read_SML3P(fName)
      [7](vscode-notebook-cell:?execution_count=16&line=7)     times.append(time_i)
      [8](vscode-notebook-cell:?execution_count=16&line=8)     i+=1

Cell In[14], [line 17](vscode-notebook-cell:?execution_count=14&line=17)
     [15](vscode-notebook-cell:?execution_count=14&line=15) var_id_am = 'soil_moisture'
     [16](vscode-notebook-cell:?execution_count=14&line=16) flag_id_am = 'retrieval_qual_flag'
---> [17](vscode-notebook-cell:?execution_count=14&line=17) soil_moisture_am = f[group_id_am][var_id_am][:,:]
     [18](vscode-notebook-cell:?execution_count=14&line=18) flag_am = f[group_id_am][flag_id_am][:,:]
     [19](vscode-notebook-cell:?execution_count=14&line=19) soil_moisture_am[soil_moisture_am==-9999.0]=np.nan;

File h5py/_objects.pyx:54, in h5py._objects.with_phil.wrapper()

File h5py/_objects.pyx:55, in h5py._objects.with_phil.wrapper()

File ~/miniconda3/envs/smap/lib/python3.10/site-packages/h5py/_hl/group.py:357, in Group.__getitem__(self, name)
    [355](https://file+.vscode-resource.vscode-cdn.net/Users/tori1/Library/CloudStorage/OneDrive-Personal/Documents/STC_CRREL/SMAP/~/miniconda3/envs/smap/lib/python3.10/site-packages/h5py/_hl/group.py:355)         raise ValueError("Invalid HDF5 object reference")
    [356](https://file+.vscode-resource.vscode-cdn.net/Users/tori1/Library/CloudStorage/OneDrive-Personal/Documents/STC_CRREL/SMAP/~/miniconda3/envs/smap/lib/python3.10/site-packages/h5py/_hl/group.py:356) elif isinstance(name, (bytes, str)):
--> [357](https://file+.vscode-resource.vscode-cdn.net/Users/tori1/Library/CloudStorage/OneDrive-Personal/Documents/STC_CRREL/SMAP/~/miniconda3/envs/smap/lib/python3.10/site-packages/h5py/_hl/group.py:357)     oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
    [358](https://file+.vscode-resource.vscode-cdn.net/Users/tori1/Library/CloudStorage/OneDrive-Personal/Documents/STC_CRREL/SMAP/~/miniconda3/envs/smap/lib/python3.10/site-packages/h5py/_hl/group.py:358) else:
    [359](https://file+.vscode-resource.vscode-cdn.net/Users/tori1/Library/CloudStorage/OneDrive-Personal/Documents/STC_CRREL/SMAP/~/miniconda3/envs/smap/lib/python3.10/site-packages/h5py/_hl/group.py:359)     raise TypeError("Accessing a group is done with bytes or str, "
    [360](https://file+.vscode-resource.vscode-cdn.net/Users/tori1/Library/CloudStorage/OneDrive-Personal/Documents/STC_CRREL/SMAP/~/miniconda3/envs/smap/lib/python3.10/site-packages/h5py/_hl/group.py:360)                     "not {}".format(type(name)))

File h5py/_objects.pyx:54, in h5py._objects.with_phil.wrapper()

File h5py/_objects.pyx:55, in h5py._objects.with_phil.wrapper()

File h5py/h5o.pyx:241, in h5py.h5o.open()

KeyError: "Unable to open object (object 'Soil_Moisture_Retrieval_Data_AM' doesn't exist)"

Introduce Valkyrie before missions

It think it is better to introduce Valkyrie and the problems it solves before giving an overview of the missions. For example...

Why Valkyrie

In 2003, NASA launched the Ice, Cloud and Land Elevation Satellite (ICESat) mission. Over the following six years, ICESat collected valuable data about ice thickness in the Polar Regions. Unfortunately, the ICESat mission ended before a follow-on mission could be launched. To fill the gap, an airborne campaign called Operation IceBridge was started. Between 2009 and 2019, Operation IceBridge flew numerous campaigns over the Greenland and Antarctic icesheets, as well as over sea ice in the Arctic and Southern Oceans. The last campaign was fill in date here. In September 2018, ICESat-2 was launched to continue NASA's collecting ice, cloud and land elevation data.

The wealth of data from these three missions, as well as from earlier missions, presents an opportunity to measure the evolution of ice thickness over several decades. However, combining data from these missions is a challenge. Data from the Airborne Topographic Mapper (ATM) flown during IceBridge campaigns is store in N different formats. ICESat and ICESat-2 data are also in different file formats. Data needs to be harmonized (put into similar formats) before comparisons can be made. A further complication is that the coordinate reference systems used to locate measurements have changed. The Earth's surface is not static and changes shape. To account for these changes, terrestrial reference frames that relate latitude and longitude to points on the Earth are updated on a regular basis. Since the launch of ICESat, the International Terrestrial Reference Frame has been updated three times. The geolocation accuracy of instruments means that a point measured at the beginning of the record is not the same point as that measured at the end of the record. Even though the latitude and longitude is the same. These changes in geolocation need to be reconciled if meaningful comparisons of measurements are to be made.

Valkyrie solves this problem...

This needs some work

Brief overview of ICESat

Brief Overview of Operation IceBridge

Brief Overview of ICESat-2

Prefer some other term than "holistic"

I'm not sure what is meant by "holistic". Maybe "use together", "combine", "compare"

Error running `from iceflow.ui import IceFlowUI`

Error seems to be with the loading of the BlueMarble base layer.
This code generates the error:

from iceflow.ui import IceFlowUI

run from the Binder instance of the notebooks.

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/srv/conda/envs/notebook/lib/python3.9/site-packages/xyzservices/lib.py in __getattr__(self, key)
     41         try:
---> 42             return self.__getitem__(key)
     43         except KeyError:

KeyError: 'BlueMarble'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_96/3719722181.py in <module>
----> 1 from iceflow.ui import IceFlowUI

~/notebooks/iceflow/iceflow/ui.py in <module>
      7 from ipyleaflet import (Map, SearchControl, AwesomeIcon, GeoJSON,
      8                         Marker, DrawControl, LayersControl)
----> 9 from .layers import custom_layers, flight_layers, widget_projections
     10 from .client import IceflowClient
     11 

~/notebooks/iceflow/iceflow/layers.py in <module>
    101 widget_projections = {
    102     'global': {
--> 103         'base_map': basemaps.NASAGIBS.BlueMarble,
    104         'projection': projections.EPSG3857,
    105         'center': (30, -30),

/srv/conda/envs/notebook/lib/python3.9/site-packages/xyzservices/lib.py in __getattr__(self, key)
     42             return self.__getitem__(key)
     43         except KeyError:
---> 44             raise AttributeError(key)
     45 
     46     def __dir__(self):

AttributeError: BlueMarble

Make sure ICESat and ICESat-2 are used consistently

also IceBridge

Define a common `read_h5` function for _h5py + pandas_ and _dask array_

def read_h5(fname, vnames=[]):
    """Read a list of vars [v1, v2, ..] -> 2D."""
    f = h5py.File(fname, 'r')
    return np.column_stack([f[v][()] for v in vnames])

could be used for the pandas and dask array cells. Maybe this could be added to icepyx or offered as part of a separate tool set.