Giter Site home page Giter Site logo

nasa / emit-data-resources Goto Github PK

View Code? Open in Web Editor NEW
105.0 12.0 54.0 101.76 MB

This repository provides guides, short how-tos, and tutorials to help users access and work with data from the Earth Surface Mineral Dust Source Investigation (EMIT) mission.

License: Apache License 2.0

Jupyter Notebook 0.27% Python 0.03% HTML 99.69%
emit lpdaac

emit-data-resources's Introduction

EMIT-Data-Resources

Welcome to the EMIT-Data-Resources repository. This repository provides guides, short how-tos, and tutorials to help users access and work with data from the Earth Surface Mineral Dust Source Investigation (EMIT) mission. In the interest of open science this repository has been made public but is still under active development. All notebooks and scripts should be functional, however, changes or additions may be made. Make sure to consult the CHANGE_LOG.md for the most recent changes to the repository. Contributions from all parties are welcome.


EMIT Background

The EMIT Project delivers space-based measurements of surface mineralogy of the Earth’s arid dust source regions. These measurements are used to initialize the compositional makeup of dust sources in Earth System Models (ESMs). The dust cycle, which describe the generation, lofting, transport, and deposition of mineral dust, plays an important role in ESMs. Dust composition is presently the largest uncertainty factor in quantifying the magnitude of aerosol direct radiative forcing. By understanding the composition of mineral dust sources, EMIT aims to constrain the sign and magnitude of dust-related radiative forcing at regional and global scales. During its one-year mission on the International Space Station (ISS), EMIT will make measurements over the sunlit Earth’s dust source regions that fall within ±52° latitude. EMIT will schedule up to five visits (three on average) of each arid target region and only acquisitions not dominated by cloud cover will be downlinked. EMIT-based maps of the relative abundance of source minerals will advance the understanding of the current and future impacts of mineral dust in the Earth system.

EMIT Data Products are distributed by the LP DAAC. Learn more about EMIT data products from EMIT Product Pages and search for and download EMIT data products using NASA EarthData Search


Prerequisites/Setup Instructions

This repository requires that users set up a compatible Python environment and download the EMIT granules used. See the setup_instuctions.md file in the ./setup/ folder.

Repository Contents

Below are the resources available for EMIT Data.

Name Type Summary
Getting EMIT Data using EarthData Search Markdown Guide A thorough walkthrough for using EarthData Search to find and download EMIT data
Exploring EMIT L2A Reflectance Jupyter Notebook Explore EMIT L2A Reflectance data using interactive plots
Visualizing Methane Plume Timeseries Jupyter Notebook Find EMIT L2B CH4 Plume Data and build a timeseries of CH4 plume complexes
Generating_Methane_Spectral_Fingerprint Jupyter Notebook Extract Radiance Spectra and build an in-plume/out-of-plume ratio to compare with CH4 absorption coefficient
How to find and access EMIT data Jupyter Notebook Use the earthaccess Python library to find and download or stream EMIT data
How to Convert to ENVI Format Jupyter Notebook Convert from downloaded netCDF4 (.nc) format to .envi format
How to Orthorectify Jupyter Notebook Use the geometry lookup table (GLT) included with the EMIT netCDF4 file to project on a geospatial grid (EPSG:4326)
How to Extract Point Data Jupyter Notebook Extract spectra using lat/lon coordinates from a .csv and build a dataframe/.csv output
How to Extract Area Data Jupyter Notebook Extract an area defined by a .geojson or shapefile
How to use EMIT Quality Data Jupyter Notebook Build a mask using an EMIT L2A Mask file and apply it to an L2A Reflectance file
How to use Direct S3 Access with EMIT Jupyter Notebook Use S3 from inside AWS us-west2 to access EMIT Data
How to find EMIT Data using NASA's CMR API Jupyter Notebook Use NASA's CMR API to programmatically find EMIT Data

Helpful Links


Contact Info

Email: [email protected]
Voice: +1-866-573-3222
Organization: Land Processes Distributed Active Archive Center (LP DAAC)¹
Website: https://lpdaac.usgs.gov/
Date last modified: 03-13-2024

¹Work performed under USGS contract G15PD00467 for NASA contract NNG14HH33I.

emit-data-resources's People

Contributors

alexgleith avatar amfriesz avatar briannalind avatar ebolch avatar kdchadwick avatar mjami00 avatar pgbrodrick avatar tkantz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

emit-data-resources's Issues

Need to flip `glt_x`?

Here are a few granules for which the default processing did not work as expected for us. We had to end up using

glt_x = ds.reflectance.shape[1] - loc.glt_x.data

Here are some examples:

EMIT_L2A_RFL_001_20230530T092122_2315006_042
EMIT_L2A_RFL_001_20230530T092146_2315006_044

Is there any other metadata field that we need to parse and check against to make sure if this flip is needed or not?

chunked reading of files

Currently reading files using emit_xarray from emit_tools.py reads into a nd.array backed xr.dataset. An option to read into a chunked dask.array backed xr.dataset would help prevent out-of-memory errors when reading on machines with limited memory (loading failed on an 8GB SMCE machine) and potentially speed up operations on downstream operations using dask.

Adding chunks='auto' to

ds = xr.open_dataset(filepath,engine = engine)
works when ortho=False but not for ortho=True

Permission error of accessing S3 data

I have tried to run How_to_Direct_S3_Access.ipynb and got an error for the part of opening s3 url:

# Open s3 url
fp = fs_s3.open(s3_url, mode='rb')
# Open dataset with xarray
ds = xr.open_dataset(fp) #Note this only opens the root group (reflectance)
ds
---------------------------------------------------------------------------
ClientError                               Traceback (most recent call last)
File [~/miniconda3/envs/emit_tutorials/lib/python3.9/site-packages/s3fs/core.py:113](https://file+.vscode-resource.vscode-cdn.net/Users/xinz/Documents/github/EMIT-Data-Resources/python/how-tos/~/miniconda3/envs/emit_tutorials/lib/python3.9/site-packages/s3fs/core.py:113), in _error_wrapper(func, args, kwargs, retries)
    112 try:
--> 113     return await func(*args, **kwargs)
    114 except S3_RETRYABLE_ERRORS as e:

File [~/miniconda3/envs/emit_tutorials/lib/python3.9/site-packages/aiobotocore/client.py:383](https://file+.vscode-resource.vscode-cdn.net/Users/xinz/Documents/github/EMIT-Data-Resources/python/how-tos/~/miniconda3/envs/emit_tutorials/lib/python3.9/site-packages/aiobotocore/client.py:383), in AioBaseClient._make_api_call(self, operation_name, api_params)
    382     error_class = self.exceptions.from_code(error_code)
--> 383     raise error_class(parsed_response, operation_name)
    384 else:

ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

The above exception was the direct cause of the following exception:

PermissionError                           Traceback (most recent call last)
[/Users/xinz/Documents/github/EMIT-Data-Resources/python/how-tos/How_to_Direct_S3_Access.ipynb](https://file+.vscode-resource.vscode-cdn.net/Users/xinz/Documents/github/EMIT-Data-Resources/python/how-tos/How_to_Direct_S3_Access.ipynb) Cell 16 line 2
      [1](vscode-notebook-cell:/Users/xinz/Documents/github/EMIT-Data-Resources/python/how-tos/How_to_Direct_S3_Access.ipynb#X21sZmlsZQ%3D%3D?line=0) # Open s3 url
----> [2](vscode-notebook-cell:/Users/xinz/Documents/github/EMIT-Data-Resources/python/how-tos/How_to_Direct_S3_Access.ipynb#X21sZmlsZQ%3D%3D?line=1) fp = fs_s3.open(s3_url, mode='rb')
      [3](vscode-notebook-cell:/Users/xinz/Documents/github/EMIT-Data-Resources/python/how-tos/How_to_Direct_S3_Access.ipynb#X21sZmlsZQ%3D%3D?line=2) # Open dataset with xarray
      [4](vscode-notebook-cell:/Users/xinz/Documents/github/EMIT-Data-Resources/python/how-tos/How_to_Direct_S3_Access.ipynb#X21sZmlsZQ%3D%3D?line=3) ds = xr.open_dataset(fp) #Note this only opens the root group (reflectance)

File [~/miniconda3/envs/emit_tutorials/lib/python3.9/site-packages/fsspec/spec.py:1309](https://file+.vscode-resource.vscode-cdn.net/Users/xinz/Documents/github/EMIT-Data-Resources/python/how-tos/~/miniconda3/envs/emit_tutorials/lib/python3.9/site-packages/fsspec/spec.py:1309), in AbstractFileSystem.open(self, path, mode, block_size, cache_options, compression, **kwargs)
   1307 else:
...
    138         err = e
    139 err = translate_boto_error(err)
--> 140 raise err

PermissionError: Forbidden

s3 access EMIT data

Trying to run the s3 access notebook :
https://github.com/nasa/EMIT-Data-Resources/blob/main/python/how-tos/How_to_Direct_S3_Access.ipynb
but I am running in an access error:
ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

The above exception was the direct cause of the following exception:

PermissionError Traceback (most recent call last)
Cell In[9], line 2
1 # Open s3 url
----> 2 fp = fs_s3.open(s3_url, mode='rb')
3 # Open dataset with xarray
4 ds = xr.open_dataset(fp) #Note this only opens the root group (reflectance)

with a final error:
PermissionError: Forbidden

I assume this is a setting issue on the COS bucket where access is defined.

geotransform and spatial_ref for orthorectification, not original data

According to the notebook, I see the geotransform and spatial_ref attrs are used for orthorectification. Note that GT[2] and GT[4] are zero in the attrs. So, does that mean we can only use geotransform and spatial_ref for orthorectified data?

Actually, I have tried to use pyresample's AreaDefinition to define the area. The derived lon/lat values are different from these in the location group of NC file. I suppose that's because these attrs from the NetCDF file only works for orthorectified data.

If I'm wrong, please feel free to correct me. I'm curious how to use geotransform and spatial_ref for original data, if that should work ...

How to filter cloud_cover for EMIT data using pystac?

from pystac_client import Client

url = 'https://cmr.earthdata.nasa.gov/stac/LPCLOUD/?page=2'
collections = ['EMITL1BRAD.v001', 'EMITL1BRAD.v001']
bbox = [-99.65, 18.85, -98.5, 19.95]
date_range = "2022-05/2023-08"

catalog = Client.open(url)

params = {
'collections': collections,
'bbox': bbox,
'datetime': date_range,
'limit': 100,
}

filt = {
    "op": "lte",
    "args": [{"property": "eo:cloud_cover"}, 40]
}

# params['filter'] = filt

search = catalog.search(**params)

print('Matching STAC Items:', search.matched())

It works well without the cloud_cover filter. However, I get this error if I try to add the filter:

    215     raise APIError(str(err))
    216 if resp.status_code != 200:
--> 217     raise APIError.from_response(resp)
    218 try:
    219     return resp.content.decode("utf-8")

APIError: {"message":"If the problem persists please contact [email protected]","errors":["An unexpected error occurred. We have been alerted and are working to resolve the problem.","Unsupported parameter filter"]}

Is this not supported yet? Otherwise, I have to access the cloud_cover properties from the collections:

image

Problem with reading NASA PACE data

Thank you for the excellent notebook examples from the repo. @bingqing-liu and I are building a Python package called HyperCoast for visualizing NASA Hyperspectral data interactively with a few lines of code. Here is a notebook example for visualizing EMIT data interactively. However, we run into issues with reading PACE data. The latitude and longitude coordinates are in the format of (latutude, longitude). We need to turn them into one dimension like the EMIT dataset shown in the example below. The goal is the turn the xarray dataset into a rasterio dataset so that we can plot it on the interactive map.

Any advice? @alexgleith @betolink @BriannaLind

PACE

Code snippet

import xarray as xr

# download from https://github.com/opengeos/datasets/releases/download/netcdf/PACE_OCI.20240423T184658.L2.OC_AOP.V1_0_0.NRT.nc
filepath = 'data/PACE_OCI.20240423T184658.L2.OC_AOP.V1_0_0.NRT.nc'
ds = xr.open_dataset(filepath, group="geophysical_data")
ds = ds.swap_dims({'number_of_lines': 'latitude', 'pixels_per_line': 'longitude'})
wvl = xr.open_dataset(filepath, group="sensor_band_parameters")
loc = xr.open_dataset(filepath, group="navigation_data")
lat = loc.latitude
lon = loc.longitude 
wavelengths = wvl.wavelength_3d
Rrs = ds.Rrs 

dataset = xr.Dataset(
    {
        "Rrs": (("latitude", "longitude", "wavelengths"), Rrs.data)
    },
    coords={
        "latitude": (("latitude", "longitude"), lat.data),
        "longitude": (("latitude", "longitude"), lon.data),
        "wavelengths": ("wavelengths", wavelengths.data)
    }
)
dataset

Note the two dimensions (latitude, longitude) in the latitude and longitude coordinates
image

EMIT

Code snippet

import hypercoast
# download from https://github.com/opengeos/datasets/releases/download/netcdf/EMIT_L2A_RFL_001_20240404T161230_2409511_009.nc
filepath = "data/EMIT_L2A_RFL_001_20240404T161230_2409511_009.nc"
dataset = hypercoast.read_emit(filepath)
dataset

image

EMIT demo

Holoviews box

Box around holoviews in tutorial is too small in vertical (y-axis), should be expanded

Emit for the Future - Nasa Space Apps Challenge project

HI! My name is Maurizio Naletto, during the Nasa Space Apps Challenge in Cagliari we developed a small project to raise people's awareness of the dynamics that contribute to global warming with EMIT. We think that the EMIT system, the ISS and the use of open source web platforms are very important to help people be more aware of climate change. Our idea was to create a website associated with a repository that serves as a guide for young students, professors and researchers. We created a working outline for younger students and professors' teaching labs to turn students into investigators researching greenhouse gas emissions using VISIONS: The EMIT Open Data Portal. For researchers we created a small notebook using the official jupiter-based repo as a basis, porting it to google colab, adding a small OpenCV-based machine vision algorithm to extract data directly from the rasterized image. We would like to share our work with you if it might be useful to you because we studied your work on github before participating. You can find the EMIT "copilot" support website here https://www.emit-vision-iss-copilot.com/ while the github repository is at this link: https://github.com/MaurizioNaletto-code/EMIT-VISIONS -ISS-COPILOT. If you have the pleasure of contacting us you can do so by email [email protected]. Thank you!

Anaconda gets hung up on solving environment

I have loaded in a local copy of the repository and am attempting to create the compatible environment using the line
conda env create -f setup/emit_tutorials.yml

Conda has been stuck on "solving environment" now for a solid 2 hours with no indication of when/if it will conclude. Is there any way to avoid Conda hanging up on this stage of the process?

Keyerror with 'features' in Visualizing Methane Time Series Tutorial

Thank you for these jupyter notebook tutorials! I downloaded the code and I'm working locally in VS Code.

I'm getting a Keyerror for 'features' when calling the fetch_ch4_metadata function on plm_gdf, even though it appears that 'features' is being indexed correctly from the json object:

def fetch_ch4_metadata(row):
response = requests.get(get_asset_url(row, 'CH4PLMMETA'))
return response.json()['features'][0]['properties']

fetch_ch4_metadata(plm_gdf.iloc[0])

Thanks in advance for any help you can provide.
Coleman Vollrath

Images with a different crosstrack dimension length

There are a number of EMIT granules with a cross-track dimension of 1306 whereas all documentation seems to suggest that the cross track dimension should be fixed at 1242. Should these granules be handled differently?

['EMIT_L2A_RFL_001_20220810T101138_2222207_001',
 'EMIT_L2A_RFL_001_20220810T065438_2222205_046',
 'EMIT_L2A_RFL_001_20230224T055637_2305504_032',
 'EMIT_L2A_RFL_001_20231021T074842_2329405_010',
 'EMIT_L2A_RFL_001_20220810T065838_2222205_053',
 'EMIT_L2A_RFL_001_20220910T140554_2225309_011',
 'EMIT_L2A_RFL_001_20231016T070407_2328905_014',
 'EMIT_L2A_RFL_001_20230316T202635_2307513_004',
 'EMIT_L2A_RFL_001_20230429T081524_2311905_003',
 'EMIT_L2A_RFL_001_20220811T042658_2222303_011',
 'EMIT_L2A_RFL_001_20230216T120159_2304708_004',
 'EMIT_L2A_RFL_001_20230114T184508_2301412_015',
 'EMIT_L2A_RFL_001_20230601T185504_2315213_013',
 'EMIT_L2A_RFL_001_20231030T161937_2330311_010',
 'EMIT_L2A_RFL_001_20230204T053642_2303504_008',
 'EMIT_L2A_RFL_001_20230222T102534_2305307_018']

Package emit_tools.py seperately for pip/conda install

Hi EMIT folks! I'm starting to work with some EMIT data and am LOVING these tutorials! Great work! My only request is for emit_tools.py to be packaged for standalone installation so that the functions can be used by other projects without having to download and install the entire tutorial repo.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.