Giter Site home page Giter Site logo

tudelftgeodesy / sarxarray Goto Github PK

View Code? Open in Web Editor NEW
17.0 2.0 4.0 8.5 MB

Xarray extension for Synthetic Aperture Radar (SAR) data

Home Page: https://tudelftgeodesy.github.io/sarxarray/

License: Apache License 2.0

Python 100.00%
earth-observation insar radar sar interferometry

sarxarray's Introduction

SarXarray

DOI PyPI Quality Gate Status OpenSSF Best Practices Build License

SARXarray is an open-source Xarray extension for Synthetic Aperture Radar (SAR) data. It is especially designed to work with large volume complex data, e.g. Single Look Complex (SLC) data, as well as derived products such as interferogram stacks.

Installation

SARXarray can be installed from PyPI:

pip install sarxarray

or from the source:

git clone [email protected]:TUDelftGeodesy/sarxarray.git
cd sarxarray
pip install .

Note that Python version >=3.10 is required for SARXarray.

Documentation

For more information on usage and examples of SARXarray, please refer to the documentation.

sarxarray's People

Contributors

cpranav93 avatar rogerkuou avatar sarahalidoost avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

sarxarray's Issues

Hierarchy within Zarr directory

According to Freek, it is preferred to separate the two kinds of Zarr atrributes in the storage:

  • point attributes: data variables with only (point,) dimension
  • time attributes: data variables with (point, time) dimension

Check if MultiIndex is working for Zarr storage

In point_selection, we replaced the MultiIndex coordinates for points dimension with an actual index. See this part

The motivation was:

  1. MultiIndex cannot be exported to Zarr
  2. MultiIndex has an issue with the xarray.sel

We can check if this is actually improved in the current version of Xarray.

software release

  • Documentation
  • Tests
    • GitHub automated testing
  • Testing actions
  • Citations
  • Tutorials
  • Publish to PyPI
  • Checklist: OpenSSF Best Practices
    • Add check list
    • Contributor information
    • Release number mechanism

Update demo Notebook

  1. Make one stand-alone notebook with figshare data. Example to be reused. Can just use the first part.
  2. README for example directory on setup.
  3. Another TUDelft dedicated version for large scale data

coherence computation function

Create a coherence comuptation function

The implementation of the coherence estimate can be found at the

  • thesis of Ramon, page 96.
  • (Old) Doris software manual: doris.tudelft.nl (Coherence equation is difficult to find, see also screenshot. So can be implemented by complex multiplications of MS’, MM’, and S*S’.
    image

Example implementation of the coherence computation can be found at:

Bug: Point selection method selects points in the water

The function 'slcstack.point_selection(threshold, method="amplitude_dispersion",chunks=5000)' selects also points in the water.
This is strange since these points have a very high amplitude dispersion.

See the image in the appendix for the selected points. The red points are points that are selected in the water
Scherm­afbeelding 2023-11-15 om 16 21 41

This problem is caused since there are points with nans in the slc stack. When we then apply the point selection method, these points are also selected since the algorithm treats these nan values as very small values

Test sarxarray in a larger scale

The current Notebook example runs only with three SLCs. We need to test it for a full stack of SLC. Spider would be a good platform to start.

  • Set up JupyterDaskOnSLURM on Spider
  • Use full stack under /project/caroline/Share/stitch/nl_veenweiden/nl_veenweiden_s1_asc_t088
  • Start a dask cluster to execute the example notebook

Deprecation warning of getting dimension sizes

mrm = stack_subset.slcstack.mrm()
/SPIDER_TYKKY_6rp6QPX/miniconda/envs/env1/lib/python3.11/site-packages/sarxarray/stack.py:34: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  t_order = list(self._obj.dims.keys()).index("time")  # Time dimension order

Imporve the demo notebook

  • Typo: Zaar -> Zarr
  • Installation instruction: using JupyterLab
  • point_selection dimensions: points first, time second
  • Duplicated importing of matplotlib

Endianess of data

Hey,

thank you for this nice open-source project. I was wondering if you thought about adding the possibility to specify the endianess of the complex SAR data to load.

Thanks,
Nils

Finish software release

  • GH action PyPI publish on release
  • Update change log
  • Add PyPI badge to README
  • Add doc link to the GH page

Usage of float16

The usage of float16 remains to be discussed. At this stage, the half-precision is not supported by every numpy function. E.g. see this numpy issue.

This means we can still store data in float16. However, we need to discuss whether it is worth the memory. It may help to reduce the memory consumption, but we may need to perform type conversion for quite some numpy operations, which may make sarxarray in efficient.

There are two alternatives to deal with this issue:

  1. Use float16 only for storage, but use float32 for sarxarray.
  2. Use float16 for both storage and in memory, do conversions where needed.

Add Multi-look function

An prototype multilook function can be found two archived notebooks. We need to add them as delayed functions in sarxarray.stack.

In Zoe's example, she uses view_as_windows from skimage as multiplook window. But maybe rolling from dask can be more parallelizable?

Example SAR stack can be downloaded here.

Redesign the default trunck size in `from_binary`

In _io.py, when loading data by from_binary, a default 50-row chunking is applied. This chunk size needs to be based more reasonably, preferably on the pixel size of SLC. This means we need to reimplement the chunking to 2-D. Besides, in the current implementation, the trunk reading function is concatenated in a loop in _mmap_dask_array, the efficiency can be improved by applying this in a vector operation.

One can work on this on Spider with this notebook. The following cell should be faster:

stack = sarxarray.from_binary(list_slcs, shape, dtype=dtype)
stack

To do:

  • Add 2D chunking argument
  • Vectorization of chunk concatenation

Partially read SLC with given polygon

Add a feature to the point_selection function of reading a part of SLC with a given polygon.

Input: 1) SLC stack in radar coords; 2) geo-reference coordinates per SLC pixel; 3) georeferenced shp file of (multi-)polygon
Output: An stm with selected pixels

Reading binary data with separated REAL and IMAG part

Some SAR data providers provide binary complex data in two files, separating real and imaginary part.

For example check this path:
ls /project/caroline/Share/users/caroline-fvanleijen/projects/saocom/saocom_test/20230415/merged.data

files:

i_HH_mst_28Mar2023.hdr  i_HH_slv1_13Apr2023.hdr  q_HH_mst_28Mar2023.hdr  q_HH_slv1_13Apr2023.hdr
i_HH_mst_28Mar2023.img  i_HH_slv1_13Apr2023.img  q_HH_mst_28Mar2023.img  q_HH_slv1_13Apr2023.img

files starting with i are real data, q are imagenary data. img are binary files, hdr are metadata files.

Now the from_binary function assumes complex data are saved in one binary. We should consider expand it for this case.

Chunk by nearby points in `point_selection`

At present, the chunking strategy in point_selection function is based on the index of each point. This may result into points far away from each other in the same chunk, and cause low performance for the later point processing.

We need a strategy to sort points based on spatial nearness, and cluster a fix number of points into one chunk.

Fix warning raised when using the `pointselection` function

See the warning in the demo notebook.

/home/ouku/miniconda3/envs/sarxarray/lib/python3.10/site-packages/dask/array/reductions.py:758: RuntimeWarning: overflow encountered in square
  ns * inner_term**order, axis=axis, **kwargs
/home/ouku/miniconda3/envs/sarxarray/lib/python3.10/site-packages/numpy/core/fromnumeric.py:86: RuntimeWarning: overflow encountered in reduce
  return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
/home/ouku/miniconda3/envs/sarxarray/lib/python3.10/site-packages/dask/core.py:119: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

This warning concerns the _amp_disp function in stack.py, specifically the following line:

amplitude_dispersion = amplitude.mean(axis=t_order) / amplitude.std(axis=t_order)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.