Giter Site home page Giter Site logo

tsamadosicesat-2app's Introduction

Thoughts on ICEsat-2 data for cloud detection training

Why this project?

It was difficult to pick one out of the 3 projects offered as all 3 were very interesting, but I would like to apply for the cloud detection project for the following reasons:

This is an unsupervised learning project and when choosing any project which relies on machine/deep learning, it is imperative to first determine the quality and quantity of the dataset we wish to work with. A brief look at the data products 1 that the ATLAS instrument outputs from ICEsat-2; specifically the relevant datasets for this project which are profiles of atmospheric backscatter: ATL04 2 (Normalised backscatter profiles) and ATL09 3 (Calibrated backscatter profiles) showed that quantity would not be an issue as there is a combined 23TB of data available to use or train on (subject to test/train splits). In addition to this, the data is well organised and known issues are well documented and the low-level HDF5 files are also available.

I am also drawn to this project due to the potential applications of it which are many. The most apparent is an additional method for cloud prediction (supporting predictions from optical satellite imagery). Cloud cover often disturbs analyses (e.g. when using other data products like ATL06) and checking whether if the disruption is caused by cloud cover or not seems to be standard procedure *, and there may be some space for automating this process (which is another application of this project).

Finally atmospheric physics is something that we as undergrad physics students did not get an opportunity to study, so the idea of applying prior knowledge to (as of yet) unfamiliar domains is a challenge that I look forward to!

The research problem and my skill set

Although the project summary was a little light on detail, here is my understanding of the project:

Advanced Topographic Laser Altimeter System (ATLAS) provides altimetry data from the ICEsat-2. The low-level binary data are transferred from ATLAS several times a day to a data centre in Alaska which are then converted to more useful formats for analyses. The dataset that would be relevant for this project are ATL09 and ATL04 which contain signals picked up with the atmospheric channels of ICEsat-2 (atmospheric backscatter). From this dataset it is possible to determine whether or not a cloud was present using an estimate for the “apparent surface reflectivity”. There are two flags present in the dataset: cloud_flag_ASR and cloud_flag_atm, the former is preferred (as mentioned in docs) in daylight conditions and the latter for night. There is also a confidence level associated with the flags ranging from 0-5 (with meaning clear with high/med/low confidence to cloud with low/med/high confidence). It is possible to use this as training data to feed into a deep neural network (CNN or LSTM) which can detect whether there was cloud/no cloud. It may also be possible to take it one more step and identify whether the cloud is thin or thick.

The following python libraries will be useful in the project:

  • SciPy ecosystem (Numpy, Scipy, Pandas etc...)
  • Sklearn 4
  • icepyx 5
  • Keras (with tflow backend) or pytorch 6 (using CUDA; I found the setup for this to be difficult)

With the exception of icepyx (which is quite specific to this project and I hope to learn to use) I have prior experience with all the packages listed above (in prior machine learning module/ physics modules and personal mini projects I have mentioned in my academic CV) which will be useful. I also believe that UCL Myriad 7 could be something to look into, especially due to its tensorflow capabilities (subject to access).

References

1: https://nsidc.org/sites/nsidc.org/files/images/icesat-2-prod-map-rev.png

2: https://nsidc.org/data/ATL04

3: https://nsidc.org/data/ATL09/versions/3

4: https://www.scipy.org/docs.html

5: https://icepyx.readthedocs.io/en/latest/

6:https://pytorch.org/docs/stable/index.html

7 https://www.rc.ucl.ac.uk/docs/Clusters/Myriad/#tensorflow

I: I found this as a very useful introduction to the datasets and the functions of ICEsat-2 https://www.youtube.com/watch?v=0guml7ihfdA

II: Excellent demonstration by the Data lead at NSIDC on how to use/obtain data https://www.youtube.com/watch?v=6KZOPqyp-bY

tsamadosicesat-2app's People

Contributors

husaininazeer avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.