Giter Site home page Giter Site logo

eo4ai's Introduction

---WORK IN PROGRESS---

EO4AI

Build Status

Earth Observation preprocessing tools for AI and machine learning applications

This project provides easy-to-use tools for preprocessing datasets for image segmentation tasks in Earth Observation. We hope to remove the barrier to entry for data scientists in EO, by reducing the amount of time spent on reformatting datasets. These EO datasets are frequently characterised by very large image formats, high bit-depths, non-standard label formats, pixel values in Digital Number, varied naming conventions, and other dataset-specific peculiarities which slow down development of AI applications.

This package aims to provide users with a pre-prepared dataset ready immediately for AI / Deep Learning applications. The processed datasets are all:

  • Normalised to reflectance values
  • Resampled to the same resolution
  • Split into smaller images for quicker read times
  • Transformed into one-hot encoded masks
  • Organised into simple directory tree structure
  • Documented with useful metadata and command for replication

Cloud Masking datasets

Landsat 8: Biomelink (USGS, 2016)

96 manually annotated Landsat 8 scenes (~8k-by-8k pixels) from 8 different terrain types (biomes). Data provided at 30m res. for all bands.

Landsat 8: SPARCSlink (USGS, 2016)

80 manually annotated cropped Landsat 8 scenes (1k-by-1k pixels). Data provided at 30m resolution but does not include sharper 'Panchromatic' band.

Landsat 7: Irishlink (USGS, 2016)

206 manually annotated Landsat 7 scenes from a diverse range of latitudes. Data provided at nominal Landsat 7 resolution of 30m.

Sentinel-2: ALCDlink (Baetens et al., 2018)

38 Sentinel-2 scenes annotated through an "active learning" system. Data provided in native band resolutions (10m - 60m). Does not include the parent scenes, only the masks. Therefore we include a download tool to retrieve the relevant scenes from the Copernicus Open Access Hub, for which a username and password is needed.

Sentinel-2: IRISlink (Francis et al., 2020)

513 subscenes from Sentinel-2. Each image and mask pair is 1022 pixels across.

Sentinel-2: KappaZetalink (Domnich et al., 2021)

4403 subscenes from 155 Sentinel-2 products. Each image and mask pair is 512 pixels across at 10 m/pixel resolution.

Credits and Contributions

Please use these tools freely in your work. Give this repository an acknowledgement and always credit and cite the datasets' creators, who have put a huge amount of work into these labelled datasets!

If you have a dataset that you think would be a good fit, or would like to contribute to the repository, please post an issue, send a PR, or just get in touch!

eo4ai's People

Contributors

alifrancis avatar johnmrziglod avatar dependabot[bot] avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.