Giter Site home page Giter Site logo

major-tom's Introduction

image/png

πŸ—ΊοΈ Major TOM: Expandable Datasets for Earth Observation

HF HF paper Open In Colab

A standard for curating large-scale (Terabyte-scale) EO datasets.

This repo currently provides some basic functionality and examples for interacting with Major TOM datasets. This will grow and change as more datasets are created.

πŸ“Š Available Datasets

Dataset Modality Number of Patches Sensing Type Comments
Core-S2L2A Sentinel-2 Level 2A 2,245,886 Multi-Spectral General-Purpose Global (about 23 TB)
Core-S2L1C Sentinel-2 Level 1C 2,245,886 Multi-Spectral General-Purpose Global (about 23 TB)

πŸ”­ Demo

You can view samples from the MajorTOM-Core dataset instantly in your browser here: https://huggingface.co/spaces/Major-TOM/MajorTOM-Core-Viewer huggingface co_spaces_Major-TOM_MajorTOM-Core-Viewer (1)

πŸ“Œ Open Access Manuscript

This project has been outlined in https://arxiv.org/abs/2402.12095/.

Read Abstract

Deep learning models are increasingly data-hungry, requiring significant resources to collect and compile the datasets needed to train them, with Earth Observation (EO) models being no exception. However, the landscape of datasets in EO is relatively atomised, with interoperability made difficult by diverse formats and data structures. If ever larger datasets are to be built, and duplication of effort minimised, then a shared framework that allows users to combine and access multiple datasets is needed. Here, Major TOM (Terrestrial Observation Metaset) is proposed as this extensible framework. Primarily, it consists of a geographical indexing system based on a set of grid points and a metadata structure that allows multiple datasets with different sources to be merged. Besides the specification of Major TOM as a framework, this work also presents a large, open-access dataset, MajorTOM-Core, which covers the vast majority of the Earth's land surface. This dataset provides the community with both an immediately useful resource, as well as acting as a template for future additions to the Major TOM ecosystem.

image/jpeg

If you found this useful for your research, please cite accordingly as:

@inproceedings{Major_TOM,
  title={Major TOM: Expandable Datasets for Earth Observation}, 
  author={Alistair Francis and Mikolaj Czerkawski},
  year={2024},
  eprint={2402.12095},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Powered by Ξ¦-lab, European Space Agency (ESA) πŸ›°οΈ


FAQ

Is Major TOM just another EO dataset?

Almost. Major TOM is not a dataset, but a project aiming to standardize some of the future EO datasets. As an example of what such a dataset could be like, MajorTOM-Core is released as a nearly global dataset of Sentinel-2 data.

Scroll up to the πŸ“Š Available Datasets section of this file to see the list of current datasets.

Who is going to contribute to upcoming Major TOM datasets?

Anyone can contribute. The original authors of the Major TOM paper are already working on a few other datasets that will join the Major TOM initiative.

Can I join Major TOM organisation on HuggingFace?

Anyone can join the organisation with reading rights. In order to gain contributor rights, you will need to contact one of the admins and verify who you are and how you would like to contribute (you should be allowed to contribute with any dataset that follows Major TOM standard).

major-tom's People

Contributors

mikonvergence avatar alifrancis avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.