vsainteuf / pytorch-psetae Goto Github PK

PyTorch implementation of the model presented in "Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention"

License: MIT License

Python 100.00%

pytorch satellite-image time-series-classification deep-learning computer-vision earth-observation remote-sensing agriculture self-attention transformer-architecture

pytorch-psetae's Introduction

Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention (CVPR 2020, Oral)

PyTorch implementation of the model presented in "Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention" published ar CVPR 2020.

Paper abstract:

Satellite image time series, bolstered by their growing availability, are at the forefront of an extensive effort towards automated Earth monitoring by international institutions. In particular, large-scale control of agricultural parcels is an issue of major political and economic importance. In this regard, hybrid convolutional-recurrent neural architectures have shown promising results for the automated classification of satellite image time series. We propose an alternative approach in which the convolutional layers are advantageously replaced with encoders operating on unordered sets of pixels to exploit the typically coarse resolution of publicly available satellite images. We also propose to extract temporal features using a bespoke neural architecture based on self-attention instead of recurrent networks. We demonstrate experimentally that our method not only outperforms previous state-of-the-art approaches in terms of precision, but also significantly decreases processing time and memory requirements. Lastly, we release a large open-access annotated dataset as a benchmark for future work on satellite image time series.

[UPDATES]

17.08.2021 Check out our new approach for panoptic segmentation of satellite image time series, as well as our new benchmark dataset for semantic and panotpic segmentation of satellite image time series.
17.07.2020 Check our lightweight version of the TAE, a channel grouping strategy brings better performance with 10 times fewer parameters.
30.03.2020 Dataset preparation script available in 'preprocessing' folder + Variation of the PixeSetData class that loads all samples to RAM at init.
12.03.2020 Bug fix in the TAE script (see pull request comments): if you were using a previous version, re-download the pre-trained weights.

Requirements

Pytorch + torchnet
numpy + pandas + sklearn

The code has been tested in the following environment:

Ubuntu 18.04.1 LTS, python 3.6.6, pytorch 1.1.0, CUDA 10.0

Downloads

Datasets

A toy version of the Pixel-set dataset can be directly downloaded here, to get an idea of the dataset structure.

The complete Pixel-set and Pixel-patch datasets are accessible on Zenodo at the following links:

Pre-trained weights

We also provide the pre-trained weights for inference.

Code

Code structure

The PyTorch implementations of the PSE, TAE and PSE+TAE architectures are located in the models folder.
The folder learning contains some additional utilities that are used for training.
The repository also contains two high-level scripts train.py and inference.py that should make it easier to get started.

Code Usage

Reproduce

Run the train.py script to reproduce the results of the PSE+TAE architecture presented in the paper. You will just need to specify the path to the Pixel-Set dataset (link above) with the --dataset_folder agrument.

Experiment

The default settings of the train.py script are those used to produce the results in the paper. Yet, some options are already implemented to play around with the model's hyperparameters and other training settings. These options are accessible through an argparse menu (see directly inside the script).

Re-use

You can use the pre-trained weights in the inference.py script to produce predictions on our dataset or your own, provided that it is formatted as per the indications below. You will need to pass the path to the unzipped folder containing the weights with the --weight_dir argument. (do not uncompress the model.pth.tar files as the script takes care of this.)
The two components of our model (the PSE and the TAE) are implemented as stand-alone pytorch nn.Modules (in pse.py and tae.py) and can be used for other applications. While the PSE needs to be used in combination with the PixelSetData class, the TAE can be applied to any sequential data (with input tensors of shape batch_size x sequence_length x embedding_size).

Data format

In order to use the PixelSetData dataset classs with other data than those provided in the link above, the data folder should be structured in the following fashion:

Data structure

Samples

Each dataset sample consits in the different observations for a single parcel. The observations are aggregated in a single array of shape TxCxS with T the number of temporal observations, C the number of channels, and S the number of pixels in the parcel (different for each data sample). Each of these arrays should be stored separately in a numpy file: unique_id_of_the_sample.npy

All the individual .npy files are stored in the same sub-directory DATA.

Normalisation values

The normalisation values should be computed beforehand and stored in the form of a tuple of arrays (means, stds) in a pickle file in the main folder. The PixelSetData dataset class can adapt to different normalisation strategies depending on the shape of the arrays:

Channel-wise normalisation for each date → the arrays have have shape (TxC)
Channel-wise normalisation → the arrays have shape (T,)
Global normalisation → In that case each of the two arrays consists in a single value.

Labels

The labels should be stored in the META/labels.json file. This file has a nested dictionary like structure and can contain multiple nomenclatures:

labels.json = {
  "Name_of_nomenclature1": {
    "unique_id_0": label_0,
    ...,
    "unique_id_N": label_N,
    }, 
  "Name_of_nomenclature2": {
    "unique_id_0": label_0,
    ...,
    "unique_id_N": label_N,
    }
}

Dates and pre-computed features

The dates of the observations, if they are going to be used for the positional encoding, should be stored in YYYYMMDD format in the META/dates.json file:

dates.json = {
    1: date_0,
    ...,
    T: date_T,
}

If some pre-computed static parcel features are to be used between the two MLPs of the PSE, they should be stored in another json file META/name_of_features.json:

name_of_features.json = {
    "unique_id_0": features_0,
    ...,
    "unique_id_N": features_N,
}

Folder structure

The dataset folder should thus have the follwoing structure:

Dataset_folder
│ normalisation_values.pkl
└─DATA
│    │ sample0.npy
│    │ . . .
│    │ sampleN.npy
└─META
     │ labels.json
     │ dates.json
     │ geomfeat.json

Credits

The Temporal Attention Encoder is heavily inspired by the works of Vaswani et al. on the Transformer, and this pytorch implementation served as code base for the TAE.py script.
Credits to github.com/clcarwin/ for the pytorch implementation of the focal loss

Reference

In case you use part of the present code, please include a citation to the following paper:

@article{garnot2019psetae,
  title={Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention},
  author={Sainte Fare Garnot, Vivien  and Landrieu, Loic and Giordano, Sebastien and Chehata, Nesrine},
  journal={CVPR},
  year={2020}
}

pytorch-psetae's People

Contributors

Stargazers

Watchers

pytorch-psetae's Issues

How to normalize data if I have dataset not for everyday?

Can you please have a look at the last two comments on this issue #12?

I have a dataset with shapes:

(142, 8, 1048576)
(159, 8, 1048576)
(151, 8, 1048576)

How should I normalize such a dataset?

geomfeat in paper

Hi VSainteuf!
I really like your paper. I'm trying to reproduce LTAE using PSE on my own data from irrigated and unirrigated parcels in the US. I wonder what you used to populate 'geomfeat.json'? I've extracted latitude, longitude, and elevation, did you use something like this?
Thanks!

Linear interpolation of cloudy pixels

Hello!
Really interesting paper, and a very cool approach. I was wondering about the paragraph in Section 4.1: "The values of cloudy pixels are linearly interpolated from the first previous and next available pixel using Orfeo Toolbox", which I assume means that cloud masks are required for the satellite images and that any cloudy pixels are replaced with the nearest (in time) land surface pixel.

Is there any specific reason to why you do not include cloudy pixel values? Can the network not handle them? In Section 2, under "Attention-Based Approach", you mention that "Transformer yields classification performance that is on par with RNN-based models and present the same robustness to cloud-obstructed observations", so it would seem that it should not be a problem to include them---or what?

Thanks!

dataset

Great job, how should I get the dataset?I didn't find your mailbox

Duplicating pixels, DataLoader, pooling functions

Observations

Because the DataLoader expects samples of same shape (N, T, C, S), you duplicate pixels, such that all samples have the same S
You save a mask of original/duplicated pixels for the aggregation layer, used by custom pooling functions.
Drawback : forward passes are applied on duplicated pixels, but the results are thrown away during aggregation = wasted ressources and a more complex pipeline

Question
Did you consider any other workaround to handle variable S value between samples ?
Did you consider any solution which doesn't require building a mask ? (torch.geometric DataLoader for instance ?)

Question regarding the normalization values

Hi there,

I wonder how you calculated the Channel-wise normalization for each date? Mean and std over all the training dataset? To be more specific, let's say the first element of the tuple is a TxC array of mean values (array_1). Is the first element of array_1 (array_1[0][0]) the mean value of all the pixels and all the training dataset for the first channel and first date?

I am a little bit confused.

About dataset.

Hi, excellent work. I emailed you, but there was no reply. Can you share me the data set?

Is dates.json mandatory?

@VSainteuf, the following statement seems to mention that the dates.json file is not mandatory:
The dates of the observations, if they are going to be used for the positional encoding, should be stored in YYYYMMDD format in the META/dates.json file
I am using my own dataset which is collected according to the growing seasons of the crop, instead of a fixed duration. Hence, I am not using dates.json as it does not seem to be useful in this case. But it looks like this file is mandatory. I even tried putting the dates.json file loader in a conditional statement, but it's also not working.
Can you please confirm whether this file is necessary or not? And if yes, can you point me out to the reference which states its usage?

How do I build my own dataset?

I recently read your paper and I wanted to know how to go about making my own dataset, which I found to be rather difficult

Question

Hi, VSainteuf, I'm really interested in your project. But I don't understand the reference data 'rpg_2017_T31TFM.geojson' in the ./preprocessing/dataset_preparation.py file. Could you briefly introduce the data content or give a reference? I really hope to get your reply. Thank you very much. 😊

How to train on the dataset which is sequential?

Hello! I have a dataset which is sequential. I have a varying length of data, for example for object 109x8x316 or 90x8x517. I have tried to put order for positions and 137 for --lms, while I am getting the following error RuntimeError: stack expects each tensor to be equal size, but got [105, 8, 64] at entry 0 and [109, 8, 64] at entry 1. Can you guide on this please?

Timestamp

Greetings Mr @VSainteuf
I have a question about the inference if you may.
Is it an obligation to use the same T in the training and inference? Because I'm dealing with data that can have rotation(so if the class of the land changes, we'll not notice since, in the case of T=24, it'll be a new picture of class "1" and the 23 other pictures will have the old class "0". So, basically the new picture will be meaningless. For that, I know it's a little bit weird but can I use only the last picture in the inference(T=1), or repeat this latter image 24 times(So it's like there's 24 different images)?
-Another question if you may :) Based on your experience, do you think that it's possible to work with this algorithm in urban stuff; Let's say to segment if there's building in the parcel or it's empty?

GeoTiff and GeoJson

Hello!
I have a question about the nature of the data if you may.
In my case, I have as input data "GeoTiff" and the labels are GeoJson.
So my question here is that can I use this type of data in order to train the model, os I should use, strictly, npy and Json as data.
And if I can, Is there anything to change in the code?
Thank you!

What is the crop type of the 19 classes? And the 44classes?

Hi ,
I will use your dataset to pretrain a model for a crop classification task, but I don't know the specific crop categories in your dataset. It doesn't seem to be mentioned in the paper. Could you please tell me the crop categories for the 19 classes and the 44 classes?

Importance of dates.json and dataset with several seasons

Hello, @VSainteuf

I have two questions regarding the dataset

I am going to train the model for dataset that contains the same fields with different seasons (year X, year X+1). So, is it okay if I create dataset names as "id_year.npy" to differentiate between different years?
What is the reason behind differences in dates in dates.json? As I understand, the delta between two consecutive dates are varying, does it affect the model somehow? Also, if I create dataset as I mentioned in the first question, how can I adjust year number in dates.json? Can it be any arbitrary year as only the days matter in calculating differences?

Thank you for your response in advance

Order of channels

Hi --

Are you able to give some information about the order of channels in your pre-processed data? I see that you're using 10 of the Sentinel-2 bands, but could you give their names and the order they appear in eg. the S2-2017-T31TFM-PixelSet-TOY dataset?

Thanks!

Label set in paper

In the paper, the results are reported for a 20-class classification task. However, it looks like the number of classes in the dataset are either 17 or 35:

>>> z = json.load(open('labels.json'))
>>> len(set(z['label_19class'].values()))
17
>>> len(set(z['label_44class'].values()))
35

Are you able to give more details on the setup of the experiments in the paper?

Thanks!

Edit: Also -- there's a -1 class in label_19class -- what does that mean?

Question about THEIA tiles pre-processing

Hello @VSainteuf,

Thanks a lot for this great repo !

Reading the preprocessing code you pushed here, I realised you read all 10 bands from one single geotiff.

Downloading the data from THEIA myself, I see that the 10 bands are given in separate geotiffs, with different resolutions (10m & 20m).

My questions are:

Am I missing something from THEIA that could provide me a stacked band geotiff directly ? (for now I am using the theia.cnes.fr/atdistrib/resto2/collections/SENTINEL2 service to download the tiles)
Do you stack the geotiffs together yourself using rasterio & GDAL ? Then is the 20m resolution interpolated to 10m or is the resulting picture down-sampled to 20m ?
Are you using THEIA cloud masks to rule out clouds in pixel samples ? I saw that you were doing interpolation before but that the model can handle clouds in another issue :)

Thanks a lot !

Best,

Inference issue

Greetings Mr @VSainteuf
I have been trying to do inference using the pre-trained weights that you've provided, and on the toy version of the Pixel-set dataset..
And it worked successfully like we can see below:

$ python inference.py --dataset_folder data --weight_dir /home/Desktop/pytorch-psetae/weights/CVPREPO_FINAL/
{'T': 1000,
'batch_size': 128,
'd_k': 32,
'dataset_folder': 'data',
'device': 'cuda',
'dropout': 0.2,
'fold': 'all',
'geomfeat': 1,
'input_dim': 10,
'lms': None,
'mlp1': [10, 32, 64],
'mlp2': [132, 128],
'mlp3': [512, 128, 128],
'mlp4': [128, 64, 32, 20],
'n_head': 4,
'npixel': 64,
'num_classes': 20,
'num_workers': 8,
'output_dir': './output',
'pooling': 'mean_std',
'positions': 'bespoke',
'weight_dir': '/home/Desktop/pytorch-psetae/weights/CVPREPO_FINAL/'}
Preparation . . .
Loading pre-trained models . . .
Successfully loaded 5 model instances
Inference . . .
100%|███████████████████████████████████████████████████████████████████████████████████| 4/4 [00:01<00:00, 2.52it/s]
Results stored in directory ./output

**But, I have 3 questions if you may:
First, the output file is named "Predictions_id_ytrue_y_pred.npy" and its shape equals to (499,3) (You can find it attached).. So what does that mean? I mean what's the information that it provides?
Second, In order to do the inference, you should provide the META data which contains labels.json, dates.json ...etc. Isn't it?
So, in case I want to do the inference on a sentinel2 image of my own; from another tile of the globe and know its class.. Am I obliged to provide its META data? Because I thought that we should only provide the satellite image, and it will do the rest. Am I wrong?

Finally, I was wondering about how to prepare the "pkl"(mean_std) file?

Thank you!

file.zip

Dataset.py script + geomfeat.json

Hello Mr @VSainteuf
I'm preparing my own dataset, I would like to add a new crop type by training the model but somehow i'm finding difficulties regarding the scripts used in this case.
I have the time series sentinel2 images ready including their corresponding npy DATA (TxCxN) and the META (labels.json, dates.json and sizes.json) however It seems like I need the geomfeat.json file also for the training part. Is there any way to generate it , to compute its 4 parameters specially the cover ratio.
Another thing, Can you please explain what is the use of "dataset.py" script exactly that takes in consideration the Data npy and it's META including labels.json and dates.json and what does it generate exactly as output?
Thank you!

Generation of normalisation_values.pkl

Hi, I am aiming to use this model with my own dataset. I have used the dataset_preparation.py file to do so, and it created the META and DATA folders correctly. However I am not able to find a way to make the normalisation_values.pkl necesarry to train the model with my data, is there any way to create it?

Tensor reshape

Hi,
First of all, thanks for your impressive work!

I was wondering why you reshape your 4D inputs (NxTxCxS) into 3D tensors (N*TxCxS).

Question 1
My guess is that you wanted to use BatchNorm1d layers (which only support 2D and 3D tensors).
Is there any deeper reason ?

Question 2
Would it be possible to use BatchNorm2D (which support 4D tensors) instead ? my guess is that it would not lead to the same normalization process (batch normalization performed by timestamp, instead of over the entire batch of the N*T samples), but to me, it seems OK though. What are you thoughts about it ?

vsainteuf / pytorch-psetae Goto Github PK

pytorch-psetae's Introduction

Satellite Image Time Series Classification with Pixel-Set Encoders and Temporal Self-Attention (CVPR 2020, Oral)

[UPDATES]

Requirements

Downloads

Datasets

Pre-trained weights

Code

Code structure

Code Usage

Reproduce

Experiment

Re-use

Data format

Data structure

Samples

Normalisation values

Labels

Dates and pre-computed features

Folder structure

Credits

Reference

pytorch-psetae's People

Contributors

Stargazers

Watchers

Forkers

pytorch-psetae's Issues

Recommend Projects

Recommend Topics

Recommend Org