The satelliteimaging from dynaslum

Fix bug tile labelling

When the tile is not square (have 2 mixed classes?) it's labeled as 'Mixed' while it's one clear class.

Train classifiers for the 3classes epxperiments

Dataset1 (px417m250)
Train 22 classifiers
Record performance in Excell sheet
Import to MATLAB as table and save in MAT file
Generate performance graphs & save them as fig and pdf
Dataset2 (px333m200)
Train 22 classifiers
Record performance in Excell sheet
Import to MATLAB as table and save in MAT file
Generate performance graphs & save them as fig and pdf
Dataset3 (px250m150)
Train 22 classifiers
Record performance in Excell sheet
Import to MATLAB as table and save in MAT file
Generate performance graphs & save them as fig and pdf
Dataset4 (px167m100)
Train 22 classifiers
Record performance in Excell sheet
Import to MATLAB as table and save in MAT file
Generate performance graphs & save them as fig and pdf
Ignoring Dataset5 (px83m50)
Dataset6 (px100m80)
Train few classifiers
Record performance in Excell sheet
Import to MATLAB as table and save in MAT file
Generate performance graphs & save them as fig and pdf

Convert the BoVW classification features arrays to tables

The classificationLearner app expects tables.

Create datastore and training and testing subsets from dataset 100px80m

Create datastore from the dataset
Split the datastore into training and teasing subsets

Find recommendations for the best size of VW vocabulary

Make BoVW extractor rotation invariant

Use the Upright flag in BagOfFeatures class as a parameter to create BoVW.
Recompute the SURF features with Uright flag set to 0.

Training and validation of an image classifier pipeline

Follow example Image Category Classifier but also use own scripts.
Use some default choices like SURF like 80% of the strongest features. multi-class SVM, but also own parameters:

vocabulary sizes of 10, 20 and 50
SURF locations 'Detector' and 'Grid' (default settings GridStep is [8 8] and the BlockWidth is [32 64 96 128 (> 100??)] )

Apply on data-set 6, 100px = 80m.

make script skeleton
Balance categories
Looping over some parameters
Implement cross-validation
Make better output
Publish
Determine the best classifier from the tested vocabulary sizes and SURF location points
Run with best classifier options and save the model, publish.
Share with partners

Utility for visualizing a multipolygon.

make a visualization utility

Perhaps the best place for such a utility is

/satsets/util/shapefile.py

See the cell under

Function to display a multipolygon from a shape file on a figure axis with with given color and extent

currently cell 22 (of course the relevant import from the first cell, currently 19) from this source notebook

modify demo notebook toy example JI between multipolygons
modify demo notebook JI between2 shapefiles

Abstract EARSel

Write draft of extended abstract
Generate new figures for the abstract

Train and validate classifiers for the 5 datasets

Save the trained models and confusion matrices for each data set.

Make training scripts also Linux compatible

Improve labelling

Add missing slums
Merge missing slums to given slums -> all_slums
Improve rough built-up mask
Make slums and built-up disjoint => proper built-up mask
generate non-built-up mask as all remaining pixels

Read the ML best practices white paper

The ML white paper by MATLAB is available here

Compute BoW for the 3classes classification experiment

Create datastores from the 4 image datasets (drop the 30m for now)
Split into training/testing 70/30
Compute BoVW for the training data
Convert the BoVW into table for the training data
Compute BoVW for the test data
Convert the BoVW into table for the testing data

Segmentation result mask- from multiclass to multiple binary masks

Create multiclass_mask2oneclass_masks_Kalyan
use multiclass_mask2oneclass_masks.
See oneclass_masks2multiclass_mask_Kalyan

Generate few SURF feature visualizations

Generate at least 1 image per class per dataset visualizing the SURF features (interest point/blobs). Related to issue #22 .

HoG - Feature

We have the HoG feature. But this needs to be summarized into 5 features per window.

Features are:
Mean
Location of the peak
Absolute sine difference between two highest peaks.

For more info see: https://digital.library.adelaide.edu.au/dspace/bitstream/2440/56300/1/hdl_56300.pdf

Jaccard index between same type of ground truth and segmented results

The segmentation result could be either pixel or grid Binary Mask (BM) or a (multipolygonal) Shape (S).
The ground truth can be either a shape file or sometimes - pixel mask. In this issue consider comparison of the same type: BM <-> BM and S <-> S.

Use skikit-learn Python implementation of the Jaccard similarity score.

Classify each pixel of the image

Decide on how to do it- via datastore or not. Check of the method predict can work on single image tile or on datastore.
Create function and script for predicting the label of each pixel based on tile (100x100px = 80x80m) around it using a learned classifier.
Segment the whole image using codes: 1- Slum, 2- NonBuiltUp and 3 BuiltUp

Prepare training, test and validaiton sets

Compare segmentation to ground truth

Convert the 3 single masks to a single multi-class mask where 1 is BuiltUp, 2 is NonBuiltUp and 3is Slum. Visualize the ground truth indexed image (pure and overlaid on the original image).
Visualize overlaid ground truth and segmentation on top of original image
Fill missing pixels segmentation. Visualize the resulting segmentation image (pure and overlaid on the original image).
De-noise (via majority filter?) the segmentation. Visualize the resulting segmentation image (pure and overlaid on the original image).
Make a script to compare ground truth to partial segmentation
Compare interpolated segmentation to ground truth

Create Excell Spreadsheet for recording classifier's performances

Add tiling step size

Currently the tiles are not overlapping, we should be able to overlap them controlled by the tiling step size.

Utility for loading shapefiles into multipolygons.

Make a loading utility

Perhaps the best place for such a utility is

/satsets/util/shapefile.py

It's a one-liner, but is nice to be as a utility. See the cell under

Load the contents of the shapefiles as multipolygons.

currently cell 24 (of course the relevant import from the first cell, currently 19) from this source notebook

The function should return also the multipolygon's bounds (extent). See the same notebook.

modify demo notebook JI between2 shapefiles

Classify test datasets with best trained classifiers

Dataset 1 (417px=250m)

Retrain and save model 1.21 Ensemble/ Subspace KNN
Classify test data
Generate and save confusion matrix
Generate and save performance measures

Dataset 2 (333px=200m)

Retrain and save model 1.12 KNN/ Fine KNN
Classify test data
Generate and save confusion matrix
Generate and save performance measures

Dataset 3 (250px=150m)

Retrain and save model 1.21 1.12 KNN/ Fine KNN
Classify test data
Generate and save confusion matrix
Generate and save performance measures

Dataset 4 (167px=100m)

Retrain and save model 1.12 KNN/ Fine KNN
Classify test data
Generate and save confusion matrix
Generate and save performance measures

Dataset 6 (100px=80m)

Retrain and save model Weighted KNN
Classify test data
Generate and save confusion matrix
Generate and save performance measures

For all datasets:

Produce comparison performance plots and save as fig and PDF
Share results by email and BeeHub

Upload the Digital Globe data to BeeHub

Functions and scripts to generate performance plots

Create functions to generate performance plots as required by issue #24 .

Function for accuracy plot generation for all models
Script (unit-test) for accuracy plot generation for all models for all data-sets
Function for generating Precision and Recall plots for all models for one class
Script (unit-test) for generating Precision and Recall plots for all models for one class for all data-sets
Function for generating Precision and Recall plots for one model for all classes
Script (unit-test) for generating Precision and Recall plots for one model for all classes for all data-sets

Create 5 imageDatastores

Make a code which takes the 5 datasets and generated 5 imageDatastores with functionality to count labels, distributions and sample preview.

Convert the imageDatastores to imageSets

Bag of features needs imageSet still!

MATLAB datastore requirenments

Make list of issues related to the data preparation to satisfy the MATLAB datastore command requirements.

Generate images for 3-class classification

Label an image as belonging to a certain class if at least 75% of the image pixels are labelled of that class.

Modify function for tiling to optionally have 3 classes only (no mix).
Modify script for considering only 75%-pure images for training.
Modify code to also report number of images per class.
Generate 5 datasets with 3 classes: Slums, BuiltUp and NonBuildUp.

Generate 5 test datasets

Generate sub-images (aka tiles or patches) and make them 5 datasets with 3 classes
Slum, BuiltUp, and NonBuiltUp each corresponding to the following tile grid sizes (use 1/4th of the size as step_size):

Create several BoVW for dataset 6 (100px80m)

Predicting the label of several random tiles for each class

Use the best linear SVM classifier trained and tested on Dataset6. i.e. px100m80, so tile size is 100x100 pixles (80x80) meters. Use the vocabulary size and mode (Detection/Grid) which produced the best validation results.

Function to generate tile image(s) from random location(s) of a class mask, such that at least 80% the pixels from the tile belong to the desired class. See nonSlumTiling.m
Script to and generate 10 random tiles per class
Publish
Script to test the tile class label prediction using the best pre-trained imageCategoryClassifier on the 10 random tiles from each class
Run the prediction script
Publish

Generic tools needed for the dataset preparation

Tools for dataset preparation

image and mask import and visualization
saving data tile to a file
setting tile label

Create Visual vocabularies

Create BoVW using SURF (default feature extractor) for the 6 training datastores:

Line Support Region Feature

Instead of the papers approach we could use canny-edge detection + probabilistic hough lines and calculate the features from that:

Line length Entropy
Mean
Entropy of line contrast (???)

Split the image datastores into train, validation and test subsets

Generate train, validation and test sub-image datastores for the 5 image datastores and save them as MAT files:

Rename classes

Urban should be called built-up
and Rural = Non built up

Linear Feature Distribution

Instead of the paper feature let's look at LSD Detector from opencv:

http://docs.opencv.org/3.1.0/df/dfa/tutorial_line_descriptor_main.html

Study of how to classify data with trained model

Compute evaluation performance for the test sets

Function to compute evaluation numbers given vectors of actual and predicted classes or a pre-computed confusion matrix
Function to evaluate the performance of a pre-trained classifier on a test feature set
(Unit-test) script to compute those for the test datasets

Create a dataset with random sampling

The new datastore should be of images/tiles of resolution 80m = 100px.
All slum pixels should be used to test of a tile contains at least 80% slum pixels. If yes => use for training, if no, check if it has at least 80% pixels of that class. If yes => use for training, if not- discard from training.
See how many slum tiles are obtained above. Randomly sample the same (remaining) number of pixels/80% tiles of the remaining 2 classes- BuiltUp and NonBuiltUp.

Create functions and scripts to do the 'slum conditional' tiling.
Generate training image tiles of the 3 classes.
Clean up by hand some of the bad training images.
Record the new dataset in the Excel sheet.

Visualize the Visual Word occurencies = classification features

GLCM - Pantex

Feature is implemented in the GLCM source book.

It needs to be put in the python package.

Take care of tuple of windows.

dynaslum / satelliteimaging Goto Github PK

satelliteimaging's People

Contributors

Stargazers

Watchers

Forkers

satelliteimaging's Issues

Recommend Projects

Recommend Topics

Recommend Org