Building a neural network that predicts that can geotag an outdoor image and how to catch a cheating neural network with grad-cam
Pls checkout the medium article for a quick overview.
To train model:
python train.py
Building a CNN model geotag an image - take an image as input and predict the location of that image as output.
The model was trained on a dataset of google streetview images. I scraped images of random locations in India for generating this dataset. The model is reasonably good in making predictions. It generally predicts in the vicinity of the actual location.
These are handpicked good examples. Even when the model's predicted location is wrong, the predicted grids are reasonable:
I cropped out the bottom portion of the image. The model accuracy was not as great, but it was able to pick up some general patterns like landscape, buildings, vegetation, roads, terrain etc when making the prediction.
I overlayed an isometric grid onto the map of India. The resulting grids where the target variables the model needed to predict.
I then uniformly sampled points in each grid, used google's streetview API to get the nearest location with a streetview image, and grabbed 4 images from the 360 view at angles 0,90,180 and 270 degrees.
Method: Group KFold - 10 splits - grouped by location
Average Accuracy: 25%
- PlaNet geolocation with Convolutional Neural Networks - https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45488.pdf
- DeepGeo: Photo Localization with Deep Neural Network - https://arxiv.org/abs/1810.03077
- GradCam on ResNext: https://www.kaggle.com/skylord/grad-cam-on-resnext