Giter Site home page Giter Site logo

neurai-lab / bihome Goto Github PK

View Code? Open in Web Editor NEW
36.0 1.0 5.0 244 KB

This is the official repo for the CVPR 2021 IMW paper: "Perceptual Loss for Robust Unsupervised Homography Estimation"

License: MIT License

Python 100.00%
homography-estimation unsupervised-learning computer-vision cvpr deep-learning

bihome's Introduction

Perceptual Loss for Robust Unsupervised Homography Estimation

This is the official code for the CVPR'21 IMW Paper "Perceptual Loss for Robust Unsupervised Homography Estimation" by Daniel Koguciuk, Elahe Arani, and Bahram Zonooz.

Abstract

Homography estimation is often an indispensable step in many computer vision tasks. The existing approaches, however, are not robust to illumination and/or larger viewpoint changes. In this paper, we propose bidirectional implicit Homography Estimation (biHomE) loss for unsupervised homography estimation. biHomE minimizes the distance in the feature space between the warped image from the source viewpoint and the corresponding image from the target viewpoint. Since we use a fixed pre-trained feature extractor and the only learnable component of our framework is the homography network, we effectively decouple the homography estimation from representation learning. We use an additional photometric distortion step in the synthetic COCO dataset generation to better represent the illumination variation of the real-world scenarios. We show that biHomE achieves state-of-the-art performance on synthetic COCO dataset, which is also comparable or better compared to supervised approaches. Furthermore, the empirical results demonstrate the robustness of our approach to illumination variation compared to existing methods.

alt text

For details, please see the Paper and Presentation.

Environment setup

python3 -m venv venv
source venv/bin/activate
pip3 install --upgrade pip
pip3 install -r requirements.txt

Dataset preparation

To speed up training we can take some part of the data preparation process offline:

python3 src/data/coco/preprocess_offline.py --raw_dataset_root /data/input/datasets/COCO/train2014 --output_dataset_root /data/input/datasets/ImageRegistration/coco/train2014
python3 src/data/coco/preprocess_offline.py --raw_dataset_root /data/input/datasets/COCO/val2014 --output_dataset_root /data/input/datasets/ImageRegistration/coco/val2014
ln -s /data/input/datasets/ImageRegistration/coco data/coco/dataset

Training

python3 train.py --config_file config/pds-coco/zeng-bihome-lr-1e-3.yaml

Testing

python3 eval.py --config_file config/pds-coco/zeng-bihome-lr-1e-3.yaml --ckpt log/zeng-bihome-pdscoco-lr-1e-3/model_90000.pth

Cite our work:

@inproceedings{koguciuk2021perceptual,
  title={Perceptual Loss for Robust Unsupervised Homography Estimation},
  author={Koguciuk, Daniel and Arani, Elahe and Zonooz, Bahram},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={4274--4283},
  year={2021}
}

License

This project is licensed under the terms of the MIT license.

bihome's People

Contributors

dkoguciuk avatar elahearani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

bihome's Issues

Some questions about FLIR Dataset

Thank you for your brilliant work, but I have some questions:
(1) As far as I know, the image pairs in the FLIR Dataset dataset are not aligned, how could you calculated the MACE without groundtruth?
(2) In your paper, you said "To test this vulnerability we compare Zhang et al. [44] method with the original and our biHomE loss on FLIR Dataset [7] preprocessed similar to S-COCO." Could you please provide a more detailed description about this process?
(3) I try to replace the original FLIR dataset with the manually registered dataset (https://github.com/StaRainJ/road-scene-infrared-visible-images), but only the Zeng + Orig strategy can successfully train (maybe because there are too few training data, MACE is only 13 in test data), and the networks of other unsupervised methods cannot converge (loss does not decline, and the value is abnormally large).
I'm confused about how the FLIR experiment in your paper works....

Why do we need to calculate the homography matrix twice?

Thank you very much for your work, but I have some questions:

  1. In https://github.com/NeurAI-Lab/biHomE/blob/main/src/heads/PerceptualHead.py#L177 used DSAC to calculate homography and four points offset
  2. In https://github.com/NeurAI-Lab/biHomE/blob/main/src/heads/PerceptualHead.py#L371 used four points offsets to calculate homography and warp feature map.
    Actually homography matrix was calculated twice, but the homography matrix is a little different.
    Thanks!

The total loss is a negative value.

Thank you for your work and your contribution.

I have a question about the training process.

When I trained the model, the total loss is a negative value.

image

I didn't change anything and only use to following commands.

CUDA_VISIBLE_DEVICES=7 python train.py --config_file config/pds-coco/zhang-bihome-lr-1e-2.yaml

Do you have any idea about how to respond to this situation?

image

Also, Could you provide a config file about flir-adas..? @dkoguciuk

Some question about RANSAC

Hello,in PFNet ,we should use ransac to sample point from PF, how u implement it , i use the kornia the cuda out of memory

Pretrained checkpoints?

Hello guys! Awesome work - really liked reading the paper. It's very interesting that you've managed non-VGG backbones to work as a perceptual loss.

Unfortunately, I currently don't have compute resources to train the model from scratch, so would you mind sharing some pretrained checkpoints?

Thanks!

MACE values

Hello.

  1. What are the expected MACE values from the models trained with following config files:
  • config/pds-coco/zeng-bihome-lr-1e-3.yaml
  • config/pds-coco/zhang-bihome-lr-1e-2.yaml
  • config/s-coco/nguyen-orig-lr-5e-3.yam

I am doing everything as is written in the README.md and after writting following commands after training models:

python3 eval.py --config_file config/pds-coco/zeng-bihome-lr-1e-3.yaml --ckpt log/zeng-bihome-pdscoco-lr-1e-3/model_090000.pth
python3 eval.py --config_file config/pds-coco/zhang-bihome-lr-1e-2.yaml --ckpt log/zhang-bihome-pdscoco-lr-1e-2/model_090000.pth
python3 eval.py --config_file config/s-coco/nguyen-orig-lr-1e-3.yaml --ckpt log/nguyen-orig-scoco-lr-1e-3/model_090000.pth

I get the following MACE values:

  • MACE: 2.9905893802642822 for zeng+biHomE
  • MACE: 2.0459039211273193 for zhang+biHomE
  • MACE: 1.5663892030715942 for nguyen

Based on what you have written in the paper, the MACE should be the smallest for the zeng as hen.

  1. Also are the MACE values for Nguyen and Zhang from figure 1 from your paper received from models trained with following config files?:
  • config/s-coco/zhang-bihome-lr-1e-2.yaml
  • config/s-coco/zhang-bihome-lr-1e-2.yaml

Can you provide the trained model?

Hi,
Thanks for providing the source code. I want to test your model with my own data. I would be very grateful if you can provide a well-trained model.

Weakly performance on real dataset?

I have trained pseudo homography dataset(augmented with a photometrical distortion), by when I evaluated with a real dataset, the performance is weakly(the smaller the better).
image

Question abou the triplet loss you use?

Hi, I have question about the triplet loss in your paper.
Here you said that the margin you use is infinite value, and you write the formulation as follow:
image
I wonder whether the value of this loss can be negative? That means ap_{i,j} < an_{i, j} (or the distance between the anchor to positive sample is smaller the that between negative sample, and this seems easy to reach?) What does it mean when loss is less than zero? how can it be optimized?

Pre-train Model

Hi,
Is trained Model weights available to download?
Thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.