Giter Site home page Giter Site logo

feixue94 / sfd2 Goto Github PK

View Code? Open in Web Editor NEW
94.0 2.0 9.0 22.76 MB

[CVPR 2023] SFD2: Semantic-guided Feature Detection and Description. Embedding semantics into local features implicitly for long-term visual localization

Home Page: https://feixue94.github.io/

Python 99.25% Shell 0.75%
end-to-end local-features semantics long-term-localization semantic-aware-feature semantic-aware-localization

sfd2's Introduction

SFD2: Semantic-guided Feature Detection and Description

In this work, we propose to leverage global instances, which are robust to illumination and season changes for both coarse and fine localization. For coarse localization, instead of performing global reference search directly, we search for reference images from recognized global instances progressively. The recognized instances are further utilized for instance-wise feature detection and matching to enhance the localization accuracy.

Dependencies

  • Python 3 >= 3.6
  • PyTorch >= 1.8
  • OpenCV >= 3.4
  • NumPy >= 1.18
  • segmentation-models-pytorch = 0.1.3
  • colmap
  • pycolmap = 0.0.1

Data preparation

  • training data. We use the same training dataset as R2D2. Please download the training dataset following the instructions provided by R2D2.
  • segmentation model. ConvXt is used to provide semantic labels and semantic-aware features for stability learning in the training process.
  • local feature model. SuperPoint is used to provide local reliability in the training process.

Pretrained weights

Pretrained weight of SFD2 is in the weights directory. If you want to retrain the model, please also download the weights of ConvXt and SuperPoint from here and put them nto the weights directory.

Localization results

Please download datasets e.g. Aachen_v1.1, RobotCar-Seasons v2, and Extended-CMU-Seasons from the visualization benchmark for evaluation.

  • localization on Aachen_v1.1
./test_aachenv_1_1

you will get results like this:

Day Night
88.2 / 96.0 / 98.7 78.0 / 92.1 / 99.5
  • localization on RobotCar-Seasons
./test_robotcar

you will get results like this:

day night night-rain
56.9 / 81.6 / 97.4 27.6 / 66.2 / 90.2 43.0 / 71.1 / 90.0
  • localization on Extended CMU-Seasons
./test_ecmu

you will get results like this:

urban suburban park
95.0 / 97.5 / 98.6 90.5 / 92.7 / 95.3 86.4 / 89.1 / 91.2

Training

./train.sh

BibTeX Citation

If you use any ideas from the paper or code from this repo, please consider citing:

@inproceedings{xue2023sfd2,
  author    = {Fei Xue and Ignas Budvytis and Roberto Cipolla},
  title     = {SFD2: Semantic-guided Feature Detection and Description},
  booktitle = {CVPR},
  year      = {2023}
}

Acknowledgements

Part of the code is from previous excellent works including SuperPoint, R2D2 , HLoc, ConvXt, LBR. You can find more details from their released repositories if you are interested in their works.

sfd2's People

Contributors

feixue94 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

sfd2's Issues

Missing file in the repository

Hi there,
thanks for open sourcing this work!
I am trying to run your default training script, and I realised the file mentioned here is missing. It is preventing me from running the training script. Ignoring the checkpoint or changing the path to 'weights/convxts-base_ade20k.pth' does not work.
Also, the pretrained weights from Convext repo does not seem to have that configuration. Could you please share this file?

Demo

Could toy provide demo script for run your great work (end to end) on arbitrary images ?
thanks.

config_train_sfd2.json

感谢分享。想用自己的数据训练一下,但是config_train_sfd2.json的很多参数的功能不是很明了,能否参数加一些注释。谢了~

test file

In test_aachenv_1_1 file, this 'matcher = NNM' is a invalid choice of conf. I wanna konw the reason. Should I choose from 'superglue', 'superglue-fast', 'NN-superpoint', 'NN-ratio', 'NN-mutual', 'adalam'? Or is there error where I don't konw?

Issues faced during running the training and test scripts

Hello again,
I would like to share some issues that I faced while running the training script. Note that I have prepared the datasets as explained in the R2D2 repository.

  1. Mismatch of dimensions during det_loss calculation:
    While executing this line I get
RuntimeError: The size of tensor a (64) must match the size of tensor b (65) at non-singleton dimension 1

I solved this by replacing the line 357 as follows:

        elif self.detloss in ['ce']:
            # det_loss = self.det_loss(pred_score=output["semi"], gt_score=output["gt_semi"], weight=output["weight"],
            #                          stability_map=None)
            det_loss = self.det_loss(pred_score=output["semi"], gt_score=output["gt_semi_norm"], weight=output["weight"],
                                     stability_map=None)

I think this error is caused by parsing of wrong values. In inputs, we got

output["gt_semi"].shape = (4,64,64,64) (==gt_score)
output["semi"].shape = (4,65,64,64) (==pred_score)

in output dict we also had

output["gt_semi_norm"] with shape (4,65,64,64)

So i replaced gt_semi with gt_semi_norm which has matching dimensions. I am not sure if this is a valid solution.

  1. Learning rate decay parameters are not specified. In trainer.py, line 166 the interpreter complains that self.args.decay_rate and self.args.decay_iter cannot be found. Indeed, they are neither specified in the argparser nor in the config file. The workaround for now is to disable learning rate decay by replacing line 166 with
#lr = min(self.args.lr * self.args.decay_rate ** (self.iteration - self.args.decay_iter), self.args.lr)
lr = self.args.lr

I think this change will prevent us from replicating the results in the paper.

Also, while running the test script test_aachenv_1_1 there are some matters I would like to mention:

  1. I am not sure whether to use the Aachen dataset that we prepared during the training, or Aachen v1.1 dataset that we can find online (for example, I downloaded it from here, as mentioned in the readme file). Since the datasets might be different, I would like to ask if there are any specific preprocessing steps I should follow to reproduce your results?
  2. In line 31 of the test script, the file pairs-db-covis20.txt is missing. I found it here, but since I found this file and the aachen v1.1 database from different sources, I wanted to ask if there is some other source I should download the aachen v1.1 dataset from, maybe a source including this file already?
  3. We need to specify outputs folder as shown here Does that mean I should first run some other script to do inference and collect the results under some outputs folder I specified?
  4. Missing file aachen_db_imglist.txt here. Google search for this file was not successful.
  5. Missing file day_night_time_queries_with_intrinsics.txt here. Google search for this file was not successful. The aachen v1.1 dataset I mentioned above only has night_time_queries_with_intrinsics.txt.
    Thank you very much and best regards,

Bad result in Aachen v1.1

Hello, Thanks for your impressive work.
When I try to test in aachen v1.1, I get a bad result.
image
It is far away from the result in the paper.
image

I use the script with a little modification in order to adapt my system.
`
#!/bin/bash
dataset=./dataset
image_dir=$dataset/images/images_upright
outputs=./output
query_pair=pairs/aachen_v1.1/pairs-query-netvlad50.txt
gt_pose_fn=assets/Aachen-v1.1_hloc_superpoint_n4096_r1600+superglue_netvlad50.txt
save_root=./output

feat=ressegnetv2-20220810-wapv2-sd2mfsf-uspg-0001-n4096-r1600
#feat=ressegnetv2-20220810-wapv2-sd2mfsf-uspg-0001-n3000-r1600
#feat=ressegnetv2-20220810-wapv2-sd2mfsf-uspg-0001-n2000-r1600
#feat=ressegnetv2-20220810-wapv2-sd2mfsf-uspg-0001-n1000-r1600

matcher=NNM

extract_feat_db=1
match_db=1
triangulation=1
localize=1

if [ "$extract_feat_db" -gt "0" ]; then
echo "-----------------extract_feat_db-----------------"
python3 -m extract_localization --image_dir $dataset/images/images_upright --export_dir $outputs/ --conf $feat
fi

if [ "$match_db" -gt "0" ]; then
echo "-----------------match_db-----------------"

python3 -m hloc.match_features --pairs pairs/aachen_v1.1/pairs-db-covis20.txt --export_dir $outputs/ --conf $matcher --features feats-$feat
fi

if [ "$triangulation" -gt "0" ]; then
echo "-----------------triangulation-----------------"
python3 -m hloc.triangulation
--sfm_dir $outputs/sfm_$feat-$matcher
--reference_sfm_model $dataset/3D-models/aachen_v_1_1
--image_dir $dataset/images/images_upright
--pairs pairs/aachen_v1.1/pairs-db-covis20.txt
--features $outputs/feats-$feat.h5
--matches $outputs/feats-$feat-$matcher-pairs-db-covis20.h5
fi

ransac_thresh=15
opt_thresh=15
covisibility_frame=50
#init_type="clu"
init_type="sng"
opt_type="clurefobs"
inlier_thresh=10
iters=5
radius=30
obs_thresh=3

if [ "$localize" -gt "0" ]; then
echo "-----------------localize-----------------"
python3 -m it_loc.localizer
--dataset aachen_v1.1
--image_dir $image_dir
--save_root $save_root
--gt_pose_fn $gt_pose_fn
--db_imglist_fn datasets/aachen/aachen_db_imglist.txt
--retrieval $query_pair
--reference_sfm $outputs/sfm_$feat-$matcher
--queries $dataset/queries/day_night_time_queries_with_intrinsics.txt
--features $outputs/feats-$feat.h5
--matcher_method $matcher
--ransac_thresh $ransac_thresh
--with_match
--covisibility_frame $covisibility_frame
--iters $iters
--radius $radius
--obs_thresh $obs_thresh
--opt_thresh $opt_thresh
--init_type $init_type
--inlier_thresh $inlier_thresh
--opt_type $opt_type
--do_covisible_opt
fi
`
Did I make a mistake comewhere?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.