tsattler / geometric_burstiness Goto Github PK

View Code? Open in Web Editor NEW

73.0 73.0 21.0 66 KB

License: BSD 3-Clause "New" or "Revised" License

CMake 1.89% C++ 98.11%

geometric_burstiness's People

Contributors

Stargazers

Watchers

geometric_burstiness's Issues

Visual Vocabulary

@tsattler Is there any visual vocabulary ready to use available, or I must train it myself?

Hi, @tsattler , I met the same problem as #2 metioned. The images of the database are captured uniformly on the trajectory of the camera inside the room. So the adjacent ones look the same. Does it matter? I trained the vocabulary with the dataset ANN_SIFT1M(http://corpus-texmex.irisa.fr/), and both database images and query images are extracted with hesaff. Anything Wrong?

Here is the log

 Loading the inverted index 
 Index loaded 
 Weights loaded 
 Found 2 query files 
 Found 836 db images in image_db.txt

 Query 0 : ./q/000230.jpg.bin

  Prepared 1287 query descriptors 
  Computing the self-similarity for the query image from 1287 query descriptors 
  Self-similarity: 26333.7
  Normalization weight 0.00616232
  Query found 836 potentially relevant database images
  Score of most relevant image 0.0912435 (image 726) with 4322 matches
  Starting spatial verification

The computation of self-similarity of database image

In function DetermineDBImageSelfSimilarities() of file inverted_file.h, the self-similarity of database images is computed to use as the normalization factor for final similarity score between query and database images.

However, it seems that the computation of self-similarity is simply to accumulate the square of idf weight of each visual word. The detail code is given as follow.

    for (int i = 0; i < num_entries; ++i) {
      current_image_id = entries_[i].image_id;
      score_ref[current_image_id] += idf_squared;
    }

However, the correlation of same visual word in a database image is ignored, which violate the original definition of self-similarity that the similarity score between the same image should be 1, as illustrated in ACCV 2014 Disloc paper.
On the contrast, we modify the computation of self-similarity as follow.

       int num_score = score_ref.size();
       std::vector<double> num_vw(num_score,0.0);
       for (int i = 0; i < num_entries; ++i) {
         current_image_id = entries_[i].image_id;
         num_vw[current_image_id] += 1;
       }       
       for (int i = 0; i < num_score; ++i) {         
         score_ref[i] += num_vw[i]*num_vw[i]*idf_squared;
       }

I have found the modification can improve the recall when I conduct the experiment on Pittsburgh 250k dataset. The recall@1 is improved from 0.508 to 0.527 without spatial verification step.
I 'm not sure whether this is a bug or the original computation in the public code is the right definition of self-similarity.

1, Could you tell me the detail recall, such as recall@1, after initial retrieval without spatial verification in your implementation?
I have evaluated two different feature extraction approaches, the one from heasff (https://github.com/perdoch/hesaff), and another one from VGG with low cornerness threshold (http://www.robots.ox.ac.uk/~vgg/research/affine/detectors.html#binaries).
The latter one with threshold 100 will extract about 260M local features, and the former can extract about 218M locat features.
However the recall of both feature extraction step is hard to improve to the result reported in ACCV2014 and your paper on both place recognition datasets.

2, Could you tell me how you transfer the jpg image to ppm image? I use the jpegtopnm command in Linux.

3, If you have time, could you give the result of feature extraction of the first image in Pittsburgh dataset, i.e., the imgname.hesaff file? I want to verify if I have extracted enough features.

Thanks!

Reproducing benchmark results on Oxford 5k dataset

Thank you for sharing your source code. It is really helpful to promote CBIR research community.

From disclaimer you mentioned that this source code can be differ from the codes used for the paper publication. However, I observe huge difference.

The reported performance on Oxf105k and Par106k follows below:

	retrieval_rank	inliers	eff.inliers	inter-image	inter-place	inter-place+pop
oxf105k(mAP)	-	0.710	0.730	0.708	0.735	0.745
Par106k(mAP)	-	0.613	0.619	0.611	0.649	0.682

I tried to get results on Oxford 5k with 200k vocabulary trained on Paris 6k.

I got results which is weird for two reasons.

I though the model should gives better result when we use oxford 5k without distractor 100k data, but it is worse than oxf105k
Having better measure make the result worse. Do you have the same result? i.e. eff.inliers give worse result than just using retrieval_rank ?

	retrieval_rank	inliers	eff.inliers	inter-image	inter-place	inter-place+pop
oxf5k(mAP)	0.700	0.679	0.682	0.674	x	x

Maybe, I made some obvious mistake. I assume you had run a lot of experiment. If you have any clues, please give me a hint.

Build a visual vocabulary

@tsattler Hi tsattler. When running compute_word_assignments, the input requires a visual vocabulary. So to compute word assignments, I need to write all the 128-D rootsift descriptors extracted from images to a text file (i.e. vocabulary). Did I understand it right?

About the query time

@tsattler Hey tsattler, I just wonder how long should a query take? I collected 340 images as db and used a 20K vocabulary from Inria holidays and I modified the parameter num_words to 20K in the computing_hamming_thresholds.cc.
As I set the nearest words num to 1 ( as asked in the README ) in every step, I simoutaneously modified the kNumNNWords to 1 in the query.cc. However , when I started in this condition, it has taken too long to run, the result file was updated every other several minutes. So, is there anything wrong about my settings?

How to train the codebook?

My respected Dr Sattler:

I'm trying to evaluate the code with the Pittsburgh dataset .
I have select 20k images from the dataset and extracted Hessian affine sift features (using the code from CUVT, Perdoch, sift_type=2) to train the 200k codebook, which have about 17M features.

Finally, I only achieve the recall@1 = 0.47, which is lower than the Disloc reported (about 0.55)

Could you make the codebook public ?

How to extract the local feature?

If I have detect_points.ln and compute_descriptors.ln for extracting local features, such as the software provided in http://www.robots.ox.ac.uk/~vgg/research/affine/detectors.html#binaries. How to extract the local feature?

Could you give use the demo shell script to extract local features of an image in your experiments?

I want to verify if I correctly extract the local feature.

Thanks in advance!

The definition of place

For the function ReRankingInterPlaceGeometricBurstiness in the ranking_schemes.h, the purpose of lines of 255-234 makes me confused.
When a new place is found, the other image will be update the place information based on the previous min distance and the new distance between the image and the place.
However, the distance is still computed as the distance from the first image.

Is this right?

tsattler / geometric_burstiness Goto Github PK

geometric_burstiness's People

Contributors

Stargazers

Watchers

Forkers

geometric_burstiness's Issues

Visual Vocabulary

Query Time

The computation of self-similarity of database image

Reproducing benchmark results on Oxford 5k dataset

Build a visual vocabulary

About the query time

How to train the codebook?

How to extract the local feature?

The definition of place

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent