Giter Site home page Giter Site logo

geometric_burstiness's People

Contributors

tsattler avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

geometric_burstiness's Issues

Query Time

Hi, @tsattler , I met the same problem as #2 metioned. The images of the database are captured uniformly on the trajectory of the camera inside the room. So the adjacent ones look the same. Does it matter? I trained the vocabulary with the dataset ANN_SIFT1M(http://corpus-texmex.irisa.fr/), and both database images and query images are extracted with hesaff. Anything Wrong?

Here is the log

 Loading the inverted index 
 Index loaded 
 Weights loaded 
 Found 2 query files 
 Found 836 db images in image_db.txt

 Query 0 : ./q/000230.jpg.bin

  Prepared 1287 query descriptors 
  Computing the self-similarity for the query image from 1287 query descriptors 
  Self-similarity: 26333.7
  Normalization weight 0.00616232
  Query found 836 potentially relevant database images
  Score of most relevant image 0.0912435 (image 726) with 4322 matches
  Starting spatial verification

The computation of self-similarity of database image

In function DetermineDBImageSelfSimilarities() of file inverted_file.h, the self-similarity of database images is computed to use as the normalization factor for final similarity score between query and database images.

However, it seems that the computation of self-similarity is simply to accumulate the square of idf weight of each visual word. The detail code is given as follow.

    for (int i = 0; i < num_entries; ++i) {
      current_image_id = entries_[i].image_id;
      score_ref[current_image_id] += idf_squared;
    }

However, the correlation of same visual word in a database image is ignored, which violate the original definition of self-similarity that the similarity score between the same image should be 1, as illustrated in ACCV 2014 Disloc paper.
On the contrast, we modify the computation of self-similarity as follow.

       int num_score = score_ref.size();
       std::vector<double> num_vw(num_score,0.0);
       for (int i = 0; i < num_entries; ++i) {
         current_image_id = entries_[i].image_id;
         num_vw[current_image_id] += 1;
       }       
       for (int i = 0; i < num_score; ++i) {         
         score_ref[i] += num_vw[i]*num_vw[i]*idf_squared;
       } 

I have found the modification can improve the recall when I conduct the experiment on Pittsburgh 250k dataset. The recall@1 is improved from 0.508 to 0.527 without spatial verification step.
I 'm not sure whether this is a bug or the original computation in the public code is the right definition of self-similarity.


1, Could you tell me the detail recall, such as recall@1, after initial retrieval without spatial verification in your implementation?
I have evaluated two different feature extraction approaches, the one from heasff (https://github.com/perdoch/hesaff), and another one from VGG with low cornerness threshold (http://www.robots.ox.ac.uk/~vgg/research/affine/detectors.html#binaries).
The latter one with threshold 100 will extract about 260M local features, and the former can extract about 218M locat features.
However the recall of both feature extraction step is hard to improve to the result reported in ACCV2014 and your paper on both place recognition datasets.

2, Could you tell me how you transfer the jpg image to ppm image? I use the jpegtopnm command in Linux.

3, If you have time, could you give the result of feature extraction of the first image in Pittsburgh dataset, i.e., the imgname.hesaff file? I want to verify if I have extracted enough features.

Thanks!

Reproducing benchmark results on Oxford 5k dataset

Thank you for sharing your source code. It is really helpful to promote CBIR research community.

From disclaimer you mentioned that this source code can be differ from the codes used for the paper publication. However, I observe huge difference.

The reported performance on Oxf105k and Par106k follows below:

retrieval_rank inliers eff.inliers inter-image inter-place inter-place+pop
oxf105k(mAP) - 0.710 0.730 0.708 0.735 0.745
Par106k(mAP) - 0.613 0.619 0.611 0.649 0.682

I tried to get results on Oxford 5k with 200k vocabulary trained on Paris 6k.

I got results which is weird for two reasons.

  • I though the model should gives better result when we use oxford 5k without distractor 100k data, but it is worse than oxf105k
  • Having better measure make the result worse. Do you have the same result? i.e. eff.inliers give worse result than just using retrieval_rank ?
retrieval_rank inliers eff.inliers inter-image inter-place inter-place+pop
oxf5k(mAP) 0.700 0.679 0.682 0.674 x x

Maybe, I made some obvious mistake. I assume you had run a lot of experiment. If you have any clues, please give me a hint.

Build a visual vocabulary

@tsattler Hi tsattler. When running compute_word_assignments, the input requires a visual vocabulary. So to compute word assignments, I need to write all the 128-D rootsift descriptors extracted from images to a text file (i.e. vocabulary). Did I understand it right?

About the query time

@tsattler Hey tsattler, I just wonder how long should a query take? I collected 340 images as db and used a 20K vocabulary from Inria holidays and I modified the parameter num_words to 20K in the computing_hamming_thresholds.cc.
As I set the nearest words num to 1 ( as asked in the README ) in every step, I simoutaneously modified the kNumNNWords to 1 in the query.cc. However , when I started in this condition, it has taken too long to run, the result file was updated every other several minutes. So, is there anything wrong about my settings?

How to train the codebook?

My respected Dr Sattler:

I'm trying to evaluate the code with the Pittsburgh dataset .
I have select 20k images from the dataset and extracted Hessian affine sift features (using the code from CUVT, Perdoch, sift_type=2) to train the 200k codebook, which have about 17M features.

Finally, I only achieve the recall@1 = 0.47, which is lower than the Disloc reported (about 0.55)

Could you make the codebook public ?

The definition of place

For the function ReRankingInterPlaceGeometricBurstiness in the ranking_schemes.h, the purpose of lines of 255-234 makes me confused.
When a new place is found, the other image will be update the place information based on the previous min distance and the new distance between the image and the place.
However, the distance is still computed as the distance from the first image.

Is this right?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.