Thank you for sharing your great work. I have some questions about

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Question about the "Anyloc vocabularies" about anyloc HOT 4 CLOSED

euncheolChoi commented on May 27, 2024

Question about the "Anyloc vocabularies"

from anyloc.

Comments (4)

TheProjectsGuy commented on May 27, 2024

Hey @euncheolChoi
Thanks for taking an interest in our work. I'll first review the vocabulary terminology, then answer your two questions.

Vocabulary Terminology

Vocabulary is a "superset" of images we use to build database descriptors (for aggregation technique VLAD). By superset, I mean a collection of datasets. We experiment with the following vocabulary types:

Global: where we include the database images from all the datasets (in Table III and IV).
Structured: where we include the database images from only the structured datasets (in Table III)
Unstructured: where we include the database images from only the unstructured datasets (in Table IV)
Map-Specific: where we include the database images from only the single dataset (on which we're testing). For example, if we're trying on Oxford, we "train" VLAD with only images from the Oxford database set (no other dataset). This can be read as "dataset-specific" as well.
Domain-Specific: A "domain" is a collection of datasets with similar properties. We find this through tSNE and PCA projections (Figure 1, for example). We color-coded these domains in the paper. For example, the "Urban" domain comprises the Pitts-30k, St. Lucia, and Oxford datasets; the "Aerial" domain contains the Nardo Air datasets and the VP-Air dataset. In the paper, we project the GeM-pooled descriptors.

Unless specified otherwise, the results in Table III and IV are using the domain-specific vocabularies. We found these to give the best performance (see Table V).

This is further described in the sections III.D, V.B, and A2 of our paper on arxiv.

Getting Vocabularies

How can i get vocabularies which is made by "Anyloc method"?

You could do it in either of the two methods. I suggest the first one.

Method 1: Use the `torch.hub` model

This is still in beta but should suit your needs for benchmarking. You can refer to issue #11 for more on this. A simple tutorial is as follows

# Load model
import torch
model = torch.hub.load("AnyLoc/DINO", "get_vlad_model", 
        backbone="DINOv2", device="cuda")
# Images
img = torch.rand(1, 3, 224, 224)
# Result: VLAD descriptors of shape [1, 49152]
res = model(img)
# Also supports batching
img = torch.rand(16, 3, 224, 224)
# Result: VLAD descriptors of shape [16, 49152]
res = model(img)
# More help
print(torch.hub.list("AnyLoc/DINO"))
r = torch.hub.help("AnyLoc/DINO", "get_vlad_model")
print(r)

The default is the indoor domain (since we show major improvement there, and it's from the more widely available structured set). However, you can use the aerial domain by loading

model = torch.hub.load("AnyLoc/DINO", "get_vlad_model", 
        backbone="DINOv2", domain="aerial", device="cuda")

The above will load the model we used for the aerial dataset columns (Nardo Air and VP-Air) in Table IV.
This method works for our paper's indoor, urban, and aerial domains.

Method 2: Using the repo's codebase

I think you should use this only if you're going to replicate the results and not use this for benchmarking, etc. (because the repo setup is a little more tedious and requires setting up a container).

Set up the repository and the datasets as described here

Create the cluster centers using dino_v2_global_vocab_vlad.py. Call to the script is documented in dino_v2_global_vocab_vlad_ablations.sh. This script creates a GlobalVLADVocabularyDataset class and loads the datasets accordingly. Also note that you must run the scripts from the repo's home folder (instead of cding in the ./scripts folder) - this is to find the other files and utilities in the repo. See this set of arguments to get a more accurate picture of which datasets are used for which vocabularies

AnyLoc/scripts/dino_v2_global_vocab_vlad.py

Lines 10 to 58 in f69501e

    
               if [ "$global_vocab" == "indoor" ]; then 
        
                   python_cmd+=" --db-samples.baidu-datasets 1" 
        
                   python_cmd+=" --db-samples.gardens 1" 
        
                   python_cmd+=" --db-samples.17places 1" 
        
               elif [ "$global_vocab" == "urban" ]; then 
        
                   python_cmd+=" --db-samples.Oxford 1" 
        
                   python_cmd+=" --db-samples.st-lucia 1" 
        
                   python_cmd+=" --db-samples.pitts30k 4" 
        
               elif [ "$global_vocab" == "aerial" ]; then 
        
                   python_cmd+=" --db-samples.Tartan-GNSS-test-rotated 1" 
        
                   python_cmd+=" --db-samples.Tartan-GNSS-test-notrotated 1" 
        
                   python_cmd+=" --db-samples.VPAir 2" 
        
               elif [ "$global_vocab" == "hawkins" ]; then 
        
                   python_cmd+=" --db-samples.hawkins 1" 
        
               elif [ "$global_vocab" == "laurel_caverns" ]; then 
        
                   python_cmd+=" --db-samples.laurel-caverns 1" 
        
               elif [ "$global_vocab" == "structured" ]; then 
        
                   python_cmd+=" --db-samples.Oxford 1" 
        
                   python_cmd+=" --db-samples.gardens 1" 
        
                   python_cmd+=" --db-samples.17places 1" 
        
                   python_cmd+=" --db-samples.baidu-datasets 1" 
        
                   python_cmd+=" --db-samples.st-lucia 1" 
        
                   python_cmd+=" --db-samples.pitts30k 4" 
        
               elif [ "$global_vocab" == "unstructured" ]; then 
        
                   python_cmd+=" --db-samples.Tartan-GNSS-test-rotated 1" 
        
                   python_cmd+=" --db-samples.Tartan-GNSS-test-notrotated 1" 
        
                   python_cmd+=" --db-samples.hawkins 1" 
        
                   python_cmd+=" --db-samples.laurel-caverns 1" 
        
                   python_cmd+=" --db-samples.eiffel 1" 
        
                   python_cmd+=" --db-samples.VPAir 2" 
        
               elif [ "$global_vocab" == "both" ]; then    # Global vocabulary 
        
                   # Structured 
        
                   python_cmd+=" --db-samples.Oxford 1" 
        
                   python_cmd+=" --db-samples.gardens 1" 
        
                   python_cmd+=" --db-samples.17places 1" 
        
                   python_cmd+=" --db-samples.baidu-datasets 1" 
        
                   python_cmd+=" --db-samples.st-lucia 1" 
        
                   python_cmd+=" --db-samples.pitts30k 4" 
        
                   # Unstructured 
        
                   python_cmd+=" --db-samples.Tartan-GNSS-test-rotated 1" 
        
                   python_cmd+=" --db-samples.Tartan-GNSS-test-notrotated 1" 
        
                   python_cmd+=" --db-samples.hawkins 1" 
        
                   python_cmd+=" --db-samples.laurel-caverns 1" 
        
                   python_cmd+=" --db-samples.eiffel 1" 
        
                   python_cmd+=" --db-samples.VPAir 2" 
        
               else 
        
                   echo "Invalid global vocab!" 
        
                   exit 1 
        
               fi

When running the above script, you might want to "test" it on a small dataset (the bulk of the time will be spent on getting and caching cluster centers). You can later call with the same arguments to test on other datasets from the same vocabulary. The cluster centers will be loaded from the cache (if you're using it).

If you do not want to run step 2 (due to data or compute constraints), we also release the cluster centers in the public material. You'll have to download and unzip the Colab1/cache.zip file and navigate to ./cache/vocabulary/dinov2_vitg14/l31_value_c32 folder. You'll find a c_centers.pt file in each of the listed vocabulary folders. You could also get these from the torch.hub release.

You can visualize the tSNE and PCA clusters by downloading datasets and using the scripts dino_v2_datasets_gem_tsne_clustering.py and dino_v2_datasets_gem_pca_clustering.py respectively. See this for other methods/scripts.

Vocabulary used in Table 4

In Table 4 of the paper, the recall values are calculated for each dataset. For Anyloc-VLAD-DINOv2, I am curious to know which Vocabulary was used to obtain each of these results. In particular, I am interested in the results for the dataset in the Aerial domain.

The Hawkins, Laurel Cavers, and Mid-Atlantic Ridge results are using the respective datasets alone. We used map-specific vocabularies here (because we only tested with one dataset from their respective domains). If, for example, we want to include another underwater dataset (let's say DB-A), we would get the cluster centers using DB-A and Mid-Atlantic as these two datasets would belong to the "underwater" domain. Some code changes (specifically in step 2 of the method 2 above) will be needed for this to work. You'll also need to write your Dataset in the custom_datasets folder (for this new DB-A dataset).
Note that though Hawkins (degraded) and Laurel Caverns (sub-terranean) have similar imagery types (camera properties and lack of distinct features in images), they are projected at different places in Figure 1.

The results for Nardo-Air, Nardo-Air R, and VP-Air is using the "aerial" domain. All the database images from these datasets are used (actually, every second image from VP-Air to avoid memory OOM and to counter the class imbalance of VP-Air). You can verify this here

AnyLoc/scripts/dino_v2_global_vocab_vlad_ablations.sh

Lines 113 to 120 in f69501e

    
           elif [ "$global_vocab" == "aerial" ]; then 
        
               python_cmd+=" --db-samples.Tartan-GNSS-test-rotated 1" 
        
               python_cmd+=" --db-samples.Tartan-GNSS-test-notrotated 1" 
        
               python_cmd+=" --db-samples.VPAir 2" 
        
           elif [ "$global_vocab" == "hawkins" ]; then 
        
               python_cmd+=" --db-samples.hawkins 1" 
        
           elif [ "$global_vocab" == "laurel_caverns" ]; then 
        
               python_cmd+=" --db-samples.laurel-caverns 1"

Additionally, to get the results of Anyloc-VLAD-DINOv2 after creating a cluster, can I use anyloc_vlad_generate.py?

After you get the cluster centers (assuming you're following the method 2 above), you'll have to change the c_centers_file to this new .pt file here

AnyLoc/demo/anyloc_vlad_generate.py

Lines 143 to 144 in f69501e

    
           c_centers_file = os.path.join(cache_dir, "vocabulary",  
        
                   ext_specifier, domain, "c_centers.pt")

Assuming that you've configured the dataset directory, it should work fine. However, I recommend that you directly use method 1 as it doesn't require you to set up any repository for it to work.

from anyloc.

euncheolChoi commented on May 27, 2024

Thank you so much for the detailed guidelines. Of the methods you mentioned, I am using
model = torch.hub.load("AnyLoc/DINO", "get_vlad_model", backbone="DINOv2", domain="aerial", device="cuda")
to load the model and try to extract the descriptors. However, I am getting the following error while loading the model. Is this a temporary error? If you have a solution to this, I would appreciate it if you could share it.

------------ Dataset loaded ------------
------- Generating global descriptors -------
Using model : Anyloc-dino-VLAD_aerial_domain
Using cache found in /root/.cache/torch/hub/AnyLoc_DINO_main
Exception: invalid syntax (, line 1)
[ERROR]: Exit is not safe
Traceback (most recent call last):
File "/root/workspace/aerial_pr/Anyloc_docker_aerial/AnyLoc/aeria_scripts/dino_v2_global_vpr.py", line 327, in
main(largs)
File "/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/root/workspace/aerial_pr/Anyloc_docker_aerial/AnyLoc/aeria_scripts/dino_v2_global_vpr.py", line 239, in main
db_descs, qu_descs = build_cache(largs, gpr_ds)
File "/opt/conda/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/root/workspace/aerial_pr/Anyloc_docker_aerial/AnyLoc/aeria_scripts/dino_v2_global_vpr.py", line 129, in build_cache
anyloc = torch.hub.load("AnyLoc/DINO", "get_vlad_model", backbone="DINOv2", device="cuda")
File "/opt/conda/lib/python3.7/site-packages/torch/hub.py", line 540, in load
model = _load_local(repo_or_dir, model, *args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/torch/hub.py", line 566, in _load_local
hub_module = _import_module(MODULE_HUBCONF, hubconf_path)
File "/opt/conda/lib/python3.7/site-packages/torch/hub.py", line 89, in _import_module
spec.loader.exec_module(module)
File "", line 724, in exec_module
File "", line 860, in get_code
File "", line 791, in source_to_code
File "", line 219, in _call_with_frames_removed
File "", line 1
(backbone = )

from anyloc.

TheProjectsGuy commented on May 27, 2024

Hey @euncheolChoi,

We didn't experience this error before. Are you trying this from a clean install? Make sure your torch hub directory - usually ~/.cache/torch/hub - doesn't have anything before trying this out. Also ensure that you've sourced your conda environment with python and pytorch installed.

# Install dependencies
conda install -c conda-forge einops
pip install fast_pytorch_kmeans

And you can run the following

import torch
# Load the model (this will download the vocabulary and the ViT model from torch.hub)
model = torch.hub.load("AnyLoc/DINO", "get_vlad_model", backbone="DINOv2", domain="aerial", device="cuda")
# Your images here
img = torch.rand(16, 3, 224, 224)
# Global descriptors
res = model(img)    # (10, 49152) dim

Are you following the same steps as above?

I get the following when I run this

Python 3.10.13 | packaged by conda-forge | (main, Dec 23 2023, 15:36:39) [GCC 12.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> model = torch.hub.load("AnyLoc/DINO", "get_vlad_model", backbone="DINOv2", domain="aerial", device="cuda")
Using cache found in /home/avneesh/.cache/torch/hub/AnyLoc_DINO_main
Storing (torch.hub) cache in: /home/avneesh/.cache/torch/hub/checkpoints/anyloc_files
100.0%
Downloading: "https://github.com/facebookresearch/dinov2/zipball/main" to /home/avneesh/.cache/torch/hub/main.zip
/home/avneesh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/swiglu_ffn.py:51: UserWarning: xFormers is not available (SwiGLU)
  warnings.warn("xFormers is not available (SwiGLU)")
/home/avneesh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/attention.py:33: UserWarning: xFormers is not available (Attention)
  warnings.warn("xFormers is not available (Attention)")
/home/avneesh/.cache/torch/hub/facebookresearch_dinov2_main/dinov2/layers/block.py:40: UserWarning: xFormers is not available (Block)
  warnings.warn("xFormers is not available (Block)")
Downloading: "https://dl.fbaipublicfiles.com/dinov2/dinov2_vitg14/dinov2_vitg14_pretrain.pth" to /home/avneesh/.cache/torch/hub/checkpoints/dinov2_vitg14_pretrain.pth
100.0%
VLAD caching is disabled.
Desc dim set to 1536
>>> img = torch.rand(16, 3, 224, 224)
>>> res = model(img)
>>> res.shape
torch.Size([16, 49152])

This takes about 5.3 GB on the GPU.

from anyloc.

euncheolChoi commented on May 27, 2024

Oh it was just my environment issue. Now i can get result. Thank you.

from anyloc.

Question about the "Anyloc vocabularies" about anyloc HOT 4 CLOSED

Comments (4)

Vocabulary Terminology

Getting Vocabularies

Method 1: Use the `torch.hub` model

Method 2: Using the repo's codebase

Vocabulary used in Table 4

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	if [ "$global_vocab" == "indoor" ]; then
	python_cmd+=" --db-samples.baidu-datasets 1"
	python_cmd+=" --db-samples.gardens 1"
	python_cmd+=" --db-samples.17places 1"
	elif [ "$global_vocab" == "urban" ]; then
	python_cmd+=" --db-samples.Oxford 1"
	python_cmd+=" --db-samples.st-lucia 1"
	python_cmd+=" --db-samples.pitts30k 4"
	elif [ "$global_vocab" == "aerial" ]; then
	python_cmd+=" --db-samples.Tartan-GNSS-test-rotated 1"
	python_cmd+=" --db-samples.Tartan-GNSS-test-notrotated 1"
	python_cmd+=" --db-samples.VPAir 2"
	elif [ "$global_vocab" == "hawkins" ]; then
	python_cmd+=" --db-samples.hawkins 1"
	elif [ "$global_vocab" == "laurel_caverns" ]; then
	python_cmd+=" --db-samples.laurel-caverns 1"
	elif [ "$global_vocab" == "structured" ]; then
	python_cmd+=" --db-samples.Oxford 1"
	python_cmd+=" --db-samples.gardens 1"
	python_cmd+=" --db-samples.17places 1"
	python_cmd+=" --db-samples.baidu-datasets 1"
	python_cmd+=" --db-samples.st-lucia 1"
	python_cmd+=" --db-samples.pitts30k 4"
	elif [ "$global_vocab" == "unstructured" ]; then
	python_cmd+=" --db-samples.Tartan-GNSS-test-rotated 1"
	python_cmd+=" --db-samples.Tartan-GNSS-test-notrotated 1"
	python_cmd+=" --db-samples.hawkins 1"
	python_cmd+=" --db-samples.laurel-caverns 1"
	python_cmd+=" --db-samples.eiffel 1"
	python_cmd+=" --db-samples.VPAir 2"
	elif [ "$global_vocab" == "both" ]; then # Global vocabulary
	# Structured
	python_cmd+=" --db-samples.Oxford 1"
	python_cmd+=" --db-samples.gardens 1"
	python_cmd+=" --db-samples.17places 1"
	python_cmd+=" --db-samples.baidu-datasets 1"
	python_cmd+=" --db-samples.st-lucia 1"
	python_cmd+=" --db-samples.pitts30k 4"
	# Unstructured
	python_cmd+=" --db-samples.Tartan-GNSS-test-rotated 1"
	python_cmd+=" --db-samples.Tartan-GNSS-test-notrotated 1"
	python_cmd+=" --db-samples.hawkins 1"
	python_cmd+=" --db-samples.laurel-caverns 1"
	python_cmd+=" --db-samples.eiffel 1"
	python_cmd+=" --db-samples.VPAir 2"
	else
	echo "Invalid global vocab!"
	exit 1
	fi

	c_centers_file = os.path.join(cache_dir, "vocabulary",
	ext_specifier, domain, "c_centers.pt")

Comments (4)

Vocabulary Terminology

Getting Vocabularies

Method 1: Use the torch.hub model

Method 2: Using the repo's codebase

Vocabulary used in Table 4

Related Issues (20)

Recommend Projects

Recommend Topics

Recommend Org

Method 1: Use the `torch.hub` model