sample4geo's People
Forkers
sj-yuan luithw ilori-jiang mksasx summerpanking handale88 sholotiuk gyoungtaechae waterm80 wlhashsample4geo's Issues
Training results on the university-1652 dataset
Thank you for your reply.
May I ask what weight file you are using for training? I used the weight file downloaded from Hugging Face, why is the final result AP value higher than the 91.39 mentioned in your paper?The following are the results of different weight files:
1.Model: convnext_base.fb_in22k_ft_in1k_384
{'input_size': (3, 224, 224), 'interpolation': 'bicubic', 'mean': (0.485, 0.456, 0.406), 'std': (0.229, 0.224, 0.225), 'crop_pct': 0.875, 'crop_mode': 'center'}
Start from: pretrained/university/convnext_base.fb_in22k_ft_in1k_384/weights_e1_0.9515.pth
GPUs available: 8
Image Size Query: (384, 384)
Image Size Ground: (384, 384)
Mean: (0.485, 0.456, 0.406)
Std: (0.229, 0.224, 0.225)
Query Images Test: 701
Gallery Images Test: 51355
------------------------------[University-1652]------------------------------
Extract Features:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:04<00:00, 1.43it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 402/402 [01:43<00:00, 3.87it/s]
Compute Scores:
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 701/701 [00:03<00:00, 193.97it/s]
Recall@1: 95.1498 - Recall@5: 97.0043 - Recall@10: 97.4322 - Recall@top1: 99.2867 - AP: 91.3901
2.Model: convnext_base.fb_in22k_ft_in1k_384
{'input_size': (3, 224, 224), 'interpolation': 'bicubic', 'mean': (0.485, 0.456, 0.406), 'std': (0.229, 0.224, 0.225), 'crop_pct': 0.875, 'crop_mode': 'center'}
Start from: /slurm-files/sgy/code/Sample4Geo/pretrained/university/convnext_base.fb_in22k_ft_in1k_384/weights_e1_0.9501.pth
GPUs available: 8
Image Size Query: (384, 384)
Image Size Ground: (384, 384)
Mean: (0.485, 0.456, 0.406)
Std: (0.229, 0.224, 0.225)
Query Images Test: 701
Gallery Images Test: 51355
------------------------------[University-1652]------------------------------
Extract Features:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:11<00:00, 2.00s/it]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 402/402 [03:41<00:00, 1.82it/s]
Compute Scores:
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 701/701 [00:06<00:00, 104.41it/s]
Recall@1: 95.0071 - Recall@5: 96.7190 - Recall@10: 97.1469 - Recall@top1: 99.2867 - AP: 91.6377
Model performance about convnext_tiny model
Hello, thank you for your great work.. We tested your method using the 'convnext_tiny.fb_in22k_ft_in1k_384' model and default parameters, training for 40 epochs. However, the model's performance only reached around 40% mean average precision on the University-1652 dataset. This suggests a deviation from the experimental results reported in your paper. Is it possible that this is due to parameter issues, such as the default setting of epoch to 1 in the training code?
About CVUSA
I tried to use CVUSA after polar coordinate transformation for training, but the index was abnormally high. I don’t know why? This is very important to me, thank you!
About break_counter in dataset shuffle
Thanks for your implementation.
I would like to ask about break_counter
. What does this variable mean? And why do different datasets have different number for ending the loop of shuffle for avoiding same id in one batch?
Thanks for your time.
How to get CVACT dataset
There is no link to the official CVACT dataset and the author's email is not working, can you share the CVACT dataset with me? Thank you very much!
About The InfoNCE Loss temperature parameter Setting of τ
Thank you very much for your paper and code, and I would like to ask you about The InfoNCE Loss
Does temperature parameter τ have an initial value? According to the thesis, it is a learnable parameter. How does it learn by itself
Training result error
Thank you for your wonderful work.
I encountered a problem during the training process, and as the epoch increased, re@1 The AP value actually decreased
This is the output result of my different epochs:
------------------------------[Epoch: 1]------------------------------
100%|████████████████████████████████████████████████████████████████████████| 295/295 [04:40<00:00, 1.05it/s, loss=0.8647, loss_avg=1.0110, lr=0.000999]
Epoch: 1, Train Loss = 1.011, Lr = 0.000999
------------------------------[Evaluate]------------------------------
Extract Features:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:02<00:00, 2.08it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 402/402 [01:32<00:00, 4.35it/s]
Compute Scores:
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 701/701 [00:03<00:00, 197.38it/s]
Recall@1: 93.5806 - Recall@5: 96.5763 - Recall@10: 97.1469 - Recall@top1: 99.2867 - AP: 89.7420
Shuffle Dataset:
42428it [00:00, 224071.11it/s]
Original Length: 37854 - Length after Shuffle: 37760
Break Counter: 512
Pairs left out of last batch to avoid creating noise: 94
First Element ID: 0945 - Last Element ID: 1646
------------------------------[Epoch: 25]------------------------------
100%|████████████████████████████████████████████████████████████████████████| 295/295 [04:45<00:00, 1.04it/s, loss=0.8206, loss_avg=0.8082, lr=0.000313]
Epoch: 25, Train Loss = 0.808, Lr = 0.000313
------------------------------[Evaluate]------------------------------
Extract Features:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:02<00:00, 2.12it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 402/402 [01:38<00:00, 4.10it/s]
Compute Scores:
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 701/701 [00:03<00:00, 197.02it/s]
Recall@1: 77.8887 - Recall@5: 85.8773 - Recall@10: 88.7304 - Recall@top1: 98.5735 - AP: 54.7721
Shuffle Dataset:
42376it [00:00, 223377.40it/s]
Original Length: 37854 - Length after Shuffle: 37760
Break Counter: 512
Pairs left out of last batch to avoid creating noise: 94
First Element ID: 0956 - Last Element ID: 1375
Geo-Localization for Drone Images Using Cross-View Image Embeddings
Thanks for your amazing work !
Problem
When the model extracts embeddings, the positional information of the image is not retained. This makes it impossible to pinpoint the exact position of the drone's target view based on the similarity of embeddings alone. The inability to retain positional data within the embeddings hampers the accuracy and reliability of the geo-localization process.
Steps to Reproduce
-
Use the model to calculate embeddings for cross-view images.
-
Compare the similarity between embeddings of different images.
-
Attempt to determine the exact position of the drone's target view based on the embeddings.
Expected Behavior
The model should retain the positional information within the embeddings, allowing for accurate determination of the drone's target view position based on the similarity of embeddings.
Actual Behavior
The positional information is lost during the embedding extraction process, making it impossible to accurately pinpoint the drone's target view position.
Possible Solution
Explore modifications to the embedding extraction process to retain positional information.
Investigate alternative architectures or additional components that can preserve positional data within the embeddings.
Consider combining the current embedding approach with a complementary method that retains positional information.
Additional Context
The model is implemented for geo-localization purposes in drone imagery.
The primary objective is to accurately determine the position of the drone's target view using image embeddings.
So any other optimized algorithm or approach to get the correct drone target view position using this model ?
Thanks !
Model performance with ConvNext-tiny
Thanks for your response! However, the current situation is that if I train using the default parameters you suggested, the performance of the University-1652 dataset network will be poor in the Epoch=1 phase when using a 4*3090GPU.
The problem we notice is that the loss does not decrease throughout the training process.
The only difference from the default code in training is that we downloaded the ConvNeXt-T model from https://github.com/facebookresearch/ConvNeXt fine-tuned on the ImageNet-1k dataset and load it locally via
model_state_dict = torch.load('. /pretrained/university/{}.pth'.format(config.model)) model.load_state_dict(model_state_dict, strict=False)
Originally posted by @MingkunLishigure in #1 (comment)
Get log file
About drawing heat maps
Thank you for your great work and your excellent and responsible answers! But I have a question. Regarding the drawing of feature heat maps in your paper, can you provide the code?
Could you please provide the CVACT's ACT_data.mat file?
Hi,
Thank you for sharing this great work. Could you please provide the "ACT_data.mat" file, which is used in the "cvusa_and_cvact.py"?
inverse polar
find a mistake in code
Thanks to the author for the well-commented code. I find a mistake in code.
In the train_university.py (line94-line95):
config.query_folder_train = './data/U1652/train/satellite' config.gallery_folder_train = './data/U1652/train/drone'
which should change to :
config.query_folder_train = './data/U1652/train/drone' config.gallery_folder_train = './data/U1652/train/satellite'
loss=NAN when training
Question about the memory
Hi, thank you very much for your excellent work and code. Your code structure is intuitive and effective and has inspired me a lot. However, I do not have that high-performance computer in my reproduction experiments, so I tried to reduce the batch size from 128 to 16 at the expense a liitle experimental results performance. However, during the evaluation phase of the VIGOR dataset, the program often gets killed due to lack of memory, I tried to buy an extra memory stick to increase my computer's memory to 32GB, and it still gets killed. So I would appreciate if you could share the size of the memory of your experimental equipment for reference.
In addition, I noticed the following settings in the training scripts :
neighbour_select: int = 64 # max selection size from pool
neighbour_range: int = 128 # pool size for selection
Based on your description in 3.4 of the paper, while reducing the batch size to 16, should I adjust neighbour_range
to 16 and neighbour_select
to 8?
How to get CVACT dataset
There is no link to the official CVACT dataset and the author's email is not working, can you share the CVACT dataset with me? Thank you very much!
Vit model
Thank you for your excellent work. I would like to ask what vit model you used in the ablation experiment?
Confuse
Thank you for your excellent work, but I found that when I used the polar coordinate transformed image to retrieve, it looked worse than directly using the original sat view to retrieve. This is not in line with intuition, because polar coordinate transformation seems to be beneficial in other networks. !
Accuracy of the cross-area mode in VIGORl
Hello! Thank you very much for sharing your research results! I have a question,in this paper, the accuracy of the VIGOR dataset in cross-area mode is very high. Is this due to the use of metric learning?
where can i get calc_distance_university.py file and *.mat file
Are there any instructions?
Hello. Good to see your great works!
I wonder if there are any instructinos about packages,dependencies to be installed !
thankyou!
Training results on the university-1652 dataset
Thank you for your reply.
May I ask what weight file you are using for training? I used the weight file downloaded from Hugging Face, why is the final result AP value higher than the 91.39 mentioned in your paper?Here are my training results:
------------------------------[Epoch: 1]------------------------------
100%|████████████████████████████████████████████████████████████████████████| 295/295 [04:45<00:00, 1.03it/s, loss=0.8654, loss_avg=1.0052, lr=0.000000]
Epoch: 1, Train Loss = 1.005, Lr = 0.000000
------------------------------[Evaluate]------------------------------
Extract Features:
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:03<00:00, 1.94it/s]
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 402/402 [01:43<00:00, 3.90it/s]
Compute Scores:
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 701/701 [00:03<00:00, 194.62it/s]
Recall@1: 95.0071 - Recall@5: 96.7190 - Recall@10: 97.1469 - Recall@top1: 99.2867 - AP: 91.6379
Shuffle Dataset:
42428it [00:00, 191806.88it/s]
Original Length: 37854 - Length after Shuffle: 37760
Break Counter: 512
Pairs left out of last batch to avoid creating noise: 94
First Element ID: 0945 - Last Element ID: 1646
Results in CVACT_val.
Hi @Skyy93 ,
I want to report your ``Random" results of CVACT_val in my paper. But I cannot find that. Can you tell me?
Best,
Guopeng.
test_160k.py
Hi, I really appreciate your method employed in this competition. It appears that your novel approach also places emphasis on the university_160k dataset. Could u pls upload test_160k.py? Thx a lot:)
sharing weights
Thanks for the great work, have you tried ConvNext-B without sharing weights?
Missing implementation of DSS in University1562
Thanks for your interesting work.
However, I can not find the implementation of Dynamic similarity sampling in the code, where there is only one custom sampler to prevent the occurrence of multiple images from the same class.
Does it mean that there is no special sampling method for University1562?
get train script
Thank you for your excellent work. Can you provide the train script?
The training process of university-1652
May I ask what initial weight file is used to train the university-1652 dataset? Why do I use the hugging face's convnext_mase.fb_122k_ft_11k_384/pytocht_madel.bin to train AP values re@1 Getting lower and lower
The training process of university-1652
May I ask what initial weight file is used to train the university-1652 dataset? Why do I use the hugging face's convnext_mase.fb_122k_ft_11k_384/pytocht_madel.bin to train AP values re@1 Getting lower and lower?
training steps for the University-1652 dataset
Can you tell me the training steps for the University-1652 dataset?For example, the running order of. py files.
Training result error
Thank you for your reply.
Does this mean that the epoch should be set to 1 during training? Will it not cause underfitting?
GPS smapling
Thank you for your excellent work, but I have some doubts. You mentioned in the ablation section of the paper that CVUSA can also use GPS sampling, but the data set does not seem to provide latitude and longitude.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.