Hello, thanks for your awesome work! I have a question about <a href="https://gith

Hello~ I have tested it on duke->market. <div class="snippet-clipboard-content

Thanks for your reply! An example: <div class="snippet-clipboard-content notra

Thanks for your reply! An example: <div class="snippet-clipboard-

about "select_cams" about openunreid HOT 16 CLOSED

SunskyF commented on May 27, 2024

about "select_cams"

from openunreid.

Comments (16)

SunskyF commented on May 27, 2024 1

Hello~ I have tested it on duke->market.

# case 1: origin
cams = pid_cam[pid_i]
index = pid_index[pid_i]
select_cams = No_index(cams, i_cam)

Mean AP: 77.7%                                                                                                                                                                    
CMC Scores:                                                                                                                                                                      
  top-1          90.8%                                                                                                                                                           
  top-5          96.6%                                                                                                                                                           
  top-10         98.0%

# case 2: no select cams
cams = pid_cam[pid_i]
index = pid_index[pid_i]
select_cams = []

Mean AP: 76.5%                                                                                                                                                                    
CMC Scores:                                                                                                                                                                      
  top-1          89.5%                                                                                                                                                           
  top-5          96.0%                                                                                                                                                           
  top-10         97.3%

# case 3: no select cams wrong
cams = pid_cam[pid_i]
index = pid_index[pid_i]
select_cams = list(range(len(cams)))

Mean AP: 8.6%                                                                                                                                                                    
CMC Scores:                                                                                                                                                                      
  top-1         17.2%                                                                                                                                                           
  top-5          31.7%                                                                                                                                                           
  top-10         39.6%

Conclusion:

"remove the index of i from sample list, which means that you may use the image of index i twice in a mini-batch." seems crucial.
sampling data according to their camera IDs may bring some improvements (~1%)

from openunreid.

yxgeee commented on May 27, 2024

select_cam targets at sampling images from different cameras for a priority. And the values in select_cams are actually the instance indexes instead of camera indexes. So if you use select_cams = list(range(len(cams))), you will always use data samples with the first len(cams) indexes, i.e. the first 6 images, which is totally wrong.

See https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py#L17, the values in the return list is the index i instead of original value j.

from openunreid.

SunskyF commented on May 27, 2024

Thanks for your reply!
An example:

select_cams = No_index(cams, i_cam)
print(len(cams), select_cams)
select_cams = list(range(len(cams)))
print(len(cams), select_cams)

21 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
21 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

I think they are similar?

from openunreid.

yxgeee commented on May 27, 2024

If you do not want to sample data by their cameras, just simply remove Line 104-118 in https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py

from openunreid.

yxgeee commented on May 27, 2024

Thanks for your reply!
An example:

select_cams = No_index(cams, i_cam)
print(len(cams), select_cams)
select_cams = list(range(len(cams)))
print(len(cams), select_cams)

21 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
21 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

I think they are similar?

If you use select_cams = list(range(len(cams))), for example, here you have 21 cameras, you will always sample your data from the first 21 images of each class.
Two situations may worse your performance:

In some classes, there are more than 21 images, but the remaining (21~) will not be used all the time;
You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.

from openunreid.

SunskyF commented on May 27, 2024

If the result of list(range(len(cams))) means 21 cameras, select_cams = No_index(cams, i_cam) also means the camera indexes? Because they are similar?

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.
Thanks for point this.

from openunreid.

SunskyF commented on May 27, 2024

select_cams = list(range(len(cams))) means the instance indexes, we can use this to get index from the same ID.
The problem may be

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.

But the performance seems wired. I rewrite the log dir, so I can't find the result

from openunreid.

yxgeee commented on May 27, 2024

If the result of list(range(len(cams))) means 21 cameras, select_cams = No_index(cams, i_cam) also means the camera indexes? Because they are similar?

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.
Thanks for point this.

Please see https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py#L17 carefully, the values in the return list is the index i (indexes of images within the same pid) instead of original value j (indexes of images' camera IDs within the same pid).

from openunreid.

yxgeee commented on May 27, 2024

select_cams = list(range(len(cams))) means the instance indexes, we can use this to get index from the same ID.
The problem may be

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.

But the performance seems wired. I rewrite the log dir, so I can't find the result

I have mentioned: For example, here you have 21 cameras, you will always sample your data from the first 21 images of each class. In some classes, there are more than 21 images, but the remaining (21~) will not be used all the time. This is the core problem.

Use list(range(len(cams))) is totally wrong. If you do not want to sample data by their cameras, just simply remove Line 104-118 in https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py

from openunreid.

SunskyF commented on May 27, 2024

If the result of list(range(len(cams))) means 21 cameras, select_cams = No_index(cams, i_cam) also means the camera indexes? Because they are similar?

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.
Thanks for point this.

Please see https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py#L17 carefully, the values in the return list is the index i (indexes of images within the same pid) instead of original value j (indexes of images' camera IDs within the same pid).

select_cams = list(range(len(cams))) alse means (indexes of images within the same pid)

An example:

datasets: {'market1501': 'trainval', 'dukemtmcreid': 'trainval'}
unsup_dataset_indexes: [0,]

cams = pid_cam[pid_i]
index = pid_index[pid_i]
print("-" * 16)
print(cams)   # [5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7]
print(index)  # [15014, 15015, 15016, 15017, 15018, 15019, 15020, 15021, 15022, 15023, 15024, 15025, 15026]
select_cams = No_index(cams, i_cam)
print(len(cams), select_cams)  # 13 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
select_cams = list(range(len(cams)))
print(len(cams), select_cams)  # 13 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
print("-" * 16)

[5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7]
[15014, 15015, 15016, 15017, 15018, 15019, 15020, 15021, 15022, 15023, 15024, 15025, 15026]
13 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
13 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

select_cams = No_index(cams, i_cam)给的是index的下标，也就是[15014, 15015, 15016, 15017, 15018, 15019, 15020, 15021, 15022, 15023, 15024, 15025, 15026]中的第几个，那list(range(len(cams)))给的也是index的下标，因为index和cams是一一对应的？

from openunreid.

yxgeee commented on May 27, 2024

What I have mentioned is that, the core problem that worse the final performance is due to the fact that
The number of cameras is fixed, e.g. 13 in your example.
So, if you use list(range(len(cams))) as the candidate list, you could only sample the images from the first list(range(len(cams))) indexes. For example, if there are 20 images in class A, you could only sample the mini-batch from the first 13 images, while the other 7 images would not be considered.

from openunreid.

SunskyF commented on May 27, 2024

What I have mentioned is that, the core problem that worse the final performance is due to the fact that
The number of cameras is fixed, e.g. 13 in your example.
So, if you use list(range(len(cams))) as the candidate list, you could only sample the images from the first list(range(len(cams))) indexes. For example, if there are 20 images in class A, you could only sample the mini-batch from the first 13 images, while the other 7 images would not be considered.

关键是总共只有13张图像，因为index的长度只有13？

from openunreid.

SunskyF commented on May 27, 2024

What I have mentioned is that, the core problem that worse the final performance is due to the fact that
The number of cameras is fixed, e.g. 13 in your example.
So, if you use list(range(len(cams))) as the candidate list, you could only sample the images from the first list(range(len(cams))) indexes. For example, if there are 20 images in class A, you could only sample the mini-batch from the first 13 images, while the other 7 images would not be considered.

并且我用的是market和duke数据集，cam不可能有13的

from openunreid.

yxgeee commented on May 27, 2024

Ok, sorry for misunderstanding the cams in list(range(len(cams))). I thought it was the overall camera IDs, and I noticed it is a variable in the code showing the same length with images.
So the conclusion is that sampling data according to their camera IDs is crucial to the final performance.

from openunreid.

SunskyF commented on May 27, 2024

嗯嗯，也有可能是因为

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.

我在尝试

just simply remove Line 104-118 in https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py

的方案。
感谢您的回复

from openunreid.

yxgeee commented on May 27, 2024

Welcome to show your comparison results here when you finish training. I am also curious about how much would it affect.

from openunreid.

about "select_cams" about openunreid HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent