Giter Site home page Giter Site logo

about "select_cams" about openunreid HOT 16 CLOSED

SunskyF avatar SunskyF commented on May 27, 2024
about "select_cams"

from openunreid.

Comments (16)

SunskyF avatar SunskyF commented on May 27, 2024 1

Hello~ I have tested it on duke->market.

# case 1: origin
cams = pid_cam[pid_i]
index = pid_index[pid_i]
select_cams = No_index(cams, i_cam)

Mean AP: 77.7%                                                                                                                                                                    
CMC Scores:                                                                                                                                                                      
  top-1          90.8%                                                                                                                                                           
  top-5          96.6%                                                                                                                                                           
  top-10         98.0% 
# case 2: no select cams
cams = pid_cam[pid_i]
index = pid_index[pid_i]
select_cams = []

Mean AP: 76.5%                                                                                                                                                                    
CMC Scores:                                                                                                                                                                      
  top-1          89.5%                                                                                                                                                           
  top-5          96.0%                                                                                                                                                           
  top-10         97.3% 
# case 3: no select cams wrong
cams = pid_cam[pid_i]
index = pid_index[pid_i]
select_cams = list(range(len(cams)))

Mean AP: 8.6%                                                                                                                                                                    
CMC Scores:                                                                                                                                                                      
  top-1         17.2%                                                                                                                                                           
  top-5          31.7%                                                                                                                                                           
  top-10         39.6% 

Conclusion:

  1. "remove the index of i from sample list, which means that you may use the image of index i twice in a mini-batch." seems crucial.
  2. sampling data according to their camera IDs may bring some improvements (~1%)

from openunreid.

yxgeee avatar yxgeee commented on May 27, 2024

select_cam targets at sampling images from different cameras for a priority. And the values in select_cams are actually the instance indexes instead of camera indexes. So if you use select_cams = list(range(len(cams))), you will always use data samples with the first len(cams) indexes, i.e. the first 6 images, which is totally wrong.

See https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py#L17, the values in the return list is the index i instead of original value j.

from openunreid.

SunskyF avatar SunskyF commented on May 27, 2024

Thanks for your reply!
An example:

select_cams = No_index(cams, i_cam)
print(len(cams), select_cams)
select_cams = list(range(len(cams)))
print(len(cams), select_cams)
21 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
21 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

I think they are similar?

from openunreid.

yxgeee avatar yxgeee commented on May 27, 2024

If you do not want to sample data by their cameras, just simply remove Line 104-118 in https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py

from openunreid.

yxgeee avatar yxgeee commented on May 27, 2024

Thanks for your reply!
An example:

select_cams = No_index(cams, i_cam)
print(len(cams), select_cams)
select_cams = list(range(len(cams)))
print(len(cams), select_cams)
21 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
21 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

I think they are similar?

If you use select_cams = list(range(len(cams))), for example, here you have 21 cameras, you will always sample your data from the first 21 images of each class.
Two situations may worse your performance:

  1. In some classes, there are more than 21 images, but the remaining (21~) will not be used all the time;
  2. You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.

from openunreid.

SunskyF avatar SunskyF commented on May 27, 2024

If the result of list(range(len(cams))) means 21 cameras, select_cams = No_index(cams, i_cam) also means the camera indexes? Because they are similar?

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.
Thanks for point this.

from openunreid.

SunskyF avatar SunskyF commented on May 27, 2024

select_cams = list(range(len(cams))) means the instance indexes, we can use this to get index from the same ID.
The problem may be

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.

But the performance seems wired. I rewrite the log dir, so I can't find the result

from openunreid.

yxgeee avatar yxgeee commented on May 27, 2024

If the result of list(range(len(cams))) means 21 cameras, select_cams = No_index(cams, i_cam) also means the camera indexes? Because they are similar?

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.
Thanks for point this.

Please see https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py#L17 carefully, the values in the return list is the index i (indexes of images within the same pid) instead of original value j (indexes of images' camera IDs within the same pid).

from openunreid.

yxgeee avatar yxgeee commented on May 27, 2024

select_cams = list(range(len(cams))) means the instance indexes, we can use this to get index from the same ID.
The problem may be

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.

But the performance seems wired. I rewrite the log dir, so I can't find the result

I have mentioned: For example, here you have 21 cameras, you will always sample your data from the first 21 images of each class. In some classes, there are more than 21 images, but the remaining (21~) will not be used all the time. This is the core problem.

Use list(range(len(cams))) is totally wrong. If you do not want to sample data by their cameras, just simply remove Line 104-118 in https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py

from openunreid.

SunskyF avatar SunskyF commented on May 27, 2024

If the result of list(range(len(cams))) means 21 cameras, select_cams = No_index(cams, i_cam) also means the camera indexes? Because they are similar?

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.
Thanks for point this.

Please see https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py#L17 carefully, the values in the return list is the index i (indexes of images within the same pid) instead of original value j (indexes of images' camera IDs within the same pid).

select_cams = list(range(len(cams))) alse means (indexes of images within the same pid)

An example:

datasets: {'market1501': 'trainval', 'dukemtmcreid': 'trainval'}
unsup_dataset_indexes: [0,]

cams = pid_cam[pid_i]
index = pid_index[pid_i]
print("-" * 16)
print(cams)   # [5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7]
print(index)  # [15014, 15015, 15016, 15017, 15018, 15019, 15020, 15021, 15022, 15023, 15024, 15025, 15026]
select_cams = No_index(cams, i_cam)
print(len(cams), select_cams)  # 13 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
select_cams = list(range(len(cams)))
print(len(cams), select_cams)  # 13 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
print("-" * 16)

[5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7]
[15014, 15015, 15016, 15017, 15018, 15019, 15020, 15021, 15022, 15023, 15024, 15025, 15026]
13 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
13 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]

select_cams = No_index(cams, i_cam)给的是index的下标,也就是[15014, 15015, 15016, 15017, 15018, 15019, 15020, 15021, 15022, 15023, 15024, 15025, 15026]中的第几个,那list(range(len(cams)))给的也是index的下标,因为indexcams是一一对应的?

from openunreid.

yxgeee avatar yxgeee commented on May 27, 2024

What I have mentioned is that, the core problem that worse the final performance is due to the fact that
The number of cameras is fixed, e.g. 13 in your example.
So, if you use list(range(len(cams))) as the candidate list, you could only sample the images from the first list(range(len(cams))) indexes. For example, if there are 20 images in class A, you could only sample the mini-batch from the first 13 images, while the other 7 images would not be considered.

from openunreid.

SunskyF avatar SunskyF commented on May 27, 2024

What I have mentioned is that, the core problem that worse the final performance is due to the fact that
The number of cameras is fixed, e.g. 13 in your example.
So, if you use list(range(len(cams))) as the candidate list, you could only sample the images from the first list(range(len(cams))) indexes. For example, if there are 20 images in class A, you could only sample the mini-batch from the first 13 images, while the other 7 images would not be considered.

关键是总共只有13张图像,因为index的长度只有13?

from openunreid.

SunskyF avatar SunskyF commented on May 27, 2024

What I have mentioned is that, the core problem that worse the final performance is due to the fact that
The number of cameras is fixed, e.g. 13 in your example.
So, if you use list(range(len(cams))) as the candidate list, you could only sample the images from the first list(range(len(cams))) indexes. For example, if there are 20 images in class A, you could only sample the mini-batch from the first 13 images, while the other 7 images would not be considered.

并且我用的是market和duke数据集,cam不可能有13的

from openunreid.

yxgeee avatar yxgeee commented on May 27, 2024

Ok, sorry for misunderstanding the cams in list(range(len(cams))). I thought it was the overall camera IDs, and I noticed it is a variable in the code showing the same length with images.
So the conclusion is that sampling data according to their camera IDs is crucial to the final performance.

from openunreid.

SunskyF avatar SunskyF commented on May 27, 2024

嗯嗯,也有可能是因为

You did not remove the index of i from your sample list, which means that you may use the image of index i twice in a mini-batch.

我在尝试

just simply remove Line 104-118 in https://github.com/open-mmlab/OpenUnReID/blob/master/openunreid/data/samplers/distributed_identity_sampler.py

的方案。
感谢您的回复

from openunreid.

yxgeee avatar yxgeee commented on May 27, 2024

Welcome to show your comparison results here when you finish training. I am also curious about how much would it affect.

from openunreid.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.