Hi, thanks for your repo. It would be nice if you could provide the code / the

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Code to reproduce NYU RGBD results / input pipeline about cmc HOT 4 OPEN

hobbitlong commented on August 24, 2024 1

Code to reproduce NYU RGBD results / input pipeline

from cmc.

Comments (4)

meyerjo commented on August 24, 2024

@HobbitLong As I don't know if there is any timeline on this issue and just to be sure to understand everything correctly to replicate the NYU RGB-D experiments the following steps would be required to replicate them wouldn't they?

Initialize the different models per modality

feat_l, feat_ab, feat_depth = model(inputs)

Create different contrast objects for the different pairs (e.g. in the core view scheme contrast_l_ab, contrast_l_depth)
2.1. This would enable something like

out_l, out_ab = contrast_l_and_ab(feat_l, feat_ab, index)
out_l_2, out_depth = contrast_l_and_depth(feat_l, feat_depth, index)

For each of the modalities one need to create a criterion object (twice for the core modality "L"). Which would lead to something like

l_loss_from_l_and_depth = criterion_l2(out_l_2)
depth_loss_from_l_and_depth = criterion_depth(out_l_2)

And then add the results from above to the general loss term.

Is this the correct way to go forward?
How exactly did you do the patched based training on the "L" modality? Could you provide some hyper-parameters for that?

Thanks for the otherwise very nice repo.

from cmc.

HobbitLong commented on August 24, 2024

@meyerjo ,

To your question, we used "patch-based contrastive objective" for this task, please refer to section 3.5.2.

That being said, for each of the modality, we extract global feature as well as local feature. Then we contrast between (1)global feature from modality A to local feature from modality B and (2)global feature from modality B to local feature from modality A. The way we build this global-local contrastive loss is exactly the same way s DIM paper. The code for this loss can be found here.

The way of extracting global local feature can be found here.

And then we sum up those pair-wise loss as described in our paper.

from cmc.

jnyjxn commented on August 24, 2024

@HobbitLong thanks for this additional info.

Just to confirm, the use of a patch-wise method here is only motivated by a small dataset?

I am also trying to find an explanation of the global/local feature concept in your CMC paper - is this a deviation from the method presented there?

Many thanks!

from cmc.

HobbitLong commented on August 24, 2024

@jnyjxn ,

Yes, NYU dataset only has less than 2k images.

This is different from the ImageNet experiment, but has been described in the supplementary of the paper.

from cmc.

Code to reproduce NYU RGBD results / input pipeline about cmc HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent