Giter Site home page Giter Site logo

why is it that using the pre-training model you provided, without any changes in the code, the test results vary greatly, even up to 10 point fluctuations? about urt HOT 9 OPEN

one23sunnyQQ avatar one23sunnyQQ commented on August 19, 2024
why is it that using the pre-training model you provided, without any changes in the code, the test results vary greatly, even up to 10 point fluctuations?

from urt.

Comments (9)

liulu112601 avatar liulu112601 commented on August 19, 2024

Hi, Thank you for sharing your code. But why is it that using the pre-training model you provided, without any changes in the code, the test results vary greatly, even up to 10 point fluctuations?May I ask how the test results provided in your paper can be determined as the final result when the performance fluctuates so much? Looking forward to your reply.

Hi there, thanks for your question. Can I ask which pretrained model do you use? I don't provide the pretrained model for URT. Do you mean the pretrained backbones provided from SUR?

from urt.

one23sunnyQQ avatar one23sunnyQQ commented on August 19, 2024

Yes, i used the pretrained backbones provided from SUR.

from urt.

liulu112601 avatar liulu112601 commented on August 19, 2024

Yes, i used the pretrained backbones provided from SUR.

For SUR, please refer to their repo for more details: https://github.com/dvornikita/SUR

For URT, our result is evaluated on the average of three runs while I don't observe such big fluctuations of 10 percent.

Hope this answers your question.

from urt.

xialeiliu avatar xialeiliu commented on August 19, 2024

Do you have updated Traffic Sign results with fixed loader issue?

What we get using your repo for Traffic Sign with latest Meta_dataset loader is about 50%.

Could you please help to confirm that?

from urt.

sudarshan1994 avatar sudarshan1994 commented on August 19, 2024

Hi,
Thank you for your contribution. I tried training URT using the pre-trained weights of SUR as instructed in your repo's readme. However I got results that were different from the reported results, I have pasted them below. Any help in clarifying this discrepancy would be much appreciated.

model \ data sur-paper sur-exp urt ok-06/10
ilsvrc_2012 56.30 +- 0.00 56.30 +- 0.00 58.75 +- 0.00 2.45
omniglot 93.10 +- 0.00 93.10 +- 0.00 75.17 +- 0.00 - 17.93
aircraft 85.40 +- 0.00 85.40 +- 0.00 94.00 +- 0.00 8.60
cu_birds 71.40 +- 0.00 71.40 +- 0.00 75.00 +- 0.00 3.60
dtd 71.50 +- 0.00 71.50 +- 0.00 86.00 +- 0.00 14.50
quickdraw 81.30 +- 0.00 81.30 +- 0.00 77.02 +- 0.00 -4.28
fungi 63.10 +- 0.00 63.10 +- 0.00 42.42 +- 0.00 -20.68
vgg_flower 82.80 +- 0.00 82.80 +- 0.00 92.22 +- 0.00 9.42
traffic_sign 70.40 +- 0.00 70.40 +- 0.00 90.00 +- 0.00 19.60
mscoco 52.40 +- 0.00 52.40 +- 0.00 52.22 +- 0.00 -0.18

from urt.

sudarshan1994 avatar sudarshan1994 commented on August 19, 2024

I used the resnet features released in the repo and got the same results as in the paper for all the datasets except for traffic sign and MNIST. Thanks for releasing the features!

from urt.

liulu112601 avatar liulu112601 commented on August 19, 2024

I used the resnet features released in the repo and got the same results as in the paper for all the datasets except for traffic sign and MNIST. Thanks for releasing the features!

Thanks for raising this issue. Just a kind reminder that because of a shuffling issue as described here: google-research/meta-dataset#54, the result has been affected especially for traffic signs and the new result has been updated in the open review system.

from urt.

sudarshan1994 avatar sudarshan1994 commented on August 19, 2024

Yeah, I am aware of the bug, thanks for letting me know though, but I am still using the old buggy dataloader just to see if I can get the same results as you guys got. I am just trying to calibrate my meta-dataset setup with your code, I think there is something not right about my meta-dataset setup.

Would it be possible to release the tf-records you guys used ? Of course I understand that is a lot of work, but it would be super helpful. Any help would be much appreciated.

from urt.

sudarshan1994 avatar sudarshan1994 commented on August 19, 2024

One more question: are the standard deviations mentioned in the paper calculated from 3 different runs or are they calculated from the 600 test tasks within a run ? Thank you for your time !

from urt.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.