Giter Site home page Giter Site logo

Comments (7)

wangg12 avatar wangg12 commented on May 18, 2024

@Foristkirito Could you provide a small snippet to reproduce your bug?

from examples.

Foristkirito avatar Foristkirito commented on May 18, 2024

@wangg12 of course, I just use the modified code by me.
with command python main.py -a alexnet -j 6 --resume ./alexnet_cp --epochs 90 -b 256 ./data. I think the problem is that the loss is too large to over flow.
I also ran with --pretrained, loss is ok, but after 90 epochs, performance nearly did not change. shown as below:

 * Prec@1 10.000 Prec@5 50.000
Epoch: [89][0/196]      Time 1.455 (1.455)      Data 1.072 (1.072)      Loss 2.3118 (2.3118)    Prec@1 9.375 (9.375)    Prec@5 49.219 (49.219)
Epoch: [89][10/196]     Time 0.406 (0.507)      Data 0.001 (0.100)      Loss 2.3118 (2.3118)    Prec@1 10.547 (10.085)  Prec@5 51.172 (50.604)
Epoch: [89][20/196]     Time 0.408 (0.460)      Data 0.001 (0.053)      Loss 2.3118 (2.3118)    Prec@1 8.594 (10.212)   Prec@5 51.172 (50.930)
Epoch: [89][30/196]     Time 0.401 (0.444)      Data 0.001 (0.036)      Loss 2.3119 (2.3118)    Prec@1 8.594 (9.929)    Prec@5 44.922 (50.441)
Epoch: [89][40/196]     Time 0.004 (0.436)      Data 0.001 (0.027)      Loss 2.3118 (2.3118)    Prec@1 10.156 (9.861)   Prec@5 55.859 (50.210)
Epoch: [89][50/196]     Time 0.410 (0.431)      Data 0.001 (0.022)      Loss 2.3119 (2.3118)    Prec@1 10.547 (9.934)   Prec@5 49.609 (50.444)
Epoch: [89][60/196]     Time 0.415 (0.428)      Data 0.001 (0.019)      Loss 2.3117 (2.3118)    Prec@1 11.719 (10.028)  Prec@5 55.078 (50.506)
Epoch: [89][70/196]     Time 0.407 (0.426)      Data 0.001 (0.016)      Loss 2.3119 (2.3118)    Prec@1 8.594 (10.030)   Prec@5 49.219 (50.539)
Epoch: [89][80/196]     Time 0.393 (0.428)      Data 0.001 (0.014)      Loss 2.3118 (2.3118)    Prec@1 6.641 (9.968)    Prec@5 50.781 (50.236)
Epoch: [89][90/196]     Time 0.392 (0.426)      Data 0.001 (0.013)      Loss 2.3119 (2.3118)    Prec@1 8.984 (10.045)   Prec@5 49.219 (50.206)
Epoch: [89][100/196]    Time 0.591 (0.425)      Data 0.001 (0.011)      Loss 2.3118 (2.3118)    Prec@1 10.156 (9.998)   Prec@5 50.000 (50.085)
Epoch: [89][110/196]    Time 0.399 (0.423)      Data 0.001 (0.011)      Loss 2.3118 (2.3118)    Prec@1 13.672 (10.015)  Prec@5 51.953 (50.070)
Epoch: [89][120/196]    Time 0.395 (0.422)      Data 0.001 (0.010)      Loss 2.3119 (2.3118)    Prec@1 8.203 (9.985)    Prec@5 48.438 (49.913)
Epoch: [89][130/196]    Time 0.389 (0.422)      Data 0.001 (0.009)      Loss 2.3118 (2.3118)    Prec@1 10.938 (9.951)   Prec@5 50.781 (49.860)
Epoch: [89][140/196]    Time 0.404 (0.421)      Data 0.001 (0.008)      Loss 2.3119 (2.3118)    Prec@1 8.984 (9.912)    Prec@5 50.000 (49.986)
Epoch: [89][150/196]    Time 0.397 (0.421)      Data 0.001 (0.008)      Loss 2.3119 (2.3118)    Prec@1 7.422 (9.910)    Prec@5 49.609 (50.000)
Epoch: [89][160/196]    Time 0.408 (0.419)      Data 0.001 (0.008)      Loss 2.3119 (2.3118)    Prec@1 9.766 (9.899)    Prec@5 47.266 (49.939)
Epoch: [89][170/196]    Time 0.399 (0.419)      Data 0.001 (0.007)      Loss 2.3119 (2.3118)    Prec@1 12.109 (9.875)   Prec@5 48.438 (49.836)
Epoch: [89][180/196]    Time 0.405 (0.419)      Data 0.001 (0.007)      Loss 2.3119 (2.3118)    Prec@1 11.328 (9.874)   Prec@5 48.828 (49.767)
Epoch: [89][190/196]    Time 0.398 (0.419)      Data 0.000 (0.006)      Loss 2.3119 (2.3118)    Prec@1 8.203 (9.835)    Prec@5 49.219 (49.691)
Test: [0/40]    Time 0.845 (0.845)      Loss 2.3119 (2.3119)    Prec@1 8.984 (8.984)    Prec@5 46.875 (46.875)
Test: [10/40]   Time 0.155 (0.264)      Loss 2.3118 (2.3118)    Prec@1 10.547 (10.298)  Prec@5 54.688 (49.503)
Test: [20/40]   Time 0.168 (0.214)      Loss 2.3118 (2.3118)    Prec@1 10.938 (10.305)  Prec@5 53.125 (50.186)
Test: [30/40]   Time 0.338 (0.203)      Loss 2.3117 (2.3118)    Prec@1 9.766 (10.131)   Prec@5 57.812 (50.101)
 * Prec@1 10.000 Prec@5 50.000

from examples.

Foristkirito avatar Foristkirito commented on May 18, 2024

@wangg12 I figure it out. I made a mistake. Problem solved thank you guy.

from examples.

wangg12 avatar wangg12 commented on May 18, 2024

@Foristkirito What is the problem?

from examples.

Foristkirito avatar Foristkirito commented on May 18, 2024

@wangg12 the problem is directories. It's necessary to maintain the project directory structure and put it right at your home directory. I do not understand why but it does not work if you put imagenet at your home directory. However, resnet always works fine i will spend some time to figure our the real problem.

from examples.

wangg12 avatar wangg12 commented on May 18, 2024

@Foristkirito Do you get better results with alexnet on cifar-10 now?

Also IMO, alexnet is not suitable for cifar10 though. The architecture is for bigger images but 32x32 cifar-10 images.

Besides, if you do not use pre-trained weights, you should be careful with the learning rate and the weights initialization (The random behavior can be fixed by torch.manual_seed(seed)).

from examples.

Foristkirito avatar Foristkirito commented on May 18, 2024

@wangg12 The accuracy of alexnet is still low. It sounds a good solution i will try it.

from examples.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.