Giter Site home page Giter Site logo

Comments (3)

wuyujack avatar wuyujack commented on July 19, 2024

I change the PyTorch version to 0.4 and finally, it works. It would be better if you would like to specify the version of the packages you used in the ReadMe : )

from deep-image-matting-pytorch.

wuyujack avatar wuyujack commented on July 19, 2024

After training for one epoch, a new issue appears:

2019-07-07 00:07:49,035 INFO Epoch: [0][0/1078] Loss 0.3504 (0.3504)
2019-07-07 00:09:40,503 INFO Epoch: [0][100/1078] Loss 0.3535 (0.3414)
2019-07-07 00:11:28,066 INFO Epoch: [0][200/1078] Loss 0.3366 (0.3431)
2019-07-07 00:13:25,253 INFO Epoch: [0][300/1078] Loss 0.3389 (0.3427)
2019-07-07 00:15:20,560 INFO Epoch: [0][400/1078] Loss 0.2967 (0.3424)
2019-07-07 00:17:14,953 INFO Epoch: [0][500/1078] Loss 0.3370 (0.3422)
2019-07-07 00:19:08,667 INFO Epoch: [0][600/1078] Loss 0.3207 (0.3417)
2019-07-07 00:21:05,087 INFO Epoch: [0][700/1078] Loss 0.2947 (0.3408)
2019-07-07 00:23:00,023 INFO Epoch: [0][800/1078] Loss 0.3078 (0.3394)
2019-07-07 00:24:54,484 INFO Epoch: [0][900/1078] Loss 0.3516 (0.3383)
2019-07-07 00:26:46,944 INFO Epoch: [0][1000/1078] Loss 0.3108 (0.3370)
Current effective learning rate: 0.0001

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory

Traceback (most recent call last):
File "train.py", line 171, in
main()
File "train.py", line 167, in main
train_net(args)
File "train.py", line 73, in train_net
logger=logger)
File "train.py", line 146, in valid
alpha_out = model(img) # [N, 320, 320]
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 114, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 124, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 65, in parallel_apply
raise output
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 41, in _worker
output = module(*input, **kwargs)
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/mingfu/Deep-Image-Matting-v2/models.py", line 121, in forward
down2, indices_2, unpool_shape2 = self.down2(down1)
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/mingfu/Deep-Image-Matting-v2/models.py", line 55, in forward
outputs = self.conv1(inputs)
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/mingfu/Deep-Image-Matting-v2/models.py", line 43, in forward
outputs = self.cbr_unit(inputs)
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 49, in forward
self.training or not self.track_running_stats, self.momentum, self.eps)
File "/home/mingfu/anaconda3/envs/pytorch-0.4/lib/python3.6/site-packages/torch/nn/functional.py", line 1194, in batch_norm
training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/THC/generic/THCStorage.cu:58

Would you like to share how many GPUs you used for the default batch size, --batch-size 32 ?

from deep-image-matting-pytorch.

foamliu avatar foamliu commented on July 19, 2024

Two GPUs, PyTorch version is: 1.0.1.post2

from deep-image-matting-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.