Giter Site home page Giter Site logo

having problems with OOM about ssd_detectors HOT 5 OPEN

mvoelk avatar mvoelk commented on July 30, 2024
having problems with OOM

from ssd_detectors.

Comments (5)

mvoelk avatar mvoelk commented on July 30, 2024

In DSOD_train.ipynb, batch size is actually 6 and the gradients get accumulated with AdamAccumulate for 128//6 batches before a gradient update is performed. This results in a virtual batch size of 126, but the log is updateted after each batch.

Setting the batch size to 4 or even 2 should solve the issue. How large is your GPU memory?

from ssd_detectors.

mustardlove avatar mustardlove commented on July 30, 2024

Thank you so much for your kind help!
I changed the 512's batch size to 4 and the train code is running!

I'm using two Titan Xp GPU and the memory spec is as follows:
11.4 GbpsMemory Speed
12 GB GDDR5XStandard Memory Config
384-bitMemory Interface Width
547.7 GB/sMemory Bandwidth (GB/sec)

currently the execution is using only 1 GPU..don't know why

I have one more question!

In your data_coco.py, there is convert_to_voc function.
I'm only using COCO dataset, so in DSOD_trian, I commented out codes related to VOC dataset and did
gt_util_train = gt_util_coco.convert_to_voc()
gt_util_val = gt_util_coco_val.convert_to_voc()

Does this code make DSOD_Train to train on only 21 categories? I figured you only have 21 initial weights.

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

I've always used 1 GPU for training a model, but it should work with multiple GPUs as well. The documentation of Model.fit_generator() explains how to do this.

convert_to_voc in the COCO case returns a new GTUtility with COCO data, but with the 20 (21 including background) VOC classes leading to a model with 21 categories.

The weights you mentioned are not trainable parameters... See #14 for more details.

from ssd_detectors.

mustardlove avatar mustardlove commented on July 30, 2024

Thank you for the reply!

I played some parameters in fit_generator() (use_multiprocessing=True, workers=2) but still only one gpu was on.

I also tried using multi_gpu_model from keras.utils, but failed with _TfDeviceCaptureOp does not have method _set_device_from_string.
I found that the class _TfDeviceCaptureOp in tensorflow/python/keras/backend.py does have _set_device_from_string, but in keras/backend/tensorflow_backend.py does not have that method..

If anyone solved this issue, please share your knowledge
Thank you!

from ssd_detectors.

mvoelk avatar mvoelk commented on July 30, 2024

Search for keras.utils.multi_gpu_model, use_multiprocessing=True, workers=2 refers to data loading.

from ssd_detectors.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.