Comments (3)
Oh alright. You can try using Google Colab to train them then. You can fit FasterRCNN+ResNet-50 (and other models with similar param count) over there.
from pedestrian-detection.
Changing parameters (besides batch size) won't help your case that much if you're using pretrained models. The model faster_rcnn_inception_resnet_v2_atrous_coco_2018_01_28
is pretty large. I would suggest using a smaller model such as FasterRCNN_ResNet50
or SSD_MobileNet
.
from pedestrian-detection.
Same problem occurring again, even with SSD_MobileNet :/
I have a P400 GPU
2018-10-18 16:09:49.781181: W tensorflow/core/common_runtime/bfc_allocator.cc:279] ____**********___********************************************************************xxxxx
2018-10-18 16:09:49.781198: W tensorflow/core/framework/op_kernel.cc:1275] OP_REQUIRES failed at conv_ops.cc:693 : Resource exhausted: OOM when allocating tensor with shape[24,128,75,75] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.ResourceExhaustedError'>, OOM when allocating tensor with shape[24,128,75,75] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6, FeatureExtractor/MobilenetV1/Conv2d_3_pointwise/weights/read/_3593)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.[[Node: Loss/Where_260/_6409 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_14144_Loss/Where_260", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.Caused by op 'FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Conv2D', defined at:
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/train.py", line 165, in
tf.app.run()
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/train.py", line 161, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/object_detection/trainer.py", line 228, in train
clones = model_deploy.create_clones(deploy_config, model_fn, [input_queue])
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/deployment/model_deploy.py", line 193, in create_clones
outputs = model_fn(*args, **kwargs)
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/object_detection/trainer.py", line 165, in _create_losses
prediction_dict = detection_model.predict(images)
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/object_detection/meta_architectures/ssd_meta_arch.py", line 264, in predict
preprocessed_inputs)
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/object_detection/models/ssd_mobilenet_v1_feature_extractor.py", line 106, in extract_features
scope=scope)
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/nets/mobilenet_v1.py", line 258, in mobilenet_v1_base
scope=end_point)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args
return func(*args, **current_args)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1154, in convolution2d
conv_dims=2)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args
return func(*args, **current_args)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1057, in convolution
outputs = layer.apply(inputs)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 805, in apply
return self.call(inputs, *args, **kwargs)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 362, in call
outputs = super(Layer, self).call(inputs, *args, **kwargs)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 736, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/keras/layers/convolutional.py", line 186, in call
outputs = self._convolution_op(inputs, self.kernel)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 868, in call
return self.conv_op(inp, filter)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 520, in call
return self.call(inp, filter)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 204, in call
name=self.name)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 956, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
op_def=op_def)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1717, in init
self._traceback = tf_stack.extract_stack()ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[24,128,75,75] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6, FeatureExtractor/MobilenetV1/Conv2d_3_pointwise/weights/read/_3593)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.[[Node: Loss/Where_260/_6409 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_14144_Loss/Where_260", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.Traceback (most recent call last):
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1278, in _do_call
return fn(*args)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1350, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[24,128,75,75] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6, FeatureExtractor/MobilenetV1/Conv2d_3_pointwise/weights/read/_3593)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.[[Node: Loss/Where_260/_6409 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_14144_Loss/Where_260", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/train.py", line 165, in
tf.app.run()
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/train.py", line 161, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/object_detection/trainer.py", line 332, in train
saver=saver)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 770, in train
sess, train_op, global_step, train_step_kwargs)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 487, in train_step
run_metadata=run_metadata)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 877, in run
run_metadata_ptr)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1100, in _run
feed_dict_tensor, options, run_metadata)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1272, in _do_run
run_metadata)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1291, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[24,128,75,75] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6, FeatureExtractor/MobilenetV1/Conv2d_3_pointwise/weights/read/_3593)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.[[Node: Loss/Where_260/_6409 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_14144_Loss/Where_260", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.Caused by op 'FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Conv2D', defined at:
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/train.py", line 165, in
tf.app.run()
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/train.py", line 161, in main
worker_job_name, is_chief, FLAGS.train_dir)
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/object_detection/trainer.py", line 228, in train
clones = model_deploy.create_clones(deploy_config, model_fn, [input_queue])
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/deployment/model_deploy.py", line 193, in create_clones
outputs = model_fn(*args, **kwargs)
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/object_detection/trainer.py", line 165, in _create_losses
prediction_dict = detection_model.predict(images)
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/object_detection/meta_architectures/ssd_meta_arch.py", line 264, in predict
preprocessed_inputs)
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/object_detection/models/ssd_mobilenet_v1_feature_extractor.py", line 106, in extract_features
scope=scope)
File "/home/mounir/PycharmProjects/Pedestrian-detection-DL/nets/mobilenet_v1.py", line 258, in mobilenet_v1_base
scope=end_point)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args
return func(*args, **current_args)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1154, in convolution2d
conv_dims=2)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 183, in func_with_args
return func(*args, **current_args)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/contrib/layers/python/layers/layers.py", line 1057, in convolution
outputs = layer.apply(inputs)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 805, in apply
return self.call(inputs, *args, **kwargs)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 362, in call
outputs = super(Layer, self).call(inputs, *args, **kwargs)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 736, in call
outputs = self.call(inputs, *args, **kwargs)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/keras/layers/convolutional.py", line 186, in call
outputs = self._convolution_op(inputs, self.kernel)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 868, in call
return self.conv_op(inp, filter)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 520, in call
return self.call(inp, filter)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/ops/nn_ops.py", line 204, in call
name=self.name)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 956, in conv2d
data_format=data_format, dilations=dilations, name=name)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3155, in create_op
op_def=op_def)
File "/home/mounir/anaconda3/envs/tflow-gpu/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1717, in init
self._traceback = tf_stack.extract_stack()ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[24,128,75,75] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[Node: FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_pointwise/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](FeatureExtractor/MobilenetV1/MobilenetV1/Conv2d_3_depthwise/Relu6, FeatureExtractor/MobilenetV1/Conv2d_3_pointwise/weights/read/_3593)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.[[Node: Loss/Where_260/_6409 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_14144_Loss/Where_260", tensor_type=DT_INT64, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
from pedestrian-detection.
Related Issues (11)
- when i run inference.py, i got this issues HOT 1
- The problem of protobuf
- Performance issue in the definition of testResizeToRangePreservesStaticSpatialShape, object_detection/core/preprocessor_test.py(P1)
- error HOT 1
- Hello, is your pre-train model easy to provide? HOT 1
- How to run the code on any videos? HOT 1
- error when run python create_tf_record.py --data_dir=`pwd` --output_dir=`pwd` HOT 2
- Training on Colab, how to pass arguments to the .ipynb on colab HOT 1
- Is there a limit to the number of bounding boxes ? HOT 2
- Full pipeline.config HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pedestrian-detection.