Hi, It is seen that tensorrt supports resnet in classification task.

@jaybdub-nv , <a class="user-mention notranslate" data-hovercard-type="user" data-hove

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I shared my test results on Jetson TX2 developer forum before: <a href="https://devtal

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Tensorrt supported detection networks,about nvidia-ai-iot/tf_trt_models

Comments (30)

bezero commented on May 31, 2024 4

I was able to convert "faster_rcnn_resnet101_coco"; however in order to be able to use it you should modify config file to use fixed input images. Modify line 4-8:
keep_aspect_ratio_resizer { min_dimension: 600 max_dimension: 1024 } ==> fixed_shape_resizer { height: 600 width: 1024 }
Use any dimension you like

from tf_trt_models.

commented on May 31, 2024

It's possible that it would work, but we haven't tested it. Currently, the build_detection_graph method that we provide in this repository is tested to work only against the listed models.

That said, it is possible that for similar meta-architectures (SSD), configurations with different feature extractors would work. A list of feature extractors registered with the tensorflow/models repository is listed here.

https://github.com/tensorflow/models/blob/master/research/object_detection/builders/model_builder.py#L47

You would need to update the object detection configuration proto to select the desired feature extractor.

Theoretically, the TensorRT integration in TensorFlow should support any model, as the operations that are not supported by TensorRT are run in native TensorFlow. That said, there may be caveats.

Please let me know if you run into issues.

from tf_trt_models.

HilmiK commented on May 31, 2024

Thank you for answer. I will report here after I try faster-rcnn with different backbones.

from tf_trt_models.

jkjung-avt commented on May 31, 2024

@jaybdub-nv , @bezero , When I tried to convert "faster_rcnn_resnet50_coco" with TF-TRT on TX2, I met a few other issues. I wonder how you got around them. Any help/suggestion is highly appreciated.

TX2 ran out of memory, especially when I tried to load an image and do tf_sess.run(...). And the program just got killed.
Same issue as #11

File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tensorrt/python/trt_convert.py", line 115, in create_inference_graph
int(msg[0]))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid graph: Frame ids for node BatchMultiClassNonMaxSuppression/map/while/Reshape_1 does not match frame ids for it's fanout.

The following error, which seems to be solved by bezero's fix as shown above.

<log time omitted>: E tensorflow/contrib/tensorrt/log/trt_logger.cc:38] DefaultLogger Parameter check failed at: ../builder/Network.cpp::addInput::377, condition: isValidDims(dims)

The following error, which I think is because the 2nd stage classifier needs to handle input tensor of larger batch size (300).

<log time omitted>: F tensorflow/contrib/tensorrt/kernels/trt_engine_op.
cc:82] input tensor batch larger than max_batch_size: 1

from tf_trt_models.

bezero commented on May 31, 2024

@jkjung-avt I also had memory issues. I solved it by closing my browser, since it is using your memory resources (if possible close all idle applications that are using memory resources. If I am not wrong, scripts in this repo work with max_batch_size=1, so try to work with single images. For batch size >1 TX2 memory might not be sufficient.

from tf_trt_models.

jkjung-avt commented on May 31, 2024

@bezero Thanks for the reply. But closing the web browser and all other applications on TX2 did not solve the OOM issue for me. I also used single-image input for the faster_rcnn_resnet50. I had to reduce number of proposals/detections in the model config to some very small numbers to get around that...

from tf_trt_models.

tevisgehr commented on May 31, 2024

I am able to run faster_rcnn_resnet50_coco, which is included in the list of supported models, but I don't seem to be getting any speedup, which makes me skeptical that any subgraphs are being optimized at all.

In order to get it to run, I used the following command to build the graph (along with the other code included in the Jupyter example):

trt_graph = trt.create_inference_graph(
input_graph_def=frozen_graph,
outputs=output_names,
max_batch_size=1,
max_workspace_size_bytes=1 << 25,
precision_mode='FP16',
minimum_segment_size=3,
maximum_cached_engines=3
)

I am wondering if anyone has had success in speeding up any form of Faster R-CNN, and if so, could you share some insight into what settings need to be adjusted or how to go about getting the graph conversions to work correctly?

from tf_trt_models.

jkjung-avt commented on May 31, 2024

I shared my test results on Jetson TX2 developer forum before: https://devtalk.nvidia.com/default/topic/1037019/jetson-tx2/tensorflow-object-detection-and-image-classification-accelerated-for-nvidia-jetson/post/5288250/#5288250

Note that I had to reduce number of region proposals in the Faster RCNN models otherwise it runs too slowly. All code I used for testing could be found in my GitHub repository: https://github.com/jkjung-avt/tf_trt_models

from tf_trt_models.

inders commented on May 31, 2024

I am facing the following error while trying to get the FasterRCNN model on TensorRT. I have tried changing the resizer as per @bezero comment 6 but still doesn't help yet. Any pointers would be highly appreciated.

.cc:724] Can't determine the device, constructing an allocator at device 0
2018-12-13 09:49:37.182205: E tensorflow/contrib/tensorrt/log/trt_logger.cc:38] DefaultLogger Parameter check failed at: Network.cpp::addInput::281, condition: isIndexedCHW(dims) && volume(dims) < MAX_TENSOR_SIZE
2018-12-13 09:49:37.182317: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:857] Engine creation for segment 0, composed of 3 nodes failed: Invalid argument: Failed to create Input layer tensor InputPH_0 rank=-2. Skipping...
2018-12-13 09:49:37.182353: W tensorflow/contrib/tensorrt/convert/convert_graph.cc:724] Can't determine the device, constructing an allocator at device 0

from tf_trt_models.

atyshka commented on May 31, 2024

@bezero @jkjung-avt Did you run your faster rcnn models in the jupyter notebook? The notebook code works fine for me for the ssd models but if try the faster rcnn models I'm getting Engine buffer is full. buffer limit=1, current entries=1, requested batch=100. I'm using the exact notebook code with three modifications:
1: MODEL = 'faster_rcnn_resnet50_coco'
2: removed score_threshold=0.3 from build_detection_graph(...
3: changed to fixed_shape_resizer { height: 600 width: 1024 } in the config file

Can either of you reproduce this issue? I'm using Tensorflow 1.12 and TensorRT 5.0

from tf_trt_models.

jkjung-avt commented on May 31, 2024

I haven't managed to get the faster_rcnn_resnet50 model to work with tensorflow 1.12.0 and TensorRT. Previously I got it to work using tensorflow 1.8.0, with some tweaks. Details are all in my GitHub repository: https://github.com/jkjung-avt/tf_trt_models/blob/master/data/faster_rcnn_resnet50_egohands.config

from tf_trt_models.

CharlieXie commented on May 31, 2024

Hi @jkjung-avt, I used your configure file: https://github.com/jkjung-avt/tf_trt_models/blob/master/data/faster_rcnn_inception_v2_egohands.config to train a model on my dataset(class num 13) and then tried to convert it to TRT but still got the error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid graph: Frame ids for node BatchMultiClassNonMaxSuppression/map/while/Reshape_1 does not match frame ids for it's fanout.
How to did u get rid of this error?
Another issue I'm facing is that my trained FRCNN-inception-v2 checkpoint file (103.5MB) is about twice size of fined-tuned checkpoint file(53.3MB). Do you have any idea about this?
Thanks in advance.

from tf_trt_models.

jkjung-avt commented on May 31, 2024

@CharlieXie, try setting 'remove_assert' to False. I recall that's how I got rid of the problem previously.

https://github.com/NVIDIA-AI-IOT/tf_trt_models/blob/master/tf_trt_models/detection.py#L108

from tf_trt_models.

xiaowenhe commented on May 31, 2024

@jkjung-avt , I use your tf_trt_models, when I run python3 camera_tf_trt.py --image --filename=xxx, --model=faster_rcnn_resnet50_coco --build .I met an error ,like:

I do not know how to deal with it? Can you help me!

And when I run python3 camera_tf_trt.py --image --filename=xxx, --model=faster_rcnn_resnet50_coco . Do not build, no error,but detec result is not ideal,like:

from tf_trt_models.

jkjung-avt commented on May 31, 2024

@xiaowenhe The segmentation fault could be caused by "out of memory" issue. You could use 'tegrastat' to monitor JTX2 memory usage and try to confirm if that's the case.

As to the bad detection result by TF-TRT optimized faster_rcnn_resnet50_coco model, I'm not exactly sure what the problem is. There could be many causes, e.g.

mismatching tensorflow versions between training and inferencing,
TF-TRT does not optimize certain operations in the model correctly,
...

from tf_trt_models.

xiaowenhe commented on May 31, 2024

@jkjung-avt ,thank you! But I bo not use TX2,. I want to test it in other first and then use tx2. And GPU like :

From the pic,only 5285M used!

from tf_trt_models.

hoangtuanvu commented on May 31, 2024

I can not force performance by using optimized TensorRT. Can someone tell my why? After optimizing the frozen graph, I get bigger model ???

from tf_trt_models.

bezero commented on May 31, 2024

@hoangtuanvu What do you mean by not being able to optimize? TensorRT optimizes your frozen model for inference, which does not mean that you get smaller model. Did you compare inference time before and after TensorRT optimization?

from tf_trt_models.

hoangtuanvu commented on May 31, 2024

@bezero I used TensorRT to optimize the frozen graph, but I did not get better speed for inference. I am currently working on person detection.

from tf_trt_models.

TomKomar commented on May 31, 2024

I'm having a similar situation to @atyshka - no improvement whatsoever. Only difference after generating an 'optimized' graph is that with every frame I'm getting a warning "Engine buffer is full". Has anyone figured out how to deal with this?
Xavier TF1.12+TRT5

from tf_trt_models.

zhucheng725 commented on May 31, 2024

Although I ran the detection demo using ssd_mobilenet_v1_coco.pb, I found that if I used the TP16 in
trt.create_inference_graph(), and the result shows that the benchmark is about 0.041013 seconds and I used the INT8, the result shows that the benchmark is about 0.383557 seconds. Why will INT8 slower than TP16

from tf_trt_models.

ashispapu commented on May 31, 2024

@hoangtuanvu I am facing an issue while running the inference on a tensorflow object detection model(242MB). I have TF 1.13 and TensorRT 5.1.2 . Below is the log details.
2019-06-03 15:05:21.432164: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:1030] TensorRT node resnet_v1_101/conv1/TRTEngineOp_123 added for segment 123 consisting of 2 nodes succeeded.
2019-06-03 15:05:21.432437: I tensorflow/contrib/tensorrt/convert/convert_graph.cc:1030] TensorRT node rpn_proposals/softmax/TRTEngineOp_124 added for segment 124 consisting of 3 nodes succeeded.
2019-06-03 15:05:22.384389: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:616] Optimization results for grappler item: tf_graph
2019-06-03 15:05:22.384595: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:618] constant folding: Graph size after: 2014 nodes (-599), 2353 edges (-637), time = 4514.9751ms.
2019-06-03 15:05:22.384653: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:618] layout: Graph size after: 2063 nodes (49), 2422 edges (69), time = 462.632ms.
2019-06-03 15:05:22.384702: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:618] constant folding: Graph size after: 2059 nodes (-4), 2422 edges (0), time = 908.786ms.
2019-06-03 15:05:22.384748: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:618] TensorRTOptimizer: Graph size after: 1653 nodes (-406), 2000 edges (-422), time = 57351.3477ms.
time(s) (trt_conversion): 72.7292
graph_size(MB)(native_tf): 230.8
graph_size(MB)(trt): 493.0
num_nodes(trt_only): 125
2019-06-03 15:05:49.531006: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:496] Building a new TensorRT engine for TRTEngineOp_0 with batch size 720
2019-06-03 15:05:49.543881: W tensorflow/contrib/tensorrt/log/trt_logger.cc:34] DefaultLogger Tensor DataType is determined at build time for tensors not marked as input or output.
2019-06-03 15:05:55.369363: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:496] Building a new TensorRT engine for resnet_v1_101/TRTEngineOp_23 with batch size 1
2019-06-03 15:05:55.837386: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:496] Building a new TensorRT engine for resnet_v1_101/conv1/TRTEngineOp_123 with batch size 1
2019-06-03 15:05:57.403776: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:496] Building a new TensorRT engine for resnet_v1_101/TRTEngineOp_24 with batch size 1
2019-06-03 15:06:10.529445: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:496] Building a new TensorRT engine for resnet_v1_101/block1/unit_1/bottleneck_v1/TRTEngineOp_25 with batch size 1
2019-06-03 15:06:13.628441: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:496] Building a new TensorRT engine for resnet_v1_101/block1/unit_1/bottleneck_v1/TRTEngineOp_26 with batch size 1
2019-06-03 15:06:20.675574: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:496] Building a new TensorRT engine for resnet_v1_101/block1/TRTEngineOp_27 with batch size 1
2019-06-03 15:06:25.591558: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:496] Building a new TensorRT engine for resnet_v1_101/block1/unit_2/bottleneck_v1/TRTEngineOp_28 with batch size 1
2019-06-03 15:06:28.377901: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:496] Building a new TensorRT engine for resnet_v1_101/block1/unit_2/bottleneck_v1/TRTEngineOp_29 with batch size 1
2019-06-03 15:06:35.168358: I tensorflow/contrib/tensorrt/kernels/trt_engine_op.cc:496] Building a new TensorRT engine for resnet_v1_101/block1/TRTEngineOp_31 with batch size 1
Killed

========================================================================
when i run dmesg --follow to check the process details.

[10663.666441] [12648] 1000 12648 6243614 1507814 3582 12 0 0 python3
[10663.666444] Out of memory: Kill process 12648 (python3) score 751 or sacrifice child
[10663.674368] Killed process 12648 (python3) total-vm:24974456kB, anon-rss:5768628kB, file-rss:262628kB, shmem-rss:0kB
[10664.011176] oom_reaper: reaped process 12648 (python3), now anon-rss:0kB, file-rss:262708kB, shmem-rss:0kB

Any suggestion or feedback is appreciated.

from tf_trt_models.

VincentChong123 commented on May 31, 2024

Hi @zhucheng725

Why will INT8 slower than TP16

Do you have any update on this?

Thanks

from tf_trt_models.

srkm009 commented on May 31, 2024

Hello,
Did anyone manage to resolve this issue? or is it still an issue from the TF-TRT?
I see the same issue with TF2.0 as well.

from tf_trt_models.

zhucheng725 commented on May 31, 2024

Hi @zhucheng725

Why will INT8 slower than TP16

Do you have any update on this?

Thanks

Not yet

from tf_trt_models.

spurani commented on May 31, 2024

I tried to run faster_rcnn_inception_v2 and got the following error. Does anyone have any clue about this? Any suggestion or advice would definitely help me to continue my learning by understanding these concepts.
thanks

InvalidArgumentError: node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Slice (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) has inputs from different frames. The input node BatchMultiClassNonMaxSuppression/map/while/Reshape_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame 'BatchMultiClassNonMaxSuppression/map/while/while_context'. The input node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Slice/begin (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame ''.

from tf_trt_models.

Some-random commented on May 31, 2024

I tried to run faster_rcnn_inception_v2 and got the following error. Does anyone have any clue about this? Any suggestion or advice would definitely help me to continue my learning by understanding these concepts.
thanks

InvalidArgumentError: node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Slice (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) has inputs from different frames. The input node BatchMultiClassNonMaxSuppression/map/while/Reshape_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame 'BatchMultiClassNonMaxSuppression/map/while/while_context'. The input node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Slice/begin (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame ''.

I'm having the same issue. Is there any update on this one? What is the meaning of this error anyway!

from tf_trt_models.

spurani commented on May 31, 2024

If I am not wrong the error states that the system does not have enough memory to run faster_rcnn_inception_v2 model Get Outlook for Android<https://aka.ms/ghei36>

…

________________________________ From: Bob_JIANG ***@***.***> Sent: Tuesday, March 30, 2021, 12:04 p.m. To: NVIDIA-AI-IOT/tf_trt_models Cc: spurani; Comment Subject: Re: [NVIDIA-AI-IOT/tf_trt_models] Tensorrt supported detection networks (#6) I tried to run faster_rcnn_inception_v2 and got the following error. Does anyone have any clue about this? Any suggestion or advice would definitely help me to continue my learning by understanding these concepts. thanks InvalidArgumentError: node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSup pression/Slice (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) has inputs from different frames. The input node BatchMultiClassNonMaxSuppression/map/while/Reshape_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame 'BatchMultiClassNonMaxSuppression/map/while/while_context'. The input node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Slice/begin (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame ''. I'm having the same issue. Is there any update on this one? What is the meaning of this error anyway! — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#6 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AG2BYGE7CIWHR774SEZXBSLTGHZA3ANCNFSM4FLQTZPA>.

from tf_trt_models.

Some-random commented on May 31, 2024

If I am not wrong the error states that the system does not have enough memory to run faster_rcnn_inception_v2 model Get Outlook for Androidhttps://aka.ms/ghei36
…
________________________________ From: Bob_JIANG @.***> Sent: Tuesday, March 30, 2021, 12:04 p.m. To: NVIDIA-AI-IOT/tf_trt_models Cc: spurani; Comment Subject: Re: [NVIDIA-AI-IOT/tf_trt_models] Tensorrt supported detection networks (#6) I tried to run faster_rcnn_inception_v2 and got the following error. Does anyone have any clue about this? Any suggestion or advice would definitely help me to continue my learning by understanding these concepts. thanks InvalidArgumentError: node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSup pression/Slice (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) has inputs from different frames. The input node BatchMultiClassNonMaxSuppression/map/while/Reshape_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame 'BatchMultiClassNonMaxSuppression/map/while/while_context'. The input node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Slice/begin (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame ''. I'm having the same issue. Is there any update on this one? What is the meaning of this error anyway! — You are receiving this because you commented. Reply to this email directly, view it on GitHub<#6 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AG2BYGE7CIWHR774SEZXBSLTGHZA3ANCNFSM4FLQTZPA.

Thanks for the quick answer! I'm running a different model using TRT and my memory is normal during execution... Do you know the meaning of 'has inputs from different frames' in the error message?

from tf_trt_models.

TClan8023 commented on May 31, 2024

I tried to run faster_rcnn_inception_v2 and got the following error. Does anyone have any clue about this? Any suggestion or advice would definitely help me to continue my learning by understanding these concepts. thanks

InvalidArgumentError: node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Slice (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) has inputs from different frames. The input node BatchMultiClassNonMaxSuppression/map/while/Reshape_1 (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame 'BatchMultiClassNonMaxSuppression/map/while/while_context'. The input node BatchMultiClassNonMaxSuppression/map/while/MultiClassNonMaxSuppression/Slice/begin (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) is in frame ''.

Hi, I'm using TF-TRT on windows10, with tf_gpu =2.10.0 and tensorrt = 7.2.3 based on cuda 11.2 and cudnn 8.1.0. I have met the same error while building TRT engine for inference. Do you know how to deal with it? Thanks a lot for your reply.

from tf_trt_models.

Tensorrt supported detection networks about tf_trt_models HOT 30 OPEN

Comments (30)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent