tannergilbert / tensorflow-object-detection-api-train-model Goto Github PK

View Code? Open in Web Editor NEW

191.0 3.0 103.0 41.69 MB

Train a object detection model with the Tensorflow Object Detection API and Tensorflow 2.

Home Page: https://gilberttanner.com/blog/creating-your-own-objectdetector

License: MIT License

Python 0.15% Jupyter Notebook 99.85%

tensorflow tensorflow2 object-detection tensorflow-object-detection-api

tensorflow-object-detection-api-train-model's Introduction

How to train a custom object detection model with the Tensorflow Object Detection API

(ReadME inspired by EdjeElectronics)

Update: This README and Repository is now fully updated for Tensorflow 2. If you want to use Tensorflow 1 instead check out my article. If you want to train your model in Google Colab check out the Tensorflow_2_Object_Detection_Train_model notebook.

Introduction

Steps:

1. Installation

You can install the TensorFlow Object Detection API either with Python Package Installer (pip) or Docker, an open-source platform for deploying and managing containerized applications. For running the Tensorflow Object Detection API locally, Docker is recommended. If you aren't familiar with Docker though, it might be easier to install it using pip.

First clone the master branch of the Tensorflow Models repository:

git clone https://github.com/tensorflow/models.git

Docker Installation

# From the root of the git repository (inside the models directory)
docker build -f research/object_detection/dockerfiles/tf2/Dockerfile -t od .
docker run -it od

Python Package Installation

cd models/research
# Compile protos.
protoc object_detection/protos/*.proto --python_out=.
# Install TensorFlow Object Detection API.
cp object_detection/packages/tf2/setup.py .
python -m pip install .

Note: The *.proto designating all files does not work protobuf version 3.5 and higher. If you are using version 3.5, you have to go through each file individually. To make this easier, I created a python script that loops through a directory and converts all proto files one at a time.

import os
import sys
args = sys.argv
directory = args[1]
protoc_path = args[2]
for file in os.listdir(directory):
    if file.endswith(".proto"):
        os.system(protoc_path+" "+directory+"/"+file+" --python_out=.")

python use_protobuf.py <path to directory> <path to protoc file>

To test the installation run:

# Test the installation.
python object_detection/builders/model_builder_tf2_test.py

If everything installed correctly you should see something like:

...
[       OK ] ModelBuilderTF2Test.test_create_ssd_models_from_config
[ RUN      ] ModelBuilderTF2Test.test_invalid_faster_rcnn_batchnorm_update
[       OK ] ModelBuilderTF2Test.test_invalid_faster_rcnn_batchnorm_update
[ RUN      ] ModelBuilderTF2Test.test_invalid_first_stage_nms_iou_threshold
[       OK ] ModelBuilderTF2Test.test_invalid_first_stage_nms_iou_threshold
[ RUN      ] ModelBuilderTF2Test.test_invalid_model_config_proto
[       OK ] ModelBuilderTF2Test.test_invalid_model_config_proto
[ RUN      ] ModelBuilderTF2Test.test_invalid_second_stage_batch_size
[       OK ] ModelBuilderTF2Test.test_invalid_second_stage_batch_size
[ RUN      ] ModelBuilderTF2Test.test_session
[  SKIPPED ] ModelBuilderTF2Test.test_session
[ RUN      ] ModelBuilderTF2Test.test_unknown_faster_rcnn_feature_extractor
[       OK ] ModelBuilderTF2Test.test_unknown_faster_rcnn_feature_extractor
[ RUN      ] ModelBuilderTF2Test.test_unknown_meta_architecture
[       OK ] ModelBuilderTF2Test.test_unknown_meta_architecture
[ RUN      ] ModelBuilderTF2Test.test_unknown_ssd_feature_extractor
[       OK ] ModelBuilderTF2Test.test_unknown_ssd_feature_extractor
----------------------------------------------------------------------
Ran 20 tests in 91.767s

OK (skipped=1)

2. Gathering data

Now that the Tensorflow Object Detection API is ready to go, we need to gather the images needed for training.

To train a robust model, the pictures should be as diverse as possible. So they should have different backgrounds, varying lighting conditions, and unrelated random objects in them.

You can either take pictures yourself, or you can download pictures from the internet. For my microcontroller detector, I took about 25 pictures of each individual microcontroller and 25 pictures containing multiple microcontrollers.

You can use the resize_images script to resize the image to the wanted resolutions.

python resize_images.py -d images/ -s 800 600

After you have all the images, move about 80% to the object_detection/images/train directory and the other 20% to the object_detection/images/test directory. Make sure that the images in both directories have a good variety of classes.

3. Labeling data

With all the pictures gathered, we come to the next step - labeling the data. Labeling is the process of drawing bounding boxes around the desired objects.

LabelImg is a great tool for creating an object detection data-set.

LabelImg GitHub

LabelImg Download

Download and install LabelImg. Then point it to your images/train and images/test directories, and draw a box around each object in each image.

LabelImg supports two formats, PascalVOC and Yolo. For this tutorial, make sure to select PascalVOC. LabelImg saves a xml file containing the label data for each image. These files will be used to create a tfrecord file, which can be used to train the model.

4. Generating Training data

With the images labeled, we need to create TFRecords that can be served as input data for training the object detector. To create the TFRecords, we will first convert the XML label files created with LabelImg to one CSV file using the xml_to_csv.py script.

python xml_to_csv.py

The above command creates two files in the images directory. One is called test_labels.csv, and another one is called train_labels.csv. Next, we'll convert the CSV files into TFRecords files. For this, open the generate_tfrecord.py file and replace the labelmap inside the class_text_to_int method with your own label map.

Old:

# TO-DO replace this with label map
def class_text_to_int(row_label):
    if row_label == 'basketball':
        return 1
    elif row_label == 'shirt':
        return 2
    elif row_label == 'shoe':
        return 3
    else:
        return None

New:

def class_text_to_int(row_label):
    if row_label == 'Raspberry_Pi_3':
        return 1
    elif row_label == 'Arduino_Nano':
        return 2
    elif row_label == 'ESP8266':
        return 3
    elif row_label == 'Heltec_ESP32_Lora':
        return 4
    else:
        return None

Now the TFRecord files can be generated by typing:

python generate_tfrecord.py --csv_input=images/train_labels.csv --image_dir=images/train --output_path=train.record
python generate_tfrecord.py --csv_input=images/test_labels.csv --image_dir=images/test --output_path=test.record

These two commands generate a train.record and a test.record file, which can be used to train our object detector.

5. Getting ready for training

The last thing we need to do before training is to create a label map and a training configuration file.

5.1 Creating a label map

The label map maps an id to a name. We will put it in a folder called training, which is located in the object_detection directory. The labelmap for my detector can be seen below.

item {
    id: 1
    name: 'Raspberry_Pi_3'
}
item {
    id: 2
    name: 'Arduino_Nano'
}
item {
    id: 3
    name: 'ESP8266'
}
item {
    id: 4
    name: 'Heltec_ESP32_Lora'
}

The id number of each item should match the id of specified in the generate_tfrecord.py file.

5.2 Creating the training configuration

Lastly, we need to create a training configuration file. As a base model, I will use EfficientDet – a recent family of SOTA models discovered with the help of Neural Architecture Search. The Tensorflow OD API provides a lot of different models. For more information check out the Tensorflow 2 Detection Model Zoo

The base config for the model can be found inside the configs/tf2 folder.

Copy the config file to the training directory. Then open it inside a text editor and make the following changes:

Line 13: change the number of classes to number of objects you want to detect (4 in my case)
Line 141: change fine_tune_checkpoint to the path of the model.ckpt file:
- fine_tune_checkpoint: "<path>/efficientdet_d0_coco17_tpu-32/checkpoint/ckpt-0"
Line 143: Change fine_tune_checkpoint_type to detection
Line 182: change input_path to the path of the train.records file:
- input_path: "<path>/train.record"
Line 197: change input_path to the path of the test.records file:
- input_path: "<path>/test.record"
Line 180 and 193: change label_map_path to the path of the label map:
- label_map_path: "<path>/labelmap.pbtxt"
Line 144 and 189: change batch_size to a number appropriate for your hardware, like 4, 8, or 16.

6. Training the model

To train the model, execute the following command in the command line:

python model_main_tf2.py --pipeline_config_path=training/ssd_efficientdet_d0_512x512_coco17_tpu-8.config --model_dir=training --alsologtostderr

If everything was setup correctly, the training should begin shortly, and you should see something like the following:

Every few minutes, the current loss gets logged to Tensorboard. Open Tensorboard by opening a second command line, navigating to the object_detection folder and typing:

tensorboard --logdir=training/train

This will open a webpage at localhost:6006.

The training script saves checkpoints about every five minutes. Train the model until it reaches a satisfying loss, then you can terminate the training process by pressing Ctrl+C.

7. Exporting the inference graph

Now that we have a trained model, we need to generate an inference graph that can be used to run the model.

python /content/models/research/object_detection/exporter_main_v2.py \
    --trained_checkpoint_dir training \
    --output_directory inference_graph \
    --pipeline_config_path training/ssd_efficientdet_d0_512x512_coco17_tpu-8.config

8. Using the model for inference

After training the model it can be used in many ways. For examples on how to use the model check out my other repositories.

Tensorflow-Object-Detection-with-Tensorflow-2.0

Appendix

Common Questions

1. How do I extract the images inside the bounding boxes?

output_directory = 'some dir'

# get label and coordinates of detected objects
output = []
for index, score in enumerate(output_dict['detection_scores']):
    label = category_index[output_dict['detection_classes'][index]]['name']
    ymin, xmin, ymax, xmax = output_dict['detection_boxes'][index]
    output.append((label, int(xmin * image_width), int(ymin * image_height), int(xmax * image_width), int(ymax * image_height)))

# Save images and labels
for l, x_min, y_min, x_max, y_max in output:
    array = cv2.cvtColor(np.array(image_show), cv2.COLOR_RGB2BGR)
    image = Image.fromarray(array)
    cropped_img = image.crop((x_min, y_min, x_max, y_max))
    file_path = output_directory+'/images/'+str(len(df))+'.jpg'
    cropped_img.save(file_path, "JPEG", icc_profile=cropped_img.info.get('icc_profile'))
    df.loc[len(df)] = [datetime.datetime.now(), file_path]
    df.to_csv(output_directory+'/results.csv', index=None

2. How do I host a model?

There are multiple ways to host a model. You can create a Restful API with Tensorflow Serving or by creating your own websites. You can also integrate the model into a website by transforming your model to Tensorflow Lite.

Contribution

Anyone is welcome to contribute to this repository, however, if you decide to do so I would appreciate it if you take a moment and review the guidelines.

Author

Gilbert Tanner

Support me

License

This project is licensed under the MIT License - see the LICENSE.md file for details

tensorflow-object-detection-api-train-model's People

Contributors

Stargazers

Watchers

Forkers

linhduongtuan ifeitao joaonunes-copin thinkall sarahalbuquerque manishs86 welldl derek14 rririanto shantanunandan qjw2bqn heeebsinc martonbazso theprogramking jimroyfr amruta5694 kaedenbrinkman iamrajnish machinaexphilip jalanning mrquang89 vikasv42 okchenfang iankush-dev e-inan wasumrtomass0 selehadin-cyber patilkunal karry0298 nkamal62 sotashe jianshijim qfaizan401 mortis-thebat larsbuck linkapp-github 13301338176 fidelisgalla mekmk00 shweta146 neda60 aravindh-iseec mlhafizur frc2423 qingcao85 kelvinxuande laiona akshay853 profabioalvespinto aanand01 msg4rajesh vitthal13 lnunes93 giaahuyy0112 thorer ojas1901 rauthbibek chenadsh nocolour yousef0m flutterbrothers ioanacocu solomem treshank dsomuncuoglu madaliou tienhoangvan cloud-computer-vision aphx097 ayahassan1 kaymakrabia vishesh71 menurivera vaidikk10 thedanielz3 algonacci tensorflow-study shreyjani dgleba waikap beardedunicorn xuhaijia sdw8855 irwanmazlin ibravo mahmutt chaitu1509 dubrovin-sudo kirodev zlte2011 cleresk leo007-htun rokker1 phillipnik04 dhirajneupane ktchan33gbc computer-vision-ai sunghyun1215 edumaza victorwsliew

tensorflow-object-detection-api-train-model's Issues

google.protobuf.text_format.ParseError: 161:14 : Message type "object_detection.protos.Optimizer" has no field named "i".

I am following your documentation:
Ubuntu : Ubuntu 16.04.7 LTS
Tensorflow: 2.2
GPU Cuda 10.0

Step: Train Model:
python model_main_tf2.py
--pipeline_config_path=training/ssd_efficientdet_d0_512x512_coco17_tpu-8.config
--model_dir=training
--alsologtostderr

Error:

File "model_main_tf2.py", line 113, in
tf.compat.v1.app.run()
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/tensor
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/absl/a
_run_main(main, args)
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/absl/a
sys.exit(main(argv))
File "model_main_tf2.py", line 104, in main
model_lib_v2.train_loop(
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/object
configs = get_configs_from_pipeline_file(
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/object
text_format.Merge(proto_str, pipeline_config)
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/google
return MergeLines(
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/google
return parser.MergeLines(lines, message)
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/google
self._ParseOrMerge(lines, message)
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/google
self._MergeField(tokenizer, message)
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/google
merger(tokenizer, message, field)
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/google
self._MergeField(tokenizer, sub_message)
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/google
merger(tokenizer, message, field)
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/google
self._MergeField(tokenizer, sub_message)
File "/home/aarju/anaconda3/envs/demo_model/lib/python3.8/site-packages/google
raise tokenizer.ParseErrorPreviousToken(
google.protobuf.text_format.ParseError: 161:14 : Message type "object_detection.

Please resolve it as soon as possible

Thank you

Dataset?

Hello!
Thanks for the whole article. I would like to try it.... Would it be possible to get the dataset?

Best
Holger

self._read_buf = _pywrap_file_io.BufferedInputStream( UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 80: invalid start byte

Hi There,
Thanks for your beautiful work and for the extra-effort of sharing and documenting. I am forever thankful.
I am a newbie and I am facing the following issue.
I have tried with other tutorials (different models, different datasets,...) but I keep getting this UnicodeDecodeError and I google the issue to the bones without any success.
Any idea anyone?
Thanks!

==================
Traceback (most recent call last):
File "C:/Program Files/Python38/models/research/object_detection/model_main_tf2.py", line 113, in
tf.compat.v1.app.run()
File "C:\Users\Admin\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\platform\app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "C:\Program Files\Python38\lib\site-packages\absl\app.py", line 303, in run
_run_main(main, args)
File "C:\Program Files\Python38\lib\site-packages\absl\app.py", line 251, in _run_main
sys.exit(main(argv))
File "C:/Program Files/Python38/models/research/object_detection/model_main_tf2.py", line 104, in main
model_lib_v2.train_loop(
File "C:\Program Files\Python38\lib\site-packages\object_detection\model_lib_v2.py", line 545, in train_loop
train_input = strategy.experimental_distribute_datasets_from_function(
File "C:\Users\Admin\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\util\deprecation.py", line 340, in new_func
return func(*args, **kwargs)
File "C:\Users\Admin\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\distribute\distribute_lib.py", line 1143, in experimental_distribute_datasets_from_function
return self.distribute_datasets_from_function(dataset_fn, options)
File "C:\Users\Admin\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\distribute\distribute_lib.py", line 1134, in distribute_datasets_from_function
return self._extended._distribute_datasets_from_function( # pylint: disable=protected-access
File "C:\Users\Admin\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\distribute\mirrored_strategy.py", line 545, in _distribute_datasets_from_function
return input_lib.get_distributed_datasets_from_function(
File "C:\Users\Admin\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\distribute\input_lib.py", line 161, in get_distributed_datasets_from_function
return DistributedDatasetsFromFunction(dataset_fn, input_workers,
File "C:\Users\Admin\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\distribute\input_lib.py", line 1272, in init
_create_datasets_from_function_with_input_context(
File "C:\Users\Admin\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\distribute\input_lib.py", line 1936, in _create_datasets_from_function_with_input_context
dataset = dataset_fn(ctx)
File "C:\Program Files\Python38\lib\site-packages\object_detection\model_lib_v2.py", line 536, in train_dataset_fn
train_input = inputs.train_input(
File "C:\Program Files\Python38\lib\site-packages\object_detection\inputs.py", line 893, in train_input
dataset = INPUT_BUILDER_UTIL_MAP['dataset_build'](
File "C:\Program Files\Python38\lib\site-packages\object_detection\builders\dataset_builder.py", line 210, in build
decoder = decoder_builder.build(input_reader_config)
File "C:\Program Files\Python38\lib\site-packages\object_detection\builders\decoder_builder.py", line 52, in build
decoder = tf_example_decoder.TfExampleDecoder(
File "C:\Program Files\Python38\lib\site-packages\object_detection\data_decoders\tf_example_decoder.py", line 414, in init
_ClassTensorHandler(
File "C:\Program Files\Python38\lib\site-packages\object_detection\data_decoders\tf_example_decoder.py", line 88, in init
name_to_id = label_map_util.get_label_map_dict(
File "C:\Program Files\Python38\lib\site-packages\object_detection\utils\label_map_util.py", line 201, in get_label_map_dict
label_map = load_labelmap(label_map_path_or_proto)
File "C:\Program Files\Python38\lib\site-packages\object_detection\utils\label_map_util.py", line 168, in load_labelmap
label_map_string = fid.read()
File "C:\Users\Admin\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\lib\io\file_io.py", line 118, in read
self._preread_check()
File "C:\Users\Admin\AppData\Roaming\Python\Python38\site-packages\tensorflow\python\lib\io\file_io.py", line 80, in _preread_check
self._read_buf = _pywrap_file_io.BufferedInputStream(
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 80: invalid start byte

Issue while installing python -m pip install .

Error during the installation process

Describe the bug
I get lots of error during the installation process, when I execute the python install command.

To Reproduce
Steps to reproduce the behavior:
I've been following it step by step but when I run the command python -m pip install . halfway through its execution I get a bunch of errors in red and it exits. Errors say things like Connection closed and .. .I've been struggling with it for a while and nothing works. I've attached a screenshot of the errors.

Desktop (please complete the following information):

OS: Windows 10
Python version: 3.7.0
Protoc version: protoc-3.15.6-win64

Use config: faster_rcnn_resnet50_v1_800x1333_coco17_gpu-8.config Error

How fix it, i search the question a long time, please

my environment:
rtx3070 cuda11.1
tf-nightly-gpu==2.5.0.dev20201226
Python ==3.8.5

Traceback (most recent call last):
File "model_main_tf2_1.py", line 114, in
tf.compat.v1.app.run()
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "model_main_tf2_1.py", line 104, in main
model_lib_v2.train_loop(
File "/code/model/research/object_detection/model_lib_v2.py", line 522, in train_loop
train_input = strategy.experimental_distribute_datasets_from_function(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/util/deprecation.py", line 337, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/distribute_lib.py", line 1147, in experimental_distribute_datasets_from_function
return self.distribute_datasets_from_function(dataset_fn, options)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/distribute_lib.py", line 1138, in distribute_datasets_from_function
return self._extended._distribute_datasets_from_function( # pylint: disable=protected-access
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/mirrored_strategy.py", line 545, in _distribute_datasets_from_function
return input_lib.get_distributed_datasets_from_function(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/input_lib.py", line 161, in get_distributed_datasets_from_function
return DistributedDatasetsFromFunction(dataset_fn, input_workers,
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/input_lib.py", line 1271, in init
_create_datasets_from_function_with_input_context(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/input_lib.py", line 1935, in _create_datasets_from_function_with_input_context
dataset = dataset_fn(ctx)
File "/code/model/research/object_detection/model_lib_v2.py", line 513, in train_dataset_fn
train_input = inputs.train_input(
File "/code/model/research/object_detection/inputs.py", line 870, in train_input
dataset = INPUT_BUILDER_UTIL_MAP['dataset_build'](
File "/code/model/research/object_detection/builders/dataset_builder.py", line 228, in build
batch_size = input_context.get_per_replica_batch_size(batch_size)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/distribute_lib.py", line 516, in get_per_replica_batch_size
raise ValueError("The global_batch_size %r is not divisible by "
ValueError: The global_batch_size 16 is not divisible by num_replicas_in_sync 3

Predicted Bounding Box not drawn in some of the test images.

Hi. After the creation of the model when I am trying to test it on my group (20 nos.) of test images using "detect_from_image.py", I am finding that in some of my test images model is successfully able to detect or predict the bounding box around the object in test image. But in some of my images, no predicted bounding box is drawn around those images.
Even I tried to pass those particular images individually, but still no prediction box is drawn around these images.

Can you please let me know what is the issue.

training issue

Describe the bug
fails when trying to run training, successful following of the rest of the guide
I have search online and not found any solution to
** ImportError: libGL.so.1: cannot open shared object file: No such file or directory**

Train model on Google colab with TPU

Thanks for the tutorials. I've achieved to get the training work on Google colab in a runtime with setted GPU. In the object_detection folder there is a script to train the model with Google's tpu (model_tpu_main.py). When I start this script with the same flags you've used in the model_main.py, it surprisingly is detecting the tpu. But it crashes because a mismatch

INFO:tensorflow:TPU job name tpu_worker

I1122 02:07:09.691519 139921392695168 tpu_estimator.py:506] TPU job name tpu_worker
INFO:tensorflow:Graph was finalized.
I1122 02:07:13.428853 139921392695168 monitored_session.py:240] Graph was finalized.
ERROR:tensorflow:Error recorded from training_loop: From /job:tpu_worker/replica:0/task:0:
Unsuccessful TensorSliceReader constructor: Failed to get matching files on /root/datalab/pretrained_model/model.ckpt: Unimplemented: File system scheme '[local]' not implemented (file:

'/root/datalab/pretrained_model/model.ckpt')

Do you maybe now why? Do I have to use different flags? Or do you know another way to train on Google colab with tpu?

utf-8' codec can't decode byte 0xfd in position 97: invalid start byte

Hi There,
Thanks for your beautiful work.
I am facing the following issue;
I have tried with other tutorials (different models, different datasets,...) but I keep getting this UnicodeDecodeError and I google the issue to the bones without any success.
Any idea anyone?
Thanks!
In the train detector part,
_ClassTensorHandler(
File "C:\Users\DELL\anaconda3\lib\site-packages\object_detection\data_decoders\tf_example_decoder.py", line 92, in init
name_to_id = label_map_util.get_label_map_dict(
File "C:\Users\DELL\anaconda3\lib\site-packages\object_detection\utils\label_map_util.py", line 201, in get_label_map_dict
label_map = load_labelmap(label_map_path_or_proto)
File "C:\Users\DELL\anaconda3\lib\site-packages\object_detection\utils\label_map_util.py", line 168, in load_labelmap
label_map_string = fid.read()
File "C:\Users\DELL\anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 114, in read
self._preread_check()
File "C:\Users\DELL\anaconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 76, in _preread_check
self._read_buf = _pywrap_file_io.BufferedInputStream(
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfd in position 97: invalid start byte

how to train the custom models in tensorflow 2.0 please help me out

What is image_show !?

For the code for saving the bounding boxes, I want to know what is meant by image_show in the 7th line !?

output_directory = 'some dir'

# get label and coordinates of detected objects
output = []
for index, score in enumerate(output_dict['detection_scores']):
    label = category_index[output_dict['detection_classes'][index]]['name']
    ymin, xmin, ymax, xmax = output_dict['detection_boxes'][index]
    output.append((label, int(xmin * image_width), int(ymin * image_height), int(xmax * image_width), int(ymax * image_height)))

# Save images and labels
for l, x_min, y_min, x_max, y_max in output:
    array = cv2.cvtColor(np.array(image_show), cv2.COLOR_RGB2BGR)
    image = Image.fromarray(array)
    cropped_img = image.crop((x_min, y_min, x_max, y_max))
    file_path = output_directory+'/images/'+str(len(df))+'.jpg'
    cropped_img.save(file_path, "JPEG", icc_profile=cropped_img.info.get('icc_profile'))
    df.loc[len(df)] = [datetime.datetime.now(), file_path]
    df.to_csv(output_directory+'/results.csv', index=None

Getting Error 'Function call stack: _dummy_computation_fn' after step 6 of your tutorial "Training the model"

Hi. I have followed the steps what you have mentioned in your tutorial on my windows 10 machine with tensorflow version 2.4.1. My image size (train + test images) is 750p x 750p and hence I have used "ssd_efficientdet_d2_768x768_coco17_tpu-8.config" model instead of "ssd_efficientdet_d0_512x512_coco17_tpu-8.config". Everything else remain the same as mentioned in your tutorial. But after running the final training the model command, I am getting the following error as attached in the text file.

error.txt

Can you please tell me what is the issue and how to solve it.

There are non-GPU devices - GPU devices not detected.

Describe the bug
After running the build docker I receive an error stating that no GPU's were detected along with a failure to run the model_main_tf2.py script.

There were several different suggestions and solutions between your repo and Tensorflow around similar issues so I attempted a few of them...

changing the nvidia gpu apt-key (this appeared to be an issue at one point but reverting it recently seemed to not cause any change)
disabling gcloud and gsutil commands
adding a gpu_device_name check to the mode_main_XX.py

I tried to install the Nvidia-Docker directly with SUDO however a password prompt appeared and my attempts to set a password in the docker-run section or to find a password did not lead to any success.

I have run my current docker files and the originals side by side and seem to get the same effect.

To Reproduce
Steps to reproduce the behavior:

Build docker following instructions on Git and/or blog page (ex : docker build -f research/object_detection/dockerfiles/tf2/Dockerfile -t od . ) (Docker - Linux Containers)
Run docker (ex : docker run -it od)
In docker after creating train.record and test.record successfully, attempt to run learing script (ex : python object_detection/model_main_tf2.py --pipeline_config_path=object_detection/training/ssd_efficientdet_d0_512x512_coco17_tpu-8.config --model_dir=object_detection/training/ --alsologtostderr)
See error listed below.

WARNING:tensorflow:There are non-GPU devices in tf.distribute.Strategy, not using nccl allreduce.
W0622 03:08:07.678852 140015840864064 cross_device_ops.py:1386] There are non-GPU devices in tf.distribute.Strategy, not using nccl allreduce.
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0',)
I0622 03:08:07.691202 140015840864064 mirrored_strategy.py:374] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0',)
INFO:tensorflow:Maybe overwriting train_steps: None
I0622 03:08:07.694155 140015840864064 config_util.py:552] Maybe overwriting train_steps: None
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0622 03:08:07.694274 140015840864064 config_util.py:552] Maybe overwriting use_bfloat16: False
I0622 03:08:07.699931 140015840864064 ssd_efficientnet_bifpn_feature_extractor.py:145] EfficientDet EfficientNet backbone version: efficientnet-b0
I0622 03:08:07.700029 140015840864064 ssd_efficientnet_bifpn_feature_extractor.py:147] EfficientDet BiFPN num filters: 64
I0622 03:08:07.700090 140015840864064 ssd_efficientnet_bifpn_feature_extractor.py:148] EfficientDet BiFPN num iterations: 3
I0622 03:08:07.702906 140015840864064 efficientnet_model.py:143] round_filter input=32 output=32
I0622 03:08:07.740919 140015840864064 efficientnet_model.py:143] round_filter input=32 output=32
I0622 03:08:07.741045 140015840864064 efficientnet_model.py:143] round_filter input=16 output=16
I0622 03:08:07.798309 140015840864064 efficientnet_model.py:143] round_filter input=16 output=16
I0622 03:08:07.798432 140015840864064 efficientnet_model.py:143] round_filter input=24 output=24
I0622 03:08:07.944522 140015840864064 efficientnet_model.py:143] round_filter input=24 output=24
I0622 03:08:07.944638 140015840864064 efficientnet_model.py:143] round_filter input=40 output=40
I0622 03:08:08.091527 140015840864064 efficientnet_model.py:143] round_filter input=40 output=40
I0622 03:08:08.091642 140015840864064 efficientnet_model.py:143] round_filter input=80 output=80
I0622 03:08:08.317637 140015840864064 efficientnet_model.py:143] round_filter input=80 output=80
I0622 03:08:08.317753 140015840864064 efficientnet_model.py:143] round_filter input=112 output=112
I0622 03:08:08.537171 140015840864064 efficientnet_model.py:143] round_filter input=112 output=112
I0622 03:08:08.537288 140015840864064 efficientnet_model.py:143] round_filter input=192 output=192
I0622 03:08:08.839897 140015840864064 efficientnet_model.py:143] round_filter input=192 output=192
I0622 03:08:08.840018 140015840864064 efficientnet_model.py:143] round_filter input=320 output=320
I0622 03:08:08.912957 140015840864064 efficientnet_model.py:143] round_filter input=1280 output=1280
I0622 03:08:08.947752 140015840864064 efficientnet_model.py:453] Building model efficientnet with params ModelConfig(width_coefficient=1.0, depth_coefficient=1.0, resolution=224, dropout_rate=0.2, blocks=(BlockConfig(input_filters=32, output_filters=16, kernel_size=3, num_repeat=1, expand_ratio=1, strides=(1, 1), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=16, output_filters=24, kernel_size=3, num_repeat=2, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=24, output_filters=40, kernel_size=5, num_repeat=2, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=40, output_filters=80, kernel_size=3, num_repeat=3, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=80, output_filters=112, kernel_size=5, num_repeat=3, expand_ratio=6, strides=(1, 1), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=112, output_filters=192, kernel_size=5, num_repeat=4, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=192, output_filters=320, kernel_size=3, num_repeat=1, expand_ratio=6, strides=(1, 1), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise')), stem_base_filters=32, top_base_filters=1280, activation='simple_swish', batch_norm='default', bn_momentum=0.99, bn_epsilon=0.001, weight_decay=5e-06, drop_connect_rate=0.2, depth_divisor=8, min_depth=None, use_se=True, input_channels=3, num_classes=1000, model_name='efficientnet', rescale_input=False, data_format='channels_last', dtype='float32')
WARNING:tensorflow:From /home/tensorflow/.local/lib/python3.8/site-packages/object_detection/model_lib_v2.py:563: StrategyBase.experimental_distribute_datasets_from_function (from tensorflow.python.distribute.distribute_lib) is deprecated and will be removed in a future version.
Instructions for updating:
rename to distribute_datasets_from_function
W0622 03:08:08.973673 140015840864064 deprecation.py:350] From /home/tensorflow/.local/lib/python3.8/site-packages/object_detection/model_lib_v2.py:563: StrategyBase.experimental_distribute_datasets_from_function (from tensorflow.python.distribute.distribute_lib) is deprecated and will be removed in a future version.
Instructions for updating:
rename to distribute_datasets_from_function
INFO:tensorflow:Reading unweighted datasets: ['object_detection/training/train.record']
I0622 03:08:08.980458 140015840864064 dataset_builder.py:162] Reading unweighted datasets: ['object_detection/training/train.record']
INFO:tensorflow:Reading record datasets for input file: ['object_detection/training/train.record']
I0622 03:08:08.980628 140015840864064 dataset_builder.py:79] Reading record datasets for input file: ['object_detection/training/train.record']
INFO:tensorflow:Number of filenames to read: 0
I0622 03:08:08.980718 140015840864064 dataset_builder.py:80] Number of filenames to read: 0
Traceback (most recent call last):
File "object_detection/model_main_tf2.py", line 120, in
tf.compat.v1.app.run()
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/platform/app.py", line 36, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/usr/local/lib/python3.8/dist-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "object_detection/model_main_tf2.py", line 111, in main
model_lib_v2.train_loop(
File "/home/tensorflow/.local/lib/python3.8/site-packages/object_detection/model_lib_v2.py", line 563, in train_loop
train_input = strategy.experimental_distribute_datasets_from_function(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/util/deprecation.py", line 357, in new_func
return func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/distribute_lib.py", line 1195, in experimental_distribute_datasets_from_function
return self.distribute_datasets_from_function(dataset_fn, options)
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/distribute_lib.py", line 1186, in distribute_datasets_from_function
return self._extended._distribute_datasets_from_function( # pylint: disable=protected-access
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/mirrored_strategy.py", line 593, in _distribute_datasets_from_function
return input_util.get_distributed_datasets_from_function(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/input_util.py", line 132, in get_distributed_datasets_from_function
return input_lib.DistributedDatasetsFromFunction(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/input_lib.py", line 1372, in init
self.build()
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/input_lib.py", line 1393, in build
_create_datasets_from_function_with_input_context(
File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/distribute/input_lib.py", line 1875, in _create_datasets_from_function_with_input_context
dataset = dataset_fn(ctx)
File "/home/tensorflow/.local/lib/python3.8/site-packages/object_detection/model_lib_v2.py", line 554, in train_dataset_fn
train_input = inputs.train_input(
File "/home/tensorflow/.local/lib/python3.8/site-packages/object_detection/inputs.py", line 908, in train_input
dataset = INPUT_BUILDER_UTIL_MAP['dataset_build'](
File "/home/tensorflow/.local/lib/python3.8/site-packages/object_detection/builders/dataset_builder.py", line 243, in build
dataset = read_dataset(
File "/home/tensorflow/.local/lib/python3.8/site-packages/object_detection/builders/dataset_builder.py", line 163, in read_dataset
return _read_dataset_internal(file_read_func, input_files,
File "/home/tensorflow/.local/lib/python3.8/site-packages/object_detection/builders/dataset_builder.py", line 82, in _read_dataset_internal
raise RuntimeError('Did not find any input files matching the glob pattern '
RuntimeError: Did not find any input files matching the glob pattern ['object_detection/training/train.record']

Expected behavior
Based on the instructions I should see some form of image learning begin to occur but instead I receive a series of messages and errors suggesting the process has halted or failed.

**Desktop **

Windows 11 Pro
Chrome
102.0.5005.115

Additional context
At one point during an attempt I received a slightly different message, however after trying some work arounds to build the Nvidia Docker these messages have not re-appeared in following attempts...

tensorflow@943f2e0f8488:~/models/research$ python object_detection/model_main_tf2.py --pipeline_config_path=object_detection/training/ssd_efficientdet_d0_512x512_coco17_tpu-8.config --model_dir=object_detection/training/ --alsologtostderr
WARNING:tensorflow:There are non-GPU devices in tf.distribute.Strategy, not using nccl allreduce.
W0622 00:20:06.607538 140269787281216 cross_device_ops.py:1386] There are non-GPU devices in tf.distribute.Strategy, not using nccl allreduce.
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0',)
I0622 00:20:06.611269 140269787281216 mirrored_strategy.py:374] Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0',)
INFO:tensorflow:Maybe overwriting train_steps: None
I0622 00:20:06.613795 140269787281216 config_util.py:552] Maybe overwriting train_steps: None
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0622 00:20:06.613889 140269787281216 config_util.py:552] Maybe overwriting use_bfloat16: False
I0622 00:20:06.618684 140269787281216 ssd_efficientnet_bifpn_feature_extractor.py:145] EfficientDet EfficientNet backbone version: efficientnet-b0
I0622 00:20:06.618793 140269787281216 ssd_efficientnet_bifpn_feature_extractor.py:147] EfficientDet BiFPN num filters: 64
I0622 00:20:06.618869 140269787281216 ssd_efficientnet_bifpn_feature_extractor.py:148] EfficientDet BiFPN num iterations: 3
I0622 00:20:06.621786 140269787281216 efficientnet_model.py:143] round_filter input=32 output=32
I0622 00:20:06.710441 140269787281216 efficientnet_model.py:143] round_filter input=32 output=32
I0622 00:20:06.710567 140269787281216 efficientnet_model.py:143] round_filter input=16 output=16
I0622 00:20:06.767254 140269787281216 efficientnet_model.py:143] round_filter input=16 output=16
I0622 00:20:06.767369 140269787281216 efficientnet_model.py:143] round_filter input=24 output=24
I0622 00:20:06.913850 140269787281216 efficientnet_model.py:143] round_filter input=24 output=24
I0622 00:20:06.913977 140269787281216 efficientnet_model.py:143] round_filter input=40 output=40
I0622 00:20:07.055300 140269787281216 efficientnet_model.py:143] round_filter input=40 output=40
I0622 00:20:07.055412 140269787281216 efficientnet_model.py:143] round_filter input=80 output=80
I0622 00:20:07.269554 140269787281216 efficientnet_model.py:143] round_filter input=80 output=80
I0622 00:20:07.269668 140269787281216 efficientnet_model.py:143] round_filter input=112 output=112
I0622 00:20:07.485285 140269787281216 efficientnet_model.py:143] round_filter input=112 output=112
I0622 00:20:07.485399 140269787281216 efficientnet_model.py:143] round_filter input=192 output=192
I0622 00:20:07.789512 140269787281216 efficientnet_model.py:143] round_filter input=192 output=192
I0622 00:20:07.789628 140269787281216 efficientnet_model.py:143] round_filter input=320 output=320
I0622 00:20:07.861017 140269787281216 efficientnet_model.py:143] round_filter input=1280 output=1280
I0622 00:20:07.895739 140269787281216 efficientnet_model.py:453] Building model efficientnet with params ModelConfig(width_coefficient=1.0, depth_coefficient=1.0, resolution=224, dropout_rate=0.2, blocks=(BlockConfig(input_filters=32, output_filters=16, kernel_size=3, num_repeat=1, expand_ratio=1, strides=(1, 1), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=16, output_filters=24, kernel_size=3, num_repeat=2, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=24, output_filters=40, kernel_size=5, num_repeat=2, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=40, output_filters=80, kernel_size=3, num_repeat=3, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=80, output_filters=112, kernel_size=5, num_repeat=3, expand_ratio=6, strides=(1, 1), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=112, output_filters=192, kernel_size=5, num_repeat=4, expand_ratio=6, strides=(2, 2), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise'), BlockConfig(input_filters=192, output_filters=320, kernel_size=3, num_repeat=1, expand_ratio=6, strides=(1, 1), se_ratio=0.25, id_skip=True, fused_conv=False, conv_type='depthwise')), stem_base_filters=32, top_base_filters=1280, activation='simple_swish', batch_norm='default', bn_momentum=0.99, bn_epsilon=0.001, weight_decay=5e-06, drop_connect_rate=0.2, depth_divisor=8, min_depth=None, use_se=True, input_channels=3, num_classes=1000, model_name='efficientnet', rescale_input=False, data_format='channels_last', dtype='float32')
WARNING:tensorflow:From /home/tensorflow/.local/lib/python3.8/site-packages/object_detection/model_lib_v2.py:563: StrategyBase.experimental_distribute_datasets_from_function (from tensorflow.python.distribute.distribute_lib) is deprecated and will be removed in a future version.
Instructions for updating:
rename to distribute_datasets_from_function
W0622 00:20:07.921413 140269787281216 deprecation.py:350] From /home/tensorflow/.local/lib/python3.8/site-packages/object_detection/model_lib_v2.py:563: StrategyBase.experimental_distribute_datasets_from_function (from tensorflow.python.distribute.distribute_lib) is deprecated and will be removed in a future version.
Instructions for updating:
rename to distribute_datasets_from_function
INFO:tensorflow:Reading unweighted datasets: ['object_detection/training/train.record']
I0622 00:20:07.925122 140269787281216 dataset_builder.py:162] Reading unweighted datasets: ['object_detection/training/train.record']
INFO:tensorflow:Reading record datasets for input file: ['object_detection/training/train.record']
I0622 00:20:07.925260 140269787281216 dataset_builder.py:79] Reading record datasets for input file: ['object_detection/training/train.record']
INFO:tensorflow:Number of filenames to read: 1
I0622 00:20:07.925341 140269787281216 dataset_builder.py:80] Number of filenames to read: 1
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W0622 00:20:07.925419 140269787281216 dataset_builder.py:86] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /home/tensorflow/.local/lib/python3.8/site-packages/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.deterministic.
W0622 00:20:07.926657 140269787281216 deprecation.py:350] From /home/tensorflow/.local/lib/python3.8/site-packages/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.AUTOTUNE) instead. If sloppy execution is desired, use tf.data.Options.deterministic.
WARNING:tensorflow:From /home/tensorflow/.local/lib/python3.8/site-packages/object_detection/builders/dataset_builder.py:235: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.data.Dataset.map() W0622 00:20:07.940346 140269787281216 deprecation.py:350] From /home/tensorflow/.local/lib/python3.8/site-packages/object_detection/builders/dataset_builder.py:235: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version. Instructions for updating: Use tf.data.Dataset.map()
WARNING:tensorflow:From /home/tensorflow/.local/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:1082: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
W0622 00:20:12.002727 140269787281216 deprecation.py:350] From /home/tensorflow/.local/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:1082: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a tf.sparse.SparseTensor and use tf.sparse.to_dense instead.
WARNING:tensorflow:From /home/tensorflow/.local/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:1082: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
W0622 00:20:14.406469 140269787281216 deprecation.py:350] From /home/tensorflow/.local/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py:1082: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.cast instead.
/home/tensorflow/.local/lib/python3.8/site-packages/keras/backend.py:450: UserWarning: tf.keras.backend.set_learning_phase is deprecated and will be removed after 2020-10-11. To update it, simply pass a True/False value to the training argument of the __call__ method of your layer or model.
warnings.warn('tf.keras.backend.set_learning_phase is deprecated and '
WARNING:tensorflow:From /home/tensorflow/.local/lib/python3.8/site-packages/tensorflow/python/util/deprecation.py:629: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Use fn_output_signature instead
W0622 00:20:38.413714 140262147876608 deprecation.py:554] From /home/tensorflow/.local/lib/python3.8/site-packages/tensorflow/python/util/deprecation.py:629: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Use fn_output_signature instead
WARNING:tensorflow:Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss. If you're using model.compile(), did you forget to provide a lossargument?
W0622 00:20:45.288741 140262147876608 utils.py:76] Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss. If you're using model.compile(), did you forget to provide a lossargument?
WARNING:tensorflow:Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss. If you're using model.compile(), did you forget to provide a lossargument?
W0622 00:20:53.888063 140262147876608 utils.py:76] Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss. If you're using model.compile(), did you forget to provide a lossargument?
WARNING:tensorflow:Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss. If you're using model.compile(), did you forget to provide a lossargument?
W0622 00:21:02.133720 140262147876608 utils.py:76] Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss. If you're using model.compile(), did you forget to provide a lossargument?
WARNING:tensorflow:Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss. If you're using model.compile(), did you forget to provide a lossargument?
W0622 00:21:12.010699 140262147876608 utils.py:76] Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss. If you're using model.compile(), did you forget to provide a lossargument?
Killed

label map path issue

Having changed all lines in the config file, when I run the training i get the following error which makes me think there is another label map path being called?

this is my input

Does the docker file work on the Mac?

Firstly, thank you for making a very 2020-easy-to-setup version of the Tensorflow Object Detection API. Most tutorials make the setup process a huge pain in the ass.

I'm trying to run your docker file on my mac. Does it not work on a Mac? I'm getting the following error when I let docker-compose up run fully:

Successfully built 9365a006aa6c
Successfully tagged docker_tensorflow_object_detection_api:latest
Creating tensorflow_object_detection_api ... error

ERROR: for tensorflow_object_detection_api  Cannot create container for service tensorflow_object_detection_api: Unknown runtime specified nvidia

ERROR: for tensorflow_object_detection_api  Cannot create container for service tensorflow_object_detection_api: Unknown runtime specified nvidia
ERROR: Encountered errors while bringing up the project.

Can this only be run on Linux machines or can it be run on the Mac as well.

python -m pip install . error

ERROR: Command errored out with exit status 1:
command: 'C:\Users\mskak\AppData\Local\Programs\Python\Python37\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\mskak\AppData\Local\Temp\pip-install-s5ovt5o5\pycocotools\setup.py'"'"'; file='"'"'C:\Users\mskak\AppData\Local\Temp\pip-install-s5ovt5o5\pycocotools\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\mskak\AppData\Local\Temp\pip-record-ngx97fie\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\mskak\AppData\Local\Programs\Python\Python37\Include\pycocotools'
cwd: C:\Users\mskak\AppData\Local\Temp\pip-install-s5ovt5o5\pycocotools
Complete output (21 lines):
running install
running build
running build_py
creating build
creating build\lib.win-amd64-3.7
creating build\lib.win-amd64-3.7\pycocotools
copying pycocotools\coco.py -> build\lib.win-amd64-3.7\pycocotools
copying pycocotools\cocoeval.py -> build\lib.win-amd64-3.7\pycocotools
copying pycocotools\mask.py -> build\lib.win-amd64-3.7\pycocotools
copying pycocotools_init_.py -> build\lib.win-amd64-3.7\pycocotools
running build_ext
skipping 'pycocotools_mask.c' Cython extension (up-to-date)
building 'pycocotools._mask' extension
creating build\temp.win-amd64-3.7
creating build\temp.win-amd64-3.7\Release
creating build\temp.win-amd64-3.7\Release\common
creating build\temp.win-amd64-3.7\Release\pycocotools
C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Users\mskak\AppData\Local\Programs\Python\Python37\lib\site-packages\numpy\core\include -I./common -IC:\Users\mskak\AppData\Local\Programs\Python\Python37\include -IC:\Users\mskak\AppData\Local\Programs\Python\Python37\include "-IC:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\include" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.7.2\include\um" /Tc./common/maskApi.c /Fobuild\temp.win-amd64-3.7\Release./common/maskApi.obj
maskApi.c
./common/maskApi.c(8): fatal error C1083: Cannot open include file: 'math.h': No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.27.29110\bin\HostX86\x64\cl.exe' failed with exit status 2
----------------------------------------
ERROR: Command errored out with exit status 1: 'C:\Users\mskak\AppData\Local\Programs\Python\Python37\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\Users\mskak\AppData\Local\Temp\pip-install-s5ovt5o5\pycocotools\setup.py'"'"'; file='"'"'C:\Users\mskak\AppData\Local\Temp\pip-install-s5ovt5o5\pycocotools\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record 'C:\Users\mskak\AppData\Local\Temp\pip-record-ngx97fie\install-record.txt' --single-version-externally-managed --compile --install-headers 'C:\Users\mskak\AppData\Local\Programs\Python\Python37\Include\pycocotools' Check the logs for full command output.

File "...\site-packages\tensorflow\python\training\py_checkpoint_reader.py", line 48, in error_translator raise errors_impl.OpError(None, None, error_message, errors_impl.UNKNOWN) tensorflow.python.framework.errors_impl.OpError: not an sstable (bad magic number)

Hi, I followed your article but got this error. I think there is something wrong with the checkpoint file? Are there other steps needed before I'd be able to use the model from http://download.tensorflow.org/models/object_detection/tf2/20200711/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8.tar.gz?

I changed this line.
fine_tune_checkpoint: "training/mask_rcnn_inception_resnet_v2_1024x1024_coco17_gpu-8/checkpoint/ckpt-0"

Thanks!

Supported Model / Compatibility with USB-Accelerator

Hey,

I read that TensorFlow Lite only supports the SSD models from the Zoo 2 model, for example the SSD MobileNet V2 FPNLite 640x640. My question would be do you have a tutorial to convert it to a TFLite model and can it be run with the hardware accelerator (USB stick) from Google Coral?

Training Aborts with 1 checkpoint and WARNING:tensorflow:Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss. W0206 04:50:37.530761 140648355514112 utils.py:83] Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss.

I am running your Colab file on my Colab Pro account with 32GB and GPU enable.I have set the following parameters

batch_size = 1
num_steps = 8
num_eval_steps = 1

Its running fine till the creation of 1st checkpoint, but after that it just stops with this on the output cell.

W0206 04:50:15.624153 140651468793728 util.py:169] A checkpoint was restored (e.g. tf.train.Checkpoint.restore or tf.keras.Model.load_weights) but not all checkpointed values were used. See above for specific issues. Use expect_partial() on the load status object, e.g. tf.train.Checkpoint.restore(...).expect_partial(), to silence these warnings, or use assert_consumed() to make the check explicit. See https://www.tensorflow.org/guide/checkpoint#loading_mechanics for details.
WARNING:tensorflow:Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss.
W0206 04:50:25.243972 140648355514112 utils.py:83] Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:605: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Use fn_output_signature instead
W0206 04:50:26.765114 140648355514112 deprecation.py:537] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:605: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Use fn_output_signature instead
WARNING:tensorflow:Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss.
W0206 04:50:37.530761 140648355514112 utils.py:83] Gradients do not exist for variables ['top_bn/gamma:0', 'top_bn/beta:0'] when minimizing the loss.

Please help

accuracy calculation

Is their any command to get accuracy of model that I have created using tensorflow 1.13 for custom object.

Training Epoch stuck at (1/2)

Question: If I restart the training, will the training start off from the last checkpoint?

Hi Tanner

Like you in your video, I stopped the training and tested the predictions. I was not happy with the results so I want to restart the training.

It looks like it restarts from the last checkpoint but it was not clear to me if that was the case from the tensor board.

Thanks again for a great blog post and repo to learn from.

CuDNN version not up-to-date?

Describe the bug
Your out-of-the-box example does not work anymore (in Colab?) because the CudNN Library version does not match.

To Reproduce

Just run your own notebook example in Colab.

If I do not install Tensorflow 2.6.0 like requested in the example notebook, it works. This is maybe the Google colab out-of-the-box version 2.7.0. uses the correct version?

Expected behavior
The error should not appear while running the training session.

Log-File with error

2021-12-14 13:30:57.846447: E tensorflow/stream_executor/cuda/cuda_dnn.cc:359] Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
2021-12-14 13:30:57.847597: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at conv_ops.cc:1120 : UNKNOWN: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
Traceback (most recent call last):
  File "/content/models/research/object_detection/model_main_tf2.py", line 115, in <module>
    tf.compat.v1.app.run()
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 303, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "/content/models/research/object_detection/model_main_tf2.py", line 112, in main
    record_summaries=FLAGS.record_summaries)
  File "/usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py", line 609, in train_loop
    train_input, unpad_groundtruth_tensors)
  File "/usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py", line 400, in load_fine_tune_checkpoint
    _ensure_model_is_built(model, input_dataset, unpad_groundtruth_tensors)
  File "/usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py", line 178, in _ensure_model_is_built
    labels,
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py", line 1316, in run
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/distribute_lib.py", line 2892, in call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/mirrored_strategy.py", line 678, in _call_for_each_replica
    self._container_strategy(), fn, args, kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/distribute/mirrored_run.py", line 86, in call_for_each_replica
    return wrapped(args, kwargs)
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node EfficientDet-D0/model/stem_conv2d/Conv2D
 (defined at /usr/local/lib/python3.7/dist-packages/keras/layers/convolutional.py:238)
]] [Op:__inference__dummy_computation_fn_32281]

Errors may have originated from an input operation.
Input Source operations connected to node EfficientDet-D0/model/stem_conv2d/Conv2D:
In[0] args_1 (defined at /usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py:178)	
In[1] EfficientDet-D0/model/stem_conv2d/Conv2D/ReadVariableOp:

Operation defined at: (most recent call last)
>>>   File "/usr/lib/python3.7/threading.py", line 890, in _bootstrap
>>>     self._bootstrap_inner()
>>> 
>>>   File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
>>>     self.run()
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py", line 170, in _dummy_computation_fn
>>>     return _compute_losses_and_predictions_dicts(model, features, labels,
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/object_detection/model_lib_v2.py", line 123, in _compute_losses_and_predictions_dicts
>>>     prediction_dict = model.predict(
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/object_detection/meta_architectures/ssd_meta_arch.py", line 569, in predict
>>>     if self._feature_extractor.is_keras_model:
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/object_detection/meta_architectures/ssd_meta_arch.py", line 570, in predict
>>>     feature_maps = self._feature_extractor(preprocessed_inputs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 64, in error_handler
>>>     return fn(*args, **kwargs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py", line 1083, in __call__
>>>     outputs = call_fn(inputs, *args, **kwargs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 92, in error_handler
>>>     return fn(*args, **kwargs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/object_detection/meta_architectures/ssd_meta_arch.py", line 251, in call
>>>     return self._extract_features(inputs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/object_detection/models/ssd_efficientnet_bifpn_feature_extractor.py", line 225, in _extract_features
>>>     base_feature_maps = self._efficientnet(
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 64, in error_handler
>>>     return fn(*args, **kwargs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py", line 1083, in __call__
>>>     outputs = call_fn(inputs, *args, **kwargs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 92, in error_handler
>>>     return fn(*args, **kwargs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py", line 452, in call
>>>     inputs, training=training, mask=mask)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/engine/functional.py", line 589, in _run_internal_graph
>>>     outputs = node.layer(*args, **kwargs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 64, in error_handler
>>>     return fn(*args, **kwargs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py", line 1083, in __call__
>>>     outputs = call_fn(inputs, *args, **kwargs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 92, in error_handler
>>>     return fn(*args, **kwargs)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/layers/convolutional.py", line 246, in call
>>>     outputs = self.convolution_op(inputs, self.kernel)
>>> 
>>>   File "/usr/local/lib/python3.7/dist-packages/keras/layers/convolutional.py", line 238, in convolution_op
>>>     name=self.__class__.__name__)
>>>

Nevertheless: many thanks for making this great example available!

ImportError: cannot import name 'abs'

When i execute this command "python object_detection/builders/model_builder_tf2_test.py" i get an error like this.

using tensorflow-gpu

With "python -m pip install ." command we install tensorflow but not tensorflow-gpu. How can i use it?

Docker Error: The following signatures couldn't be verified because the public key is not available

Describe the bug
The docker installation is failing because the public key for coda cannot be verified, error below:

To Reproduce

models git:(master) ✗ docker build -f research/object_detection/dockerfiles/tf2/Dockerfile -t od .
[+] Building 4.5s (6/15)
 => [internal] load build definition from Dockerfile                                                                                                                                           0.0s
 => => transferring dockerfile: 37B                                                                                                                                                            0.0s
 => [internal] load .dockerignore                                                                                                                                                              0.0s
 => => transferring context: 2B                                                                                                                                                                0.0s
 => [internal] load metadata for docker.io/tensorflow/tensorflow:2.2.0-gpu                                                                                                                     0.7s
 => CACHED [ 1/11] FROM docker.io/tensorflow/tensorflow:2.2.0-gpu@sha256:3f8f06cdfbc09c54568f191bbc54419b348ecc08dc5e031a53c22c6bba0a252e                                                      0.0s
 => [internal] load build context                                                                                                                                                              0.2s
 => => transferring context: 264.21kB                                                                                                                                                          0.2s
 => ERROR [ 2/11] RUN apt-get update && apt-get install -y     git     gpg-agent     python3-cairocffi     protobuf-compiler     python3-pil     python3-lxml     python3-tk     wget          3.6s
------
 > [ 2/11] RUN apt-get update && apt-get install -y     git     gpg-agent     python3-cairocffi     protobuf-compiler     python3-pil     python3-lxml     python3-tk     wget:
#5 0.537 Get:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease [1581 B]
#5 0.595 Hit:2 http://archive.ubuntu.com/ubuntu bionic InRelease
#5 0.625 Get:3 http://archive.ubuntu.com/ubuntu bionic-updates InRelease [88.7 kB]
#5 0.634 Err:1 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease
#5 0.634   The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
#5 0.659 Ign:4 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
#5 0.676 Get:5 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release [564 B]
#5 0.693 Get:6 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Release.gpg [833 B]
#5 0.704 Get:7 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
#5 0.731 Get:8 http://archive.ubuntu.com/ubuntu bionic-backports InRelease [74.6 kB]
#5 0.818 Get:9 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  Packages [73.8 kB]
#5 0.914 Get:10 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 Packages [3221 kB]
#5 1.105 Get:11 http://archive.ubuntu.com/ubuntu bionic-updates/multiverse amd64 Packages [29.8 kB]
#5 1.106 Get:12 http://archive.ubuntu.com/ubuntu bionic-updates/universe amd64 Packages [2278 kB]
#5 1.156 Get:13 http://archive.ubuntu.com/ubuntu bionic-updates/restricted amd64 Packages [986 kB]
#5 1.172 Get:14 http://archive.ubuntu.com/ubuntu bionic-backports/universe amd64 Packages [12.9 kB]
#5 1.172 Get:15 http://archive.ubuntu.com/ubuntu bionic-backports/main amd64 Packages [12.2 kB]
#5 1.280 Get:16 http://security.ubuntu.com/ubuntu bionic-security/multiverse amd64 Packages [22.8 kB]
#5 1.422 Get:17 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [2781 kB]
#5 2.126 Get:18 http://security.ubuntu.com/ubuntu bionic-security/restricted amd64 Packages [949 kB]
#5 2.174 Get:19 http://security.ubuntu.com/ubuntu bionic-security/universe amd64 Packages [1503 kB]
#5 2.451 Reading package lists...
#5 3.560 W: GPG error: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A4B469963BF863CC
#5 3.560 E: The repository 'https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease' is no longer signed.
------
executor failed running [/bin/bash -c apt-get update && apt-get install -y     git     gpg-agent     python3-cairocffi     protobuf-compiler     python3-pil     python3-lxml     python3-tk     wget]: exit code: 100

Desktop (please complete the following information):

OS: macOS Big Sur
Version 11.6.1

Val_Los SSD Mobilenet V2 is this normal?

Hello I am using SSD_Mobilenet V2 from the tensorflow 2 model zoo
I use SSD Mobilenet V2 because I want to make realtime detection using Android Later.
I am using google colab GPU to train and follow all your instruction
my batch size is 32
and 8000 step

my dataset is form google open image V6
1000 human_body
1000 bicycle
1000 car
1000 motorcycle
336 Stop_sign

I dont understand is this really okay?

this is the training progress
I0912 11:40:27.956218 139735407601536 model_lib_v2.py:652] Step 100 per-step time 0.813s loss=1.396
INFO:tensorflow:Step 200 per-step time 0.831s loss=1.192
I0912 11:41:51.976096 139735407601536 model_lib_v2.py:652] Step 200 per-step time 0.831s loss=1.192
INFO:tensorflow:Step 300 per-step time 0.844s loss=1.346
I0912 11:43:14.231966 139735407601536 model_lib_v2.py:652] Step 300 per-step time 0.844s loss=1.346
INFO:tensorflow:Step 400 per-step time 0.771s loss=1.360
I0912 11:44:32.873070 139735407601536 model_lib_v2.py:652] Step 400 per-step time 0.771s loss=1.360
INFO:tensorflow:Step 500 per-step time 0.701s loss=1.444
I0912 11:45:50.882271 139735407601536 model_lib_v2.py:652] Step 500 per-step time 0.701s loss=1.444
INFO:tensorflow:Step 600 per-step time 0.827s loss=1.204
I0912 11:47:12.726692 139735407601536 model_lib_v2.py:652] Step 600 per-step time 0.827s loss=1.204
INFO:tensorflow:Step 700 per-step time 0.736s loss=1.774
I0912 11:48:34.281836 139735407601536 model_lib_v2.py:652] Step 700 per-step time 0.736s loss=1.774
INFO:tensorflow:Step 800 per-step time 0.748s loss=2.155
I0912 11:49:56.143587 139735407601536 model_lib_v2.py:652] Step 800 per-step time 0.748s loss=2.155
INFO:tensorflow:Step 900 per-step time 0.813s loss=2.374
I0912 11:51:17.713723 139735407601536 model_lib_v2.py:652] Step 900 per-step time 0.813s loss=2.374
INFO:tensorflow:Step 1000 per-step time 0.824s loss=2.285
I0912 11:52:35.221740 139735407601536 model_lib_v2.py:652] Step 1000 per-step time 0.824s loss=2.285
INFO:tensorflow:Step 1100 per-step time 0.775s loss=2.309
I0912 11:53:52.562232 139735407601536 model_lib_v2.py:652] Step 1100 per-step time 0.775s loss=2.309
INFO:tensorflow:Step 1200 per-step time 0.757s loss=2.085
I0912 11:55:10.645867 139735407601536 model_lib_v2.py:652] Step 1200 per-step time 0.757s loss=2.085
INFO:tensorflow:Step 1300 per-step time 0.795s loss=2.252
I0912 11:56:29.934082 139735407601536 model_lib_v2.py:652] Step 1300 per-step time 0.795s loss=2.252
INFO:tensorflow:Step 1400 per-step time 0.778s loss=2.198
I0912 11:57:50.456187 139735407601536 model_lib_v2.py:652] Step 1400 per-step time 0.778s loss=2.198
INFO:tensorflow:Step 1500 per-step time 0.912s loss=1.814
I0912 11:59:12.039975 139735407601536 model_lib_v2.py:652] Step 1500 per-step time 0.912s loss=1.814
INFO:tensorflow:Step 1600 per-step time 0.890s loss=2.095
I0912 12:00:32.703614 139735407601536 model_lib_v2.py:652] Step 1600 per-step time 0.890s loss=2.095
INFO:tensorflow:Step 1700 per-step time 0.931s loss=1.791
I0912 12:01:55.470209 139735407601536 model_lib_v2.py:652] Step 1700 per-step time 0.931s loss=1.791
INFO:tensorflow:Step 1800 per-step time 0.832s loss=1.960
I0912 12:03:16.882377 139735407601536 model_lib_v2.py:652] Step 1800 per-step time 0.832s loss=1.960
INFO:tensorflow:Step 1900 per-step time 0.834s loss=1.919
I0912 12:04:38.863093 139735407601536 model_lib_v2.py:652] Step 1900 per-step time 0.834s loss=1.919
INFO:tensorflow:Step 2000 per-step time 0.781s loss=1.836
I0912 12:06:00.716773 139735407601536 model_lib_v2.py:652] Step 2000 per-step time 0.781s loss=1.836
INFO:tensorflow:Step 2100 per-step time 0.784s loss=1.699
I0912 12:07:23.374910 139735407601536 model_lib_v2.py:652] Step 2100 per-step time 0.784s loss=1.699
INFO:tensorflow:Step 2200 per-step time 0.742s loss=1.667
I0912 12:08:45.287693 139735407601536 model_lib_v2.py:652] Step 2200 per-step time 0.742s loss=1.667
INFO:tensorflow:Step 2300 per-step time 0.885s loss=1.775
I0912 12:10:07.288780 139735407601536 model_lib_v2.py:652] Step 2300 per-step time 0.885s loss=1.775
INFO:tensorflow:Step 2400 per-step time 0.802s loss=1.972
I0912 12:11:27.960621 139735407601536 model_lib_v2.py:652] Step 2400 per-step time 0.802s loss=1.972
INFO:tensorflow:Step 2500 per-step time 0.870s loss=1.679
I0912 12:12:52.460410 139735407601536 model_lib_v2.py:652] Step 2500 per-step time 0.870s loss=1.679
INFO:tensorflow:Step 2600 per-step time 0.855s loss=1.824
I0912 12:14:17.548085 139735407601536 model_lib_v2.py:652] Step 2600 per-step time 0.855s loss=1.824
INFO:tensorflow:Step 2700 per-step time 0.850s loss=1.824
I0912 12:15:39.991292 139735407601536 model_lib_v2.py:652] Step 2700 per-step time 0.850s loss=1.824
INFO:tensorflow:Step 2800 per-step time 0.907s loss=1.557
I0912 12:17:01.983814 139735407601536 model_lib_v2.py:652] Step 2800 per-step time 0.907s loss=1.557
INFO:tensorflow:Step 2900 per-step time 0.712s loss=1.685
I0912 12:18:25.104320 139735407601536 model_lib_v2.py:652] Step 2900 per-step time 0.712s loss=1.685
INFO:tensorflow:Step 3000 per-step time 0.896s loss=1.839
I0912 12:19:47.908390 139735407601536 model_lib_v2.py:652] Step 3000 per-step time 0.896s loss=1.839
INFO:tensorflow:Step 3100 per-step time 0.840s loss=1.662
I0912 12:21:11.121691 139735407601536 model_lib_v2.py:652] Step 3100 per-step time 0.840s loss=1.662
INFO:tensorflow:Step 3200 per-step time 0.923s loss=1.586
I0912 12:22:33.857348 139735407601536 model_lib_v2.py:652] Step 3200 per-step time 0.923s loss=1.586
INFO:tensorflow:Step 3300 per-step time 0.739s loss=1.439
I0912 12:23:53.232469 139735407601536 model_lib_v2.py:652] Step 3300 per-step time 0.739s loss=1.439
INFO:tensorflow:Step 3400 per-step time 0.734s loss=1.660
I0912 12:25:11.282731 139735407601536 model_lib_v2.py:652] Step 3400 per-step time 0.734s loss=1.660
INFO:tensorflow:Step 3500 per-step time 0.932s loss=1.530
I0912 12:26:33.676065 139735407601536 model_lib_v2.py:652] Step 3500 per-step time 0.932s loss=1.530
INFO:tensorflow:Step 3600 per-step time 0.880s loss=1.334
I0912 12:27:55.216676 139735407601536 model_lib_v2.py:652] Step 3600 per-step time 0.880s loss=1.334
INFO:tensorflow:Step 3700 per-step time 0.832s loss=1.363
I0912 12:29:19.527122 139735407601536 model_lib_v2.py:652] Step 3700 per-step time 0.832s loss=1.363
INFO:tensorflow:Step 3800 per-step time 0.766s loss=1.343
I0912 12:30:41.062087 139735407601536 model_lib_v2.py:652] Step 3800 per-step time 0.766s loss=1.343
INFO:tensorflow:Step 3900 per-step time 0.764s loss=1.710
I0912 12:32:04.439736 139735407601536 model_lib_v2.py:652] Step 3900 per-step time 0.764s loss=1.710
INFO:tensorflow:Step 4000 per-step time 0.827s loss=1.531
I0912 12:33:28.149232 139735407601536 model_lib_v2.py:652] Step 4000 per-step time 0.827s loss=1.531
INFO:tensorflow:Step 4100 per-step time 0.949s loss=1.545
I0912 12:34:55.035795 139735407601536 model_lib_v2.py:652] Step 4100 per-step time 0.949s loss=1.545
INFO:tensorflow:Step 4200 per-step time 0.912s loss=1.541
I0912 12:36:20.388214 139735407601536 model_lib_v2.py:652] Step 4200 per-step time 0.912s loss=1.541
INFO:tensorflow:Step 4300 per-step time 0.865s loss=1.448
I0912 12:37:44.852299 139735407601536 model_lib_v2.py:652] Step 4300 per-step time 0.865s loss=1.448
INFO:tensorflow:Step 4400 per-step time 0.833s loss=1.298
I0912 12:39:09.155400 139735407601536 model_lib_v2.py:652] Step 4400 per-step time 0.833s loss=1.298
INFO:tensorflow:Step 4500 per-step time 0.827s loss=1.421
I0912 12:40:32.768919 139735407601536 model_lib_v2.py:652] Step 4500 per-step time 0.827s loss=1.421
INFO:tensorflow:Step 4600 per-step time 0.778s loss=1.347
I0912 12:41:53.501947 139735407601536 model_lib_v2.py:652] Step 4600 per-step time 0.778s loss=1.347
INFO:tensorflow:Step 4700 per-step time 0.826s loss=1.466
I0912 12:43:13.038845 139735407601536 model_lib_v2.py:652] Step 4700 per-step time 0.826s loss=1.466
INFO:tensorflow:Step 4800 per-step time 0.755s loss=1.288
I0912 12:44:31.372616 139735407601536 model_lib_v2.py:652] Step 4800 per-step time 0.755s loss=1.288
INFO:tensorflow:Step 4900 per-step time 0.933s loss=1.212
I0912 12:45:51.460758 139735407601536 model_lib_v2.py:652] Step 4900 per-step time 0.933s loss=1.212
INFO:tensorflow:Step 5000 per-step time 0.813s loss=1.464
I0912 12:47:13.726693 139735407601536 model_lib_v2.py:652] Step 5000 per-step time 0.813s loss=1.464
INFO:tensorflow:Step 5100 per-step time 0.854s loss=1.244
I0912 12:48:35.392923 139735407601536 model_lib_v2.py:652] Step 5100 per-step time 0.854s loss=1.244
INFO:tensorflow:Step 5200 per-step time 0.679s loss=1.202
I0912 12:49:53.332885 139735407601536 model_lib_v2.py:652] Step 5200 per-step time 0.679s loss=1.202
INFO:tensorflow:Step 5300 per-step time 0.713s loss=1.580
I0912 12:51:10.413735 139735407601536 model_lib_v2.py:652] Step 5300 per-step time 0.713s loss=1.580
INFO:tensorflow:Step 5400 per-step time 0.812s loss=1.327
I0912 12:52:29.304504 139735407601536 model_lib_v2.py:652] Step 5400 per-step time 0.812s loss=1.327
INFO:tensorflow:Step 5500 per-step time 0.802s loss=1.397
I0912 12:53:49.318393 139735407601536 model_lib_v2.py:652] Step 5500 per-step time 0.802s loss=1.397
INFO:tensorflow:Step 5600 per-step time 0.792s loss=1.503
I0912 12:55:11.936688 139735407601536 model_lib_v2.py:652] Step 5600 per-step time 0.792s loss=1.503
INFO:tensorflow:Step 5700 per-step time 0.933s loss=1.430
I0912 12:56:33.722716 139735407601536 model_lib_v2.py:652] Step 5700 per-step time 0.933s loss=1.430
INFO:tensorflow:Step 5800 per-step time 0.817s loss=1.384
I0912 12:57:55.663592 139735407601536 model_lib_v2.py:652] Step 5800 per-step time 0.817s loss=1.384
INFO:tensorflow:Step 5900 per-step time 0.926s loss=1.433
I0912 12:59:21.656138 139735407601536 model_lib_v2.py:652] Step 5900 per-step time 0.926s loss=1.433
INFO:tensorflow:Step 6000 per-step time 0.886s loss=1.201
I0912 13:00:49.301700 139735407601536 model_lib_v2.py:652] Step 6000 per-step time 0.886s loss=1.201
INFO:tensorflow:Step 6100 per-step time 0.909s loss=1.172
I0912 13:02:14.654530 139735407601536 model_lib_v2.py:652] Step 6100 per-step time 0.909s loss=1.172
INFO:tensorflow:Step 6200 per-step time 0.796s loss=1.339
I0912 13:03:39.308925 139735407601536 model_lib_v2.py:652] Step 6200 per-step time 0.796s loss=1.339
INFO:tensorflow:Step 6300 per-step time 0.882s loss=1.223
I0912 13:05:04.184717 139735407601536 model_lib_v2.py:652] Step 6300 per-step time 0.882s loss=1.223
INFO:tensorflow:Step 6400 per-step time 0.859s loss=1.135
I0912 13:06:30.135948 139735407601536 model_lib_v2.py:652] Step 6400 per-step time 0.859s loss=1.135
INFO:tensorflow:Step 6500 per-step time 0.833s loss=1.351
I0912 13:07:55.786441 139735407601536 model_lib_v2.py:652] Step 6500 per-step time 0.833s loss=1.351
INFO:tensorflow:Step 6600 per-step time 0.925s loss=1.251
I0912 13:09:20.490083 139735407601536 model_lib_v2.py:652] Step 6600 per-step time 0.925s loss=1.251
INFO:tensorflow:Step 6700 per-step time 0.883s loss=1.101
I0912 13:10:44.277643 139735407601536 model_lib_v2.py:652] Step 6700 per-step time 0.883s loss=1.101
INFO:tensorflow:Step 6800 per-step time 0.904s loss=1.522
I0912 13:12:09.175756 139735407601536 model_lib_v2.py:652] Step 6800 per-step time 0.904s loss=1.522
INFO:tensorflow:Step 6900 per-step time 0.797s loss=1.457
I0912 13:13:34.616225 139735407601536 model_lib_v2.py:652] Step 6900 per-step time 0.797s loss=1.457
INFO:tensorflow:Step 7000 per-step time 0.835s loss=1.203
I0912 13:14:57.843070 139735407601536 model_lib_v2.py:652] Step 7000 per-step time 0.835s loss=1.203
INFO:tensorflow:Step 7100 per-step time 0.760s loss=1.240
I0912 13:16:23.126715 139735407601536 model_lib_v2.py:652] Step 7100 per-step time 0.760s loss=1.240
INFO:tensorflow:Step 7200 per-step time 0.822s loss=1.347
I0912 13:17:45.772905 139735407601536 model_lib_v2.py:652] Step 7200 per-step time 0.822s loss=1.347
INFO:tensorflow:Step 7300 per-step time 0.887s loss=1.413
I0912 13:19:11.714976 139735407601536 model_lib_v2.py:652] Step 7300 per-step time 0.887s loss=1.413
INFO:tensorflow:Step 7400 per-step time 0.881s loss=1.129
I0912 13:20:38.058959 139735407601536 model_lib_v2.py:652] Step 7400 per-step time 0.881s loss=1.129
INFO:tensorflow:Step 7500 per-step time 0.906s loss=1.265
I0912 13:22:03.379301 139735407601536 model_lib_v2.py:652] Step 7500 per-step time 0.906s loss=1.265
INFO:tensorflow:Step 7600 per-step time 0.885s loss=1.283
I0912 13:23:29.195028 139735407601536 model_lib_v2.py:652] Step 7600 per-step time 0.885s loss=1.283
INFO:tensorflow:Step 7700 per-step time 0.807s loss=1.330
I0912 13:24:54.521943 139735407601536 model_lib_v2.py:652] Step 7700 per-step time 0.807s loss=1.330
INFO:tensorflow:Step 7800 per-step time 0.784s loss=1.258
I0912 13:26:17.378603 139735407601536 model_lib_v2.py:652] Step 7800 per-step time 0.784s loss=1.258
INFO:tensorflow:Step 7900 per-step time 0.866s loss=1.098
I0912 13:27:42.443229 139735407601536 model_lib_v2.py:652] Step 7900 per-step time 0.866s loss=1.098
INFO:tensorflow:Step 8000 per-step time 0.893s loss=1.235
I0912 13:29:03.420507 139735407601536 model_lib_v2.py:652] Step 8000 per-step time 0.893s loss=1.235

and I give 3 photo of a human to test it
the first test is myself formal photo and it detect as motorcycle 58%
the second is photo of my friend look formal and it cannot detect anything
the third photo is 5 human taking photo and cannot detect anything

Can anybody help me with this?
or give me some advice
is this my dataset problem?
batch size problem?
or need more steps?

Colab notebook training errors

ERROR: Could not find a version that satisfies the requirement tensorflow>=2.3.0 (from tf-models-official->object-detection==0.1)

Hi Gilbert,

when I run python3 -m pip install .
this error occurs:

ERROR: Could not find a version that satisfies the requirement tensorflow>=2.3.0 (from tf-models-official->object-detection==0.1) (from versions: none)
ERROR: No matching distribution found for tensorflow>=2.3.0 (from tf-models-official->object-detection==0.1)
(At the bottom you can see the complete output.)
I'm using a Nvidia Jetson Nano with Ununtu 18.04, Python 3.6.9 and Tensorflow 2.2.0.

TensorFlow Version 2.2.0 should be ok i thought.

So unfortunately I don't understand why there is an error that refers to tensorflow 2.3.0 altough I followed your instructions. I don't know what I can do now.

Is there maybe any file where I can chance the requirement from

tensorflow>=2.3.0

I cloned the TensorFlow Models repository a few days before I startet your tutorial but I can't imagine that this has anything to do with the error.
This is where I cloned the repository: https://github.com/tensorflow/models
Or would it have been right to clone it from here because I'm using Tensorflow 2.2.0: https://github.com/tensorflow/models/tree/r2.2.0

Do you have any idea what I can do?

Best regards
chris

nvidia@nvidia-desktop:~/models/research$ python3 -m pip install .
Defaulting to user installation because normal site-packages is not writeable
Processing /home/nvidia/models/research
Collecting avro-python3
  Downloading avro-python3-1.10.0.tar.gz (37 kB)
Collecting apache-beam
  Downloading apache-beam-2.25.0.zip (2.3 MB)
     |████████████████████████████████| 2.3 MB 5.2 MB/s 
Requirement already satisfied: pillow in /usr/local/lib/python3.6/dist-packages/Pillow-8.0.1-py3.6-linux-aarch64.egg (from object-detection==0.1) (8.0.1)
Requirement already satisfied: lxml in /usr/lib/python3/dist-packages (from object-detection==0.1) (4.2.1)
Requirement already satisfied: matplotlib in /home/nvidia/.local/lib/python3.6/site-packages (from object-detection==0.1) (3.3.2)
Requirement already satisfied: Cython in /home/nvidia/.local/lib/python3.6/site-packages (from object-detection==0.1) (0.29.21)
Requirement already satisfied: contextlib2 in /usr/local/lib/python3.6/dist-packages (from object-detection==0.1) (0.6.0.post1)
Collecting tf-slim
  Downloading tf_slim-1.1.0-py2.py3-none-any.whl (352 kB)
     |████████████████████████████████| 352 kB 8.1 MB/s 
Requirement already satisfied: six in /home/nvidia/.local/lib/python3.6/site-packages (from object-detection==0.1) (1.15.0)
Requirement already satisfied: pycocotools in /usr/local/lib/python3.6/dist-packages (from object-detection==0.1) (2.0.2)
Collecting lvis
  Downloading lvis-0.5.3-py3-none-any.whl (14 kB)
Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from object-detection==0.1) (1.4.1)
Requirement already satisfied: pandas in /home/nvidia/.local/lib/python3.6/site-packages (from object-detection==0.1) (1.1.3)
Collecting tf-models-official
  Downloading tf_models_official-2.3.0-py2.py3-none-any.whl (840 kB)
     |████████████████████████████████| 840 kB 7.9 MB/s 
Collecting crcmod<2.0,>=1.7
  Downloading crcmod-1.7.tar.gz (89 kB)
     |████████████████████████████████| 89 kB 2.3 MB/s 
Collecting dill<0.3.2,>=0.3.1.1
  Downloading dill-0.3.1.1.tar.gz (151 kB)
     |████████████████████████████████| 151 kB 8.4 MB/s 
Collecting fastavro<2,>=0.21.4
  Downloading fastavro-1.1.0.tar.gz (656 kB)
     |████████████████████████████████| 656 kB 8.0 MB/s 
Requirement already satisfied: future<1.0.0,>=0.18.2 in /home/nvidia/.local/lib/python3.6/site-packages (from apache-beam->object-detection==0.1) (0.18.2)
Requirement already satisfied: grpcio<2,>=1.29.0 in /usr/local/lib/python3.6/dist-packages (from apache-beam->object-detection==0.1) (1.33.2)
Collecting hdfs<3.0.0,>=2.1.0
  Downloading hdfs-2.5.8.tar.gz (41 kB)
     |████████████████████████████████| 41 kB 313 kB/s 
Requirement already satisfied: httplib2<0.18.0,>=0.8 in /usr/lib/python3/dist-packages (from apache-beam->object-detection==0.1) (0.9.2)
Collecting mock<3.0.0,>=1.0.1
  Downloading mock-2.0.0-py2.py3-none-any.whl (56 kB)
     |████████████████████████████████| 56 kB 1.6 MB/s 
Requirement already satisfied: numpy<2,>=1.14.3 in /usr/local/lib/python3.6/dist-packages (from apache-beam->object-detection==0.1) (1.16.1)
Collecting pymongo<4.0.0,>=3.8.0
  Downloading pymongo-3.11.0-cp36-cp36m-manylinux2014_aarch64.whl (507 kB)
     |████████████████████████████████| 507 kB 8.0 MB/s 
Collecting oauth2client<5,>=2.0.1
  Downloading oauth2client-4.1.3-py2.py3-none-any.whl (98 kB)
     |████████████████████████████████| 98 kB 2.2 MB/s 
Requirement already satisfied: protobuf<4,>=3.12.2 in /usr/local/lib/python3.6/dist-packages (from apache-beam->object-detection==0.1) (3.13.0)
Collecting pydot<2,>=1.2.0
  Downloading pydot-1.4.1-py2.py3-none-any.whl (19 kB)
Requirement already satisfied: python-dateutil<3,>=2.8.0 in /home/nvidia/.local/lib/python3.6/site-packages (from apache-beam->object-detection==0.1) (2.8.1)
Requirement already satisfied: pytz>=2018.3 in /home/nvidia/.local/lib/python3.6/site-packages (from apache-beam->object-detection==0.1) (2020.1)
Requirement already satisfied: requests<3.0.0,>=2.24.0 in /usr/local/lib/python3.6/dist-packages (from apache-beam->object-detection==0.1) (2.24.0)
Collecting typing-extensions<3.8.0,>=3.7.0
  Downloading typing_extensions-3.7.4.3-py3-none-any.whl (22 kB)
Collecting pyarrow<0.18.0,>=0.15.1
  Downloading pyarrow-0.17.1.tar.gz (2.6 MB)
     |████████████████████████████████| 2.6 MB 8.9 MB/s 
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3 in /usr/lib/python3/dist-packages (from matplotlib->object-detection==0.1) (2.2.0)
Requirement already satisfied: certifi>=2020.06.20 in /home/nvidia/.local/lib/python3.6/site-packages (from matplotlib->object-detection==0.1) (2020.6.20)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/nvidia/.local/lib/python3.6/site-packages (from matplotlib->object-detection==0.1) (1.2.0)
Requirement already satisfied: cycler>=0.10 in /usr/lib/python3/dist-packages (from matplotlib->object-detection==0.1) (0.10.0)
Requirement already satisfied: absl-py>=0.2.2 in /usr/local/lib/python3.6/dist-packages (from tf-slim->object-detection==0.1) (0.11.0)
Requirement already satisfied: setuptools>=18.0 in /usr/local/lib/python3.6/dist-packages (from pycocotools->object-detection==0.1) (49.6.0)
Collecting opencv-python>=4.1.0.25
  Downloading opencv-python-4.4.0.46.tar.gz (88.9 MB)
     |████████████████████████████████| 88.9 MB 6.3 kB/s 
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Requirement already satisfied: pyyaml in /usr/lib/python3/dist-packages (from tf-models-official->object-detection==0.1) (3.12)
Requirement already satisfied: tensorflow-hub>=0.6.0 in /home/nvidia/.local/lib/python3.6/site-packages (from tf-models-official->object-detection==0.1) (0.10.0)
Collecting dataclasses
  Downloading dataclasses-0.7-py3-none-any.whl (18 kB)
Collecting sentencepiece
  Downloading sentencepiece-0.1.94-cp36-cp36m-manylinux2014_aarch64.whl (1.1 MB)
     |████████████████████████████████| 1.1 MB 9.9 MB/s 
Collecting kaggle>=1.3.9
  Downloading kaggle-1.5.9.tar.gz (58 kB)
     |████████████████████████████████| 58 kB 2.6 MB/s 
ERROR: Could not find a version that satisfies the requirement tensorflow>=2.3.0 (from tf-models-official->object-detection==0.1) (from versions: none)
ERROR: No matching distribution found for tensorflow>=2.3.0 (from tf-models-official->object-detection==0.1)

No bounding boxes detected

How to run in realtime ?
I Loaded my saved_model.pb, no error but no bounding boxes there

import cv2
import tensorflow as tf
from object_detection.utils import ops as utils_ops
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util
import numpy as np
def load_model(model_path):
model = tf.saved_model.load(model_path)
return model
cap = cv2.VideoCapture(0) # or cap = cv2.VideoCapture("")

List of the strings that is used to add correct label for each box.

PATH_TO_LABELS = '/root/object/models/research/object_detection/training/label_map.pbtxt'
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)
model_name = '/root/object/models/research/object_detection/inference_graph/saved_model'
detection_model = load_model(model_name)

print(detection_model.signatures['serving_default'].inputs)
detection_model.signatures['serving_default'].output_dtypes
detection_model.signatures['serving_default'].output_shapes

def run_inference_for_single_image(model, image):
image = np.asarray(image)
# The input needs to be a tensor, convert it using tf.convert_to_tensor.
input_tensor = tf.convert_to_tensor(image)
# The model expects a batch of images, so add an axis with tf.newaxis.
input_tensor = input_tensor[tf.newaxis]

# Run inference
model_fn = model.signatures['serving_default']
output_dict = model_fn(input_tensor)

# All outputs are batches tensors.
# Convert to numpy arrays, and take index [0] to remove the batch dimension.
# We're only interested in the first num_detections.
num_detections = int(output_dict.pop('num_detections'))
output_dict = {key:value[0, :num_detections].numpy() 
                for key,value in output_dict.items()}
output_dict['num_detections'] = num_detections

# detection_classes should be ints.
output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)

# Handle models with masks:
if 'detection_masks' in output_dict:
    # Reframe the the bbox mask to the image size.
    detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            output_dict['detection_masks'], output_dict['detection_boxes'],
            image.shape[0], image.shape[1])      
    detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5,
                                    tf.uint8)
    output_dict['detection_masks_reframed'] = detection_masks_reframed.numpy()
    
return output_dict

def run_inference(model, cap):
while cap.isOpened():
ret, image_np = cap.read()
# Actual detection.
output_dict = run_inference_for_single_image(model, image_np)
# Visualization of the results of a detection.
vis_util.visualize_boxes_and_labels_on_image_array(
image_np,
output_dict['detection_boxes'],
output_dict['detection_classes'],
output_dict['detection_scores'],
category_index,
instance_masks=output_dict.get('detection_masks_reframed', None),
use_normalized_coordinates=True,
max_boxes_to_draw=10,
min_score_thresh=.30,
agnostic_mode=False,
line_thickness=8)

    cv2.imshow('object_detection', cv2.resize(image_np, (640, 480)))
    if cv2.waitKey(25) & 0xFF == ord('q'):
        cap.release()
        cv2.destroyAllWindows()
        break

run_inference(detection_model, cap)

Error in the testing the installation run

Hi,
After installing and running python -m pip install . command, I run the command, python object_detection/builders/model_builder_tf2_test.py to test the installation. But I'm getting the following errors as seen in the image attached below.
I'm currently running it in a Virtual Machine in GCP which contains a GPU . So when I run the pip install -U tensorflow==2.3.0 command, it runs but still shows some errors and messes up my GPU , due to which I'm not able to access it.

Please any help will be much appreciated!

Best,
Aravind

path to label_map

Having changed all lines in the config file, when I run the training i get the following error which makes me think there is another label map path being called?

this is my input

Issue when: 4. Generating Training data

Hi Tanner,
generate_tfrecord.py in section number 4 ( Generating Training data) Dat Tran’s raccoon detector, it
address tf 1.x. while your code addresses tf2.x.
Generates:
Traceback (most recent call last):
File "csv_a_tf.py", line 24, in
flags = tf.app.flags
AttributeError: module 'tensorflow' has no attribute 'app'

i recommend updating the reference to another repo (maybe https://github.com/douglasrizzo/detection_util_scripts)
Best regards
yo0x

extract the images inside the bounding boxes

I am still confused about extracting Bounding boxes as a separate image, where do I exactly need to use your code from commonly asked questions can you please elaborate?

Below are the function I am using and testing it on test images folder.

def load_custom_model(model_name):
  model_file = model_name
  model_dir = pathlib.Path(model_file)/"saved_model"
  model = tf.saved_model.load(str(model_dir))
  return model

model_name = 'exported-models/my_model'
detection_model = load_custom_model(model_name)

PATH_TO_LABELS` = 'training/label_map.pbtxt'
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS, use_display_name=True)
PATH_TO_TEST_IMAGES_DIR = pathlib.Path('test_images')
TEST_IMAGE_PATHS = sorted(list(PATH_TO_TEST_IMAGES_DIR.glob("*.jpg")))


def run_inference_for_single_image(model, image):
  image = np.asarray(image)
  # The input needs to be a tensor, convert it using `tf.convert_to_tensor`.
  input_tensor = tf.convert_to_tensor(image)
  # The model expects a batch of images, so add an axis with `tf.newaxis`.
  input_tensor = input_tensor[tf.newaxis,...]

  # Run inference
  model_fn = model.signatures['serving_default']
  output_dict = model_fn(input_tensor)

  # All outputs are batches tensors.
  # Convert to numpy arrays, and take index [0] to remove the batch dimension.
  # We're only interested in the first num_detections.
  num_detections = int(output_dict.pop('num_detections'))
  output_dict = {key:value[0, :num_detections].numpy() 
                 for key,value in output_dict.items()}
  output_dict['num_detections'] = num_detections

  # detection_classes should be ints.
  output_dict['detection_classes'] = output_dict['detection_classes'].astype(np.int64)
   
  # Handle models with masks:
  if 'detection_masks' in output_dict:
    # Reframe the the bbox mask to the image size.
    detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
              output_dict['detection_masks'], output_dict['detection_boxes'],
               image.shape[0], image.shape[1])      
    detection_masks_reframed = tf.cast(detection_masks_reframed > 0.5,
                                       tf.uint8)
    output_dict['detection_masks_reframed'] = detection_masks_reframed.numpy()
    
  return output_dict

def show_inference(model, image_path):
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = np.array(Image.open(image_path))
  # Actual detection.
  output_dict = run_inference_for_single_image(model, image_np)
  # Visualization of the results of a detection.
  vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks_reframed', None),
      use_normalized_coordinates=True,
      line_thickness=8)

  display(Image.fromarray(image_np))


for image_path in TEST_IMAGE_PATHS:
  show_inference(detection_model, image_path)

Created a script to run inferences on the test images and display in OpenCV window

First - thank you for putting together your YouTube video and this repo. It definitely helped a lot to get started with understanding how to use Tensorflow2 Object Detection.

I wanted to share a script I put together to run inferences and display the results locally using opencv windows. You can find a gist of my script here:

https://gist.github.com/youngsoul/ae49b39e35cc66d34d564958dda66f35

tannergilbert / tensorflow-object-detection-api-train-model Goto Github PK

tensorflow-object-detection-api-train-model's Introduction

How to train a custom object detection model with the Tensorflow Object Detection API

Introduction

Steps:

1. Installation

Docker Installation

Python Package Installation

2. Gathering data

3. Labeling data

4. Generating Training data

5. Getting ready for training

5.1 Creating a label map

5.2 Creating the training configuration

6. Training the model

7. Exporting the inference graph

8. Using the model for inference

Appendix

Common Questions

1. How do I extract the images inside the bounding boxes?

2. How do I host a model?

Contribution

Author

Support me

License

tensorflow-object-detection-api-train-model's People

Contributors

Stargazers

Watchers

Forkers

tensorflow-object-detection-api-train-model's Issues

List of the strings that is used to add correct label for each box.

Recommend Projects

Recommend Topics

Recommend Org