Giter Site home page Giter Site logo

ibm / max-object-detector Goto Github PK

View Code? Open in Web Editor NEW
288.0 39.0 225.0 205.02 MB

Localize and identify multiple objects in a single image.

Home Page: https://developer.ibm.com/exchanges/models/all/max-object-detector/

License: Apache License 2.0

Python 76.21% Dockerfile 5.50% Jupyter Notebook 18.29%
docker-image machine-learning machine-learning-models coco-dataset tensorflow-model

max-object-detector's Introduction

Build Status Website Status

IBM Developer Model Asset Exchange: Object Detector

This repository contains code to instantiate and deploy an object detection model. This model recognizes the objects present in an image from the 80 different high-level classes of objects in the COCO Dataset. The model consists of a deep convolutional net base model for image feature extraction, together with additional convolutional layers specialized for the task of object detection, that was trained on the COCO data set. The input to the model is an image, and the output is a list of estimated class probabilities for the objects detected in the image.

The model is based on the SSD Mobilenet V1 and Faster RCNN ResNet101 object detection model for TensorFlow. The model files are hosted on IBM Cloud Object Storage: ssd_mobilenet_v1.tar.gz and faster_rcnn_resnet101.tar.gz. The code in this repository deploys the model as a web service in a Docker container. This repository was developed as part of the IBM Developer Model Asset Exchange and the public API is powered by IBM Cloud.

Model Metadata

Domain Application Industry Framework Training Data Input Data Format
Vision Object Detection General TensorFlow COCO Dataset Image (RGB/HWC)

References

Licenses

Component License Link
This repository Apache 2.0 LICENSE
Model Weights Apache 2.0 TensorFlow Models Repo
Model Code (3rd party) Apache 2.0 TensorFlow Models Repo
Test Samples Various Samples README

Pre-requisites:

  • docker: The Docker command-line interface. Follow the installation instructions for your system.
  • The minimum recommended resources for this model is 2GB Memory and 2 CPUs.
  • If you are on x86-64/AMD64, your CPU must support AVX at the minimum.

Deployment options

Deploy from Quay

To run the docker image, which automatically starts the model serving API, run:

Intel CPUs:

$ docker run -it -p 5000:5000 quay.io/codait/max-object-detector

ARM CPUs (eg Raspberry Pi):

$ docker run -it -p 5000:5000 quay.io/codait/max-object-detector:arm-arm32v7-latest

This will pull a pre-built image from the Quay.io container registry (or use an existing image if already cached locally) and run it. If you'd rather checkout and build the model locally you can follow the run locally steps below.

Deploy on Red Hat OpenShift

You can deploy the model-serving microservice on Red Hat OpenShift by following the instructions for the OpenShift web console or the OpenShift Container Platform CLI in this tutorial, specifying quay.io/codait/max-object-detector as the image name.

Deploy on Kubernetes

You can also deploy the model on Kubernetes using the latest docker image on Quay.

On your Kubernetes cluster, run the following commands:

$ kubectl apply -f https://raw.githubusercontent.com/IBM/MAX-Object-Detector/master/max-object-detector.yaml

The model will be available internally at port 5000, but can also be accessed externally through the NodePort.

A more elaborate tutorial on how to deploy this MAX model to production on IBM Cloud can be found here.

Deploy on Code Engine

You can also deploy the model on IBM Cloud's Code Engine platform which is based on the Knative serverless framework. Once authenticated with your IBM Cloud account, run the commands below.

Create a Code Engine project, give it a unique name

$ ibmcloud ce project create --name sandbox

Run the container by pointing to the quay.io image and exposting port 5000.

$ ibmcloud ce application create --name max-object-detector --image quay.io/codait/max-object-detector --port 5000

Open the resulting URL in a browser, append /app to view the app instead of the API.

Run Locally

  1. Build the Model
  2. Deploy the Model
  3. Use the Model
  4. Run the Notebook
  5. Development
  6. Cleanup

1. Build the Model

Clone this repository locally. In a terminal, run the following command:

$ git clone https://github.com/IBM/MAX-Object-Detector.git

Change directory into the repository base folder:

$ cd MAX-Object-Detector

To build the docker image locally for Intel CPUs, run:

$ docker build -t max-object-detector .

To select a model, pass in the --build-arg model=<desired-model> switch:

$ docker build --build-arg model=faster_rcnn_resnet101 -t max-object-detector .

Currently we support two models, ssd_mobilenet_v1 (default) and faster_rcnn_resnet101.

For ARM CPUs (eg Raspberry Pi), run:

$ docker build -f Dockerfile.arm32v7 -t max-object-detector .

All required model assets will be downloaded during the build process. Note that currently this docker image is CPU only (we will add support for GPU images later).

2. Deploy the Model

To run the docker image, which automatically starts the model serving API, run:

$ docker run -it -p 5000:5000 max-object-detector

3. Use the Model

The API server automatically generates an interactive Swagger documentation page. Go to http://localhost:5000 to load it. From there you can explore the API and also create test requests.

Use the model/predict endpoint to load a test image (you can use one of the test images from the samples folder) and get predicted labels for the image from the API. The coordinates of the bounding box are returned in the detection_box field, and contain the array of normalized coordinates (ranging from 0 to 1) in the form [ymin, xmin, ymax, xmax].

Swagger Doc Screenshot

You can also test it on the command line, for example:

$ curl -F "image=@samples/dog-human.jpg" -XPOST http://127.0.0.1:5000/model/predict

You should see a JSON response like that below:

{
  "status": "ok",
  "predictions": [
      {
          "label_id": "1",
          "label": "person",
          "probability": 0.944034993648529,
          "detection_box": [
              0.1242099404335022,
              0.12507188320159912,
              0.8423267006874084,
              0.5974075794219971
          ]
      },
      {
          "label_id": "18",
          "label": "dog",
          "probability": 0.8645511865615845,
          "detection_box": [
              0.10447660088539124,
              0.17799153923988342,
              0.8422801494598389,
              0.732001781463623
          ]
      }
  ]
}

You can also control the probability threshold for what objects are returned using the threshold argument like below:

$ curl -F "image=@samples/dog-human.jpg" -XPOST http://127.0.0.1:5000/model/predict?threshold=0.5

The optional threshold parameter is the minimum probability value for predicted labels returned by the model. The default value for threshold is 0.7.

4. Run the Notebook

The demo notebook walks through how to use the model to detect objects in an image and visualize the results. By default, the notebook uses the hosted demo instance, but you can use a locally running instance (see the comments in Cell 3 for details). Note the demo requires jupyter, matplotlib, Pillow, and requests.

Run the following command from the model repo base folder, in a new terminal window:

$ jupyter notebook

This will start the notebook server. You can launch the demo notebook by clicking on demo.ipynb.

5. Development

To run the Flask API app in debug mode, edit config.py to set DEBUG = True under the application settings. You will then need to rebuild the docker image (see step 1).

6. Cleanup

To stop the Docker container, type CTRL + C in your terminal.

Object Detector Web App

The latest release of the MAX Object Detector Web App is included in the Object Detector docker image.

When the model API server is running, the web app can be accessed at http://localhost:5000/app and provides interactive visualization of the bounding boxes and their related labels returned by the model.

Mini Web App Screenshot

If you wish to disable the web app, start the model serving API by running:

$ docker run -it -p 5000:5000 -e DISABLE_WEB_APP=true quay.io/codait/max-object-detector

Resources and Contributions

If you are interested in contributing to the Model Asset Exchange project or have any queries, please follow the instructions here.

Links

max-object-detector's People

Contributors

ajbozarth avatar animeshsingh avatar bdwyer2 avatar dependabot[bot] avatar djalova avatar frreiss avatar imgbot[bot] avatar kant avatar kmh4321 avatar lresende avatar mlnick avatar nilmeier avatar ptitzler avatar splovyt avatar ssaishruthi avatar stevemar avatar xuhdev avatar xwu0226 avatar yil532 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

max-object-detector's Issues

Extended support for Raspberry Pi 3

Hello!

My Raspberry Pi has the following characteristics:

processor	: 0
processor	: 1
processor	: 2
processor	: 3
model name	: **ARMv7** Processor rev 4 (v7l)
BogoMIPS	: 38.40
Features	: half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32 
CPU implementer	: 0x41
CPU architecture: 7
CPU variant	: 0x0
CPU part	: 0xd03
CPU revision	: 4

Hardware	: BCM2835
Revision	: a020d3
Serial		: 0000000000000000
Model		: Raspberry Pi 3 Model B Plus Rev 1.3
$ uname -a
Linux raspbari14 4.19.97-v7+ #1294 SMP Thu Jan 30 13:15:58 GMT 2020 **armv7l** GNU/Linux

I'm not sure where your image is intended to support this older model. I do have TensorFlow installed on this computer. Could I run some scripts to build the image myself? :)

Also, I noticed that the version of python being used supposedly v2. Is there any support for v3?

These are not issues, per se, but clearly roadblocks in practicing your tutorials. Any guidance (other than purchasing RPi4) would be sincerely appreciated.

Kind regards.

Image has security Critical issues

Related to #168

Hi
I just ran the trivy scanner now and is still finding same 6 Critical issues

trivy i -s CRITICAL ai
2021-05-24T20:19:57.780-0400	INFO	Detected OS: debian
2021-05-24T20:19:57.780-0400	INFO	Detecting Debian vulnerabilities...
2021-05-24T20:19:57.793-0400	INFO	Number of PL dependency files: 0

ai (debian 10.9)
================
Total: 6 (CRITICAL: 6)

+----------------------+------------------+----------+-------------------+---------------+---------------------------------------+
|       LIBRARY        | VULNERABILITY ID | SEVERITY | INSTALLED VERSION | FIXED VERSION |                 TITLE                 |
+----------------------+------------------+----------+-------------------+---------------+---------------------------------------+
| libgnutls30          | CVE-2021-20231   | CRITICAL | 3.6.7-4+deb10u6   |               | gnutls: Use after free in             |
|                      |                  |          |                   |               | client key_share extension            |
|                      |                  |          |                   |               | -->avd.aquasec.com/nvd/cve-2021-20231 |
+                      +------------------+          +                   +---------------+---------------------------------------+
|                      | CVE-2021-20232   |          |                   |               | gnutls: Use after free                |
|                      |                  |          |                   |               | in client_send_params in              |
|                      |                  |          |                   |               | lib/ext/pre_shared_key.c              |
|                      |                  |          |                   |               | -->avd.aquasec.com/nvd/cve-2021-20232 |
+----------------------+------------------+          +-------------------+---------------+---------------------------------------+
| libpython2.7-minimal | CVE-2021-3177    |          | 2.7.16-2+deb10u1  |               | python: Stack-based buffer overflow   |
|                      |                  |          |                   |               | in PyCArg_repr in _ctypes/callproc.c  |
|                      |                  |          |                   |               | -->avd.aquasec.com/nvd/cve-2021-3177  |
+----------------------+                  +          +                   +---------------+                                       +
| libpython2.7-stdlib  |                  |          |                   |               |                                       |
|                      |                  |          |                   |               |                                       |
|                      |                  |          |                   |               |                                       |
+----------------------+                  +          +                   +---------------+                                       +
| python2.7            |                  |          |                   |               |                                       |
|                      |                  |          |                   |               |                                       |
|                      |                  |          |                   |               |                                       |
+----------------------+                  +          +                   +---------------+                                       +
| python2.7-minimal    |                  |          |                   |               |                                       |
|                      |                  |          |                   |               |                                       |
|                      |                  |          |                   |               |                                       |
+----------------------+------------------+----------+-------------------+---------------+---------------------------------------+

Running Object Detector on older AMD x86 machine

Hello, is there a way to run the detector on an older x86 machine? I am trying to reuse an older machine in my house to do object detection and camera recordings. But I cannot get neither :latest nor :v1.3.0 tags to run.

root@ubuntu-server:~# docker run -p 5000:5000 codait/max-object-detector:v1.3.0
2020-04-13 09:36:00.831007: F tensorflow/core/platform/cpu_feature_guard.cc:37] The TensorFlow library was compiled to use SSE4.1 instructions, but these aren't available on your machine.
Aborted (core dumped)
root@ubuntu-server:~# docker run -p 5000:5000 codait/max-object-detector:latest
Illegal instruction (core dumped)

This is my CPU information:

root@ubuntu-server:~# cat /proc/cpuinfo 
processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 4
model name	: AMD Phenom(tm) II X4 955 Processor
stepping	: 3
microcode	: 0x10000c8
cpu MHz		: 800.000
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate vmmcall npt lbrv svm_lock nrip_save
bugs		: tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs null_seg amd_e400 spectre_v1 spectre_v2
bogomips	: 6421.40
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

processor	: 1
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 4
model name	: AMD Phenom(tm) II X4 955 Processor
stepping	: 3
microcode	: 0x10000c8
cpu MHz		: 3200.000
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 1
cpu cores	: 4
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate vmmcall npt lbrv svm_lock nrip_save
bugs		: tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs null_seg amd_e400 spectre_v1 spectre_v2
bogomips	: 6421.40
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

processor	: 2
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 4
model name	: AMD Phenom(tm) II X4 955 Processor
stepping	: 3
microcode	: 0x10000c8
cpu MHz		: 800.000
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 2
cpu cores	: 4
apicid		: 2
initial apicid	: 2
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate vmmcall npt lbrv svm_lock nrip_save
bugs		: tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs null_seg amd_e400 spectre_v1 spectre_v2
bogomips	: 6421.40
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

processor	: 3
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 4
model name	: AMD Phenom(tm) II X4 955 Processor
stepping	: 3
microcode	: 0x10000c8
cpu MHz		: 800.000
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate vmmcall npt lbrv svm_lock nrip_save
bugs		: tlb_mmatch apic_c1e fxsave_leak sysret_ss_attrs null_seg amd_e400 spectre_v1 spectre_v2
bogomips	: 6421.40
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

detecting result running on openshift is not same as the result running on Local Mac

Hello, I deploy the max-object-detector on openshift, with the resource request as below
resources:
requests:
cpu: "300m"
memory: "2500Mi"
limits:
cpu: "2"
memory: "3000Mi"

The issue is when I send request to the detector, it can return status 200OK, but predictions result array is empty[], when I running the detector on my local Mac, everything works fine, I run the docker stats to monitor the local memory and CPU usage, it is around 1G memory usage and CPU usage is 115% (peak 400%), I am not sure if the resources on openshift containers is not enough for the detector to run. If the resource is limited, is it impact the detection precision, or is it impact the length of the time to finish the detection.

Many thanks for the help!

Node-Red with Object Detector only for humans

Hello, I have set up the MAX Object Detector Raspberry Pi docker container and it works perfect. The usage is for a security surveillance setup in my boat, and I am basically only interested in detecting persons.

Can I, using the Node-Red flow, be able to filter out only detections with persons? I get some false-positives and would like to remove these. I am not interested in "teddy bear" or "TV" matches.

Apologies if this is not the right forum! I would be grateful for guidance in case! 💃

salong

POST returns sometimes HTTP code 500

I have a nodered flow which fetches screenshots from security cameras and POSTing them to this docker image running in docker under windows. most of the time, it just works as expected and other times it returns an error 500.

I cannot even tell whether I am doing it somehow wrong or if there is a bug in this docker container. Can you guys help me somehow? Where should I go from here?

The docker log shows this:

172.17.0.1 - - [14/Oct/2020 11:23:32] "POST /model/predict HTTP/1.1" 200 -
172.17.0.1 - - [14/Oct/2020 11:27:35] "POST /model/predict HTTP/1.1" 200 -
172.17.0.1 - - [14/Oct/2020 11:27:38] "POST /model/predict HTTP/1.1" 200 -
172.17.0.1 - - [14/Oct/2020 11:28:28] "POST /model/predict HTTP/1.1" 200 -
[2020-10-14 11:30:53,323] ERROR in app: Exception on /model/predict [POST]
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/opt/conda/lib/python3.7/site-packages/flask_restplus/api.py", line 319, in wrapper
resp = resource(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/flask/views.py", line 89, in view
return self.dispatch_request(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/flask_restplus/resource.py", line 44, in dispatch_request
resp = meth(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/flask_restplus/marshalling.py", line 136, in wrapper
resp = f(*args, **kwargs)
File "/workspace/api/predict.py", line 89, in post
image = model_wrapper._read_image(image_data)
File "/workspace/core/model.py", line 62, in _read_image
image = Image.open(io.BytesIO(image_data)).convert("RGB")
File "/opt/conda/lib/python3.7/site-packages/PIL/Image.py", line 2818, in open
raise IOError("cannot identify image file %r" % (filename if filename else fp))
OSError: cannot identify image file <_io.BytesIO object at 0x7f8394db9d70>
172.17.0.1 - - [14/Oct/2020 11:30:53] "POST /model/predict HTTP/1.1" 500 -

Docker build error

Getting error when building Docker image:
wget: unable to resolve host address ‘max-cdn.cdn.appdomain.cloud’

Illegal instruction (core dumped)

When I try to use :latest either via docker run or via docker build is get the following error and no further detail...
Illegal instruction (core dumped)
The operating system is Ubuntu 18.04LTS with not much other than docker CE installed, and it is sitting on a Proxmox VM
Are there any CPU requirements that may not be met due to Proxmox abstracting them maybe?

Error when send some images

When submitting certain images I get the following error:

[2018-06-28 17:01:38,842] ERROR in app: Exception on /model/predict [POST]
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/flask/app.py", line 1813, in full_dispatch_request
    rv = self.dispatch_request()
  File "/opt/conda/lib/python3.6/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/opt/conda/lib/python3.6/site-packages/flask_restplus/api.py", line 319, in wrapper
    resp = resource(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/flask/views.py", line 88, in view
    return self.dispatch_request(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/flask_restplus/resource.py", line 44, in dispatch_request
    resp = meth(*args, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/flask_restplus/marshalling.py", line 136, in wrapper
    resp = f(*args, **kwargs)
  File "/workspace/api/model.py", line 56, in post
    label_preds = self.model_wrapper.predict(image)
  File "/workspace/core/backend.py", line 60, in predict
    image = preprocess_image(imageRaw)
  File "/workspace/core/backend.py", line 27, in preprocess_image
    (im_height, im_width, 3)).astype(np.uint8)
ValueError: cannot reshape array of size 1398400 into shape (475,736,3)

I've attached the images that are causing the error

failures.zip

Docker container crashes on specific image submit

I found a stock image that consistently breaks the api docker container. This manifests in a few ways, all of which show no log output including:

  • the docker container just dies
  • the api stops responding but the docker container is still listed as running (in this case docker itself usually slows down to the point it's non-functional)

The image that consistently causes this behavior is attached and was originally downloaded here: https://www.pexels.com/photo/computer-desk-electronics-indoors-374074/

computer-desk-electronics-374074.jpg.zip

I haven't managed to recreate this with any other image.

Investigate various model options

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models has a few other models that we could make available, either via a Docker build flag or via an API call query parameter (at the cost of significantly increasing the Docker image build time as it would need to pull down multiple models).

In particular, we use the ssd_mobilenet_v1_coco model which is among the fastest & smallest, but a bit less accurate.

There are a couple new models on the TF model zoo that are slower (by about a factor of 2) but more accurate:

  • ssd_mobilenet_v1_fpn_coco
  • ssd_resnet_50_fpn_coco

Kubernetes configuration file: add environment variable for mini-app

Users can disable the mini-app endpoint by setting environment variable DISABLE_WEB_APP to true.
We should add this variable to the object-detector.yaml configuration file as follows:

spec:
 restartPolicy: Always
 containers:
 - env:
   - name: DISABLE_WEB_APP
     value: "false"
   name: object-detector
   image: codait/max-object-detector

Doing so will not change the default behavior (which is to expose the mini-app) but make it easy for users to do so, by simply changing the value from false to true.

Docker image cannot be deployed in RedHat OpenShift due to permission error

File "app.py", line 18, in <module>
    from api import ModelMetadataAPI, ModelLabelsAPI, ModelPredictAPI
    File "/workspace/api/__init__.py", line 18, in <module>
    from .predict import ModelLabelsAPI, ModelPredictAPI  # noqa
    File "/workspace/api/predict.py", line 35, in <module>
    model_wrapper = ModelWrapper()
    File "/workspace/core/model.py", line 45, in __init__
    serialized_graph = fid.read()
    File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 125, in read
    self._preread_check()
    File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/lib/io/file_io.py", line 85, in _preread_check
    compat.as_bytes(self.__name), 1024 * 512, status)
    File "/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in __exit__
    c_api.TF_GetCode(self.status.status))
    tensorflow.python.framework.errors_impl.PermissionDeniedError: assets/frozen_inference_graph.pb; Permission denied

inconsistent link to COS directory for model files

the README contains this link to the model files:

this works. however, it is different than the link in the Dockerfile:

i would expect both of these to point to the same set of model files.
in addition, both of the links should be pointing to the new US cross region:

s3.us.cloud-object-storage.appdomain.cloud

because the old one has been deprecated:

s3-api.us-geo.objectstorage.softlayer.net

Image has CVE security issues

While using the Cloud Native Toolkit to run a pipeline we use trivy to scan the images

And the report came back with 6 critical CVEs to address

Trivy Security Scan image in registry
/var/oci/image (debian 10.9)
============================
Total: 214 (UNKNOWN: 0, LOW: 150, MEDIUM: 20, HIGH: 38, CRITICAL: 6)

The 6 critical issues are the following that need to be fix

Total: 6 (CRITICAL: 6)

+----------------------+------------------+----------+-------------------+---------------+--------------------------------+------------------------------------+
|       LIBRARY        | VULNERABILITY ID | SEVERITY | INSTALLED VERSION | FIXED VERSION |             TITLE              |                URL                 |
+----------------------+------------------+----------+-------------------+---------------+--------------------------------+------------------------------------+
| libgnutls30          | CVE-2021-20231   | CRITICAL | 3.6.7-4+deb10u6   |               | gnutls: Use after free in      | avd.aquasec.com/nvd/cve-2021-20231 |
|                      |                  |          |                   |               | client key_share extension     |                                    |
+                      +------------------+          +                   +---------------+--------------------------------+------------------------------------+
|                      | CVE-2021-20232   |          |                   |               | gnutls: Use after free         | avd.aquasec.com/nvd/cve-2021-20232 |
|                      |                  |          |                   |               | in client_send_params in       |                                    |
|                      |                  |          |                   |               | lib/ext/pre_shared_key.c       |                                    |
+----------------------+------------------+          +-------------------+---------------+--------------------------------+------------------------------------+
| libpython2.7-minimal | CVE-2021-3177    |          | 2.7.16-2+deb10u1  |               | python: Stack-based buffer     | avd.aquasec.com/nvd/cve-2021-3177  |
|                      |                  |          |                   |               | overflow in PyCArg_repr in     |                                    |
|                      |                  |          |                   |               | _ctypes/callproc.c             |                                    |
+----------------------+                  +          +                   +---------------+                                +                                    +
| libpython2.7-stdlib  |                  |          |                   |               |                                |                                    |
|                      |                  |          |                   |               |                                |                                    |
|                      |                  |          |                   |               |                                |                                    |
+----------------------+                  +          +                   +---------------+                                +                                    +
| python2.7            |                  |          |                   |               |                                |                                    |
|                      |                  |          |                   |               |                                |                                    |
|                      |                  |          |                   |               |                                |                                    |
+----------------------+                  +          +                   +---------------+                                +                                    +
| python2.7-minimal    |                  |          |                   |               |                                |                                    |
|                      |                  |          |                   |               |                                |                                    |
|                      |                  |          |                   |               |                                |                                    |
+----------------------+------------------+----------+-------------------+---------------+--------------------------------+------------------------------------+

model training fails

training log file:

Training with training/test data at:
  DATA_DIR: /mnt/data/a-od-in
  MODEL_DIR: /job/model-code
  TRAINING_JOB: 
  TRAINING_COMMAND: chmod +x *.sh && ./train-max-model.sh
Storing trained model at:
  RESULT_DIR: /mnt/results/a-od-out/training-uGH1W19ZR
Fri Mar 27 15:55:42 UTC 2020: Running Tensorflow job
# ************************************************************
# Preparing for model training
# ************************************************************
Training data is stored in /mnt/data/a-od-in
Training work files and results will be stored in /mnt/results/a-od-out/training-uGH1W19ZR
Installing prerequisite packages ...
Requirement already satisfied: Cython in /opt/conda/lib/python3.6/site-packages (from -r training_requirements.txt (line 1)) (0.29.15)
Requirement already satisfied: contextlib2 in /opt/conda/lib/python3.6/site-packages (from -r training_requirements.txt (line 2)) (0.6.0.post1)
Collecting pycocotools==2.0.0
  Downloading pycocotools-2.0.0.tar.gz (1.5 MB)
Collecting coremltools==2.0
  Downloading coremltools-2.0-cp36-none-manylinux1_x86_64.whl (2.7 MB)
Collecting tfcoreml==0.3.0
  Downloading tfcoreml-0.3.0-py3-none-any.whl (38 kB)
Collecting tensorflowjs==0.8.0
  Downloading tensorflowjs-0.8.0-py3-none-any.whl (39 kB)
Collecting tensorflow-hub==0.3.0
  Downloading tensorflow_hub-0.3.0-py2.py3-none-any.whl (73 kB)
Collecting h5py==2.8.0
  Downloading h5py-2.8.0-cp36-cp36m-manylinux1_x86_64.whl (2.8 MB)
Collecting numpy==1.17.5
  Downloading numpy-1.17.5-cp36-cp36m-manylinux1_x86_64.whl (20.0 MB)
Building wheels for collected packages: pycocotools
  Building wheel for pycocotools (setup.py): started
  Building wheel for pycocotools (setup.py): finished with status 'done'
  Created wheel for pycocotools: filename=pycocotools-2.0.0-cp36-cp36m-linux_x86_64.whl size=280641 sha256=a6286e9b8a13f150b736ef67703ea56d00109b65e9ff337d57146dd69f5cc4a9
  Stored in directory: /home/gpuuser/.cache/pip/wheels/64/7a/c0/ac8f633d11a5f1a6902c72acb9fa828a2bb3639afba4e94a6c
Successfully built pycocotools
Installing collected packages: pycocotools, coremltools, tfcoreml, tensorflowjs, tensorflow-hub, h5py, numpy
  WARNING: The script coremlconverter is installed in '/home/gpuuser/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script tensorflowjs_converter is installed in '/home/gpuuser/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The scripts f2py, f2py3 and f2py3.6 are installed in '/home/gpuuser/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed coremltools-2.0 h5py-2.8.0 numpy-1.17.5 pycocotools-2.0.0 tensorflow-hub-0.3.0 tensorflowjs-0.8.0 tfcoreml-0.3.0
Training completed. Output is stored in /mnt/results/a-od-out/training-uGH1W19ZR.
Running training command "./training_command.sh"
# ************************************************************
# Preparing data ...
# ************************************************************
/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/opt/conda/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/opt/conda/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/job/model-code/data/prepare_data_object_detection.py", line 213, in <module>
    main()
  File "/job/model-code/data/prepare_data_object_detection.py", line 172, in main
    opener.retrieve(download_base + model_file, tar_path)
  File "/opt/conda/lib/python3.6/urllib/request.py", line 1791, in retrieve
    fp = self.open(url, data)
  File "/opt/conda/lib/python3.6/urllib/request.py", line 1757, in open
    return getattr(self, name)(url)
  File "/opt/conda/lib/python3.6/urllib/request.py", line 1967, in open_https
    return self._open_generic_http(self._https_connection, url, data)
  File "/opt/conda/lib/python3.6/urllib/request.py", line 1932, in _open_generic_http
    response.status, response.reason, response.msg, data)
  File "/opt/conda/lib/python3.6/urllib/request.py", line 1952, in http_error
    return self.http_error_default(url, fp, errcode, errmsg, headers)
  File "/opt/conda/lib/python3.6/urllib/request.py", line 1957, in http_error_default
    raise HTTPError(url, errcode, errmsg, headers, None)
urllib.error.HTTPError: HTTP Error 404: Not Found
84 training and 36 validation examples.
---------------------------------
 Using pre-trained model weights 
---------------------------------
Downloading model checkpoint...
Return code from task 1: 1
Error: Training run exited with status code 1
awk: cannot open /mnt/results/a-od-out/training-uGH1W19ZR/checkpoint/checkpoint (No such file or directory)

cp: cannot stat '/mnt/results/a-od-out/training-uGH1W19ZR/checkpoint/*': No such file or directory
cp: cannot stat '/mnt/results/a-od-out/training-uGH1W19ZR/model/frozen_inference_graph.pb': No such file or directory
cp: cannot stat '/mnt/results/a-od-out/training-uGH1W19ZR/model/saved_model/saved_model.pb': No such file or directory
# ************************************************************
# Packaging artifacts
# ************************************************************
Creating downloadable archive "/mnt/results/a-od-out/training-uGH1W19ZR/model_training_output.tar.gz".
./
./trained_model/
./trained_model/tensorflow/
./trained_model/tensorflow/checkpoint/
./trained_model/tensorflow/frozen_graph_def/
./trained_model/tensorflow/frozen_graph_def/label_map.pbtxt
./trained_model/tensorflow/saved_model/
./trained_model/tensorflow/saved_model/label_map.pbtxt
Packaging completed.

Add CORS support

Required (for illustrative purposes) by MAX workshop app https://github.com/CODAIT/max-workshop-app-nodejs.git.

Simple demo example:

Deployment fails on Kubernetes 1.16+ : no matches for kind "Deployment" in version "extensions/v1beta1"

I was following this document Deploy "MAX models to the cloud with Kubernetes" instructions
https://developer.ibm.com/tutorials/deploy-max-models-to-ibm-cloud-with-kubernetes/

and I could not execute this command on step 3
3-Apply the configuration file to your Kubernetes cluster.
Again, replace the max-object-detector.yaml with the .yaml file corresponding to your model. The following command deploys the model.
kubectl apply -f ./max-object-detector.yaml

I changed max-object-detector.yaml to apps/v1 like this:

apiVersion: apps/v1
kind: Service
metadata:
name: max-object-detector
spec:
selector:
app: max-object-detector
ports:
- port: 5000
type: NodePort

and then kubectl apply -f ./max-object-detector.yaml is working:
NAME READY UP-TO-DATE AVAILABLE AGE
max-object-detector 1/1 1 1 17h

Docker: camera cannot be used in embedded sample application

The console log contains the following error message if a browser is used that requires a secure context to access navigator.mediaDevices as documented here https://developer.mozilla.org/en-US/docs/Web/API/Navigator/mediaDevices

webapp.js:179 Uncaught TypeError: Cannot read property 'getUserMedia' of undefined
    at HTMLButtonElement.runWebcam (webapp.js:179)
    at HTMLButtonElement.dispatch (jquery-3.3.1.min.js:2)
    at HTMLButtonElement.y.handle (jquery-3.3.1.min.js:2)

swagguer UI doesn't work on OpenShift using secure ssl/https route

Deploy the container and expose it using OpenShift secure route.

I tested this on IBM Red Hat OpenShift Kubernetes Services (ROKS)

When accessing the route the web UI for swagger doesn't load

The UI tries to load the spec swagger.json via HTTP and not HTTPS, and the browser doesn't allow crossing requests once it starts in https

image

Error on Browser console log

index.js:1 Mixed Content: The page at 'https://max-object-detector-test-dev......us-east.containers.appdomain.cloud/' was loaded over HTTPS, but requested an insecure resource 'http://max-object-detector-test-dev.....us-east.containers.appdomain.cloud/swagger.json'. This request has been blocked; the content must be served over HTTPS.

​

Disable embedded sample web app by default

Currently the MAX microservice exposes a mini sample app by default, which is great for demo/exploration purposes but not so much for production-ish use. We might want to consider changing the default behavior to mirror other optional features, such as CORS support (disabled by default)

Support for ARM

Hello, will you add support for ARM CPU so we can run it on for instance Raspberry Pi 4? Is there a guide to build it manually for RPI4? I tried the run locally but it fails due to usage of some prebuilt binaries not supported on arm.

Geo support other then US-South for training

I tried your training readme.md with us-south region for ml instance and buckets. But when I change it to eu-de I run into errors. Other regions can be very important as some projects can have geo restrictions.

Dockerfile doesn't pass linting using hadolint

We are working on creating Tekton Pipelines for the application.

One of the Tekton tasks link the Dockerfile using hadolint

Here are the findings in the task that should be fixed or ignore via comment in Dockerfile

Dockerfile:19 DL3004 error: Do not use sudo as it leads to unpredictable behavior. Use a tool like gosu to enforce root
Dockerfile:33 DL3059 info: Multiple consecutive `RUN` instructions. Consider consolidation.
Dockerfile:33 DL4006 warning: Set the SHELL option -o pipefail before RUN with a pipe in it. If you are using /bin/sh in an alpine image or if your shell is symlinked to busybox then consider explicitly setting your SHELL to /bin/ash, or disable this check
Dockerfile:36 DL3045 warning: `COPY` to a relative destination without `WORKDIR` set.
Dockerfile:37 DL3042 warning: Avoid use of cache directory with pip. Use `pip install --no-cache-dir <package>`
Dockerfile:39 DL3045 warning: `COPY` to a relative destination without `WORKDIR` set.
Dockerfile:44 SC1075 error: Use 'elif' instead of 'else if' (or put 'if' on new line if nesting).
Dockerfile:44 DL3059 info: Multiple consecutive `RUN` instructions. Consider consolidation.
Dockerfile:56 DL3025 warning: Use arguments JSON notation for CMD and ENTRYPOINT arguments

Make all MAX example repos templates to be compatible with Cloud Native Toolkit workflows

I got feedback from @mjperrins that while using the Cloud Native Toolkit typically we ask users to click the green "template" button on the github repository to get started by creating their own copy of the example.

Could you make all the MAX examples in the IBM github org templates enabled?
This would not have any impact on how you collaborate with forks, the same workflow continues are you develop these repositories

@SSaishruthi what do you think about this enablement?

Here is an example https://github.com/IBM/template-java-spring
image

To enable check the box Template repository in settings
image

Model training fails due to pip install failures

# ************************************************************
# Preparing for model training
# ************************************************************
Training data is stored in /mnt/data/a-od-in
Training work files and results will be stored in /mnt/results/a-od-out/training-S0Vgh1qZR
Installing prerequisite packages ...
Collecting Cython==0.29.15
  Downloading Cython-0.29.15-cp36-cp36m-manylinux1_x86_64.whl (2.1 MB)
Collecting contextlib2
  Downloading contextlib2-0.6.0.post1-py2.py3-none-any.whl (9.8 kB)
Collecting pycocotools==2.0.0
  Downloading pycocotools-2.0.0.tar.gz (1.5 MB)
    ERROR: Command errored out with exit status 1:
     command: /opt/anaconda/envs/wmlce/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-ggn5v_rg/pycocotools/setup.py'"'"'; __file__='"'"'/tmp/pip-install-ggn5v_rg/pycocotools/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-install-ggn5v_rg/pycocotools/pip-egg-info
         cwd: /tmp/pip-install-ggn5v_rg/pycocotools/
    Complete output (5 lines):
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-install-ggn5v_rg/pycocotools/setup.py", line 2, in <module>
        from Cython.Build import cythonize
    ModuleNotFoundError: No module named 'Cython'
    ----------------------------------------
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
Running training command "./training_command.sh"
# ************************************************************
# Preparing data ...
# ************************************************************

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.