Giter Site home page Giter Site logo

matterport / mask_rcnn Goto Github PK

View Code? Open in Web Editor NEW
24.4K 24.4K 11.7K 119.72 MB

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

License: Other

Python 100.00%
instance-segmentation keras mask-rcnn object-detection tensorflow

mask_rcnn's People

Contributors

bill2239 avatar borda avatar cclauss avatar concerttttt avatar coreyhu avatar cpruce avatar jmtatsch avatar jningwei avatar keineahnung2345 avatar llltttppp avatar manasbedmutha98 avatar marluca avatar maxfrei750 avatar moorage avatar mowoe avatar np-csu avatar philferriere avatar ps48 avatar robaleman avatar rymalia avatar scitator avatar skt7 avatar stevenhickson avatar veemon avatar viredery avatar waleedka avatar xelmirage avatar yanfengliu avatar yqnlp avatar ziyigogogo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mask_rcnn's Issues

Train on my own numpy dataset

Hi,
Thanks for the great work!

I want to train Mask RCNN on my own dataset (numpy array of images and masks, or two folder for images and masks). Could you help me how can I do that?
I've checked COCO dataset and Shapes dataset codes, but I couldn't understand how Dataset class actually works.

How to load resnet50?

Hi @waleedka ,

I have a less powerful GPU and want to train my own data which contain only 3 classes. Therefore I would use Resnet50 to speed up the model. How can I switch to Resnet50? Thanks for your help.

How much gpu memory required for inference?

I have a gtx 780 card which have 3Gb of memory, but when runing the demo example, an error ocurred :"W tensorflow/core/common_runtime/bfc_allocator.cc:217] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.07GiB."

so how much gpu memory required to run an inference example ?

I used the trained model to test an image. There are some errors.

The code is here followed the demo.ipynb. But there are some errros.

import os
import sys
import random
import math
import numpy as np
import scipy.misc
import matplotlib
import matplotlib.pyplot as plt

import coco
import utils
import model as modellib
import visualize

Create model object in inference mode.

model = modellib.MaskRCNN(mode="inference", model_dir='mask_rcnn_coco.h5', config=0)

Load weights trained on MS-COCO

model.load_weights('mask_rcnn_coco.h5', by_name=True)

COCO Class names

Index of the class in the list is its ID. For example, to get ID of

the teddy bear class, use: class_names.index('teddy bear')

class_names = ['BG', 'person', 'bicycle', 'car', 'motorcycle', 'airplane',
'bus', 'train', 'truck', 'boat', 'traffic light',
'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird',
'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear',
'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie',
'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball',
'kite', 'baseball bat', 'baseball glove', 'skateboard',
'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup',
'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza',
'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed',
'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote',
'keyboard', 'cell phone', 'microwave', 'oven', 'toaster',
'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors',
'teddy bear', 'hair drier', 'toothbrush']

Load a random image from the images folder

file_names = next(os.walk('/media/wxl/00063DAF000FECC3/Mask_RCNN-master/images'))
image = scipy.misc.imread(os.path.join('/media/wxl/00063DAF000FECC3/Mask_RCNN-master/images', random.choice(file_names)))

Run detection

results = model.detect([image], verbose=1)

Visualize results

r = results[0]
visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'],
class_names, r['scores'])

The errors locate in:
File "/media/wxl/00063DAF000FECC3/Mask_RCNN-master/test.py", line 20, in
model = modellib.MaskRCNN(mode="inference", model_dir='mask_rcnn_coco.h5', config=0)
I think the model is load incorrect. But I have no idea about the mistakes. Thanks for your helps

model.py's datagenerator() may not be threadsafe

I haven't been able to run train_shapes.ipynb to completion perhaps because of threading issues. Training the head branches fails with the following output:

Starting at epoch 0. LR=0.002

Checkpoint Path: E:\repos\Mask_RCNN.wip\logs\shapes20171102T1726\mask_rcnn_shapes_{epoch:04d}.h5
Selecting layers to train
fpn_c5p5               (Conv2D)
fpn_c4p4               (Conv2D)
fpn_c3p3               (Conv2D)
fpn_c2p2               (Conv2D)
fpn_p5                 (Conv2D)
fpn_p2                 (Conv2D)
fpn_p3                 (Conv2D)
fpn_p4                 (Conv2D)
In model:  rpn_model
    rpn_conv_shared        (Conv2D)
    rpn_class_raw          (Conv2D)
    rpn_bbox_pred          (Conv2D)
mrcnn_mask_conv1       (TimeDistributed)
mrcnn_mask_bn1         (TimeDistributed)
mrcnn_mask_conv2       (TimeDistributed)
mrcnn_class_conv1      (TimeDistributed)
mrcnn_mask_bn2         (TimeDistributed)
mrcnn_class_bn1        (TimeDistributed)
mrcnn_mask_conv3       (TimeDistributed)
mrcnn_mask_bn3         (TimeDistributed)
mrcnn_class_conv2      (TimeDistributed)
mrcnn_class_bn2        (TimeDistributed)
mrcnn_mask_conv4       (TimeDistributed)
mrcnn_mask_bn4         (TimeDistributed)
mrcnn_bbox_fc          (TimeDistributed)
mrcnn_mask_deconv      (TimeDistributed)
mrcnn_class_logits     (TimeDistributed)
mrcnn_mask             (TimeDistributed)
e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\site-packages\tensorflow\python\ops\gradients_impl.py:95: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\site-packages\keras\engine\training.py:1987: UserWarning: Using a generator with `use_multiprocessing=True` and multiple workers may duplicate your data. Please consider using the`keras.utils.Sequence class.
  UserWarning('Using a generator with `use_multiprocessing=True`'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-11-2101606c7d8e> in <module>()
      6             learning_rate=config.LEARNING_RATE,
      7             epochs=1,
----> 8             layers='heads')

E:\repos\Mask_RCNN.wip\model.py in train(self, train_dataset, val_dataset, learning_rate, epochs, layers)
   2089             initial_epoch=self.epoch,
   2090             epochs=epochs,
-> 2091             **fit_kwargs
   2092             )
   2093         self.epoch = max(self.epoch, epochs)

e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\site-packages\keras\legacy\interfaces.py in wrapper(*args, **kwargs)
     85                 warnings.warn('Update your `' + object_name +
     86                               '` call to the Keras 2 API: ' + signature, stacklevel=2)
---> 87             return func(*args, **kwargs)
     88         wrapper._original_function = func
     89         return wrapper

e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\site-packages\keras\engine\training.py in fit_generator(self, generator, steps_per_epoch, epochs, verbose, callbacks, validation_data, validation_steps, class_weight, max_queue_size, workers, use_multiprocessing, shuffle, initial_epoch)
   2000                                              use_multiprocessing=use_multiprocessing,
   2001                                              wait_time=wait_time)
-> 2002             enqueuer.start(workers=workers, max_queue_size=max_queue_size)
   2003             output_generator = enqueuer.get()
   2004 

e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\site-packages\keras\utils\data_utils.py in start(self, workers, max_queue_size)
    594                     thread = threading.Thread(target=data_generator_task)
    595                 self._threads.append(thread)
--> 596                 thread.start()
    597         except:
    598             self.stop()

e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\multiprocessing\process.py in start(self)
    103                'daemonic processes are not allowed to have children'
    104         _cleanup()
--> 105         self._popen = self._Popen(self)
    106         self._sentinel = self._popen.sentinel
    107         _children.add(self)

e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\multiprocessing\context.py in _Popen(process_obj)
    221     @staticmethod
    222     def _Popen(process_obj):
--> 223         return _default_context.get_context().Process._Popen(process_obj)
    224 
    225 class DefaultContext(BaseContext):

e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\multiprocessing\context.py in _Popen(process_obj)
    320         def _Popen(process_obj):
    321             from .popen_spawn_win32 import Popen
--> 322             return Popen(process_obj)
    323 
    324     class SpawnContext(BaseContext):

e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\multiprocessing\popen_spawn_win32.py in __init__(self, process_obj)
     63             try:
     64                 reduction.dump(prep_data, to_child)
---> 65                 reduction.dump(process_obj, to_child)
     66             finally:
     67                 set_spawning_popen(None)

e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
     58 def dump(obj, file, protocol=None):
     59     '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60     ForkingPickler(file, protocol).dump(obj)
     61 
     62 #

AttributeError: Can't pickle local object 'GeneratorEnqueuer.start.<locals>.data_generator_task'

I have also tried to modify fit_generator()'s param as follows:

        # Common parameters to pass to fit_generator()
        fit_kwargs = {
            "steps_per_epoch": self.config.STEPS_PER_EPOCH,
            "callbacks": callbacks,
            "validation_data": next(val_generator),
            "validation_steps": self.config.VALIDATION_STPES,
            "max_queue_size": 100,
            "workers": 1,  # Phil: was max(self.config.BATCH_SIZE // 2, 2),
            "use_multiprocessing": False # Phil: was "use_multiprocessing": True,
        }

Unfortunatly, that also results in a crash with the jupyter notebook crashing without any output...

It could well be that the generator is threadsafe. After a quick perusal, however, I haven't found any serializing code anywhere. Threadsafe data generators usually implement some kind of locking mechanism. Here are examples that are threadsafe: keras-team/keras#1638 (comment) and http://anandology.com/blog/using-iterators-and-generators/

Here's a bit of information about my own GPU config:

(from notebook):

os: nt
sys: 3.6.1 |Continuum Analytics, Inc.| (default, May 11 2017, 13:25:24) [MSC v.1900 64 bit (AMD64)]
numpy: 1.13.3, e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\site-packages\numpy\__init__.py
matplotlib: 2.0.2, e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\site-packages\matplotlib\__init__.py
cv2: 3.3.0, e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\site-packages\cv2.cp36-win_amd64.pyd
tensorflow: 1.3.0, e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\site-packages\tensorflow\__init__.py
keras: 2.0.8, e:\toolkits.win\anaconda3-4.4.0\envs\dlwin36coco\lib\site-packages\keras\__init__.py

(from jupyter notebook log):

2017-11-02 16:59:50.162182: I C:\tf_jenkins\home\workspace\rel-win\M\windows-gpu\PY\36\tensorflow\core\common_runtime\gpu\gpu_device.cc:955] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:03:00.0
Total memory: 12.00GiB
Free memory: 10.06GiB
2

Has anyone else observed similar issues?

The loss of Step 2

During my training process, the loss of step 1 suddenly jumps from a low value (like 2.1) to a high value (like 13) in step 2. is that a normal situation?

AssertionError in run_graph

I am running testing MASK_RCNN on my local computer and on a remote machine.
In inspect_model everything runs fine locally but on the remote machine I get an assertion error at ### 1.b RPN Predictions:

# Run RPN sub-graph
pillar = model.keras_model.get_layer("ROI").output  # node to start searching from
rpn = model.run_graph([image], [
    ("rpn_class", model.keras_model.get_layer("rpn_class").output),
    ("pre_nms_anchors", model.ancestor(pillar, "ROI/pre_nms_anchors:0")),
    ("refined_anchors", model.ancestor(pillar, "ROI/refined_anchors:0")),
    ("refined_anchors_clipped", model.ancestor(pillar, "ROI/refined_anchors_clipped:0")),
    ("post_nms_anchor_ix", model.ancestor(pillar, "ROI/rpn_non_max_suppression:0")),
    ("proposals", model.keras_model.get_layer("ROI").output),
])

=>

AssertionError            Traceback (most recent call last)
<ipython-input-14-799ca4676404> in <module>()
      7     ("refined_anchors_clipped", model.ancestor(pillar, "ROI/refined_anchors_clipped:0")),
      8     ("post_nms_anchor_ix", model.ancestor(pillar, "ROI/rpn_non_max_suppression:0")),
----> 9     ("proposals", model.keras_model.get_layer("ROI").output),
     10 ])

/home/orestisz/repositories/Mask_RCNN/model.py in run_graph(self, images, outputs)
   2296         for o in outputs.values():
   2297             print(o)
-> 2298             assert o is not None
   2299 
   2300         # Build a Keras function to run parts of the computation graph

AssertionError:

when printing the outputs:

for o in outputs.values():
            print(o)
            assert o is not None

I get the following output locally:

Tensor("rpn_class/concat:0", shape=(?, ?, 2), dtype=float32, device=/device:CPU:0)
Tensor("ROI/pre_nms_anchors:0", shape=(1, 10000, 4), dtype=float32, device=/device:CPU:0)
Tensor("ROI/refined_anchors:0", shape=(1, 10000, 4), dtype=float32, device=/device:CPU:0)
Tensor("ROI/refined_anchors_clipped:0", shape=(1, 10000, 4), dtype=float32, device=/device:CPU:0)
Tensor("ROI/rpn_non_max_suppression:0", shape=(?,), dtype=int32, device=/device:CPU:0)
Tensor("ROI/packed_2:0", shape=(1, ?, 4), dtype=float32, device=/device:CPU:0)

and the following remotely:

Tensor("rpn_class/concat:0", shape=(?, ?, 2), dtype=float32, device=/device:CPU:0)
Tensor("ROI/pre_nms_anchors:0", shape=(1, 10000, 4), dtype=float32, device=/device:CPU:0)
Tensor("ROI/refined_anchors:0", shape=(1, 10000, 4), dtype=float32, device=/device:CPU:0)
Tensor("ROI/refined_anchors_clipped:0", shape=(1, 10000, 4), dtype=float32, device=/device:CPU:0)
None

So looks like ("post_nms_anchor_ix", model.ancestor(pillar, "ROI/rpn_non_max_suppression:0")) is causing the issue.

Any suggestions? Thanks in advance

Unable to train on multiple GPUs

I am trying to run coco.py on a machine with 8 Tesla P100 GPUs... However, it seems like there is something going wrong when I try to use more than one GPU.
I was able to run the parallel_model.py file on all GPUs without a problem.
The error that is dumped in my terminal is the following one:

Epoch 1/40
2017-11-09 14:49:19.639818: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Incompatible shapes: [15,4] vs. [30,4]
	 [[Node: rpn_bbox_loss/sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](rpn_bbox_loss/concat, rpn_bbox_loss/GatherNd)]]
2017-11-09 14:49:19.639903: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Incompatible shapes: [15,4] vs. [30,4]
	 [[Node: rpn_bbox_loss/sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](rpn_bbox_loss/concat, rpn_bbox_loss/GatherNd)]]
2017-11-09 14:49:19.640016: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Incompatible shapes: [15,4] vs. [30,4]
	 [[Node: rpn_bbox_loss/sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](rpn_bbox_loss/concat, rpn_bbox_loss/GatherNd)]]
2017-11-09 14:49:19.640329: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Incompatible shapes: [15,4] vs. [30,4]
	 [[Node: rpn_bbox_loss/sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](rpn_bbox_loss/concat, rpn_bbox_loss/GatherNd)]]
2017-11-09 14:49:19.640406: W tensorflow/core/framework/op_kernel.cc:1192] Invalid argument: Incompatible shapes: [15,4] vs. [30,4]
	 [[Node: rpn_bbox_loss/sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](rpn_bbox_loss/concat, rpn_bbox_loss/GatherNd)]]
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1323, in _do_call
    return fn(*args)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1302, in _run_fn
    status, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [15,4] vs. [30,4]
	 [[Node: rpn_bbox_loss/sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](rpn_bbox_loss/concat, rpn_bbox_loss/GatherNd)]]
	 [[Node: proposal_targets/roi_assertion_1/AssertGuard/Assert/Switch/_1627 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_5177_proposal_targets/roi_assertion_1/AssertGuard/Assert/Switch", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "coco.py", line 417, in <module>
    layers='heads')
  File "/projects/mask_rcnn/model.py", line 2110, in train
    **fit_kwargs
  File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 2077, in fit_generator
    class_weight=class_weight)
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/training.py", line 1797, in train_on_batch
    outputs = self.train_function(ins)
  File "/usr/local/lib/python3.5/dist-packages/keras/backend/tensorflow_backend.py", line 2332, in __call__
    **self.session_kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 889, in run
    run_metadata_ptr)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1120, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1317, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 1336, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Incompatible shapes: [15,4] vs. [30,4]
	 [[Node: rpn_bbox_loss/sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](rpn_bbox_loss/concat, rpn_bbox_loss/GatherNd)]]
	 [[Node: proposal_targets/roi_assertion_1/AssertGuard/Assert/Switch/_1627 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_5177_proposal_targets/roi_assertion_1/AssertGuard/Assert/Switch", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'rpn_bbox_loss/sub', defined at:
  File "coco.py", line 365, in <module>
    model_dir=MODEL_DIR)
  File "/projects/mask_rcnn/model.py", line 1646, in __init__
    self.keras_model = self.build(mode=mode, config=config)
  File "/projects/mask_rcnn/model.py", line 1794, in build
    [input_rpn_bbox, input_rpn_match, rpn_bbox])
  File "/usr/local/lib/python3.5/dist-packages/keras/engine/topology.py", line 603, in __call__
    output = self.call(inputs, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/keras/layers/core.py", line 651, in call
    return self.function(inputs, **arguments)
  File "/projects/mask_rcnn/model.py", line 1793, in <lambda>
    rpn_bbox_loss = KL.Lambda(lambda x: rpn_bbox_loss_graph(config, *x), name="rpn_bbox_loss")(
  File "/projects/mask_rcnn/model.py", line 987, in rpn_bbox_loss_graph
    diff = K.abs(target_bbox - rpn_bbox)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/math_ops.py", line 894, in binary_op_wrapper
    return func(x, y, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 4636, in _sub
    "Sub", x=x, y=y, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 2956, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1470, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Incompatible shapes: [15,4] vs. [30,4]
	 [[Node: rpn_bbox_loss/sub = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](rpn_bbox_loss/concat, rpn_bbox_loss/GatherNd)]]
	 [[Node: proposal_targets/roi_assertion_1/AssertGuard/Assert/Switch/_1627 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_5177_proposal_targets/roi_assertion_1/AssertGuard/Assert/Switch", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Is there anyone that has faced a similar error here?
My system is running python 3.5 and the latest keras (2.0.9) and tensorflow (1.4.0) versions.
Thanks!

Suggest

1.2159 exclude_ix = np.where((boxes[:, 2] - boxes[:, 0]) * (boxes[:, 2] - boxes[:, 0]) <= 0)[0] --line 2159 in function unmold_detections(self, detections, mrcnn_mask, image_shape, window) the bboxes area calculations was misspelt.
2. Also in this function the filtering out zero area process is better placed after the translation coordinates backing to image domain process for the reason that the latter process may still cause zero area.

GPU and Mask_shape

some question about GPU_COUNT and mask_shape:

  1. In the config.py, it says GPU_COUNT : NUMBER of gpus to use, for training on cpu, use 1.so if I want to use a single gpu from multi-gpu machine, what should i set the GPU_COUNT.1 or 2?. I found in the class ParallelModel, the GPU_COUNT represent the real number of gpu without minus 1(which i think 1 for cpu, and 2 for a single gpu....)

  2. Zero volatile GPU-Util but high GPU Memory Usage: most time the volatile GPU-Util is zero but GPU Memory Usage is very high, and the training is slow after i set the USE_MINI_MASK = False, Is this a bug? what should I do? the input size of image is 1024*1024

  3. in config.py have two paras: mask_pool_size and 'mask_shape', however in FCN only have one deconv layer which means the mask_shape = 2*mask_pool_size. so what i should do , if I want a more dense segmentation without resize from 28 * 28 to the Roi size

Training the model on Custom Images

Please guide me on how to train the model and weights on custom images.I have set of images .How do I import those images,mask them and train this model.

Class imbalance for RPN

Hi,

For my master thesis I am working on the topic of text detection in images and video frames. I implemented a modified version of Faster R-CNN/Mask R-CNN, which is very similar to your implementation, but tailored for text detection.

On average there are ~54 positive text anchors in the images I use, which result in a mini batch of ~54 positive and ~200 negative examples per image ( I use mini batches of size 254 per image). The problem I encounter is that my network overfits on only negative prediction (because on average, there are many more negative than positive examples). Now a few simple solutions would be to 1) take a smaller mini batch size(for example 112), 2) to remove all the images in which there are not enough positive examples from the data set or 3) use a weighted loss function.

I inspected your code very carefully, but (as far as I can see) in your implementation this doesn't seem to be an issue. Was this positive/negative class imbalance also a problem in your implementation, and if so, how did you solve this problem?

Thanks!

Maurits

The detection speed reduce when batch size increasing

I test the detection speed with different batch_size, and it's surprising that the detection speed reduce when batch size increasing.
Then I check my device , and find it's OK.
Finally, I find batch_slice's annotation in utils.py.

Batch Slicing
Some custom layers support a batch size of 1 only, and require a lot of work
to support batches greater than 1. This function slices an input tensor
across the batch dimension and feeds batches of size 1. Effectively,
an easy way to support batches > 1 quickly with little code modification.
In the long run, it's more efficient to modify the code to support large
batches and getting rid of this function. Consider this a temporary solution

Although I have changed the code in DetectionLayer, it's the fact that the model support batch size > 1 superficially.

And I also find a bug, if we feed images of which the length is smaller than config.BATCH_SIZE to the function detect(self, images, verbose=0), program will raise exception.
I add "assert" to avoid it.

assert len(images) == self.config.BATCH_SIZE, "len(images) must be equal to BATCH_SIZE"

Run out of memory

Well, I have a voc-like dataset with 7000 classes. So I use the following config:

    GPU_COUNT = 2
    IMAGES_PER_GPU = 1
    STEPS_PER_EPOCH = 150
    BASE_EPOCH = 10
    NUM_CLASSES = 1 + 7000  # WARNING: This dataset has  7000 classes
    MAX_GT_INSTANCES = 50
    POST_NMS_ROIS_TRAINING = 1000
    POST_NMS_ROIS_INFERENCE = 500
    DETECTION_MAX_INSTANCES = 50

Other config is just as the default config.

And~ I only take less than 15 objects in one image.

I run it on 2 Titan X, each of which has 12 GB memory. But still run out of memory during training:

41/150 [=======>......................] - ETA: 1:12 - loss: 3.6104 - rpn_class_loss: 0.0231 - rpn_bbox_loss: 0.7184 - mrcnn_class_loss: 1.5067 - mrcnn_bbox_loss: 0.6695 - mrcnn_mask_loss: 0.6922
42/150 [=======>......................] - ETA: 1:11 - loss: 3.6236 - rpn_class_loss: 0.0239 - rpn_bbox_loss: 0.7198 - mrcnn_class_loss: 1.5192 - mrcnn_bbox_loss: 0.6680 - mrcnn_mask_loss: 0.6921


2017-11-13 16:05:25.086033: W tensorflow/core/common_runtime/bfc_allocator.cc:273] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.62GiB.  Current allocation summary follows.
2017-11-13 16:05:25.086152: I tensorflow/core/common_runtime/bfc_allocator.cc:627] Bin (256): 	Total Chunks: 276, Chunks in use: 275. 69.0KiB allocated for chunks. 68.8KiB in use in bin. 11.9KiB client-requested in use in bin.
2017-11-13 16:05:25.086173: I tensorflow/core/common_runtime/bfc_allocator.cc:627] Bin (512): 	Total Chunks: 45, Chunks in use: 44. 23.2KiB allocated for chunks. 22.5KiB in use in bin. 22.1KiB client-requested in use in bin.
.....
.....

So, anyone can tell me how to prevent it?


OK~ set TRAIN_ROIS_PER_IMAGE = 32, goes well.

Data Generator - reduction operation minimum which has no identity

I'm playing around with the shapes data-set; I have increased the settings by 4:

IMAGE_MIN_DIM = 4*128
IMAGE_MAX_DIM = 4*128
RPN_ANCHOR_SCALES = (4*8, 4*16, 4*32, 4*64, 4*128)  # anchor side in pixels
TRAIN_ROIS_PER_IMAGE = 4*32

And also generate 100x more samples. However I get errors like this that stop execution:

ValueError Traceback (most recent call last)
in ()
6 learning_rate=config.LEARNING_RATE,
7 epochs=1,
----> 8 layers='heads')
/home/iliauk/Mask_RCNN/model.py in train(self, train_dataset, val_dataset, learning_rate, epochs, layers)
2072 "steps_per_epoch": self.config.STEPS_PER_EPOCH,
2073 "callbacks": callbacks,
-> 2074 "validation_data": next(val_generator),
2075 "validation_steps": self.config.VALIDATION_STPES,
2076 "max_queue_size": 100,
/home/iliauk/Mask_RCNN/model.py in data_generator(dataset, config, shuffle, augment, random_rois, batch_size, detection_targets)
1525 image_id = image_ids[image_index]
1526 image, image_meta, gt_boxes, gt_masks =
-> 1527 load_image_gt(dataset, config, image_id, augment=augment, use_mini_mask=config.USE_MINI_MASK)
1528
1529 # Skip images that have no instances. This can happen in cases
/home/iliauk/Mask_RCNN/model.py in load_image_gt(dataset, config, image_id, augment, use_mini_mask)
1150 # Resize masks to smaller size to reduce memory usage
1151 if use_mini_mask:
-> 1152 mask = utils.minimize_mask(bbox, mask, config.MINI_MASK_SHAPE)
1153
1154 # Image meta data
/home/iliauk/Mask_RCNN/utils.py in minimize_mask(bbox, mask, mini_shape)
433 y1, x1, y2, x2 = bbox[i][:4]
434 m = m[y1:y2, x1:x2]
--> 435 m = scipy.misc.imresize(m.astype(float), mini_shape, interp='bilinear')
436 mini_mask[:, :, i] = np.where(m >= 128, 1, 0)
437 return mini_mask
/anaconda/envs/py35/lib/python3.5/site-packages/numpy/lib/utils.py in newfunc(*args, **kwds)
99 """arrayrange is deprecated, use arange instead!"""
100 warnings.warn(depdoc, DeprecationWarning, stacklevel=2)
--> 101 return func(*args, **kwds)
102
103 newfunc = _set_function_name(newfunc, old_name)
/anaconda/envs/py35/lib/python3.5/site-packages/scipy/misc/pilutil.py in imresize(arr, size, interp, mode)
552
553 """
--> 554 im = toimage(arr, mode=mode)
555 ts = type(size)
556 if issubdtype(ts, numpy.signedinteger):
/anaconda/envs/py35/lib/python3.5/site-packages/numpy/lib/utils.py in newfunc(*args, **kwds)
99 """arrayrange is deprecated, use arange instead!"""
100 warnings.warn(depdoc, DeprecationWarning, stacklevel=2)
--> 101 return func(*args, **kwds)
102
103 newfunc = _set_function_name(newfunc, old_name)
/anaconda/envs/py35/lib/python3.5/site-packages/scipy/misc/pilutil.py in toimage(arr, high, low, cmin, cmax, pal, mode, channel_axis)
334 if mode in [None, 'L', 'P']:
335 bytedata = bytescale(data, high=high, low=low,
--> 336 cmin=cmin, cmax=cmax)
337 image = Image.frombytes('L', shape, bytedata.tostring())
338 if pal is not None:
/anaconda/envs/py35/lib/python3.5/site-packages/numpy/lib/utils.py in newfunc(*args, **kwds)
99 """arrayrange is deprecated, use arange instead!"""
100 warnings.warn(depdoc, DeprecationWarning, stacklevel=2)
--> 101 return func(*args, **kwds)
102
103 newfunc = _set_function_name(newfunc, old_name)
/anaconda/envs/py35/lib/python3.5/site-packages/scipy/misc/pilutil.py in bytescale(data, cmin, cmax, high, low)
91
92 if cmin is None:
---> 93 cmin = data.min()
94 if cmax is None:
95 cmax = data.max()
/anaconda/envs/py35/lib/python3.5/site-packages/numpy/core/_methods.py in _amin(a, axis, out, keepdims)
27
28 def _amin(a, axis=None, out=None, keepdims=False):
---> 29 return umr_minimum(a, axis, None, out, keepdims)
30
31 def _sum(a, axis=None, dtype=None, out=None, keepdims=False):
ValueError: zero-size array to reduction operation minimum which has no identity

Performance Improvement

Firstly I want to congratulate you for this amazing work.

In your opinion, where modifications will impact most the performance (frame rate) of the method.

I'm thinking going with a smaller backbone network (for instance, squeeze net). Do you think it's there the performance bottleneck?

scipy.misc has no attribute 'imread' when scipy version is 1.0.0

When I run demo ,there is an error:
scipy.misc has no attribute 'imread'.

I check the code and find that 'imread' is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
But the fact is that when I use the scipy with version 1.0.0, error occurs.
I chage the scipy version to 1.0.0rc2 and then the demo run ok.

Could you provide the dependence list for the project to guide user preparing the environment.

KL.UpSampling2D()

You used KL.UpSampling2D() to build the FPN network. Did you mean un-pooling or deconv?

Because when I looked into keras' document, it says this function is to repeats the rows and columns. (more like un-pooling). See from this: https://keras.io/layers/convolutional/#upsampling2d

But it seems like this process should be more like deconv. Am I wrong about understanding FPN ?

not work

i want to train on my own dataset. the image size is 1024*1024 and with a Titan xp GPU (12G). when i run the script . it wait .
Configurations:

BACKBONE_SHAPES                [[256 256]
 [128 128]
 [ 64  64]
 [ 32  32]
 [ 16  16]]
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     1
BBOX_STD_DEV                   [ 0.1  0.1  0.2  0.2]
DETECTION_MAX_INSTANCES        100
DETECTION_MIN_CONFIDENCE       0.7
DETECTION_NMS_THRESHOLD        0.3
GPU_COUNT                      1
IMAGES_PER_GPU                 1
IMAGE_MAX_DIM                  1024
IMAGE_MIN_DIM                  1024
IMAGE_PADDING                  True
IMAGE_SHAPE                    [1024 1024    3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.002
MASK_POOL_SIZE                 14
MASK_SHAPE                     [28, 28]
MAX_GT_INSTANCES               100
MEAN_PIXEL                     [ 96.  96.  96.]
MINI_MASK_SHAPE                (56, 56)
NAME                           guidewire
NUM_CLASSES                    2
POOL_SIZE                      7
POST_NMS_ROIS_INFERENCE        1000
POST_NMS_ROIS_TRAINING         2000
ROI_POSITIVE_RATIO             0.33
RPN_ANCHOR_RATIOS              [0.5, 1, 2]
RPN_ANCHOR_SCALES              (32, 64, 128, 256)
RPN_ANCHOR_STRIDE              2
RPN_BBOX_STD_DEV               [ 0.1  0.1  0.2  0.2]
RPN_TRAIN_ANCHORS_PER_IMAGE    256
STEPS_PER_EPOCH                100
TRAIN_ROIS_PER_IMAGE           32
USE_MINI_MASK                  False
USE_RPN_ROIS                   True
VALIDATION_STEPS               1
WEIGHT_DECAY                   0.0001


2017-11-12 12:59:57.096834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: TITAN Xp, pci bus id: 0000:0a:00.0, compute capability: 6.1)

mrcnn_mask_conv1       (TimeDistributed)
mrcnn_mask_bn1         (TimeDistributed)
mrcnn_mask_conv2       (TimeDistributed)
mrcnn_class_conv1      (TimeDistributed)
mrcnn_mask_bn2         (TimeDistributed)
mrcnn_class_bn1        (TimeDistributed)
mrcnn_mask_conv3       (TimeDistributed)
mrcnn_mask_bn3         (TimeDistributed)
mrcnn_class_conv2      (TimeDistributed)
mrcnn_class_bn2        (TimeDistributed)
mrcnn_mask_conv4       (TimeDistributed)
mrcnn_mask_bn4         (TimeDistributed)
mrcnn_bbox_fc          (TimeDistributed)
mrcnn_mask_deconv      (TimeDistributed)
mrcnn_class_logits     (TimeDistributed)
mrcnn_mask             (TimeDistributed)
/home/wuyudong/anaconda3/envs/python34/lib/python3.4/site-packages/tensorflow/python/ops/gradients_impl.py:96: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/home/wuyudong/anaconda3/envs/python34/lib/python3.4/site-packages/keras/engine/training.py:2022: UserWarning: Using a generator with use_multiprocessing=True and multiple workers may duplicate your data. Please consider using the keras.utils.Sequence class.
  UserWarning(Using a generator with use_multiprocessing=True
Epoch 1/100`

when i use Ctrl+c it out_put:
Traceback (most recent call last): File "train_own_data.py", line 252, in <module> train_own_data() File "train_own_data.py", line 189, in train_own_data layers="all") File "/home/wuyudong/Project/scripts/Mask_RCNN/model.py", line 2088, in train **fit_kwargs File "/home/wuyudong/anaconda3/envs/python34/lib/python3.4/site-packages/keras/legacy/interfaces.py", line 87, in wrapper return func(*args, **kwargs) File "/home/wuyudong/anaconda3/envs/python34/lib/python3.4/site-packages/keras/engine/training.py", line 2046, in fit_generator generator_output = next(output_generator) File "/home/wuyudong/anaconda3/envs/python34/lib/python3.4/site-packages/keras/utils/data_utils.py", line 661, in get time.sleep(self.wait_time) KeyboardInterrupt
when i set the image size to 256*256 it works
what should i do

How to evaluate on a new image?

Hi
first thanks for sharing this great repo.

Now I tried the jupiter notebook but got an error on the first cell (not sure why):

File "coco.py", line 339
    config.print()
               ^
SyntaxError: invalid syntax

then I did comment the line given it is just a print but then I got a ton of more errors;

So back to the command line, but I get the same error...

can you please help to fix or give the steps to properly evaluate on a new image (not COCO data)?

thanks for your help.
Tets

OOM when allocating tensor

First of all I would like to thank the authors for such a great work.
When running the train_shapes notebook, I get the following error message 3 times in the first epoch.

ResourceExhaustedError: OOM when allocating tensor with shape[256,256,28,28]
	 [[Node: mrcnn_mask_deconv/conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="VALID", strides=[1, 2, 2, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](mrcnn_mask_deconv/stack, mrcnn_mask_deconv/kernel/read, mrcnn_mask_deconv/Reshape)]]
	 [[Node: Mean_9/_5017 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_17513_Mean_9", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

I guess my humble 2GB GPU is not enough... how could I solve this? I do not see any batch size option implemented, is there any at all? Do you think that would that solve my issue? I would appreciate all kind of suggestions. Thanks.

Pre trained weights

The README mentions pre trained weights trained on the COCO dataset. However it doesn't seem to be there in the repo! Could you please commit/link it, it would be really useful.

Wings, limbs and holes

The model seems to have problems with wings, limbs and holes. Is there theoretical awareness of this and proposed countermeasures?

Detection result is error when input batch size > 1

I use 'model.detect(image_batch, verbose=1)' to detect images, but I find the result is error.
The results of '1.jpg' and '10.jpg' are same.
I check the code and find that the value of input is ok but the value of detections[0] and detections[1] are duplicate.

code in model.py

detections, mrcnn_class, mrcnn_bbox, mrcnn_mask, \
        rois, rpn_class, rpn_bbox =\
    self.keras_model.predict([molded_images, image_metas], verbose=0)

my test code

image_batch = []
image_name_batch = []

image_path_1 = './1.jpg'
image_batch.append(skimage.io.imread(image_path_1))
image_path_10 = './10.jpg'
image_batch.append(skimage.io.imread(image_path_10))

results = model.detect(image_batch, verbose=1)
for j in range(len(image_batch)):
    r = results[j]
    visualize.display_instances(image_batch[j], r['rois'], r['masks'], r['class_ids'],
                      coco_class_names, r['scores'],title=image_name_batch[j])

NO file mask_rcnn_coco.h5

When I want to run the demo.ipynb, I can't find the model file mask_rcnn_coco.h5. Wherer can I get the file or I must train the model first?

Training Policy Advice?

Hi, I want to train my own dataset consisting of 10 classes and 30000 images with segmentation annotation. Can you give me some advice on how many epoches for each training stage and how much time it may take to train on two TITAN X GPU? Thanks a lot!

Do I need to resize mask after resizing image?

Hi,I made some masks for some 2048*2048 images.Then I found it's too big for my memory.So I change the IMAGE_MAX_DIM in config .But I couldn't find anywhere to change the size of the masks.Will the size of the masks change automatically?

Shape mismatch error

Hi @waleedka ,

Thanks a lot for an awesome repository. You've done good work.
I'm trying to trying to trigger a training on own dataset which is of Coco dataset type, but less number of classes(only 5).

When I changed the the number classes to 5+1 in the coco.py and trigger a training using Coco pretrained model, I'm getting the shape mismatch error.

Does this because the pretrained model is of 81 classes Or have I done any mistake? Please correct me if I'm wrong.

So does this mean, I need to start a new training from scratch for 5 classes?

Regards,
Sharath

ValueError: zero-size array to reduction operation minimum which has no identity

Hi~ I am training Mask RCNN on my dataset. During training, an error occured:

63/150 [===========>..................] - ETA: 1:05 - loss: 5.5540 - rpn_class_loss: 0.1711 - rpn_bbox_loss: 1.3187 - mrcnn_class_loss: 1.6656 - mrcnn_bbox_loss: 0.8899 - mrcnn_mask_loss: 0.6008
ERROR:root:Error processing image {'source': 'suncg', 'path': '/path/to/mlt/8a33bca7ed13c8d2698303625feba21a/000005.png', 'obj_mask_path': '/path/to/node/8a33bca7ed13c8d2698303625feba21a/000005.png', 'cls_mask_path': '/path/to/category/8a33bca7ed13c8d2698303625feba21a/000005.png', 'id': 8067}
Traceback (most recent call last):
  File "/home/haoyu/Workspace/Mask_RCNN/model.py", line 1523, in data_generator
    load_image_gt(dataset, config, image_id, augment=augment, use_mini_mask=config.USE_MINI_MASK)
  File "/home/haoyu/Workspace/Mask_RCNN/model.py", line 1148, in load_image_gt
    mask = utils.minimize_mask(bbox, mask, config.MINI_MASK_SHAPE)
  File "/home/haoyu/Workspace/Mask_RCNN/utils.py", line 436, in minimize_mask
    m = scipy.misc.imresize(m.astype(float), mini_shape, interp='bilinear')
  File "/home/haoyu/venv/lib/python3.4/site-packages/scipy/misc/pilutil.py", line 480, in imresize
    im = toimage(arr, mode=mode)
  File "/home/haoyu/venv/lib/python3.4/site-packages/scipy/misc/pilutil.py", line 299, in toimage
    cmin=cmin, cmax=cmax)
  File "/home/haoyu/venv/lib/python3.4/site-packages/scipy/misc/pilutil.py", line 90, in bytescale
    cmin = data.min()
  File "/home/haoyu/venv/lib/python3.4/site-packages/numpy/core/_methods.py", line 29, in _amin
    return umr_minimum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation minimum which has no identity

Anyone can tell me how to fix it?

Suggestion: Add `pycocotools` and `cython` to requirements + upade installation instructions

Hi Waleed,
This is wonderful work and I'm glad you made it available to the rest of us.
The two small edits to README.md might make it easier to experiment with your package:

## Requirements
* Python 3.4+
* TensorFlow 1.3+
* Keras 2.0.8+
* Jupyter Notebook
* Numpy, skimage, scipy
* pycocotools, cython

If you use Docker, the model has been verified to work on
[this Docker container](https://hub.docker.com/r/waleedka/modern-deep-learning/).

The package `pycocotools` requires `cython` and a C compiler to install correctly. See below for further instructions.

## Installation
1. Clone this repository
2. Download pre-trained COCO weights from the releases section of this repository.
3. Installing `pycocotools` as follows:
    - On Linux, run `pip install git+https://github.com/waleedka/cocoapi.git#egg=pycocotools&subdirectory=PythonAPI`
    - On Windows, run `pip install git+https://github.com/philferriere/cocoapi.git#egg=pycocotools^&subdirectory=PythonAPI`

Note that on Windows, for the avove to work, you must have the Visual C++ 2015 build tools on your path (see [this coco clone](https://github.com/philferriere/cocoapi) for additional details). 

Let me know it you'd like me to PR this.
Thanks again!

Results reproduction

Hi, thanks a lot for publishing this code as open source project.

I trained the net without doing any changes to the code/configuration, and by initializing it with ImageNet weights. After training the weights output is different and gives worse results(but somewhat similar) compared to the one which was provided in the repository.
I have some thoughts on why the train could go wrong, although I’m not sure in my correctness. From the code, it seems to me that the mean is applied at the end of loss functions calculations and results from different GPUs are concatenated during the model parallelization, hence no need learning rate change. I have trained the net with 1 GPU, and as far I understand in such case in each step the net is trained with 2 images, but if the net in trained with 8 GPUs, then in each step the net will be trained with 16 images, and my thought is that in that case the train will be more stable, since the impact of “noise” will be smaller because the gradient direction will be determined by using 16 images instead of 2.
Please correct me if you think I made a wrong conclusions, and do you have any ideas on why the train could go wrong?

Thanks in advance!

Here is an inference result with my trained weights.

result_1

tensor_board

Train own Dataset by taking actual images

Hi,

Thanks a lot for the awesome repository.

I went to train_shapes file which describes about how to train for our own dataset.

But all the things which you guys are doing over there is by generating randomly. Could explain the same by taking actual images which has ground truth of mask, class and bounding related information.

Regards,
Pirag

ValueError: The channel dimension of the inputs should be defined. Found `None`. in demo.ipynb

Hey @waleedka

Great work. Now I have man-crush on you.

A little PSA. If anyone is getting this error:

ValueError: The channel dimension of the inputs should be defined. Found None.

you're most likely having backend issue ( in ~/.keras/keras.json), as it might be set to Theano instead of TF. And even if it it,s for some reason jupyter notebook is not reading it correctly.

Solution. You can do either

  1. Before this cell is executed:
model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config)
# Load weights trained on MS-COCO
model.load_weights(COCO_MODEL_PATH, by_name=True)

you can add:

import keras.backend

K = keras.backend.backend()
if K=='tensorflow':
    keras.backend.set_image_dim_ordering('tf')

  1. Stick it in ipython notebook startup file (~/.ipython/profile_default/startup/ ) like this guy

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.