broadinstitute / keras-rcnn Goto Github PK

View Code? Open in Web Editor NEW

554.0 39.0 223.0 5.2 MB

Keras package for region-based convolutional neural networks (RCNNs)

License: Other

Python 100.00%

deep-learning theano tensorflow cntk object-detection image-segmentation

keras-rcnn's Introduction

Keras-RCNN

keras-rcnn is the Keras package for region-based convolutional neural networks.

Requirements

Python 3

keras-resnet==0.2.0

numpy==1.16.2

tensorflow==1.13.1

Keras==2.2.4

scikit-image==0.15.0

Getting Started

Let’s read and inspect some data:

training_dictionary, test_dictionary = keras_rcnn.datasets.shape.load_data()

categories = {"circle": 1, "rectangle": 2, "triangle": 3}

generator = keras_rcnn.preprocessing.ObjectDetectionGenerator()

generator = generator.flow_from_dictionary(
    dictionary=training_dictionary,
    categories=categories,
    target_size=(224, 224)
)

validation_data = keras_rcnn.preprocessing.ObjectDetectionGenerator()

validation_data = validation_data.flow_from_dictionary(
    dictionary=test_dictionary,
    categories=categories,
    target_size=(224, 224)
)

target, _ = generator.next()

target_bounding_boxes, target_categories, target_images, target_masks, target_metadata = target

target_bounding_boxes = numpy.squeeze(target_bounding_boxes)

target_images = numpy.squeeze(target_images)

target_categories = numpy.argmax(target_categories, -1)

target_categories = numpy.squeeze(target_categories)

keras_rcnn.utils.show_bounding_boxes(target_images, target_bounding_boxes, target_categories)

Let’s create an RCNN instance:

model = keras_rcnn.models.RCNN((224, 224, 3), ["circle", "rectangle", "triangle"])

and pass our preferred optimizer to the compile method:

optimizer = keras.optimizers.Adam(0.0001)

model.compile(optimizer)

Finally, let’s use the fit_generator method to train our network:

model.fit_generator(
    epochs=10,
    generator=generator,
    validation_data=validation_data
)

External Data

The data is made up of a list of dictionaries corresponding to images.

For each image, add a dictionary with keys 'image', 'objects'
- 'image' is a dictionary, which contains keys 'checksum', 'pathname', and 'shape'
  
  'checksum' is the md5 checksum of the image
  
  'pathname' is the pathname of the image, put in full pathname
  
  'shape' is a dictionary with keys 'r', 'c', and 'channels'
  
  'c': number of columns
  
  'r': number of rows
  
  'channels': number of channels
- 'objects' is a list of dictionaries, where each dictionary has keys 'bounding_box', 'category'
  
  'bounding_box' is a dictionary with keys 'minimum' and 'maximum'
  
  'minimum': dictionary with keys 'r' and 'c'
  
  'r': smallest bounding box row
  
  'c': smallest bounding box column
  
  'maximum': dictionary with keys 'r' and 'c'
  
  'r': largest bounding box row
  
  'c': largest bounding box column
  
  'category' is a string denoting the class name

Suppose this data is save in a file called training.json. To load data,

import json

with open('training.json') as f:
    d = json.load(f)

Slack

We’ve been meeting in the #keras-rcnn channel on the keras.io Slack server.

You can join the server by inviting yourself from the following website:

https://keras-slack-autojoin.herokuapp.com/

keras-rcnn's People

Contributors

Stargazers

Watchers

Forkers

jgraving jihongju bityangke rulanchen jianchi8 fei9009 mayman99 cicobalico jhung0 ishanic thegreatshasha hyzcn fizyr-forks wubizhi benjamesbabala kulikovv goldsborough syeda27 boluoyu whytin 24hours dmitrysarov asylbek27 cells2numbers rok avansp bhaveshoswal mouhanedg56 fitrialif qinst64 milani zhoujs17 therahulkumar pengkiki thevivekpandey kinect59 zzdang mehrdad-shokri eynaij chrisakroyd rahasayantan yueqiw lianyi khaledto pierson-we fantasycz theone4ever akshaybapat04 daicoolb imparkss guilhermefscampos aglotero fanyishu123 panthole chenglongchen luogongning zarak codeaudit nodevel jackvial fenilsuchak osciiart ankitanaik rboyes lqchien aihill airoot dddzg xuanyangge dlombardo376 jorgeecardona mave5 k-sandhu brandenkmurray nsl2014fm drwaltman vz415 nanase-always andrearapuzzi aguang1201 neuwangmeng chaunh2 nguyenhongchau ngxbac nick917 yizhiyuzhang reiisky eliotandres senzaki chenghuige shixiaofu cnzjhdx murari023 sunshinezhihuo ifhubs sbordt grseb9s skarimik bw4sz afcarl

keras-rcnn's Issues

Anchor layer

It should produce anchors from ground truth bounding boxes.

This is the first of two layers needed by the regional proposal network (RPN) layers.

Documentation

Document all functions, classes, modules

Dictionary schema for datasets

Make a standard dictionary schema for all datasets and make sure existing ones (malaria, pascal) adhere.

Add tests to everything

Make sure there are tests for everything, including models, pascal, theano/cntk backend.
Also, use keras’ test layer functionality.

Mask R-CNN

keras-rcnn should include an implementation of Mask R-CNN.

Update readme

Update readme on current status of the project

keras-rcnn has two region proposal network (RPN) loss layers (RPNClassificationLoss and RPNRegressionLoss) and two region-based convolutional neural network (RCNN) layers (RCNNClassification and RCNNRegression). I think it makes sense to combine the two RPN layers into one RPN layer and the two RCNN layers into one RCNN layer in an effort to further simplify the keras-rcnn’s public API.

Cityscapes dataset

contributor.rst

Make a contributor.rst including info on how to contribute, what's expected, etc.

Implement theano, cntk backend functions

ImageSegmentationGenerator

keras-rcnn should include an ImageSegmentationGenerator class like Keras’s ImageDataGenerator.

Feature Pyramid Networks

FPN highlights the importance of leveraging the multi-scale feature hierarchy for object detection.

More details see the paper: Feature Pyramid Networks for Object Detection

Get tests passing on master

Use of TimeDistributed(BatchNormalization())

I believe there is an issue with using TimeDistributed(BatchNormalization()) in keras, as is done in keras-rcnn/keras_rcnn/classifiers/resnet.py (although I may be mistaken).As I understand, this leads to the moving mean and variance not being updated.

A small example:

from keras.layers import *
from keras.models import *
import keras.backend as K

img_size = 8
batch_size = 64
num_time_steps = 4
num_channels = 3

inputs = Input(shape=(num_time_steps, img_size, img_size, num_channels))
x = TimeDistributed(BatchNormalization(axis=3))(inputs)

model = Model(inputs=inputs, outputs=x)
model.compile(loss='mae', optimizer='sgd')

X = np.random.rand(batch_size, num_time_steps, img_size, img_size, num_channels)
Y = np.random.rand(batch_size, num_time_steps, img_size, img_size, num_channels)
history = model.fit(X, Y, epochs=4)

print(model.layers[1].get_weights()) # print the weights of the BN layer

And the output of the print() statement is:

[array([ 0.97972429,  0.97963089,  0.9796797 ], dtype=float32), array([ 0.00780913,  0.00766754, 
0.00772369], dtype=float32), array([ 0.,  0.,  0.], dtype=float32), array([ 1.,  1.,  1.], dtype=float32)]

Note that the [0, 0, 0] and [1, 1, 1] are the default, non-updated values of the mean and variance.
As an alternative, I think doing BatchNormalization(axis=bn_axis+1), with the +1 as an offset to account for the extra time dimension, is an ok fix to the problem.

`crop_and_resize` Theano backend function

Theano backend needs an implementation of keras_rcnn.backend.crop_and_resize.

Region proposal network (RPN) regression loss

It’s implanted here:

https://github.com/broadinstitute/keras-rcnn/blob/master/keras_rcnn/losses/proposals.py#L19-L38

but it needs documentation and unit tests.

gt Overlapping indices

Why is this line necessary/what is it for? Isn't gt_argmax_overlaps_inds defined before this line?
https://github.com/broadinstitute/keras-rcnn/blob/master/keras_rcnn/layers/object_detection/_anchor.py#L237

Add mask branch to residual classifier

Object detection generator should return a scale and scale boxes with images

R-CNN API design

I think it is the time to think about how the R-CNN API should look like. We have discussed it a bit in #7, but there it is more about the structure than the API.

From my understanding, a R-CNN framework should include a body, that predicts ROIs from images, and a couple of heads, that predicts scores, bounding boxes (and masks).

The idea is to easily attach different bodies, ResNets, (VGG, FPN etc.), and choose to include/exclude the mask head.

Do you have an elegant way of doing this in minds? @0x00b1 @jhung0

PASCAL dataset

anchor() issue

from keras_rcnn.backend.common import *
import tensorflow as tf
sess = tf.InteractiveSession()
anc = anchor(4,[1,2],[1])

[[ 0. 0. 3. 3. ]
[ 0. 0. 3. 3. ]
[ 0.5 -1. 2.5 4. ]
[ 0.5 -1. 2.5 4. ]]

I suppose that it should return list of two arrays.

ObjectDetection generator

keras-rcnn should include an ObjectDetection generator like Keras’s ImageDataGenerator.

Region proposal network (RPN) layer

The region proposal network (RPN) should take two inputs, image features (i.e. features extracted by ResNet) and ground truth bounding boxes and produce object proposals and corresponding “objectness” scores. I’m envisioning something like:

x = keras.layers.Input((223, 223, 3))

a = keras_resnet.ResNet50(x)

b = keras.layers.Input((None, 4))

y = keras_rcnn.layers.RPN((14, 14))([a, b])

set up travis lint for every commit

Travis lint should run on every commit

Alternating training

I only saw this library last week and I am not sure how complete it is. I was wondering if the alternating training technique mentioned in the paper (RPN-FastRCNN-RPN-FastRCNN-...) was implemented but I couldn't find the code for it here. Let's say I built a model successfully with keras-rcnn, now what should I do to train it? Should I just use model.fit()?

Check bounding boxes correctness

If any bounding box coordinates are outside the image (less than 0 or more than image size) or the order of the coordinates is wrong, the loss might become nan. Check that all coordinates are within image after scaling and that all x_max > x_min and y_max > y_min.

R-CNN regression loss

Document use with Apple’s Core ML framework

Keras-rcnn was written to be compatible with a number of third-party frameworks and services like Apple’s Core ML framework that enables developers to embed Keras models into their iOS applications. We should document how an Apple developer can create, train, and export their model to their Core ML-compatible iOS application.

Shape of y_pred and y_true

Why is the shape of y_pred and y_true (a, b, c, d) (e.g. see https://github.com/broadinstitute/keras-rcnn/blob/master/keras_rcnn/losses/rpn.py#L57)?
I think it'd make more sense to have shape (m, n). https://github.com/mitmul/chainer-faster-rcnn also supports that.

COCO dataset

MaskTargetLayer

Produce Mask targets for the mask branch.

Pep8 and Codecov

@0x00b1 Would you be keen to add codecov and pep8 check to standardize the contribution at some point?

RoIAlign Layer

Pooling layer proposed in Mask R-CNN

Mask Loss

Loss function for the mask branch of Mask R-CNN

return scores in objectproposal

Instead of returning just proposals, return scores as well https://github.com/broadinstitute/keras-rcnn/blob/master/keras_rcnn/backend/tensorflow_backend.py#L86

Get 100% code coverage

Mean average precision (mAP) metric

keras-rcnn should provide a mean average precision (mAP) Keras-compatiable metric that can be used to evaluate the performance of a model during training.

where is the starting point

How do start training on the malaria data...

If Somebody can Help...

I Did Run Setup.py but could not Figure out...

I am Finding it Difficult to start as there is no README...pls help regarding training on malaria data ...

R-CNN classification loss

Shape inference

Allow network to infer shapes and run without specifying image dimensions.
(Note: pascal has this issue but malaria does not)

Implement per batch mean average precision

For monitoring purposes, we should have independent, per batch mean average precision numbers.

Region proposal network (RPN) classification loss

It’s implanted here:

https://github.com/broadinstitute/keras-rcnn/blob/master/keras_rcnn/losses/proposals.py#L4-L16

but it needs documentation and unit tests.

TypeError when computing RPN loss

Starting from `0e28c4c` , there's a type error in the rpn loss.

TypeError Traceback (most recent call last)
in ()
7
8 deltas = keras_rcnn.layers.losses.RPNRegressionLoss(9)([deltas, bounding_box_targets, rpn_labels])
----> 9 scores = keras_rcnn.layers.losses.RPNClassificationLoss(9)([scores, rpn_labels])

/usr/local/lib/python3.6/site-packages/keras/engine/topology.py in call(self, inputs, **kwargs)
594
595 # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 596 output = self.call(inputs, **kwargs)
597 output_mask = self.compute_mask(inputs, previous_mask)
598

~/Documents/com/github/keras-rcnn/keras_rcnn/layers/losses/_rpn.py in call(self, inputs, **kwargs)
13 output, target = inputs
14
---> 15 loss = self.compute_loss(output, target)
16
17 self.add_loss(loss, inputs)

~/Documents/com/github/keras-rcnn/keras_rcnn/layers/losses/_rpn.py in compute_loss(output, target)
30 target = keras_rcnn.backend.gather_nd(target, indices)
31
---> 32 loss = keras.backend.sparse_categorical_crossentropy(target, output)
33 loss = keras.backend.mean(loss)
34

/usr/local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py in sparse_categorical_crossentropy(output, target, from_logits)
2777 output_shape = output.get_shape()
2778 targets = cast(flatten(target), 'int64')
-> 2779 logits = tf.reshape(output, [-1, int(output_shape[-1])])
2780 res = tf.nn.sparse_softmax_cross_entropy_with_logits(
2781 labels=targets,

TypeError: int returned non-int (type NoneType)

Add inside and outside bounding box weights

As in https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/fast_rcnn/config.py#L120 and used in https://github.com/rbgirshick/py-faster-rcnn/blob/master/lib/rpn/proposal_target_layer.py#L64 and regression losses.

Intersection over Union (IoU) algorithm

I think IoU is supposed to be like this:

But based on the code in this library, the definition of union is currently implemented as the rectangle defined by the top left corner of the first box and the bottom right corner of the second box, which is bigger than the true union.

code related to this issue here

Malaria dataset

test_malaria takes too long

Maybe we could mock the downloading procedure? @jhung0