Giter Site home page Giter Site logo

broadinstitute / keras-rcnn Goto Github PK

View Code? Open in Web Editor NEW
549.0 39.0 225.0 5.2 MB

Keras package for region-based convolutional neural networks (RCNNs)

License: Other

Python 100.00%
deep-learning theano tensorflow cntk object-detection image-segmentation

keras-rcnn's Introduction

Keras-RCNN

image

image

keras-rcnn is the Keras package for region-based convolutional neural networks.

Requirements

Python 3

keras-resnet==0.2.0

numpy==1.16.2

tensorflow==1.13.1

Keras==2.2.4

scikit-image==0.15.0

Getting Started

Let’s read and inspect some data:

training_dictionary, test_dictionary = keras_rcnn.datasets.shape.load_data()

categories = {"circle": 1, "rectangle": 2, "triangle": 3}

generator = keras_rcnn.preprocessing.ObjectDetectionGenerator()

generator = generator.flow_from_dictionary(
    dictionary=training_dictionary,
    categories=categories,
    target_size=(224, 224)
)

validation_data = keras_rcnn.preprocessing.ObjectDetectionGenerator()

validation_data = validation_data.flow_from_dictionary(
    dictionary=test_dictionary,
    categories=categories,
    target_size=(224, 224)
)

target, _ = generator.next()

target_bounding_boxes, target_categories, target_images, target_masks, target_metadata = target

target_bounding_boxes = numpy.squeeze(target_bounding_boxes)

target_images = numpy.squeeze(target_images)

target_categories = numpy.argmax(target_categories, -1)

target_categories = numpy.squeeze(target_categories)

keras_rcnn.utils.show_bounding_boxes(target_images, target_bounding_boxes, target_categories)

Let’s create an RCNN instance:

model = keras_rcnn.models.RCNN((224, 224, 3), ["circle", "rectangle", "triangle"])

and pass our preferred optimizer to the compile method:

optimizer = keras.optimizers.Adam(0.0001)

model.compile(optimizer)

Finally, let’s use the fit_generator method to train our network:

model.fit_generator(    
    epochs=10,
    generator=generator,
    validation_data=validation_data
)

External Data

The data is made up of a list of dictionaries corresponding to images.

  • For each image, add a dictionary with keys 'image', 'objects'
    • 'image' is a dictionary, which contains keys 'checksum', 'pathname', and 'shape'
      • 'checksum' is the md5 checksum of the image
      • 'pathname' is the pathname of the image, put in full pathname
      • 'shape' is a dictionary with keys 'r', 'c', and 'channels'
        • 'c': number of columns
        • 'r': number of rows
        • 'channels': number of channels
    • 'objects' is a list of dictionaries, where each dictionary has keys 'bounding_box', 'category'
      • 'bounding_box' is a dictionary with keys 'minimum' and 'maximum'
        • 'minimum': dictionary with keys 'r' and 'c'
          • 'r': smallest bounding box row
          • 'c': smallest bounding box column
        • 'maximum': dictionary with keys 'r' and 'c'
          • 'r': largest bounding box row
          • 'c': largest bounding box column
      • 'category' is a string denoting the class name

Suppose this data is save in a file called training.json. To load data,

import json

with open('training.json') as f:
    d = json.load(f)

Slack

We’ve been meeting in the #keras-rcnn channel on the keras.io Slack server.

You can join the server by inviting yourself from the following website:

https://keras-slack-autojoin.herokuapp.com/

keras-rcnn's People

Contributors

0x00b1 avatar 24hours avatar akshaybapat04 avatar brandenkmurray avatar chrisakroyd avatar drwaltman avatar guilhermefscampos avatar hannarud avatar hgaiser avatar imparkss avatar jhung0 avatar jihongju avatar mbroisinbi avatar milani avatar yanfengliu avatar yhenon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

keras-rcnn's Issues

Combine the loss layers

keras-rcnn has two region proposal network (RPN) loss layers (RPNClassificationLoss and RPNRegressionLoss) and two region-based convolutional neural network (RCNN) layers (RCNNClassification and RCNNRegression). I think it makes sense to combine the two RPN layers into one RPN layer and the two RCNN layers into one RCNN layer in an effort to further simplify the keras-rcnn’s public API.

Add tests to everything

Make sure there are tests for everything, including models, pascal, theano/cntk backend.
Also, use keras’ test layer functionality.

Check bounding boxes correctness

If any bounding box coordinates are outside the image (less than 0 or more than image size) or the order of the coordinates is wrong, the loss might become nan. Check that all coordinates are within image after scaling and that all x_max > x_min and y_max > y_min.

R-CNN API design

I think it is the time to think about how the R-CNN API should look like. We have discussed it a bit in #7, but there it is more about the structure than the API.

From my understanding, a R-CNN framework should include a body, that predicts ROIs from images, and a couple of heads, that predicts scores, bounding boxes (and masks).

The idea is to easily attach different bodies, ResNets, (VGG, FPN etc.), and choose to include/exclude the mask head.

Do you have an elegant way of doing this in minds? @0x00b1 @jhung0

where is the starting point

How do start training on the malaria data...

If Somebody can Help...

I Did Run Setup.py but could not Figure out...

I am Finding it Difficult to start as there is no README...pls help regarding training on malaria data ...

Alternating training

I only saw this library last week and I am not sure how complete it is. I was wondering if the alternating training technique mentioned in the paper (RPN-FastRCNN-RPN-FastRCNN-...) was implemented but I couldn't find the code for it here. Let's say I built a model successfully with keras-rcnn, now what should I do to train it? Should I just use model.fit()?

Document use with Apple’s Core ML framework

Keras-rcnn was written to be compatible with a number of third-party frameworks and services like Apple’s Core ML framework that enables developers to embed Keras models into their iOS applications. We should document how an Apple developer can create, train, and export their model to their Core ML-compatible iOS application.

Anchor layer

It should produce anchors from ground truth bounding boxes.

This is the first of two layers needed by the regional proposal network (RPN) layers.

Pep8 and Codecov

@0x00b1 Would you be keen to add codecov and pep8 check to standardize the contribution at some point?

anchor() issue

from keras_rcnn.backend.common import *
import tensorflow as tf
sess = tf.InteractiveSession()
anc = anchor(4,[1,2],[1])

[[ 0. 0. 3. 3. ]
[ 0. 0. 3. 3. ]
[ 0.5 -1. 2.5 4. ]
[ 0.5 -1. 2.5 4. ]]

I suppose that it should return list of two arrays.

TypeError when computing RPN loss

Starting from 0e28c4c , there's a type error in the rpn loss.

TypeError Traceback (most recent call last)
in ()
7
8 deltas = keras_rcnn.layers.losses.RPNRegressionLoss(9)([deltas, bounding_box_targets, rpn_labels])
----> 9 scores = keras_rcnn.layers.losses.RPNClassificationLoss(9)([scores, rpn_labels])

/usr/local/lib/python3.6/site-packages/keras/engine/topology.py in call(self, inputs, **kwargs)
594
595 # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 596 output = self.call(inputs, **kwargs)
597 output_mask = self.compute_mask(inputs, previous_mask)
598

~/Documents/com/github/keras-rcnn/keras_rcnn/layers/losses/_rpn.py in call(self, inputs, **kwargs)
13 output, target = inputs
14
---> 15 loss = self.compute_loss(output, target)
16
17 self.add_loss(loss, inputs)

~/Documents/com/github/keras-rcnn/keras_rcnn/layers/losses/_rpn.py in compute_loss(output, target)
30 target = keras_rcnn.backend.gather_nd(target, indices)
31
---> 32 loss = keras.backend.sparse_categorical_crossentropy(target, output)
33 loss = keras.backend.mean(loss)
34

/usr/local/lib/python3.6/site-packages/keras/backend/tensorflow_backend.py in sparse_categorical_crossentropy(output, target, from_logits)
2777 output_shape = output.get_shape()
2778 targets = cast(flatten(target), 'int64')
-> 2779 logits = tf.reshape(output, [-1, int(output_shape[-1])])
2780 res = tf.nn.sparse_softmax_cross_entropy_with_logits(
2781 labels=targets,

TypeError: int returned non-int (type NoneType)

Use of TimeDistributed(BatchNormalization())

I believe there is an issue with using TimeDistributed(BatchNormalization()) in keras, as is done in keras-rcnn/keras_rcnn/classifiers/resnet.py (although I may be mistaken).As I understand, this leads to the moving mean and variance not being updated.

A small example:

from keras.layers import *
from keras.models import *
import keras.backend as K

img_size = 8
batch_size = 64
num_time_steps = 4
num_channels = 3

inputs = Input(shape=(num_time_steps, img_size, img_size, num_channels))
x = TimeDistributed(BatchNormalization(axis=3))(inputs)

model = Model(inputs=inputs, outputs=x)
model.compile(loss='mae', optimizer='sgd')

X = np.random.rand(batch_size, num_time_steps, img_size, img_size, num_channels)
Y = np.random.rand(batch_size, num_time_steps, img_size, img_size, num_channels)
history = model.fit(X, Y, epochs=4)

print(model.layers[1].get_weights()) # print the weights of the BN layer

And the output of the print() statement is:

[array([ 0.97972429,  0.97963089,  0.9796797 ], dtype=float32), array([ 0.00780913,  0.00766754, 
0.00772369], dtype=float32), array([ 0.,  0.,  0.], dtype=float32), array([ 1.,  1.,  1.], dtype=float32)]

Note that the [0, 0, 0] and [1, 1, 1] are the default, non-updated values of the mean and variance.
As an alternative, I think doing BatchNormalization(axis=bn_axis+1), with the +1 as an offset to account for the extra time dimension, is an ok fix to the problem.

Intersection over Union (IoU) algorithm

I think IoU is supposed to be like this:
iou_equation

But based on the code in this library, the definition of union is currently implemented as the rectangle defined by the top left corner of the first box and the bottom right corner of the second box, which is bigger than the true union.

code related to this issue here

contributor.rst

Make a contributor.rst including info on how to contribute, what's expected, etc.

Region proposal network (RPN) layer

The region proposal network (RPN) should take two inputs, image features (i.e. features extracted by ResNet) and ground truth bounding boxes and produce object proposals and corresponding “objectness” scores. I’m envisioning something like:

x = keras.layers.Input((223, 223, 3))

a = keras_resnet.ResNet50(x)

b = keras.layers.Input((None, 4))

y = keras_rcnn.layers.RPN((14, 14))([a, b])

Shape inference

Allow network to infer shapes and run without specifying image dimensions.
(Note: pascal has this issue but malaria does not)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.