Giter Site home page Giter Site logo

shubhampachori12110095 / pytorch_mask_rcnn Goto Github PK

View Code? Open in Web Editor NEW

This project forked from soeaver/pytorch_mask_rcnn

0.0 2.0 0.0 17.05 MB

Converted from [tf+keras version MASK-RCNN](https://github.com/matterport/Mask_RCNN)

Python 84.94% Shell 0.22% Cuda 5.52% C 9.24% C++ 0.08%

pytorch_mask_rcnn's Introduction

PyTorch has no tf.crop_and_resize function used for feature pyramid network, Million thanks to longwc ported it from tensorflow!

Notice: We have no time to continue this project, the model is converted and performing well; The data pipeline is 95% complete, for the training you may study well for the loss function. :)

Download the tf+keras model and run python convert_weights/convert_weights.py or download the converted model at Dropbox.

INSTALLATION

CUDA CODE:

Build NMS and ROIAlign/CropAndResize

  • Changing -arch in lib/make.sh for your GPU
    # Which CUDA capabilities do we want to pre-build for?
    # https://developer.nvidia.com/cuda-gpus
    # Compute/shader model   Cards
    # 6.1                    P4, P40, Titan Xp, GTX 1080 Ti, GTX 1080
    # 6.0                    P100
    # 5.2                    M40, Titan X, GTX 980
    # 3.7                    K80
    # 3.5                    K40, K20
    # 3.0                    K10, Grid K520 (AWS G2)
    
cd lib
./make.sh

MS COCO Requirements:

Install pycocotools from forks of the original pycocotools with fixes for Python3 and Windows (the official repo doesn't seem to be active anymore).

To train or test on MS COCO, you'll also need:

Structure

  • ./convert_weights: How to convert the weights from tf+keras version of MASK-RCNN.

  • ./network: The definitions for the mask rcnn.

  • ./preprocess: All the scripts for the data pipeline: Transform raw image and labels.

  • ./postprocess: For the model's output...

  • ./README: This package contains image will showed on the Github.

Demo

Picture demo

Change the path of the model at demo.py, and then run:

python demo.py

Result:

Realtime webcam demo

Change the path of the model at realtime_demo.py, and then run:

python realtime_demo.py

Evaluation

Change the path of the model at eval.py, and then run:

python eval.py

mAP of Bbox, tf+keras model has 0.347, the difference may come from some upsample function, or other issues which would be great if you want to dive deep:

mAP of Segmentation tf+keras model has 0.296, the difference may come from some upsample function, or other issues which would be great if you want to dive deep:

Training(Not woring for now)...

Data loader

python preprocess/test_data_loader

Loss function

Loss function is at network/mask_rcnn.py, you may need study well for the loss function in the keras code and modify it at network/mask_rcnn.py.

Using data loader backpropagation loss to train the model

The fit_loader(ncullen93/torchsample#24) function in torchSample maybe a good replacer for fit_generator used by keras.

Pipeline Description

Overview:

Stage 0, Resnet101 and Feature Pyramid Network to Extrac Features of the Image:

Stage 1, Region Proposal Network:

The Region Proposal Network (RPN) runs a lightweight binary classifier on a lot of boxes (anchors) over the image and returns object/no-object scores. Anchors with high objectness score (positive anchors) are passed to the stage two to be classified.

Often, even positive anchors don't cover objects fully. So the RPN also regresses a refinement (a delta in location and size) to be applied to the anchros to shift it and resize it a bit to the correct boundaries of the object.

1.1 RPN Targets:

The RPN targets are the training values for the RPN. To generate the targets, we start with a grid of anchors that cover the full image at different scales, and then we compute the IoU of the anchors with ground truth object. Positive anchors are those that have an IoU >= 0.7 with any ground truth object, and negative anchors are those that don't cover any object by more than 0.3 IoU. Anchors in between (i.e. cover an object by IoU >= 0.3 but < 0.7) are considered neutral and excluded from training.

To train the RPN regressor, we also compute the shift and resizing needed to make the anchor cover the ground truth object completely.

1.2 RPN Predictions:

1.3 RoIAlign

Stage 2, Proposal Classification:

This stage takes the region proposals from the RPN and feed the result to detection and mask branch respectively.

2.2 Detection

Per-Class Non-Max Suppression

Detections after NMS

2.2 Bounding Box Refinement

This is an example of final detection boxes (dotted lines) and the refinement applied to them (solid lines) in the second stage.

Stage 3: Generating Masks

This stage takes the detections (refined bounding boxes and class IDs) from the previous layer and runs the mask head to generate segmentation masks for every instance.

4. Composing the different pieces into a final result

pytorch_mask_rcnn's People

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.