havanagrawal / clomask Goto Github PK

Capstone Project for Clobotics: Using Mask R-CNN for Rigid/Non-Rigid Retail Consumable Product Detection

License: MIT License

CSS 0.01% HTML 0.05% JavaScript 12.19% Python 0.77% Jupyter Notebook 86.98%

capstone clobotics deep-learning image-segmentation machine-learning mask-rcnn object-detection semantic-segmentation uw uw-msds

clomask's People

Contributors

Stargazers

Watchers

Forkers

pshivraj tejasmhos sparsh9012 hasnainrsyed

clomask's Issues

Create workflow for labeling images

In order to collect training data, we need a workflow for the team to label and store data.

In general, in an ideal case:

Team member picks up an image from a queue (could be a common folder, dedicated folder, zip file, s3, whatever works)
Uploads it to LabelMe
Creates masks for the various retail products
Downloads the masks
Uploads masks to a specified location in the desired format (Drive/S3/etc)
Marks the item as done in the queue (either by deleting the original, or moving it to another directory)

Deliverable: A README/Wiki document

DeepLab Segmentation Model

I discovered a few more pre-trained models on DeepLab.

Code to infer from pretrained model can be found here.

Might be useful for both bottles and non-bottles.

@pshivraj @havanagrawal @tejasmhos @vivanvish

Redundancy in the naming structure of synthesized data

Problem

The synthesized data directory looks something like:
data/synth_data_2019_01_09_22_45_08/image_0_2019_01_09_22_45_08/train_image/image_0_2019_01_09_22_45_08.jpg

This feels incredibly noisy to me. Can we instead favor something more concise, such as:
data/synth_data_2019_01_09_22_45_08/image_0/train_image/image_0.jpg

In other words, I don't see the point of embedding the timestamp at three levels of the path.

@vivanvish Was there any particular reason (that perhaps I am completely missing) for requiring the timestamp at each level?

Solution

Change the image filename format to:
data/synth_data_{timestamp}/image_{k}/train_image/image_{k}.jpg

@pshivraj I'm assuming this makes no difference to your training pipeline, since afair you were using os.walk?

Improve website UI

Current Issues

The output mask shows up at a random point below the drop area.
There is no way to examine images that have already been processed by the application.

Deliverables:

Streamline the UI so that the generated mask appears in an intuitive location on the UI
Add a gallery section that displays images and their masks in pairs.
Explore the concept of overlaying the masks using Javascript, so that the user can hover over an object to see the annotation, like the Coco Dataset. (ambitious)

Generate Synthetic Boxes/Bags Data

Generate boxes/bags masks on empty fridge/shelf to train a Mask-RCNN custom model

Create initial skeleton for website

As of now, the website should satisfy a single user story:

User uploads image from file system
Web app sends the image to the backend using an API call
Web app displays the generated mask for the image

Depending on the complexity of the model, this MAY be asynchronous, i.e. the user may be assigned an image ID, with which he can query a separate section of the website to view the generated mask.

The web app should also:

Link back to this GitHub repo

Scope (limited on purpose)

The mask will be shown as a separate image. Nice to have feature would be to overlay on the original image
Nice to have: Webapp should be responsive (renders correctly on mobile screens/tablets/varying laptop screen sizes)

Add initial research document for pre-trained model results

As part of preliminary research,

Evaluate matterport's MaskRCNN implementation, in terms of

Visual analysis (how close are the predicted masks to the true masks, how many does it miss, etc),
IoU (Intersection over Union) for each image
Mean Average Precision (MAP) over a set of images

Deliverables

A Jupyter notebook with the above metrics for a small dataset (~5 images), preferably reproducible.
Links to resources for IoU/MAP metrics (Kaggle perhaps)

Evaluate the effect of image transforms on model performance

As discussed in #16 it would be interesting to check if the brightness, clarity, and perspective have any effect on the model performance. My personal intuition is that:

Brightness and perspective are less likely to affect the model performance significantly
The clarity (blurring/low-pass filters) will have some effect on the model performance.

Deliverables

A Jupyter notebook with the above analysis
A README document with appropriate details

Evaluate performance change in model after training on no mask images.

See if adding no mask cases in the context of a grocery store improves the performance of the model in terms of reducing false positives.

Comparison of image resolution on model performance

We observe that the masks for lower resolution images (blurred backgrounds, no clear FG/BG separation) can be erroneous. We need a formal investigation to validate if the resolution does in-fact affect the prediction performance.

Deliverable:

A Jupyter notebook that compares the performance of the model on varying resolutions

Add details with synthetic data generation

Deliverables

Explore and document different data synthesis techniques
Add scripts/modules/images used for synthetic data generation

Evaluate Rigid Box and Non-Rigid Bags Performance

Replicate what Puru/Havan notebook for bottles.

clomask_notebook directory is missing the images directory

Problem
The research directory is missing the 'images/val_images/' directory. Additionally, code cell is long-ish.

Proposed Solution

Add the directory. Make individual cells more concise.

Create a consistent way of referring to the MaskRCNN implementation

Since the matterport implementation is not available via a package manager, we want a way to consistently refer to the library from code.

One solution is to simply clone the repository as a subdirectory in this one, the license allows us to do this. The pro is that this is the simplest solution, the con is that it will bloat our own repo. We could strip out the parts that we don't need to alleviate that issue.
Another solution is to ensure that everyone has an environment variable that points to their local directory for MaskRCNN. For example:
MASK_CNN_DIR='/Users/havan/Dropbox/CP/Git/MaskRCNN'
We would then add all necessary paths from code, preferably from a central module/lib:

mask_rcnn_path = os.environ.get('MASK_CNN_DIR')
sys.path.append(os.path.join(mask_rcnn_path, 'mrcnn'))
sys.path.append(os.path.join(mask_rcnn_path, 'samples', 'coco'))

import utils
import coco
...

A subsequent task would be to move all model related code to a central library that we can simply import from. Computing mAP (mean average precision) is one such piece of code, that is currently duplicated across notebooks.

Please let me know what your thoughts are on this. Mentioning @pshivraj since we discussed this offline.

Demo.ipynb in the data-synthesizer module is not replicable

Problem

Since the notebook adds/delete configurations, the first half of the notebook (Step 3) almost always errors out.
The code in step 4 refers to a specific file generated on a specific date. This will also error out if anyone runs the notebook as is.

Solution

Reorder the cells, so that the changes result in the original configuration of foregrounds.json and backgrounds.json being restored OR use a temporary config in the example.
Have the generate_synthetic_dataset return the paths to the train_image and train_mask directories.

havanagrawal / clomask Goto Github PK

clomask's People

Contributors

Stargazers

Watchers

Forkers

clomask's Issues

Recommend Projects

Recommend Topics

Recommend Org