Giter Site home page Giter Site logo

havanagrawal / clomask Goto Github PK

View Code? Open in Web Editor NEW
6.0 6.0 4.0 174.01 MB

Capstone Project for Clobotics: Using Mask R-CNN for Rigid/Non-Rigid Retail Consumable Product Detection

License: MIT License

CSS 0.01% HTML 0.05% JavaScript 12.19% Python 0.77% Jupyter Notebook 86.98%
capstone clobotics deep-learning image-segmentation machine-learning mask-rcnn object-detection semantic-segmentation uw uw-msds

clomask's People

Contributors

dependabot[bot] avatar havanagrawal avatar pshivraj avatar vivanvish avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

clomask's Issues

Create workflow for labeling images

In order to collect training data, we need a workflow for the team to label and store data.

In general, in an ideal case:

  1. Team member picks up an image from a queue (could be a common folder, dedicated folder, zip file, s3, whatever works)
  2. Uploads it to LabelMe
  3. Creates masks for the various retail products
  4. Downloads the masks
  5. Uploads masks to a specified location in the desired format (Drive/S3/etc)
  6. Marks the item as done in the queue (either by deleting the original, or moving it to another directory)

Deliverable: A README/Wiki document

Redundancy in the naming structure of synthesized data

Problem

The synthesized data directory looks something like:
data/synth_data_2019_01_09_22_45_08/image_0_2019_01_09_22_45_08/train_image/image_0_2019_01_09_22_45_08.jpg

This feels incredibly noisy to me. Can we instead favor something more concise, such as:
data/synth_data_2019_01_09_22_45_08/image_0/train_image/image_0.jpg

In other words, I don't see the point of embedding the timestamp at three levels of the path.

@vivanvish Was there any particular reason (that perhaps I am completely missing) for requiring the timestamp at each level?

Solution

Change the image filename format to:
data/synth_data_{timestamp}/image_{k}/train_image/image_{k}.jpg

@pshivraj I'm assuming this makes no difference to your training pipeline, since afair you were using os.walk?

Improve website UI

Current Issues

  1. The output mask shows up at a random point below the drop area.
  2. There is no way to examine images that have already been processed by the application.

Deliverables:

  1. Streamline the UI so that the generated mask appears in an intuitive location on the UI
  2. Add a gallery section that displays images and their masks in pairs.
  3. Explore the concept of overlaying the masks using Javascript, so that the user can hover over an object to see the annotation, like the Coco Dataset. (ambitious)

Create initial skeleton for website

As of now, the website should satisfy a single user story:

  1. User uploads image from file system
  2. Web app sends the image to the backend using an API call
  3. Web app displays the generated mask for the image

Depending on the complexity of the model, this MAY be asynchronous, i.e. the user may be assigned an image ID, with which he can query a separate section of the website to view the generated mask.

The web app should also:

  1. Link back to this GitHub repo

Scope (limited on purpose)

  1. The mask will be shown as a separate image. Nice to have feature would be to overlay on the original image
  2. Nice to have: Webapp should be responsive (renders correctly on mobile screens/tablets/varying laptop screen sizes)

Add initial research document for pre-trained model results

As part of preliminary research,

Evaluate matterport's MaskRCNN implementation, in terms of

  1. Visual analysis (how close are the predicted masks to the true masks, how many does it miss, etc),
  2. IoU (Intersection over Union) for each image
  3. Mean Average Precision (MAP) over a set of images

Deliverables

  1. A Jupyter notebook with the above metrics for a small dataset (~5 images), preferably reproducible.
  2. Links to resources for IoU/MAP metrics (Kaggle perhaps)

Evaluate the effect of image transforms on model performance

As discussed in #16 it would be interesting to check if the brightness, clarity, and perspective have any effect on the model performance. My personal intuition is that:

  1. Brightness and perspective are less likely to affect the model performance significantly
  2. The clarity (blurring/low-pass filters) will have some effect on the model performance.

Deliverables

  1. A Jupyter notebook with the above analysis
  2. A README document with appropriate details

Comparison of image resolution on model performance

We observe that the masks for lower resolution images (blurred backgrounds, no clear FG/BG separation) can be erroneous. We need a formal investigation to validate if the resolution does in-fact affect the prediction performance.

Deliverable:

  • A Jupyter notebook that compares the performance of the model on varying resolutions

Create a consistent way of referring to the MaskRCNN implementation

Since the matterport implementation is not available via a package manager, we want a way to consistently refer to the library from code.

  • One solution is to simply clone the repository as a subdirectory in this one, the license allows us to do this. The pro is that this is the simplest solution, the con is that it will bloat our own repo. We could strip out the parts that we don't need to alleviate that issue.
  • Another solution is to ensure that everyone has an environment variable that points to their local directory for MaskRCNN. For example:
    MASK_CNN_DIR='/Users/havan/Dropbox/CP/Git/MaskRCNN'
    We would then add all necessary paths from code, preferably from a central module/lib:
mask_rcnn_path = os.environ.get('MASK_CNN_DIR')
sys.path.append(os.path.join(mask_rcnn_path, 'mrcnn'))
sys.path.append(os.path.join(mask_rcnn_path, 'samples', 'coco'))

import utils
import coco
...

A subsequent task would be to move all model related code to a central library that we can simply import from. Computing mAP (mean average precision) is one such piece of code, that is currently duplicated across notebooks.

Please let me know what your thoughts are on this. Mentioning @pshivraj since we discussed this offline.

Demo.ipynb in the data-synthesizer module is not replicable

Problem

  1. Since the notebook adds/delete configurations, the first half of the notebook (Step 3) almost always errors out.
  2. The code in step 4 refers to a specific file generated on a specific date. This will also error out if anyone runs the notebook as is.

Solution

  1. Reorder the cells, so that the changes result in the original configuration of foregrounds.json and backgrounds.json being restored OR use a temporary config in the example.
  2. Have the generate_synthetic_dataset return the paths to the train_image and train_mask directories.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.