havanagrawal / clomask Goto Github PK
View Code? Open in Web Editor NEWCapstone Project for Clobotics: Using Mask R-CNN for Rigid/Non-Rigid Retail Consumable Product Detection
License: MIT License
Capstone Project for Clobotics: Using Mask R-CNN for Rigid/Non-Rigid Retail Consumable Product Detection
License: MIT License
In order to collect training data, we need a workflow for the team to label and store data.
In general, in an ideal case:
Deliverable: A README/Wiki document
Problem
The synthesized data directory looks something like:
data/synth_data_2019_01_09_22_45_08/image_0_2019_01_09_22_45_08/train_image/image_0_2019_01_09_22_45_08.jpg
This feels incredibly noisy to me. Can we instead favor something more concise, such as:
data/synth_data_2019_01_09_22_45_08/image_0/train_image/image_0.jpg
In other words, I don't see the point of embedding the timestamp at three levels of the path.
@vivanvish Was there any particular reason (that perhaps I am completely missing) for requiring the timestamp at each level?
Solution
Change the image filename format to:
data/synth_data_{timestamp}/image_{k}/train_image/image_{k}.jpg
@pshivraj I'm assuming this makes no difference to your training pipeline, since afair you were using os.walk
?
Current Issues
Deliverables:
Generate boxes/bags masks on empty fridge/shelf to train a Mask-RCNN custom model
As of now, the website should satisfy a single user story:
Depending on the complexity of the model, this MAY be asynchronous, i.e. the user may be assigned an image ID, with which he can query a separate section of the website to view the generated mask.
The web app should also:
Scope (limited on purpose)
As part of preliminary research,
Evaluate matterport's MaskRCNN implementation, in terms of
Deliverables
As discussed in #16 it would be interesting to check if the brightness, clarity, and perspective have any effect on the model performance. My personal intuition is that:
Deliverables
See if adding no mask cases in the context of a grocery store improves the performance of the model in terms of reducing false positives.
We observe that the masks for lower resolution images (blurred backgrounds, no clear FG/BG separation) can be erroneous. We need a formal investigation to validate if the resolution does in-fact affect the prediction performance.
Deliverable:
Deliverables
Replicate what Puru/Havan notebook for bottles.
Problem
The research directory is missing the 'images/val_images/'
directory. Additionally, code cell is long-ish.
Proposed Solution
Add the directory. Make individual cells more concise.
Since the matterport implementation is not available via a package manager, we want a way to consistently refer to the library from code.
MASK_CNN_DIR='/Users/havan/Dropbox/CP/Git/MaskRCNN'
mask_rcnn_path = os.environ.get('MASK_CNN_DIR')
sys.path.append(os.path.join(mask_rcnn_path, 'mrcnn'))
sys.path.append(os.path.join(mask_rcnn_path, 'samples', 'coco'))
import utils
import coco
...
A subsequent task would be to move all model related code to a central library that we can simply import from. Computing mAP
(mean average precision) is one such piece of code, that is currently duplicated across notebooks.
Please let me know what your thoughts are on this. Mentioning @pshivraj since we discussed this offline.
Problem
Solution
foregrounds.json
and backgrounds.json
being restored OR use a temporary config in the example.generate_synthetic_dataset
return the paths to the train_image
and train_mask
directories.A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.