Laboro Tomato is an image dataset of growing tomatoes at different stages of their ripening which is designed for object detection and instance segmentation tasks. We also provide two subsets of tomatoes separated by size. Dataset was gathered at a local farm with two separate cameras with its different resolution and image quality.
Samples of raw/annotated images: IMG_1066, IMG_1246
Each tomato is divided into 2 categories according to size (normal size and cherry tomato) and 3 categories depending on the stage of ripening:
- fully_ripened - complitely red color and ready to be harvested. Filled with red color on 90%* or more
- half_ripened - greenish and needs time to ripen. Filled with red color on 30-89%*
- green - complitely green/white, sometimes with rare red parts. Filled with red color on 0-30%*
*All percentages are approximate and differ from case to case.
Dataset includes 804 images with following details:
name: tomato_mixed
images: 643 train, 161 test
cls_num: 6
cls_names: b_fully_ripened, b_half_ripened, b_green, l_fully_ripened, l_half_ripened, l_green
total_bboxes: train[7781], test[1,996]
bboxes_per_class:
*Train: b_fully_ripened[348], b_half_ripened[520], b_green[1467],
l_fully_ripened[982], l_half_ripened[797], l_green[3667]
*Test: b_fully_ripened[72], b_half_ripened[116], b_green[387],
l_fully_ripened[269], l_half_ripened[223], l_green[929]
image_resolutions: 3024x4032, 3120x4160
Laboro Tomato dataset can be used to solve cutting edge real-life tasks by fusing various technologies:
- Harvesting forecast based on tomato maturity
- Automatic harvest of only ripened tomates
- Identification and automatic thinning of deteriorated and obsolete tomatoes
- Splayig pesticides only on tomatoes at a specific ripening stage
- Temperature control in greenhouse according to ripening stage
- Quality control on production line of food manufactures, etc.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
For commercial use, please contact Laboro.AI Inc.
Laboro Tomato download link
Model have been trained by mmdetection V2.0 on 4 Tesla-V100 and based on Mask R-CNN with R-50-FPN 1x backbone:
Dataset | bbox AP | mask AP | Download |
---|---|---|---|
Laboro Tomato | 64.3 | 65.7 | model |
We haven't done hyperparameters tuning for baseline model training and used default values, provided by original mmdetection configs.
Training parameters:
lr = 0.01
step = [32, 44]
total epoch = 48
Image gallery with pretrained model output examples and its comparison between raw and annotated images.
To evaluate pretrained models please prepare mmdetection environment by official installation guide.
It is recommended to symlink the dataset root to $MMDETECTION/data. If your folder structure is different, you may need to change the corresponding paths in config files.
mmdetection
├── mmdet
├── tools
├── configs
├── data
│ ├── laboro_tomato
│ │ ├── annotations
│ │ ├── train
│ │ ├── test
To load data we need to create a new config file mmdet/datasets/laboro_tomato.py
with corresponding subsets:
from .coco import CocoDataset
from .builder import DATASETS
@DATASETS.register_module()
class LaboroTomato(CocoDataset):
CLASSES = ('b_fully_ripened', 'b_half_ripened', 'b_green',
'l_fully_ripened', 'l_half_ripened', 'l_green')
And add dataset names to mmdet/datasets/__init__.py
:
from .laboro_tomato import LaboroTomato
__all__ = [
..., 'LaboroTomato'
]
Configuration files setup on Tomato Mixed dataset example:
- Create
laboro_tomato_base.py
inconfigs/_base_/datasets/
with content of coco_detection configuration file and change dataset type, root and path parameters:
dataset_type = 'LaboroTomato'
data_root = 'data/laboro_tomato/'
...
- Create
laboro_tomato_instance.py
inconfigs/_base_/datasets/
with content of coco_instance and replace it with your base detection configuration file:
_base_ = 'laboro_tomato_base.py'
...
- Replace class numbers at model configuration file
configs/_base_/models/mask_rcnn_r50_fpn.py
:
...
num_classes = 6
...
num_classes = 6
...
- Replace dataset configuration file name in
configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py
with created at step 3:
_base_ = [
...
'../_base_/datasets/laboro_tomato_instance.py',
...
]
You can use the following commands to test a dataset:
# single-gpu testing
python tools/test.py configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
laboro_tomato_96ep.pth --show
# multi-gpu testing with 4 GPUs
./tools/dist_test.sh configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
laboro_tomato_96ep.pth 4 --out results.pkl --eval bbox segm
To train your model finish all steps from Test a model section and change learning rate and total epoch, steps at configs/_base_/schedules/schedule_1x.py
. The default learning rate in config files is for 8 GPUs and 2 img/gpu (batch size = 8*2 = 16). According to the Linear Scaling Rule, you need to set the learning rate proportional to the batch size if you use different GPUs or images per GPU, e.g., lr=0.01 for 4 GPUs * 2 img/gpu and lr=0.08 for 16 GPUs * 4 img/gpu.
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
...
step=[64, 88])
total_epochs = 96
You can use the following commands to train a model:
# single-gpu train
python tools/train.py configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py \
--work-dir ./laboro_tomato
# multi-gpu train with 4 GPUs
./tools/dist_test.sh configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py 4 \
--work-dir ./laboro_tomato
name: tomato_big
images: 353 train, 89 test
cls_num: 3
cls_names: b_fully_ripened, b_half_ripened, b_green
total_bboxes: train[2360], test[550]
bboxes_per_class:
*Train: b_fully_ripened[343], b_half_ripened[506], b_green[1511],
*Test: b_fully_ripened[77], b_half_ripened[130], b_green[343],
image_resolutions: 3024x4032, 3120x4160
name: tomato_little
images: 289 train, 73 test
cls_num: 3
cls_names: l_fully_ripened, l_half_ripened, l_green
total_bboxes: train[5397], test[1470]
bboxes_per_class:
*Train: l_fully_ripened[963], l_half_ripened[805], l_green[3629],
*Test: l_fully_ripened[288], l_half_ripened[215], l_green[967],
image_resolutions: 3024x4032, 3120x4160
As well as main dataset, Laboro tomato big and Laboro tomato little have been trained by mmdetection V2.0 on 4 Tesla-V100 and based on Mask R-CNN with R-50-FPN 1x backbone:
Dataset | bbox AP | mask AP | Download |
---|---|---|---|
Laboro tomato big | 67.9 | 68.4 | model |
Laboro tomato little | 62.7 | 63.1 | model |
Training parameters:
lr = 0.01
step = [32, 44]
total epoch = 48