Giter Site home page Giter Site logo

tikitong / minicoco Goto Github PK

View Code? Open in Web Editor NEW
18.0 1.0 5.0 6.03 MB

Fast alternative to FiftyOne for creating a subset of the COCO dataset.

Python 100.00%
coco-dataset-format deep-learning pycocotools annotations coco object-detection coco-annotations coco-dataset coco-format coco-format-annotations

minicoco's Introduction

minicoco

This script presents a quick alternative to FiftyOne to create a subset of the 2017 coco dataset. It allows the generation of training and validation datasets. With a single images folder containing the images and a labels folder containing the image annotations for both datasets in COCO (JSON) format. It is main inspired by the notebook pycocoDemo and this stackoverflow solution for the download method.

Its execution creates the following directory tree:

data/
    images/ *.jpg
    labels/ train.json
            val.json

Installation

The use of conda is recommended. The following steps are required in order to run the script:

conda create -n minicoco python=3.9
conda activate minicoco
git clone https://github.com/tikitong/minicoco.git 
cd minicoco
wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
unzip ./annotations_trainval2017.zip
pip install -r requirements.txt

Usage

usage: script.py [-h] [-t TRAINING] [-v VALIDATION] [-cat NARGS [NARGS ...]] annotation_file

positional arguments:
  annotation_file       annotations/instances_train2017.json path file.

optional arguments:
  -h, --help            show this help message and exit
  -t TRAINING, --training TRAINING
                        number of images in the training set.
  -v VALIDATION, --validation VALIDATION
                        number of images in the validation set.
  -cat NARGS [NARGS ...], --nargs NARGS [NARGS ...]
                        category names.

The 80 categories that can be used with the -cat argument are the following:

person bicycle car motorcycle airplane bus train truck boat traffic light fire hydrant stop sign parking meter bench bird cat dog horse sheep cow elephant bear zebra giraffe backpack umbrella handbag tie suitcase frisbee skis snowboard sports ball kite baseball bat baseball glove skateboard surfboard tennis racket bottle wine glass cup fork knife spoon bowl banana apple sandwich orange broccoli carrot hot dog pizza donut cake chair couch potted plant bed dining table toilet tv laptop mouse remote keyboard cell phone microwave oven toaster sink refrigerator book clock vase scissors teddy bear hair drier toothbrush
code
#from https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocoDemo.ipynb
from pycocotools.coco import COCO
coco = COCO("annotations/instances_train2017.json")
cats = coco.loadCats(coco.getCatIds())
nms = [cat['name'] for cat in cats]
print('COCO categories: \n{}\n'.format(' '.join(nms)))

You can run for example: python script.py annotations/instances_train2017.json -t 30 -v 10 -cat car airplane person.

minicoco's People

Contributors

tikitong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

minicoco's Issues

running issue for script.py

Hello

Thank you very much for your effort and work.
I try to use your script to have part of coco dataset.
when I choice the number of images for the training and validation set to be 2001:667 , I got a dataset with both images and annotation .
but when I choice the num to be 3500:1500 , i got an annotation json file , without a folder of images.

could you please help , why this is happening ? notice that the 1st case has only two classes, while the 2nd case has 7 classes.

another issues , the images that are uploading during running the code , is it from train2017 , or from val or test images ?
please notice that I am trying on coco dataset. no other dataset.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.