Giter Site home page Giter Site logo

pytorch-ssd300's Introduction

Pytorch Implementation of SSD300

We redesign and fix the bug in original implementation which considers pytorch 0.4.

This code supports pytorch 1.0 > in python 3.6.

plz, refer to detail information on paper

Objective

To build a model that can detect and localize specific objects in images.

This repository addresses Single Shot Multibox Detector (SSD), a popular, powerful, and especially nimble network for this task. The authors' original implementation can be found here.

Usage

Quick overview of entire procedure.
We elaborate on the procedure with the following sections.

Overall Training

cd asset
bash download_voc.sh
cd ..
python create_data_list.py
python train.py

Overall Test

cd asset
bash download.sh
cd ..
python detect.py or python eval.py

Dataset

We use VOC2007 and VOC2012 dataset to train SSD300.
You can download those datasets using below command.

cd asset
bash download_voc.sh
VOCdevkit
-| VOC2007
   -| Annotations
   -| ImageSets
   -| JPEGImages
   -| SegmentationClass
   -| SegmentationObject
-| VOC2012
   -| Annotations
   -| ImageSets
   -| JPEGImages
   -| SegmentationClass
   -| SegmentationObject

Create Data List

Before you train the model, you need to preprocess the data.
Specify the data root in create_data_list.py.

from utils import create_data_lists

if __name__ == '__main__':
    create_data_lists(voc07_path='[VOC2007 Datapath]', # specify your data root
                      voc12_path='[VOC2012 Datapath]',
                      output_folder='./')
python create_data_list.py

then, TRAIN_images.json TEST_images.json and TRAIN_objects.json TEST_objects.json files are generated.

Training

If the json files were successfully generated, you can now train the SSD300 model.

python train.py 

We use SGD optimizer with momentum=0.9 and adopt lr decay at 80000,100000 iteration. grad_clip is useful if you afford to use a large batch size (e.g., more than 32). We train the model batch_size=8 in single TITAN RTX without grad_clip.

Refer to the training setting as below:

# Learning parameters
checkpoint = None  # path to model checkpoint, None if none
batch_size = 8  # batch size
iterations = 120000  # number of iterations to train
workers = 4  # number of workers for loading data in the DataLoader
print_freq = 200  # print training status every __ batches
lr = 1e-3  # learning rate
decay_lr_at = [80000, 100000]  # decay learning rate after these many iterations
decay_lr_to = 0.1  # decay learning rate to this fraction of the existing learning rate
momentum = 0.9  # momentum
weight_decay = 5e-4  # weight decay
grad_clip = None  # clip if gradients are exploding, which may happen at larger batch sizes (sometimes at 32) - you will recognize it by a sorting error in the MuliBox loss calculation

Test

We provide a pre-trained model with link.
You can download it using above link or using shell file download.sh

cd asset
bash download.sh

You can detect objects based on single image using detect.py.
Given the path of single image, you can process the object detection using pre-trained model and save the result.

if __name__ == '__main__':
    img_path = '[Path of single image]' # e.g., /mnt2/datasets/VOCdevkit/VOC2007/JPEGImages/000131.jpg
    original_image = Image.open(img_path, mode='r')
    original_image = original_image.convert('RGB')
    annotated_image = detect(original_image, min_score=0.2, max_overlap=0.5, top_k=200)
    annotated_image.save('[Name of result image]') # e.g., ./result.jpg

Evaluation

For evaluation, SSD model use mAP(mean Average Precison).
Detail for how calculate the mAP is provided in calculate_mAP in utils.py
It takes a few minutes (definetely depends on your environment).
We obtain mAP 70.2%.

python eval.py
Model data mAP aero bike bird boat bottle bus car chair cow table dog horse mbike person plant sheep sofa train tv
SSD300 VOC07+12 74.3 75.5 80.2 72.3 66.3 47.6 83.0 84.2 86.1 54.7 78.3 73.9 84.5 85.3 82.6 76.2 48.6 73.9 76.0 83.4
Trial_1 VOC07+12 70.2 70.1 80.4 64.8 61.7 39.6 81.1 80.1 79.2 51.0 75.6 74.3 75.2 81.4 79.1 73.4 41.5 71.7 73.8 82.9
Trial_2 VOC07+12

Demo images

pytorch-ssd300's People

Contributors

jeffkang-94 avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

jhyuuu

pytorch-ssd300's Issues

utils.py image normalization part annotation reason

#new_image = FT.normalize(new_image, mean=mean, std=std)

안녕하세요 ssd code review하면서 Jeffkang님의 코드로 많은 도움 받았습니다. 감사합니다.
궁금한 점이 있는데, 이 부분 주석처리한 이유가 있을까요?
(imagenet std 와 mean값을 이용해서 normalize를 해야된다고 생각하고 있어서요)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.