Giter Site home page Giter Site logo

mtcnn's Introduction

Introduction

this repository is the implementation of MTCNN in MXnet

  • core: core routines for MTCNN training and testing.
  • tools: utilities for training and testing
  • data: Refer to Data Folder Structure for dataset reference. Usually dataset contains images and imglists
  • model: Folder to save training symbol and model
  • prepare_data: scripts for generating training data for pnet, rnet and onet

Useful information

You're required to modify mxnet/src/regression_output-inl.h according to mxnet_diff.patch before using the code for training.

  • Dataset format The images used for training are stored in ./data/dataset_name/images/ The annotation file is placed in ./data/dataset_name/imglists/

    • For training: Each line of the annotation file states a training sample.
      The format is: [path to image] [cls_label] [bbox_label]
      cls_label: 1 for positive, 0 for negative, -1 for part face.
      bbox_label are the offset of x1, y1, x2, y2, calculated by (xgt(ygt) - x(y)) / width(height)
      An example would be 12/positive/28 1 -0.05 0.11 -0.05 -0.11.
      Note that all the strings are seperated by space.

    • For testing: Similar to training but only path-to-image is needed.

  • Data Folder Structure (suppose root is data)

cache (created by imdb)
-- name + image set + gt_roidb
-- results (created by detection and evaluation)
mtcnn # contains images and anno for training mtcnn
-- images
---- 12 (images of size 12 x 12, used by pnet)
---- 24 (images of size 24 x 24, used by rnet)
---- 48 (images of size 48 x 48, used by onet)
-- imglists 
---- train_12.txt
---- train_24.txt
---- train_48.txt
custom (datasets for testing) 
-- images
-- imglists
---- image_set.txt
  • Scripts to generate training data(from wider face dataset)
    • run wider_annotations/transform.m (or transform.py) to get the annotation file of the format we need.
    • gen_pnet_data.py: obtain training samples for pnet
    • gen_hard_example.py: prepare hard examples. you can set test_mode to "pnet" to get training data for rnet, or set test_mode to "rnet" to get training data for onet.
    • gen_imglist.py: ramdom sample images generated by gen_pnet_data.py or gen_hard_example.py to form training set.

Results

image

License

MIT LICENSE

Reference

Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li, Yu Qiao , " Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks," IEEE Signal Processing Letter

mtcnn's People

Contributors

kuaikuaikim avatar seanlinx avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.