Giter Site home page Giter Site logo

foreveryounggithub / mtcnn Goto Github PK

View Code? Open in Web Editor NEW
329.0 329.0 156.0 24.16 MB

Repository for "Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks", implemented with Caffe, C++ interface.

CMake 1.72% C++ 92.21% Makefile 0.94% Shell 0.93% Perl 1.82% Python 2.37%

mtcnn's People

Contributors

foreveryounggithub avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mtcnn's Issues

pts loss 层 如何忽略负样本

数据处理时,我最终为下面的格式:(每行:图片位置 1个label 4个box 10个landmark)
pos1.jpg 1 0.1 0.2 0.3 0.4 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
neg1.jpg 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
part1.jpg -1 0.1 0.2 0.3 0.4 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
landmark1.jpg -1 -1 -1 -1 -1 0.1 0.3 0.3 0.4 0.5 0.5 0.3 0.2 0.7 0.7
我在算landmark loss层时,如何忽略其他样本,您是重写的loss层么还是? 看prototxt您并没有重写,还是我前面忽略了什么信息,谢谢。

when I use the celebA database to train the P_Net network will report an error.

Hi:
When I use the celebA database to train the P_Net network, the following error occurs:
I0520 22:22:14.954217 3971 net.cpp:84] Creating Layer loss_label
I0520 22:22:14.954236 3971 net.cpp:406] loss_label <- conv4-1
I0520 22:22:14.954241 3971 net.cpp:406] loss_label <- label
I0520 22:22:14.954246 3971 net.cpp:380] loss_label -> loss_label
I0520 22:22:14.954262 3971 layer_factory.hpp:77] Creating layer loss_label
F0520 22:22:14.954509 3971 softmax_loss_layer.cpp:47] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (4900 vs. 100) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be NHW, with integer values in {0, 1, ..., C-1}.
*** Check failure stack trace: ***
@ 0x7fefd7991daa (unknown)
@ 0x7fefd7991ce4 (unknown)
@ 0x7fefd79916e6 (unknown)
@ 0x7fefd7994687 (unknown)
@ 0x7fefd804eee0 caffe::SoftmaxWithLossLayer<>::Reshape()
@ 0x7fefd7fc1bd5 caffe::Net<>::Init()
@ 0x7fefd7fc3ad2 caffe::Net<>::Net()
@ 0x7fefd810e0d0 caffe::Solver<>::InitTrainNet()
@ 0x7fefd810f023 caffe::Solver<>::Init()
@ 0x7fefd810f2ff caffe::Solver<>::Solver()
@ 0x7fefd7fa2a31 caffe::Creator_SGDSolver<>()
@ 0x40ee6e caffe::SolverRegistry<>::CreateSolver()
@ 0x407efd train()
@ 0x40590c main
@ 0x7fefd699af45 (unknown)
@ 0x40617b (unknown)
@ (nil) (unknown)
Aborted (core dumped)
label.txt only the label of the positive sample
Looking forward to your reply

The GPU performence nearly same as CPU?

Hi, I have used your codes, and through macro select cpu or gpu for testing not for training. But I find this tow ways performance nearly same.
By the way, I want to know, can I use your model directly detection multi-faces by no training?
Thank you!

why loss of landmark task does not descend?

I have trained your codes many times, but the loss of landmark task does not converge. I don' t known what is wrong. When i only train face classification and regression of bounding boxes, losses of these tasks both descend. Why?

generate hdf5 file?

labels = np.concatenate((label, regression_box, landmark), axis = 1)
with h5py.File(train_file_path, 'w') as f:
f['data'] = a
f['labels'] = labels
f['regression'] = regression_box
f['landmark'] = landmark

there are no need to use f['regression'] = regression_box f['landmark'] = landmark
because labels = np.concatenate((label, regression_box, landmark), axis = 1) has regression and landmark

does right?

image label calculate

hei Liuyang many thanks for you . but how to calculate the face regression train labels? as you give-- face regression: [0.1,0.1,0.1,0.1].,,what‘s the calculation method to get these values: 0.1 0.1 0.1 0.1

trainingdata

Dear author, can you providea link for your prepared training data in baidu pan? thanks

关于绘制pr图

你好!我想问下作者提供的pr图是怎么作出来的,我用o-net输出的bbox和score 拿到widerface 的eval-tools上运行,效果非常差……所以应该怎样选取bbox和score

train data label

@foreverYoungGitHub hi, my dear liuyang, based on your lastest reply if an image is H*W, ground truth is (x0,y0,x1,y1), (x0,y0) is left top point of the gruoud truth , a boundbox is (x2,y2,x3,y3),IOU with groud truth >0.65, we consider it as an positive example, does the train label is
( (x0-x2)/W, (y0-y2)/H, (x1-x3)/W, (y1-y3)/H ) ?
But if this is right, the train labels are near 0, we alse set the negative regression values to (0, 0, 0, 0), Does it is ok? i mean both negative and positive samples' regression values are near 0

How many training examples are you use?

How many training examples are you use? I extract 6 non-face patches, 2 part-face patches, and 2 positive patches every image, so I get totally 1 billion training data. Is this too many?

计算landmark loss 层时如何忽略batch中的其他三类样本?

数据处理时,我最终为下面的格式:(每行:图片位置 1个label 4个box 10个landmark)
pos1.jpg 1 0.1 0.2 0.3 0.4 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
neg1.jpg 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
part1.jpg -1 0.1 0.2 0.3 0.4 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
landmark1.jpg -1 -1 -1 -1 -1 0.1 0.3 0.3 0.4 0.5 0.5 0.3 0.2 0.7 0.7
我在算landmark loss层时,如何忽略其他样本,您是重写的loss层么还是? 看prototxt您并没有重写,还是我前面忽略了什么信息,谢谢。

what's the format of label.txt

Hi ,

I wander what's the format of label.txt & landmark.txt & regression_box.txt & crop_image.txt.
Could you kindly tell me?

Thank you very much!

GPU version

When I try to run this code in GPU version, it always shows errors like this (even I try in a different GPU server).

...
}
I1030 17:39:38.964071 13238 layer_factory.hpp:77] Creating layer input
I1030 17:39:38.964112 13238 net.cpp:84] Creating Layer input
I1030 17:39:38.964131 13238 net.cpp:380] input -> data
I1030 17:39:38.965209 13238 net.cpp:122] Setting up input
I1030 17:39:38.966150 13238 net.cpp:129] Top shape: 1 3 12 12 (432)
I1030 17:39:38.966163 13238 net.cpp:137] Memory required for data: 1728
I1030 17:39:38.966179 13238 layer_factory.hpp:77] Creating layer conv1
I1030 17:39:38.966215 13238 net.cpp:84] Creating Layer conv1
I1030 17:39:38.966230 13238 net.cpp:406] conv1 <- data
I1030 17:39:38.966775 13238 net.cpp:380] conv1 -> conv1
Segmentation fault

CelebA数据库如何生成训练数据

你好:
请问你是如何将CelebA数据库生成 你 generate_hdf5.py中的
label_path = '../dataset/label.txt'
landmark_path = '../dataset/landmark.txt'
regression_box_path = '../dataset/regression_box.txt'
crop_image_path = '../dataset/crop_image.txt'
train_file_path = '../dataset/train_24.hd5'
caffe菜鸟,期待你的回复 谢谢

Can I know how you suppress loss that are not used?

According to the MTCNN paper, "some of the losses are not used". For example, for a negative example, not bounding box and landmark points will be detected. Therefore, regression loss and landmark loss are not used. Can I know which part of your code does that? Thanks.

problem of training models

sorry to bother you .I have trained three mofdels,det1,det2,det3 seperately .but the detetion performance is not as well as yours. the location of landmark is inaccurate especially.
I crop the image and then generate hdf5 file of different sizes ,1212,2424,4848.
and then train det1,det2,det3 by the corresponding hdf5 file 12
12,2424,4848.,
am I wrong? should I train the det2 model based on the trained det1 model?

Face Alignment

Can I know if the code does face alignment after detection
If so can I know the location from where the face alignment code starts

train data proportion

Hi @foreverYoungGitHub , in your reply : For example, the positive : part : negative= 1 : 3 : 3 at the beginning, while it will change to positive : part : negative= 1 : 5 : 3 in the next iteration.
you mean in a train process, we need to change every batchsize‘s data proportion ? or you mean we use 1:3:3 to train a model A.caffemodel ,then,we use this model to generate train data, the proportion set to 1:5:3 ,finetune on A.caffemodel ? then we will get B.caffemodel

bulk detect version?

mtcnn detect slow in mobile device, only CPU, 720p image avg 500ms (SAMSUNG S7 EDGE)
maybe we cant implements a bulk version, it can be more faster ?

linking error

I'am using caffe-1.0 release version,but a linking error come across。
what's the original caffe version did you use??
@foreverYoungGitHub

CMakeFiles/MTCNN.dir/MTCNN.cpp.o: In function MTCNN::MTCNN(std::vector<std::string, std::allocator<std::string> >, std::vector<std::string, std::allocator<std::string> >)': MTCNN.cpp:(.text+0x6df): undefined reference to caffe::Net::Net(std::string const&, caffe::Phase, int, std::vector<std::string, std::allocatorstd::string > const*)'
collect2: error: ld returned 1 exit status
make[2]: *** [MTCNN] Error 1
make[1]: *** [CMakeFiles/MTCNN.dir/all] Error 2
make: *** [all] Error 2

Fine tuned landmark

Hi @foreverYoungGitHub,

Based on your sentence in README "You can also train the face detection and regression for the dataset without landmark label. The model is then used to train the face landmark."
Can we fine tune the ONet only using landmark data ?.

Thanks

train data label

Hi Liuyang,could you give us a sample about three types of image's labels ?

memory leak?

in celeba_crop.cpp
char *cstr = new char[path[i].length() + 1];
no delete

little offset in face alignment

For trump.jpg, compared with your result, there is little offset in face alignment phase when the main.cpp calls MTCNN.detection_TEST(), could you tell me how can I improve the face alignmnet precision?

Train the network

I want to train MTCNN using my own dataset. can you please give me some hint to do that?, for example:

  1. Format of the dataset, like naming the folder and image.
  2. Train the network to produce the model

how to use the mtcnn.cpp

Hi:
i use the detect code for detecting face and alignment face. but i got error. and the code only output
face detect result, not output alignment result.
can you provide a complete example?
thanks

license?

hello! nice implementation, is this under the BSD 3-Clause license?

如何忽略多余的标签

训练人脸检测的时候是3部分训练数据,2个任务进行的,非人脸,人脸,部分人脸,分类标签分别为0,1,-1,请问如何忽略掉-1的分类标签进行分类训练呢,又是如何忽略掉非人脸的回归标签进行回归任务训练的呢,

文本文件格式内容是什么?

label_path = '../dataset/label.txt'

landmark_path = '../dataset/landmark.txt'

regression_box_path = '../dataset/regression_box.txt'

crop_image_path = '../dataset/crop_image.txt'
你能发个样本吗,我想这样就可以按照你的流程可以训练了

which version of the caffe be used?

When I run the detection on my pc(windows).
I find some function such as Forword() is not the same definition as my caffe.
give the errors:
Error 11 error C2661: 'caffe::Net::Net' : no overloaded function takes 2 arguments D:\my programs13.0\MTCNN_foreverYoung\detection\MTCNN.cpp 30 1 MTCNN
Error 26 error C2661: 'caffe::Net::Forward' : no overloaded function takes 0 arguments D:\my programs13.0\MTCNN_foreverYoung\detection\MTCNN.cpp 346 1 MTCNN
Error 27 error C2661: 'caffe::Net::Forward' : no overloaded function takes 0 arguments D:\my programs13.0\MTCNN_foreverYoung\detection\MTCNN.cpp 382 1 MTCNN
Error 12 error C2660: 'std::shared_ptr<caffe::Net>::reset' : function does not take 1 arguments D:\my programs13.0\MTCNN_foreverYoung\detection\MTCNN.cpp 30 1 MTCNN

about the label

you give an example of labels of the true ,part,negtive face image. true is 1,part 1,neg 0
why true and part have the same label.some other Programmers set part as -1,
is it practical to train models?

regression box bug?

            bbox.height = bounding_box_[j].height + regression_box_temp_[4*j+2] * bounding_box_[j].height;
            bbox.width = bounding_box_[j].width + regression_box_temp_[4*j+3] * bounding_box_[j].width;

看mathlab 代码,regression box 对应的应该是 x1,y1; x2,y2 两个坐标,而不是height,width 造成最终结果的bbox位置会有点歪,改正后效果更好些

Training

Cool project! I'm looking at using either this project or your Cascade CNN detector for a general object detector. I had a couple questions, hoping you can help :)

  1. How fast is MTCNN on CPU?
  2. Is it possible to upload your training scripts? Or samples of the input for the lmdb databases that Caffe needs? I'll need to create my own training data -- not quite sure how to do it.

Thanks!

run MTCNN error

Thanks for your work!
I used cmake to compile the code.
but when I run the executable file MTCNN ,I got this backtrace:

*** Error in `./MTCNN': free(): invalid next size (fast): 0x00000000023e9bd0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f49b12f17e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x7fe0a)[0x7f49b12f9e0a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f49b12fd98c]
./MTCNN[0x416950]
./MTCNN[0x414d54]
./MTCNN[0x411edc]
./MTCNN[0x40eea1]
./MTCNN[0x40c5b9]
./MTCNN[0x407af5]
./MTCNN[0x407191]
./MTCNN[0x40638b]
./MTCNN[0x40429d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f49b129a830]
./MTCNN[0x403e49]
How could I deal with it ?

O-Net error?

Hello! i try to get it running under windows, it runs normally and then closes unexpectedly when entering the O-Net phase.. using the MTCNN::detection_TEST function.. some one experiencing the same? or do you know what could be going wrong @foreverYoungGitHub ? i will try to debug it later :)
cheers!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.