foreveryounggithub / mtcnn Goto Github PK
View Code? Open in Web Editor NEWRepository for "Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks", implemented with Caffe, C++ interface.
Repository for "Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks", implemented with Caffe, C++ interface.
数据处理时,我最终为下面的格式:(每行:图片位置 1个label 4个box 10个landmark)
pos1.jpg 1 0.1 0.2 0.3 0.4 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
neg1.jpg 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
part1.jpg -1 0.1 0.2 0.3 0.4 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
landmark1.jpg -1 -1 -1 -1 -1 0.1 0.3 0.3 0.4 0.5 0.5 0.3 0.2 0.7 0.7
我在算landmark loss层时,如何忽略其他样本,您是重写的loss层么还是? 看prototxt您并没有重写,还是我前面忽略了什么信息,谢谢。
Hi:
When I use the celebA database to train the P_Net network, the following error occurs:
I0520 22:22:14.954217 3971 net.cpp:84] Creating Layer loss_label
I0520 22:22:14.954236 3971 net.cpp:406] loss_label <- conv4-1
I0520 22:22:14.954241 3971 net.cpp:406] loss_label <- label
I0520 22:22:14.954246 3971 net.cpp:380] loss_label -> loss_label
I0520 22:22:14.954262 3971 layer_factory.hpp:77] Creating layer loss_label
F0520 22:22:14.954509 3971 softmax_loss_layer.cpp:47] Check failed: outer_num_ * inner_num_ == bottom[1]->count() (4900 vs. 100) Number of labels must match number of predictions; e.g., if softmax axis == 1 and prediction shape is (N, C, H, W), label count (number of labels) must be NHW, with integer values in {0, 1, ..., C-1}.
*** Check failure stack trace: ***
@ 0x7fefd7991daa (unknown)
@ 0x7fefd7991ce4 (unknown)
@ 0x7fefd79916e6 (unknown)
@ 0x7fefd7994687 (unknown)
@ 0x7fefd804eee0 caffe::SoftmaxWithLossLayer<>::Reshape()
@ 0x7fefd7fc1bd5 caffe::Net<>::Init()
@ 0x7fefd7fc3ad2 caffe::Net<>::Net()
@ 0x7fefd810e0d0 caffe::Solver<>::InitTrainNet()
@ 0x7fefd810f023 caffe::Solver<>::Init()
@ 0x7fefd810f2ff caffe::Solver<>::Solver()
@ 0x7fefd7fa2a31 caffe::Creator_SGDSolver<>()
@ 0x40ee6e caffe::SolverRegistry<>::CreateSolver()
@ 0x407efd train()
@ 0x40590c main
@ 0x7fefd699af45 (unknown)
@ 0x40617b (unknown)
@ (nil) (unknown)
Aborted (core dumped)
label.txt only the label of the positive sample
Looking forward to your reply
Hi, I have used your codes, and through macro select cpu or gpu for testing not for training. But I find this tow ways performance nearly same.
By the way, I want to know, can I use your model directly detection multi-faces by no training?
Thank you!
I have trained your codes many times, but the loss of landmark task does not converge. I don' t known what is wrong. When i only train face classification and regression of bounding boxes, losses of these tasks both descend. Why?
hi!
good job! 发现训练配置文件里面roi和landmask回归loss为欧式,请问训练最终结果和原版文章精度相差多少?谢谢!
I found there is a gap between yours with author's result. Where is the trouble make this lower?
labels = np.concatenate((label, regression_box, landmark), axis = 1)
with h5py.File(train_file_path, 'w') as f:
f['data'] = a
f['labels'] = labels
f['regression'] = regression_box
f['landmark'] = landmark
there are no need to use f['regression'] = regression_box f['landmark'] = landmark
because labels = np.concatenate((label, regression_box, landmark), axis = 1) has regression and landmark
does right?
hei Liuyang many thanks for you . but how to calculate the face regression train labels? as you give-- face regression: [0.1,0.1,0.1,0.1].,,what‘s the calculation method to get these values: 0.1 0.1 0.1 0.1
你好,想请问一下,如果想要在多张gpu卡的电脑上运行,需要设置什么参数吗?
Dear author, can you providea link for your prepared training data in baidu pan? thanks
@foreverYoungGitHub thx share the code, can the code be used to train mtcnn model?
你好!我想问下作者提供的pr图是怎么作出来的,我用o-net输出的bbox和score 拿到widerface 的eval-tools上运行,效果非常差……所以应该怎样选取bbox和score
@foreverYoungGitHub hi, my dear liuyang, based on your lastest reply if an image is H*W, ground truth is (x0,y0,x1,y1), (x0,y0) is left top point of the gruoud truth , a boundbox is (x2,y2,x3,y3),IOU with groud truth >0.65, we consider it as an positive example, does the train label is
( (x0-x2)/W, (y0-y2)/H, (x1-x3)/W, (y1-y3)/H ) ?
But if this is right, the train labels are near 0, we alse set the negative regression values to (0, 0, 0, 0), Does it is ok? i mean both negative and positive samples' regression values are near 0
@foreverYoungGitHub Hi yang, what's your train label. is x y w h or x1 y1 x2,y2?
How many training examples are you use? I extract 6 non-face patches, 2 part-face patches, and 2 positive patches every image, so I get totally 1 billion training data. Is this too many?
数据处理时,我最终为下面的格式:(每行:图片位置 1个label 4个box 10个landmark)
pos1.jpg 1 0.1 0.2 0.3 0.4 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
neg1.jpg 0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
part1.jpg -1 0.1 0.2 0.3 0.4 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1
landmark1.jpg -1 -1 -1 -1 -1 0.1 0.3 0.3 0.4 0.5 0.5 0.3 0.2 0.7 0.7
我在算landmark loss层时,如何忽略其他样本,您是重写的loss层么还是? 看prototxt您并没有重写,还是我前面忽略了什么信息,谢谢。
Hi ,
I wander what's the format of label.txt & landmark.txt & regression_box.txt & crop_image.txt.
Could you kindly tell me?
Thank you very much!
When I try to run this code in GPU version, it always shows errors like this (even I try in a different GPU server).
...
}
I1030 17:39:38.964071 13238 layer_factory.hpp:77] Creating layer input
I1030 17:39:38.964112 13238 net.cpp:84] Creating Layer input
I1030 17:39:38.964131 13238 net.cpp:380] input -> data
I1030 17:39:38.965209 13238 net.cpp:122] Setting up input
I1030 17:39:38.966150 13238 net.cpp:129] Top shape: 1 3 12 12 (432)
I1030 17:39:38.966163 13238 net.cpp:137] Memory required for data: 1728
I1030 17:39:38.966179 13238 layer_factory.hpp:77] Creating layer conv1
I1030 17:39:38.966215 13238 net.cpp:84] Creating Layer conv1
I1030 17:39:38.966230 13238 net.cpp:406] conv1 <- data
I1030 17:39:38.966775 13238 net.cpp:380] conv1 -> conv1
Segmentation fault
你好:
请问你是如何将CelebA数据库生成 你 generate_hdf5.py中的
label_path = '../dataset/label.txt'
landmark_path = '../dataset/landmark.txt'
regression_box_path = '../dataset/regression_box.txt'
crop_image_path = '../dataset/crop_image.txt'
train_file_path = '../dataset/train_24.hd5'
caffe菜鸟,期待你的回复 谢谢
According to the MTCNN paper, "some of the losses are not used". For example, for a negative example, not bounding box and landmark points will be detected. Therefore, regression loss and landmark loss are not used. Can I know which part of your code does that? Thanks.
sorry to bother you .I have trained three mofdels,det1,det2,det3 seperately .but the detetion performance is not as well as yours. the location of landmark is inaccurate especially.
I crop the image and then generate hdf5 file of different sizes ,1212,2424,4848.
and then train det1,det2,det3 by the corresponding hdf5 file 1212,2424,4848.,
am I wrong? should I train the det2 model based on the trained det1 model?
Can I know if the code does face alignment after detection
If so can I know the location from where the face alignment code starts
Hi @foreverYoungGitHub , in your reply : For example, the positive : part : negative= 1 : 3 : 3 at the beginning, while it will change to positive : part : negative= 1 : 5 : 3 in the next iteration.
you mean in a train process, we need to change every batchsize‘s data proportion ? or you mean we use 1:3:3 to train a model A.caffemodel ,then,we use this model to generate train data, the proportion set to 1:5:3 ,finetune on A.caffemodel ? then we will get B.caffemodel
In MTCNN.ccp ,could you tell me you method that how to calculation bounding_box_ in ?
mtcnn detect slow in mobile device, only CPU, 720p image avg 500ms (SAMSUNG S7 EDGE)
maybe we cant implements a bulk version, it can be more faster ?
I'am using caffe-1.0 release version,but a linking error come across。
what's the original caffe version did you use??
@foreverYoungGitHub
CMakeFiles/MTCNN.dir/MTCNN.cpp.o: In function MTCNN::MTCNN(std::vector<std::string, std::allocator<std::string> >, std::vector<std::string, std::allocator<std::string> >)': MTCNN.cpp:(.text+0x6df): undefined reference to
caffe::Net::Net(std::string const&, caffe::Phase, int, std::vector<std::string, std::allocatorstd::string > const*)'
collect2: error: ld returned 1 exit status
make[2]: *** [MTCNN] Error 1
make[1]: *** [CMakeFiles/MTCNN.dir/all] Error 2
make: *** [all] Error 2
Thanks for your code. Could you tell me how to implement face alignment?
Thank you.
Based on your sentence in README "You can also train the face detection and regression for the dataset without landmark label. The model is then used to train the face landmark."
Can we fine tune the ONet only using landmark data ?.
Thanks
Hi Liuyang,could you give us a sample about three types of image's labels ?
in celeba_crop.cpp
char *cstr = new char[path[i].length() + 1];
no delete
For trump.jpg, compared with your result, there is little offset in face alignment phase when the main.cpp calls MTCNN.detection_TEST(), could you tell me how can I improve the face alignmnet precision?
I want to train MTCNN using my own dataset. can you please give me some hint to do that?, for example:
@foreverYoungGitHub could you share the code of generate training data?
Hi:
i use the detect code for detecting face and alignment face. but i got error. and the code only output
face detect result, not output alignment result.
can you provide a complete example?
thanks
hello! nice implementation, is this under the BSD 3-Clause license?
训练人脸检测的时候是3部分训练数据,2个任务进行的,非人脸,人脸,部分人脸,分类标签分别为0,1,-1,请问如何忽略掉-1的分类标签进行分类训练呢,又是如何忽略掉非人脸的回归标签进行回归任务训练的呢,
label_path = '../dataset/label.txt'
landmark_path = '../dataset/landmark.txt'
regression_box_path = '../dataset/regression_box.txt'
crop_image_path = '../dataset/crop_image.txt'
你能发个样本吗,我想这样就可以按照你的流程可以训练了
When I run the detection on my pc(windows).
I find some function such as Forword() is not the same definition as my caffe.
give the errors:
Error 11 error C2661: 'caffe::Net::Net' : no overloaded function takes 2 arguments D:\my programs13.0\MTCNN_foreverYoung\detection\MTCNN.cpp 30 1 MTCNN
Error 26 error C2661: 'caffe::Net::Forward' : no overloaded function takes 0 arguments D:\my programs13.0\MTCNN_foreverYoung\detection\MTCNN.cpp 346 1 MTCNN
Error 27 error C2661: 'caffe::Net::Forward' : no overloaded function takes 0 arguments D:\my programs13.0\MTCNN_foreverYoung\detection\MTCNN.cpp 382 1 MTCNN
Error 12 error C2660: 'std::shared_ptr<caffe::Net>::reset' : function does not take 1 arguments D:\my programs13.0\MTCNN_foreverYoung\detection\MTCNN.cpp 30 1 MTCNN
According to your description,1080P image is processed every 0.179 s.
What graphic card is used?
you give an example of labels of the true ,part,negtive face image. true is 1,part 1,neg 0
why true and part have the same label.some other Programmers set part as -1,
is it practical to train models?
Such as the title, I cloned and not modified, but the trump.jpg can't achieve your accurate. I want to know why?
Thank you!
@foreverYoungGitHub hei, thanks for your work ,could you show us an example of train image's label?
hi, @foreverYoungGitHub , what's the proprotion of three types train data? positive : part : negative= ? :? :?
bbox.height = bounding_box_[j].height + regression_box_temp_[4*j+2] * bounding_box_[j].height;
bbox.width = bounding_box_[j].width + regression_box_temp_[4*j+3] * bounding_box_[j].width;
看mathlab 代码,regression box 对应的应该是 x1,y1; x2,y2 两个坐标,而不是height,width 造成最终结果的bbox位置会有点歪,改正后效果更好些
Cool project! I'm looking at using either this project or your Cascade CNN detector for a general object detector. I had a couple questions, hoping you can help :)
Thanks!
Thanks for your work!
I used cmake to compile the code.
but when I run the executable file MTCNN ,I got this backtrace:
*** Error in `./MTCNN': free(): invalid next size (fast): 0x00000000023e9bd0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f49b12f17e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x7fe0a)[0x7f49b12f9e0a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f49b12fd98c]
./MTCNN[0x416950]
./MTCNN[0x414d54]
./MTCNN[0x411edc]
./MTCNN[0x40eea1]
./MTCNN[0x40c5b9]
./MTCNN[0x407af5]
./MTCNN[0x407191]
./MTCNN[0x40638b]
./MTCNN[0x40429d]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f49b129a830]
./MTCNN[0x403e49]
How could I deal with it ?
Hello! i try to get it running under windows, it runs normally and then closes unexpectedly when entering the O-Net phase.. using the MTCNN::detection_TEST function.. some one experiencing the same? or do you know what could be going wrong @foreverYoungGitHub ? i will try to debug it later :)
cheers!
when test, why you transpose sample_float image in prepocess? I can not make sense.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.