sfzhang15 / faceboxes Goto Github PK
View Code? Open in Web Editor NEWFaceBoxes: A CPU Real-time Face Detector with High Accuracy, IJCB, 2017
License: Apache License 2.0
FaceBoxes: A CPU Real-time Face Detector with High Accuracy, IJCB, 2017
License: Apache License 2.0
由于GPU的原因,我只能设置更小的batch_size,这样的话,迭代次数和学习率该如何调整呢?能否分享点经验。
老哥,你这网络设计对移动设备不友好,前几层耗时占80%。用ncnn跑非常慢,可能是ncnn对7x7优化不好。
增加的层,opencv都是支持的
opencv\modules\dnn\src\layers\permute_layer.cpp
opencv\modules\dnn\src\layers\detection_output_layer.cpp
opencv\modules\dnn\src\layers\prior_box_layer.cpp
目前发现prior_box_layer.cpp的参数,你的实现和opencv不同,打印的错误log是
prior_box_param {
min_size: 32
min_size: 64
min_size: 128
clip: false
variance: 0.1
variance: 0.1
variance: 0.2
variance: 0.2
step: 32
offset: 0.5
}
prior_box_param 里面min_size只能有一个,跑到这里就出问题了。
@sfzhang15 HI
发现在测试时
缩放与否是由什么决定吗?
在FaceBoxes中,这样成比例的3倍放大,显然可以提升检测效果.那为什么在S3FD中不进行这样的放大呢??
Hi Dr. Zhang,
I have built the latest BVLC caffe on Windows 10 and VS2015, and I am trying to run your code draw_net.py on Python 3.5.
The input_net_proto_file is:
models\faceboxes\deploy.prototxt
The model file is downloaded from your tutorials link:
faceboxes.caffemodel
I am experiencing the following issue:
Traceback (most recent call last):
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.3.3\helpers\pydev\pydevd.py", line 1741, in <module>
main()
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.3.3\helpers\pydev\pydevd.py", line 1735, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.3.3\helpers\pydev\pydevd.py", line 1135, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.3.3\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "D:/code_cv/face_detection/FaceBoxes/python/draw_net.py", line 66, in <module>
main()
File "D:/code_cv/face_detection/FaceBoxes/python/draw_net.py", line 52, in main
text_format.Merge(open(args.input_net_proto_file).read(), net)
File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 525, in Merge
descriptor_pool=descriptor_pool)
File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 579, in MergeLines
return parser.MergeLines(lines, message)
File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 612, in MergeLines
self._ParseOrMerge(lines, message)
File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 627, in _ParseOrMerge
self._MergeField(tokenizer, message)
File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 727, in _MergeField
merger(tokenizer, message, field)
File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 815, in _MergeMessageField
self._MergeField(tokenizer, sub_message)
File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 695, in _MergeField
(message_descriptor.full_name, name))
google.protobuf.text_format.ParseError: 1378:3 : Message type "caffe.LayerParameter" has no field named "permute_param".
It seems that the caffe version I am using doesn't match the version you are using.
What is the caffe version you are using?
Do you have any other idea to fix this issue?
Thanks and Best Regards,
Ardeal
multibox_loss_layer.cpp:139] Check failed: num_priors_ * loc_classes_ * 4 == bottom[0]->channels() (13568 vs. 87296) Number of priors must match number of location predictions.
我知道了,是我的caffe没有这个cpp,我把它加进去编译再试试
@sfzhang15
HI
像code中这样的模型,是属于您自己搭建的,应该没有什么预训练模型(传统的那样基于iamgenet分类数据集训练好的模型)可以借助把.换句话说,此时的训练应该是所谓的从头训练吧.
We try to transfer faceboxes to some other applications. However, we found that the real time of the algorithm in NVIDIA GTX1080TI is 25fps, which is far less than the paper which had about 100fps. What't wrong with that? Could you give me an explain?
Hi,
Could the code be run on Windows 10 with VS 2015 or VS 2017?
What version of Windows and VS you are using to run the code?
Thanks and Best Regards,
Ardeal
hi
i want to use your command with train.sh to train my data by your project,but i do not konw how to add my data. could you tell me what i should do. thank you very much.
Can you provide the performance of FaceBoxes on WIDERFACE val?
Thanks for sharing another awesome work!
I'm wondering if you have compared this new work with the previous S3FD under similar computational cost. From my own test, I tried S3FD (320x180) vs FaceBoxes (960x540), both have similar speed on my device (not using CPU though), but S3FD still performs better on my test images.
Is this expected or there's any reason behind?
Is it possible to detect faces in batches of images?
In your train.prototxt, the height and width is (1024,1024), but in your deploy.prototxt it is (640,480).
What is the impact of different height and width?
Hi,
I checked all your source code, but I didn't find out the main function of your source code.
Could you please kindly tell me where the main function of faceboxes is?
Thanks and Best Regards,
Ardeal
conda install caffe-gpu make all -j && make py
log:
compilation terminated. make: *** [.build_release/src/caffe/layers/base_data_layer.o] Error 1 In file included from ./include/caffe/util/cudnn.hpp:8:0, from ./include/caffe/util/device_alternate.hpp:40, from ./include/caffe/common.hpp:19, from src/caffe/data_reader.cpp:6: .build_release/src/caffe/proto/caffe.pb.h:10:40: fatal error: google/protobuf/port_def.inc: No such file or directory #include <google/protobuf/port_def.inc>
你好,以前使用Pytorch,对caffe里prototxt文件生成不是很了解,什么方法能快速生成呢?例如FaceBoxes,里面的层都是新的,不能使用现成的网络改。:)
Why can't i compile this project in linux. But i can complie caffe project. I compare Makefile and Makefile.config with caffe. there are so many points have been changed in Makefile. what does it mean. How to complile with cpu-only. I dont have GPU.
@sfzhang15 HI
paper的3.4节提到:
filter out these face boxes whose height or width is less than 20 pixels
在FAN中的3.2.1节中,也提到了相似的内容:
Besides, we calculate the statistics from the WiderFace train set based on the ground-truth face size. As Figure 3 shows, more than 80% faces have an object scale from 16 to 406 pixel. Faces with small size lack sufficient resolution and therefore it may not be a good choice to include in the training data.
有个疑问:这种直接从训练集中剔除<20/16的face,就会导致模型对<20/16的face不敏感,那么在检测的时候,肯定会对<20/16的face检测效果很差,这不直接影响了模型的检测效果么。不知道这样的理解对吗??
另外,在FDNet中,作者的做法恰恰相反:
As WIDER FACE dataset contain many extremely tinny faces (<16 pixels width/height), we keep these small proposals (<16 pixels width/height) valid in the training and testing time [13]. The experiments show that our method can achieve better performance.
相当于保留了所有的小face。
您好,我使用以下两个命令生成自己的LMDB文件,
./data/WIDER_FACE/create_list.sh
./data/WIDER_FACE/create_data.sh
但是,我无法执行成功!
请问,有详细的注释说明,如何配置这两个文件吗?
While training Faceboxes model, I noticed that the output is strange:
I0316 10:16:10.963383 6670 solver.cpp:243] Iteration 30450, loss = 3.36783
I0316 10:16:10.963630 6670 solver.cpp:259] Train net output #0: mbox_loss = 3.34174 (* 1 = 3.34174 loss)
I0316 10:16:13.457460 6670 sgd_solver.cpp:151] Iteration 30450, lr = 0.0052975
I0316 10:18:44.811120 6670 solver.cpp:243] Iteration 30500, loss = 3.35862
I0316 10:18:44.812209 6670 solver.cpp:259] Train net output #0: mbox_loss = 4.07279 (* 1 = 4.07279 loss)
I0316 10:18:44.812288 6670 sgd_solver.cpp:151] Iteration 30500, lr = 0.005275
I0316 10:21:14.092057 6670 solver.cpp:243] Iteration 30550, loss = 3.36363
I0316 10:21:14.092276 6670 solver.cpp:259] Train net output #0: mbox_loss = 3.34809 (* 1 = 3.34809 loss)
I0316 10:21:14.092325 6670 sgd_solver.cpp:151] Iteration 30550, lr = 0.0052525
I0316 10:23:43.217934 6670 solver.cpp:243] Iteration 30600, loss = 3.35409
Why isn't there classification loss but only localization loss?
I looked into solver.cpp
, the code is as below:
if (display) {
LOG_IF(INFO, Caffe::root_solver()) << "Iteration " << iter_
<< ", loss = " << smoothed_loss_;
const vector<Blob<Dtype>*>& result = net_->output_blobs();
int score_index = 0;
for (int j = 0; j < result.size(); ++j) {
const Dtype* result_vec = result[j]->cpu_data();
const string& output_name =
net_->blob_names()[net_->output_blob_indices()[j]];
const Dtype loss_weight =
net_->blob_loss_weights()[net_->output_blob_indices()[j]];
for (int k = 0; k < result[j]->count(); ++k) {
ostringstream loss_msg_stream;
if (loss_weight) {
loss_msg_stream << " (* " << loss_weight
<< " = " << loss_weight * result_vec[k] << " loss)";
}
LOG_IF(INFO, Caffe::root_solver()) << " Train net output #"
<< score_index++ << ": " << output_name << " = "
<< result_vec[k] << loss_msg_stream.str();
}
}
}
The for loop iterates all the losses and prints them, if so, why isn't there classification loss?
Hi,
Great job. Its cpu real-time speed is really fast. Wonder how to add facial landmarks detection on your net.
Do you have any schedule to release this kind of work?
Thanks a lot.
Hi Dr. Zhang,
I tested 10 images on your CPU model. The time needed for the algorithm seems to be much longer than that in your paper:
-----0th images, image width=640, image height=480
-----0th images, Face Rect width=135, Face Rect height=160
-----0th images, time for **FD = 273.279480ms**, FL = 3.838500ms
-----1th images, image width=640, image height=426
-----1th images, Face Rect width=243, Face Rect height=335
-----1th images, time for **FD = 120.471107ms,** FL = 3.123300ms
-----2th images, image width=570, image height=856
-----2th images, Face Rect width=318, Face Rect height=428
-----2th images, time for **FD = 119.700096ms**, FL = 3.967700ms
-----3th images, image width=640, image height=480
-----3th images, Face Rect width=195, Face Rect height=282
-----3th images, time for **FD = 159.247803ms,** FL = 3.166000ms
-----4th images, image width=640, image height=632
-----4th images, Face Rect width=381, Face Rect height=497
-----4th images, time for **FD = 132.707703ms**, FL = 3.671800ms
-----5th images, image width=640, image height=677
-----5th images, Face Rect width=457, Face Rect height=574
-----5th images, time for **FD = 135.079193ms,** FL = 4.643900ms
-----6th images, image width=640, image height=442
-----6th images, Face Rect width=254, Face Rect height=349
-----6th images, time for **FD = 154.615097ms,** FL = 3.570300ms
-----7th images, image width=640, image height=563
-----7th images, Face Rect width=298, Face Rect height=415
-----7th images, time for **FD = 296.186584ms**, FL = 5.205200ms
-----8th images, image width=640, image height=480
-----8th images, Face Rect width=258, Face Rect height=326
-----8th images, time for **FD = 188.629303ms**, FL = 3.470700ms
-----9th images, image width=224, image height=224
-----9th images, Face Rect width=111, Face Rect height=167
-----9th images, time for **FD = 33.309402ms,** FL = 7.112700ms
For the 640*480 images, the time needed is around 150ms. Do you have any idea about the reason why the time needed is much longer?
Do you have a trained GPU model?
Thanks and Best Regards,
Ardeal
Why does inception_a3_concat_mbox_loc output 84 dimensions, while conv3_2_mbox_loc and conv4_2_mbox_loc output 4 dimensions?Thank you!!!
Hi,
I tried to run the demo.py file in python, but I got the following error:
F0311 15:58:18.148229 6000 common.cpp:76] Cannot use GPU in CPU-only Caffe: check mode.
however, I have modified the mode to CPU-only mode. Should I modified some other code to disable GPU mode but enable CPU-only mode?
By the way, the faceboxes model was downloaded from the page you publicized your source code:
https://github.com/sfzhang15/FaceBoxes
os.chdir(caffe_root)
sys.path.insert(0, 'python')
import caffe
caffe.set_device(0)
# caffe.set_mode_gpu()
caffe.set_mode_cpu()
model_def = 'D:/code_cv/face_detection/FaceBoxes/models/faceboxes/deploy.prototxt'
model_weights = 'D:/code_cv/face_detection/FaceBoxes/models/faceboxes/faceboxes.caffemodel'
net = caffe.Net(model_def, model_weights, caffe.TEST)
image = caffe.io.load_image('examples/images/1.jpg')
Thanks and Best Regards,
Ardeal
Hi,
I want to use this method to detect people. The proportion of people is generally not 1:1. In the initial test, I did not modify the aspect_ratio in your train.prototxt. This maybe a potential problem.
first,when i training about 15,000 steps, the loss did not dropped, and it is around 6. Is this normal? Or is training steps too small?
Thank you very mush. -.-
I cannot download the trained model because of the internet , please put it on the BaiduYun, thank you very much!
首先,非常感谢你们作出的工作,那非常棒。
针对您的model,我有以下两个疑问,希望得到解答:
Hi,
请问output的score和bbox的回归数值大概分别分布在什么范围。
THX.
你好,我尝试将你的网络改变成多目标的结构,但是检测出来的结果有很多杂乱的框,如果想修改成多目标的形式,有哪些需要修改的地方吗?
Hi Dr. Zhang,
A few days ago, I could run faceboxes model on pycaffe correctly. The face detection result of the algorithms is good as well.
The code could be run correctly, but the output of algorithms seems to be weird:
The only difference is that I use CPP to call the model.
Why are the output of some images not correct?
Do you have any idea about the issue?
The AP is above 95%.
How about the recall? for example, the recall on >30 pixels faces?
你好! 我在我的电脑上复现了,训练出了模型,然后想使用c++的opencv进行调用caffe,发现出现这个错误:
[libprotobuf ERROR /home/chan/software/opencv-3.3.0/3rdparty/protobuf/src/google/protobuf/text_format.cc:298] Error parsing text-format caffe.NetParameter: 1761:15: Message type "caffe.PriorBoxParameter" has no field named "fixed_size".
OpenCV Error: Unspecified error (FAILED: ReadProtoFromTextFile(param_file, param). Failed to parse NetParameter file: /home/chan/桌面/Face_demo/demo/faceboxes_deploy.prototxt) in ReadNetParamsFromTextFileOrDie, file /home/chan/software/opencv-3.3.0/modules/dnn/src/caffe/caffe_io.cpp, line 1137
是opencv的dnn与这个caffe不符合?
I am running into errors with fb_net = caffe.Net('deploy.prototxt', 'faceboxes.caffemodel', caffe.TEST)
.
When I use the BVLC/caffe master release (1.0):
[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 1378:17: Message type "caffe.LayerParameter" has no field named "permute_param".
F0213 01:06:12.385599 1 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: deploy.prototxt
*** Check failure stack trace: ***
To avoid this issue, I've tried caffe-ssd and have encountered:
I0213 02:06:34.258400 61 net.cpp:165] Memory required for data: 51944960
I0213 02:06:34.258419 61 layer_factory.hpp:77] Creating layer detection_out
I0213 02:06:34.258469 61 net.cpp:100] Creating Layer detection_out
I0213 02:06:34.258489 61 net.cpp:434] detection_out <- mbox_loc
I0213 02:06:34.258509 61 net.cpp:434] detection_out <- mbox_conf_flatten
I0213 02:06:34.258527 61 net.cpp:434] detection_out <- mbox_priorbox
I0213 02:06:34.258569 61 net.cpp:408] detection_out -> detection_out
F0213 02:06:34.258677 61 detection_output_layer.cpp:164] Check failed: num_priors_ * num_loc_classes_ * 4 == bottom[0]->channels() (4000 vs. 25600) Number of priors must match number of location predictions.
*** Check failure stack trace: ***
Can anyone help with these issues?
I have downloaded the Wider Face dataset, but when trying to convert it to lmdb, there's no additional information to step in.
Is there any preprocess in that step, something like filtering the size below 20*20.
Hi,
which LMDB package is used in your code? could you please share the link of LMDB source code or binary?
On Linux:
Could you please list the version of each needed package?
Thanks and Best Regards,
Ardeal
caffe version: bias=true
pytorch version: nn.Conv2d(in_channels, out_channels, bias=False, **kwargs)
why bias different
when I test the afw data, the error happed! but when i trained the faceboxes ,it didnot happen
When will the PyTorch version be release?
您好,我按照您的步骤重新训练了一遍,发现loss值下降到4左右就降不下去了,基本都在3+~4+的范围内,请问您训练的时候也会这样吗
if im_scale != 1.0:
image = cv2.resize(image, None, None, fx=im_scale, fy=im_scale, interpolation=cv2.INTER_LINEAR)
n=10
net.blobs['data'].reshape(n, 3, image.shape[0], image.shape[1])
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2, 0, 1))
transformer.set_mean('data', np.array([104, 117, 123])) # mean pixel
transformer.set_raw_scale('data', 255) # the reference model operates on images in [0,255] range instead of [0,1]
transformer.set_channel_swap('data', (2, 1, 0)) # the reference model has channels in BGR order instead of RGB
transformed_image = transformer.preprocess('data', image)
#net.blobs['data'].data[...] = transformed_image
imgb = []
for i in range(n): imgb.append(transformed_image)
net.blobs['data'].data[...] = imgb
detections = net.forward()['detection_out']
print(detections.shape)
我参考demo.py做了上面的修改, 最后打印的结果是(1, 1, 800, 7),而不是(10, 1, 80, 7),
Yes, it is. You can put some images in a batch to detect faces. The first dimension of the output is the batch index.
这跟你上面的描述不一致。
当批处理的时候每张图片人脸数目不同的时候,无法区分结果。
hi, you said,
"We keep the overlapped part of the
face box if its center is in the above processed image,
then filter out these face boxes whose height or width
is less than 20 pixels." .
but in the project, I can't find where you filter the small face?
May you point it out for me, thx.
The paper said neg_pos_ratio is 3.0,But the code is 7.0.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.