Giter Site home page Giter Site logo

sfzhang15 / faceboxes Goto Github PK

View Code? Open in Web Editor NEW
591.0 591.0 173.0 4.02 MB

FaceBoxes: A CPU Real-time Face Detector with High Accuracy, IJCB, 2017

License: Apache License 2.0

CMake 2.35% Makefile 0.60% Shell 0.37% Dockerfile 0.06% C++ 80.21% Roff 0.12% Cuda 6.00% MATLAB 0.76% Python 9.54%

faceboxes's People

Contributors

sfzhang15 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

faceboxes's Issues

batch_size问题

由于GPU的原因,我只能设置更小的batch_size,这样的话,迭代次数和学习率该如何调整呢?能否分享点经验。

用ncnn跑速度非常慢。

老哥,你这网络设计对移动设备不友好,前几层耗时占80%。用ncnn跑非常慢,可能是ncnn对7x7优化不好。

能否修改prior_box_param,目前opencv的prior_box和你的不同

增加的层,opencv都是支持的
opencv\modules\dnn\src\layers\permute_layer.cpp
opencv\modules\dnn\src\layers\detection_output_layer.cpp
opencv\modules\dnn\src\layers\prior_box_layer.cpp
目前发现prior_box_layer.cpp的参数,你的实现和opencv不同,打印的错误log是
prior_box_param {
min_size: 32
min_size: 64
min_size: 128
clip: false
variance: 0.1
variance: 0.1
variance: 0.2
variance: 0.2
step: 32
offset: 0.5
}
prior_box_param 里面min_size只能有一个,跑到这里就出问题了。

test size

@sfzhang15 HI

发现在测试时

  1. 有时是对测试集进行尺寸的统一缩放,如 FaceBoxes 中的 fddb_test.py.
     >>此时的缩放是成比例的缩放(放大3倍),所以不会影响image的ratio.
  2. 有时是直接用测试集图片的原尺寸,如 S3FD 中 fddb_test.py.

缩放与否是由什么决定吗?
在FaceBoxes中,这样成比例的3倍放大,显然可以提升检测效果.那为什么在S3FD中不进行这样的放大呢??

caffe.LayerParameter has no field named permute_param

Hi Dr. Zhang,

I have built the latest BVLC caffe on Windows 10 and VS2015, and I am trying to run your code draw_net.py on Python 3.5.

The input_net_proto_file is:
models\faceboxes\deploy.prototxt

The model file is downloaded from your tutorials link:
faceboxes.caffemodel

I am experiencing the following issue:

Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.3.3\helpers\pydev\pydevd.py", line 1741, in <module>
    main()
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.3.3\helpers\pydev\pydevd.py", line 1735, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.3.3\helpers\pydev\pydevd.py", line 1135, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2018.3.3\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "D:/code_cv/face_detection/FaceBoxes/python/draw_net.py", line 66, in <module>
    main()
  File "D:/code_cv/face_detection/FaceBoxes/python/draw_net.py", line 52, in main
    text_format.Merge(open(args.input_net_proto_file).read(), net)
  File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 525, in Merge
    descriptor_pool=descriptor_pool)
  File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 579, in MergeLines
    return parser.MergeLines(lines, message)
  File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 612, in MergeLines
    self._ParseOrMerge(lines, message)
  File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 627, in _ParseOrMerge
    self._MergeField(tokenizer, message)
  File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 727, in _MergeField
    merger(tokenizer, message, field)
  File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 815, in _MergeMessageField
    self._MergeField(tokenizer, sub_message)
  File "C:\ProgramData\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 695, in _MergeField
    (message_descriptor.full_name, name))
google.protobuf.text_format.ParseError: 1378:3 : Message type "caffe.LayerParameter" has no field named "permute_param".

It seems that the caffe version I am using doesn't match the version you are using.
What is the caffe version you are using?
Do you have any other idea to fix this issue?

Thanks and Best Regards,
Ardeal

测试和训练都报错,维度不匹配

multibox_loss_layer.cpp:139] Check failed: num_priors_ * loc_classes_ * 4 == bottom[0]->channels() (13568 vs. 87296) Number of priors must match number of location predictions.

我知道了,是我的caffe没有这个cpp,我把它加进去编译再试试

about pretrain

@sfzhang15
HI

像code中这样的模型,是属于您自己搭建的,应该没有什么预训练模型(传统的那样基于iamgenet分类数据集训练好的模型)可以借助把.换句话说,此时的训练应该是所谓的从头训练吧.

about train my data

hi
i want to use your command with train.sh to train my data by your project,but i do not konw how to add my data. could you tell me what i should do. thank you very much.

Performance compare to S3FD

Thanks for sharing another awesome work!

I'm wondering if you have compared this new work with the previous S3FD under similar computational cost. From my own test, I tried S3FD (320x180) vs FaceBoxes (960x540), both have similar speed on my device (not using CPU though), but S3FD still performs better on my test images.

Is this expected or there's any reason behind?

main function of Faceboxes source code

Hi,

I checked all your source code, but I didn't find out the main function of your source code.
Could you please kindly tell me where the main function of faceboxes is?

Thanks and Best Regards,
Ardeal

Error 1 google/protobuf/port_def

conda install caffe-gpu make all -j && make py

log:
compilation terminated. make: *** [.build_release/src/caffe/layers/base_data_layer.o] Error 1 In file included from ./include/caffe/util/cudnn.hpp:8:0, from ./include/caffe/util/device_alternate.hpp:40, from ./include/caffe/common.hpp:19, from src/caffe/data_reader.cpp:6: .build_release/src/caffe/proto/caffe.pb.h:10:40: fatal error: google/protobuf/port_def.inc: No such file or directory #include <google/protobuf/port_def.inc>

prototxt

你好,以前使用Pytorch,对caffe里prototxt文件生成不是很了解,什么方法能快速生成呢?例如FaceBoxes,里面的层都是新的,不能使用现成的网络改。:)

compile

Why can't i compile this project in linux. But i can complie caffe project. I compare Makefile and Makefile.config with caffe. there are so many points have been changed in Makefile. what does it mean. How to complile with cpu-only. I dont have GPU.

small(<20 pixels) face

@sfzhang15 HI
paper的3.4节提到:
filter out these face boxes whose height or width is less than 20 pixels

FAN中的3.2.1节中,也提到了相似的内容:
Besides, we calculate the statistics from the WiderFace train set based on the ground-truth face size. As Figure 3 shows, more than 80% faces have an object scale from 16 to 406 pixel. Faces with small size lack sufficient resolution and therefore it may not be a good choice to include in the training data.

有个疑问:这种直接从训练集中剔除<20/16的face,就会导致模型对<20/16的face不敏感,那么在检测的时候,肯定会对<20/16的face检测效果很差,这不直接影响了模型的检测效果么。不知道这样的理解对吗??

另外,在FDNet中,作者的做法恰恰相反:
As WIDER FACE dataset contain many extremely tinny faces (<16 pixels width/height), we keep these small proposals (<16 pixels width/height) valid in the training and testing time [13]. The experiments show that our method can achieve better performance.
相当于保留了所有的小face。

WINDER_FACE 自定义生成 LMDB 文件

您好,我使用以下两个命令生成自己的LMDB文件,
./data/WIDER_FACE/create_list.sh
./data/WIDER_FACE/create_data.sh
但是,我无法执行成功!
请问,有详细的注释说明,如何配置这两个文件吗?

Loss in output

While training Faceboxes model, I noticed that the output is strange:

I0316 10:16:10.963383  6670 solver.cpp:243] Iteration 30450, loss = 3.36783
I0316 10:16:10.963630  6670 solver.cpp:259]     Train net output #0: mbox_loss = 3.34174 (* 1 = 3.34174 loss)
I0316 10:16:13.457460  6670 sgd_solver.cpp:151] Iteration 30450, lr = 0.0052975
I0316 10:18:44.811120  6670 solver.cpp:243] Iteration 30500, loss = 3.35862
I0316 10:18:44.812209  6670 solver.cpp:259]     Train net output #0: mbox_loss = 4.07279 (* 1 = 4.07279 loss)
I0316 10:18:44.812288  6670 sgd_solver.cpp:151] Iteration 30500, lr = 0.005275
I0316 10:21:14.092057  6670 solver.cpp:243] Iteration 30550, loss = 3.36363
I0316 10:21:14.092276  6670 solver.cpp:259]     Train net output #0: mbox_loss = 3.34809 (* 1 = 3.34809 loss)
I0316 10:21:14.092325  6670 sgd_solver.cpp:151] Iteration 30550, lr = 0.0052525
I0316 10:23:43.217934  6670 solver.cpp:243] Iteration 30600, loss = 3.35409

Why isn't there classification loss but only localization loss?

I looked into solver.cpp, the code is as below:

    if (display) {
      LOG_IF(INFO, Caffe::root_solver()) << "Iteration " << iter_
          << ", loss = " << smoothed_loss_;
      const vector<Blob<Dtype>*>& result = net_->output_blobs();
      int score_index = 0;
      for (int j = 0; j < result.size(); ++j) {
        const Dtype* result_vec = result[j]->cpu_data();
        const string& output_name =
            net_->blob_names()[net_->output_blob_indices()[j]];
        const Dtype loss_weight =
            net_->blob_loss_weights()[net_->output_blob_indices()[j]];
        for (int k = 0; k < result[j]->count(); ++k) {
          ostringstream loss_msg_stream;
          if (loss_weight) {
            loss_msg_stream << " (* " << loss_weight
                            << " = " << loss_weight * result_vec[k] << " loss)";
          }
          LOG_IF(INFO, Caffe::root_solver()) << "    Train net output #"
              << score_index++ << ": " << output_name << " = "
              << result_vec[k] << loss_msg_stream.str();
        }
      }
    }

The for loop iterates all the losses and prints them, if so, why isn't there classification loss?

Facial landmarks detection

Hi,
Great job. Its cpu real-time speed is really fast. Wonder how to add facial landmarks detection on your net.
Do you have any schedule to release this kind of work?
Thanks a lot.

time needed for FaceBoxes algorithms

Hi Dr. Zhang,

I tested 10 images on your CPU model. The time needed for the algorithm seems to be much longer than that in your paper:

-----0th images, image width=640, image height=480
-----0th images, Face Rect width=135, Face Rect height=160
-----0th images, time for **FD = 273.279480ms**, FL = 3.838500ms
-----1th images, image width=640, image height=426
-----1th images, Face Rect width=243, Face Rect height=335
-----1th images, time for **FD = 120.471107ms,** FL = 3.123300ms
-----2th images, image width=570, image height=856
-----2th images, Face Rect width=318, Face Rect height=428
-----2th images, time for **FD = 119.700096ms**, FL = 3.967700ms
-----3th images, image width=640, image height=480
-----3th images, Face Rect width=195, Face Rect height=282
-----3th images, time for **FD = 159.247803ms,** FL = 3.166000ms
-----4th images, image width=640, image height=632
-----4th images, Face Rect width=381, Face Rect height=497
-----4th images, time for **FD = 132.707703ms**, FL = 3.671800ms
-----5th images, image width=640, image height=677
-----5th images, Face Rect width=457, Face Rect height=574
-----5th images, time for **FD = 135.079193ms,** FL = 4.643900ms
-----6th images, image width=640, image height=442
-----6th images, Face Rect width=254, Face Rect height=349
-----6th images, time for **FD = 154.615097ms,** FL = 3.570300ms
-----7th images, image width=640, image height=563
-----7th images, Face Rect width=298, Face Rect height=415
-----7th images, time for **FD = 296.186584ms**, FL = 5.205200ms
-----8th images, image width=640, image height=480
-----8th images, Face Rect width=258, Face Rect height=326
-----8th images, time for **FD = 188.629303ms**, FL = 3.470700ms
-----9th images, image width=224, image height=224
-----9th images, Face Rect width=111, Face Rect height=167
-----9th images, time for **FD = 33.309402ms,** FL = 7.112700ms

For the 640*480 images, the time needed is around 150ms. Do you have any idea about the reason why the time needed is much longer?

Do you have a trained GPU model?

Thanks and Best Regards,
Ardeal

Set CPU-only mode

Hi,

I tried to run the demo.py file in python, but I got the following error:

F0311 15:58:18.148229  6000 common.cpp:76] Cannot use GPU in CPU-only Caffe: check mode.

however, I have modified the mode to CPU-only mode. Should I modified some other code to disable GPU mode but enable CPU-only mode?

By the way, the faceboxes model was downloaded from the page you publicized your source code:
https://github.com/sfzhang15/FaceBoxes

os.chdir(caffe_root)
sys.path.insert(0, 'python')
import caffe

caffe.set_device(0)
# caffe.set_mode_gpu()
caffe.set_mode_cpu()

model_def = 'D:/code_cv/face_detection/FaceBoxes/models/faceboxes/deploy.prototxt'
model_weights = 'D:/code_cv/face_detection/FaceBoxes/models/faceboxes/faceboxes.caffemodel'
net = caffe.Net(model_def, model_weights, caffe.TEST)

image = caffe.io.load_image('examples/images/1.jpg')

Thanks and Best Regards,
Ardeal

About train on other data

Hi,
        I want to use this method to detect people. The proportion of people is generally not 1:1. In the initial test, I did not modify the aspect_ratio in your train.prototxt. This maybe a potential problem.
first,when i training about 15,000 steps, the loss did not dropped, and it is around 6. Is this normal? Or is training steps too small?
Thank you very mush. -.-

faceboxes.caffemodel iter 是多少次后的结果

首先,非常感谢你们作出的工作,那非常棒。
针对您的model,我有以下两个疑问,希望得到解答:

  1. 请问您提供的faceboxes.caffemodel, 是训练iter多少次后的model?
  2. 该model使用的数据集是WIDER FACE的完整数据集吗?

Range of the output

Hi,
请问output的score和bbox的回归数值大概分别分布在什么范围。
THX.

将faceboxes网络转化为多目标检测

你好,我尝试将你的网络改变成多目标的结构,但是检测出来的结果有很多杂乱的框,如果想修改成多目标的形式,有哪些需要修改的地方吗?

感受野的计算是否存在问题

image
您好,
看了一下您的感受野计算,最高的达到了911x911,但是Inception每个分支最终是以concat的形式结合在一起的,而论文的这个图好像是把Inception的每个分支串行起来然后去计算的。实际上,3个Inception,每个是4个分支,代表一共有4x4x4条分支路段,这样算出来的感受野结果好像是应该如下:
image
但是论文的结果仿佛是把所有分支从模型底部到顶部串行起来,就是:
image
image
image
上图红色的部分代表每个分支的RF,还有开头两个conv,两个pooling和结尾的两个conv的RF

请问是否是我理解错了呢?还是论文此处的感受野计算是否有点问题呢?

The output of CPP code is weird

Hi Dr. Zhang,

A few days ago, I could run faceboxes model on pycaffe correctly. The face detection result of the algorithms is good as well.


I followed the example code in SSD(https://github.com/weiliu89/caffe/blob/ssd/examples/ssd/ssd_detect.cpp) to call your faceboxes mode using cpp in Caffe.

The code could be run correctly, but the output of algorithms seems to be weird:

  1. for some images, the output is very good(The same as that in pycaffe).
  2. for some images, the output is weird.
    Please check the following output:
    For those abnormal images, the score is very small and the boxes is not aligned as well.
    For those normal images, the score and boxes are both normal.

I used the same network file and weights file downloaded from your code, and I used the same code as pycaffe.

The only difference is that I use CPP to call the model.

Why are the output of some images not correct?
Do you have any idea about the issue?

image

image

image

image

How about the recall

The AP is above 95%.
How about the recall? for example, the recall on >30 pixels faces?

opencv的dnn 调用问题

你好! 我在我的电脑上复现了,训练出了模型,然后想使用c++的opencv进行调用caffe,发现出现这个错误:
[libprotobuf ERROR /home/chan/software/opencv-3.3.0/3rdparty/protobuf/src/google/protobuf/text_format.cc:298] Error parsing text-format caffe.NetParameter: 1761:15: Message type "caffe.PriorBoxParameter" has no field named "fixed_size".
OpenCV Error: Unspecified error (FAILED: ReadProtoFromTextFile(param_file, param). Failed to parse NetParameter file: /home/chan/桌面/Face_demo/demo/faceboxes_deploy.prototxt) in ReadNetParamsFromTextFileOrDie, file /home/chan/software/opencv-3.3.0/modules/dnn/src/caffe/caffe_io.cpp, line 1137
是opencv的dnn与这个caffe不符合?

Error loading network

I am running into errors with fb_net = caffe.Net('deploy.prototxt', 'faceboxes.caffemodel', caffe.TEST).

When I use the BVLC/caffe master release (1.0):

[libprotobuf ERROR google/protobuf/text_format.cc:274] Error parsing text-format caffe.NetParameter: 1378:17: Message type "caffe.LayerParameter" has no field named "permute_param".
F0213 01:06:12.385599 1 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: deploy.prototxt
*** Check failure stack trace: ***

To avoid this issue, I've tried caffe-ssd and have encountered:

I0213 02:06:34.258400    61 net.cpp:165] Memory required for data: 51944960
I0213 02:06:34.258419    61 layer_factory.hpp:77] Creating layer detection_out
I0213 02:06:34.258469    61 net.cpp:100] Creating Layer detection_out
I0213 02:06:34.258489    61 net.cpp:434] detection_out <- mbox_loc
I0213 02:06:34.258509    61 net.cpp:434] detection_out <- mbox_conf_flatten
I0213 02:06:34.258527    61 net.cpp:434] detection_out <- mbox_priorbox
I0213 02:06:34.258569    61 net.cpp:408] detection_out -> detection_out
F0213 02:06:34.258677    61 detection_output_layer.cpp:164] Check failed: num_priors_ * num_loc_classes_ * 4 == bottom[0]->channels() (4000 vs. 25600) Number of priors must match number of location predictions.
*** Check failure stack trace: ***

Can anyone help with these issues?

Is there a script to convert it to VOC format?

I have downloaded the Wider Face dataset, but when trying to convert it to lmdb, there's no additional information to step in.
Is there any preprocess in that step, something like filtering the size below 20*20.

packages needed to run the code

Hi,

which LMDB package is used in your code? could you please share the link of LMDB source code or binary?

On Linux:
Could you please list the version of each needed package?

Thanks and Best Regards,
Ardeal

bias=false or true

caffe version: bias=true
pytorch version: nn.Conv2d(in_channels, out_channels, bias=False, **kwargs)

why bias different

loss值不下降

您好,我按照您的步骤重新训练了一遍,发现loss值下降到4左右就降不下去了,基本都在3+~4+的范围内,请问您训练的时候也会这样吗

批处理无法使用

if im_scale != 1.0:
image = cv2.resize(image, None, None, fx=im_scale, fy=im_scale, interpolation=cv2.INTER_LINEAR)
n=10
net.blobs['data'].reshape(n, 3, image.shape[0], image.shape[1])
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_transpose('data', (2, 0, 1))
transformer.set_mean('data', np.array([104, 117, 123])) # mean pixel
transformer.set_raw_scale('data', 255) # the reference model operates on images in [0,255] range instead of [0,1]
transformer.set_channel_swap('data', (2, 1, 0)) # the reference model has channels in BGR order instead of RGB
transformed_image = transformer.preprocess('data', image)
#net.blobs['data'].data[...] = transformed_image
imgb = []
for i in range(n): imgb.append(transformed_image)
net.blobs['data'].data[...] = imgb
detections = net.forward()['detection_out']
print(detections.shape)

我参考demo.py做了上面的修改, 最后打印的结果是(1, 1, 800, 7),而不是(10, 1, 80, 7),

Yes, it is. You can put some images in a batch to detect faces. The first dimension of the output is the batch index.

这跟你上面的描述不一致。
当批处理的时候每张图片人脸数目不同的时候,无法区分结果。

where you filter the small face

hi, you said,
"We keep the overlapped part of the
face box if its center is in the above processed image,
then filter out these face boxes whose height or width
is less than 20 pixels." .
but in the project, I can't find where you filter the small face?
May you point it out for me, thx.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.