sai-bi / facealignment Goto Github PK
View Code? Open in Web Editor NEWFace Alignment by Explicit Shape Regression
License: MIT License
Face Alignment by Explicit Shape Regression
License: MIT License
Hi,
First of all, thanks for putting up online this source code, it's very useful to have a reference implementation of that great paper.
I was curious to understand the rationale behind the 0.2 multiplicator in your fern's threshold calculation so I set out to inspect the max_diff values and noticed that some of them were greater than 255 which didn't seem to make sense. Ultimately, I tracked down the problem to what I believe is a bug in the implementation of calculate_covariance. In this function you pass the image intensities as references to const vector and build OpenCV Mats from them, sharing the data (i.e. copyData is left to false in the Mat constructor). However, at some point, you modify the values inside these Mats, e.g.:
v1 = v1 - mean_1;
This seems to modify the value of the v_1 vector for me (despite the const qualifier on the argument) and as a result after returning from this function, the values in the 'densities' array are changed.
Here is what I think the code should be:
double calculate_covariance(const vector<double>& v_1,
const vector<double>& v_2) {
Mat_<double> v1(v_1);
Mat_<double> v2(v_2);
double mean_1 = mean(v1)[0];
double mean_2 = mean(v2)[0];
// FIXME: bug - this modifies v_1 and v_2 (supposed to be const)
//v1 = v1 - mean_1;
//v2 = v2 - mean_2;
//return mean((v1).mul((v2)))[0];
return mean((v1 - mean_1).mul((v2 - mean_2)))[0];
}
Could you please confirm that you are not intentionally modifying the values passed in that code?
Once the densities values are back in [0,255], do we still need the 0.2 multiplicator in the threshold computation?
Thanks!
It looks like that the sample results contains much more landmarks than the COFW dataset which contains only 29 landmarks. could you please upload the dabaset and tain model of the sample result to the google cloud disk. Thank you !
when i started running TrainDemo.cpp, it takes more than a day, what is the minimum number of images required for training and what is the minimum first_level_num ,second_level_num; in your code you have given first_level_num=10.
If I use the model that you have trained and provided, then which face detector would be prefered? The face detector realized in OpenCV?
Firstly, I tried to test your code with the trained model provided which was already provided here https://drive.google.com/file/d/0B0tUTCaZBkccOGZTcjJNcDMwa28/edit?usp=sharing
but, I ran into a couple of issues :
so, I thought that something is wrong with the given model, and I decided to train on my own. I have trained on the training set provided in COFW dataset. It took about 30mins to train. Now, when I try to test this model with the same test code (with updated paths to the new model), I keep getting segmentation fault whenever I access current_shape(i,0) and current_shape(i,1).
Let me know what further can be done to fix this issue.
Could u gimme more detail information of two level cascaded regression, please. Do u have any ppt that have more information about ESR rather than this one ? please provide more information about the regression and the regressors that used in ESR, please.
solved
Hi, I used the test demo ,and ,test 157 Images ,it cost 5129ms totally, i just do a loop
for i 1:157
current_shape=regressor.Predict(test_images[index],test_bounding_box[index],initial_number);
Average image cost 30ms . the init number had been set to 2 ,can you tell me why the speed is much
slower the the paper;
And the 3000fps using OpenMp support to accelerate the speed . Your project can also use openmp ?.
Hello,
How long does the prediction process take in your test? What was the parameter configuration you used in your experiment? (candidate_pixel_num, fern_pixel_num, first_level_num, second_level_num, landmark_num)
The prediction process takes around 500 ms in my machine, is that reasonable?
Thank you!
你好!我有个困惑,我检测一张图片的特征点时耗费了140ms,然而论文里面说大概15ms,你能告诉我怎么去提高这个效率吗?
Thank you for sharing your code. I think your code is perfect. I have trained the 68 landmarks model using your code. I want to ask you what can we do to improve this method(I think this method is already very good)? Do you have some good ideas?
Thank you!
We trained a model on LFPW dataset with 68 landmark points. We used 1300 images for training and provided the bounding box according to the viola jones face detector as given in opencv. It took about 10 hours to create the model on intel(R) Xeon(R) processor. But we are not getting the results as expected during testing.
Is there any relation between number of training images to be used and number of landmarks?? Any other suggestions??
I see the paper《Random projection in dimensionality reduction: Applications to image and text data》,which is quite different from your code.
// RNG random_generator(i);
Mat_<double> random_direction(landmark_num_ , 2);
random_generator.fill(random_direction, RNG::UNIFORM, -1.1, 1.1);
normalize(random_direction,random_direction);
vector<double> projection_result(regression_targets.size(), 0); //size = (1, image_num)
// project regression targets along the random direction
for(int j = 0; j < regression_targets.size(); j++){ //for each sample
double temp = 0;
temp = sum(regression_targets[j].mul(random_direction))[0];
projection_result[j] = temp;
}
However, in the paper,
The left of equation is the projected result, k is the lower dimension, d is the current dimension, N is the number of data. Your projection matrix(random_direction),however, is not consistent with the paper. Neither the computation is.
Second, you fill the matrix by random_generator.fill(random_direction, RNG::UNIFORM, -1.1, 1.1);
In the paper,
It subjects itself to some distribution. So what's your consideration of generating total random matrix?
Thanks a lot for your kind sharing. But the model you shared is for 29 landmarks. Could you share a model for more landmarks, say, 64 or 83 landmarks?
hello,The trained Model Website you provided is not available now;
Can you give me the Mode.
my eamail, [email protected]
heihei,i am lazy.
as the title,why can the random projection be useful?
Trying to compile this code under Linux with g++ 4.8.2 fails, because your code uses C++ 11 features. So adding these two lines to CMakeLists.txt fixes the problem:
set_property(TARGET TrainDemo.out PROPERTY CXX_STANDARD 11)
set_property(TARGET TestDemo.out PROPERTY CXX_STANDARD 11)
Thanks for your code! I complied it with cmake and run the TrainDemo.out. It took me 36 minutes to train the data. But after the training, I couldn't find ./data/model.txt, which was supposed to appear in my current directory. How should I fix this problem? Thank you!
hi
i don't understand what is the meaning of the parameter initial number.
i try to train why 5 land marks in the data base MTFL
i have a lillte puzzle that you use the two ProjectShape and ReProjectShape functions in the code.
can you give me some ideas that how to understand the two functions?
I find there are numbers over 10,000 in "boundingbox.txt". I think those are width or height of facebox, but could them be so large?
Can I also detect the face_profile and with the original code?
I have tried altering the landmark_num but it doesn't work.
Can you tell me the concrete way to get boundingbox.txt and keypoints.txt?
I have compiled the code successfully on my Ubuntu 14.04 machine. Also I've changed the path of the data in both TrainDemo.cpp and TestDemo.cpp. The training data I used is COFW dataset and when I ran TrainDemo.out, it crashed at line 104 of FernCascade.cpp while doing 1 out of 10 fern cascades training. Even if I skipped training step and used the model.txt you provided and ran TestDemo.out, it still crashed at line 137 of ShapeRegressor.cpp, when loading the model.txt file. I checked the location of the model file and I think it is correct. Since I'm not familiar with the details of your implementation I find it pretty hard for me to figure out the solution. Could you tell me how to solve these problems? Thank you.
hi, i use my own facedetect algorithm to train the helen model, porvide the images,boundingbox and the standard ground_shapes, but the results of the test is terrible . do you know the reason?
my email is [email protected]
thanks;
can you share the training data and the model in github?
Thanks for your code. But I am wondering whether there is EXE files that we can directly use. Many thanks!
keypoint.txt landmark is error?,
ifstream fin2;
locale::global(locale(""));
fin2.open("oursTrain/keypoints.txt");
locale::global(locale("C"));
string line;
int sample_freq=0;
while(getline(fin2,line))
{
sample_freq++;
printf("loading test:%d\n",sample_freq);
for(int j = 0;j < 29;j++)
{
fin2>>shapes(j,0);
}
for(int j = 0;j < 29;j++)
{
fin2>>shapes(j,1);
}
}
for (int m=0;m<shapes.rows;++m)
{
circle(testImg1,Point(shapes(m,0),shapes(m,1)),3,Scalar(255));
}
imwrite("mark_face.jpg",testImg1);
i find the testImg1 that it landmark error, thanks
Can the algorithm predict landmarks in 15 ms as the paper?
Thanks
Hello,I have trained on the facedatabase MUCT,and 1000 pictures are used of testing set,another 500 pictures are used as training set.The result of alignment is good,but when I use JAFFE as the testing set,the result is poor.I think that the reason is the bounding box,such as the follow picture,the bounding box as it is,
.
But I don't kown where I went wrong?
if (OpenCV_FOUND)
include_directories(${OpenCV_INCLUDE_DIRS})
endif()
or the complier will fail to find cv.h etc.
There's a missing term in your implementation of the equation that calculates correlation, which is the sample variance of the Yprob, under the sqrt in the denominator of that equation (equation 11). Please correct me if I am mistaken about this.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.