davidsandberg / facenet Goto Github PK

View Code? Open in Web Editor NEW

13.7K 563.0 4.8K 2.92 MB

Face recognition using Tensorflow

License: MIT License

Python 92.81% MATLAB 7.19%

face-recognition tensorflow facenet deep-learning computer-vision face-detection mtcnn

facenet's People

Contributors

Stargazers

Watchers

Forkers

jackiechen0708 sunjia2015 tianweixing zhujianwang amosmichael zqaidwj1314 twinsyssy1018 agarwalnaimish 21hub yobibyte clear-datacenter chagge campuslifeceo liuhengli phperwu csgcmai yyd106 ambier stephen-xu anguoyang jevonswang jeromeyoon jinwoo-jeon farizrahman4u gangqiangzhao milestonesvn hyer minsuu hardold turgunyusuf mingsun-tse hxl1990 zmoon111 labimage dreadlord1984 jizhongfm lubosbgit visionu caomw tammyyang ddbelkov guangyue carloserodriguez2000 bigmai222 ilovecv somaticapi deepcompute italoarruda wowo200 redserpent7 kli-casia gzzgz hitluobin wanjinchang lgw1860 gucasbrg zhangxinnan chao-jiang huangr76 andyyang13 joyhuang9473 neerajbaji ywelement minhthuanit lyk125 seonho podilaaditya tianxingjianmj apollo-time fun-alex-alex2006hw fungchou alfiya400 2php facear robustfengbin mornydew likeucode wenlin-zhang ganzhangzi xl94 benjamesbabala lijian8 stevenlol guiqulaxi zhangscth beijinghxl1990 hyhlinux simplysimleprob mfzhang firyuza n2itn playif berli gyunt vyraun misc-git-forks bigwisu xetrocoen lyimage zhangxujinsh

facenet's Issues

it can't train when run facenet_train.py

Hi,I met a problem as follow when I try to run facenet_train.py.

I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 750 Ti
major: 5 minor: 0 memoryClockRate (GHz) 1.0845
pciBusID 0000:01:00.0
Total memory: 2.00GiB
Free memory: 1.82GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 750 Ti, pci bus id: 0000:01:00.0)
E tensorflow/stream_executor/cuda/cuda_driver.cc:965] failed to allocate 2.00G (2146762752 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
Loading training data
Loaded 1800 images in 1.82 seconds
Selecting suitable triplets for training
E tensorflow/stream_executor/cuda/cuda_dnn.cc:354] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
E tensorflow/stream_executor/cuda/cuda_dnn.cc:321] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
F tensorflow/core/kernels/conv_ops.cc:457] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
Aborted (core dumped)

I'm sure I have successfully installed tensorflow0.10 ,cuda7.5 and cudnn7.5v5.
How can I solve this problem?

Write an L2 pooling that actually works

The implemented L2 pooling gives NaN in the gradient sometimes

tripERR

I As you mentioned in README files, I use “Facescrub” database. Unfortunately, when I ran your code the tripERR was swinging between different values (0.61, 0.48, 0.39, 0.44, 0.72, ....).
I changed some parameters of your code such as “learning_rate”, “alpha”, “moving_average_decay”, but the result was not much change.
May I ask you why is the tripEER swinging? Could you help me to fix this problem, please?

compare.py fails loading from the prebuilt checkpoint

This is my invocation
$ python compare.py --model_dir ~/facenet/ --dlib_face_predictor ~/dlib-18.18/shape_predictor_68_face_landmarks.dat --image1 ~/s1 --image2 ~/s2

The directory (~/facenet) has the prebuilt checkpoint file model-20160506.ckpt-500000. Upon running this, I see this:

Traceback (most recent call last):
  File "compare.py", line 73, in <module>
    main()
  File "compare.py", line 53, in main
    raise ValueError('Checkpoint not found')
ValueError: Checkpoint not found

Add data augumentation

Add random translation and horizontal flipping of images

training error

I have used facescrub dataset and dlib to align them before running facenet_train.py
After 1st epoch of training and it has the following error at validation below. Do you know what is wrong?

Traceback (most recent call last):
File "./facenet_train.py", line 274, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py ", line 30, in run
sys.exit(main(sys.argv))
File "./facenet_train.py", line 144, in main
global_step, embeddings, loss, 'validation', summary_writer)
File "./facenet_train.py", line 212, in validate
image_paths, num_per_class = facenet.sample_people(dataset, nrof_people, FLA GS.images_per_person)
File "/mnt/2TB/src/facenet/facenet/src/facenet.py", line 606, in sample_people
class_index = class_indices[i]
IndexError: index 52 is out of bounds for axis 0 with size 52

What is the meaning of this parameter lfw_dir ？

Hi, David

When run facenet_train.py , I met a problem :

I get the following error:

Runnning forward pass on LFW images
Traceback (most recent call last):
  File "facenet_train.py", line 340, in <module>
    main(parse_arguments(sys.argv[1:]))
  File "facenet_train.py", line 126, in main
    images_placeholder, phase_train_placeholder, embeddings, nrof_folds=args.lfw_nrof_folds)
  File "/home/dy/tensorflow/facenet-master/facenet/src/lfw.py", line 26, in validate
    emb_array = np.vstack(emb_list)  # Stack the embeddings to a nrof_examples_per_epoch x 128 matrix
  File "/usr/local/lib/python2.7/dist-packages/numpy/core/shape_base.py", line 230, in vstack
    return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
ValueError: need at least one array to concatenate

What is the meaning of this parameter lfw_dir ？
I noticed that ：--lfw_dir means "Path to the data directory containing aligned face patches."
But i do not understand it , and the default value '~/datasets/lfw/lfw_realigned/' is not exist , run the method lfw.get_paths() can't get “paths” lead to validate error

Waiting for your answer.

the accuracy is 81.3% which is much lower than 91%

hi, i used the model you given and ran the validate_on_lfw.py directly , but the result is not as good as yours, is there something wrong? thank you

Getting UnicodeDecodeError while trying to load checkpoint file

Hi,

I modified the facenet_train.py file to load the pretrained model that you provide but it seems to me there is some problem with the file, since whenever I try to load it using,

ckpt = tf.train.get_checkpoint_state(model_dir, latest_filename='model-20160506.ckpt-500000')

I get the following error:

Traceback (most recent call last):
File "/media/esb172/Hard_Disk_2/facenet_data/tightface/facenet-tf/facenet/src/facenet_train.py", line 268, in
main(parse_arguments(sys.argv[1:]))
File "/media/esb172/Hard_Disk_2/facenet_data/tightface/facenet-tf/facenet/src/facenet_train.py", line 98, in main
ckpt = tf.train.get_checkpoint_state(model_dir, latest_filename='model-20160506.ckpt-500000')
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 662, in get_checkpoint_state
coord_checkpoint_filename).decode("utf-8")
File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid start byte

Process finished with exit code 1

Possibility to resume training

Implement the possibility to resume training by restoring a checkpoint file

Try training with dropout

Train a model with dropout and check if the learnt features becomes more sparse

GPU implementation

Do you plan on making a pure GPU implementation of facenet?

Training from scratch fails

I am trying to train from scratch but when I start training I get the following error
Traceback (most recent call last):
File "facenet_train.py", line 257, in
main(parse_arguments(sys.argv[1:]))
File "facenet_train.py", line 47, in main
pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs))
File "/home/kalvik/facenet/facenet/src/lfw.py", line 64, in read_pairs
with open(pairs_filename, 'r') as f:
IOError: [Errno 2] No such file or directory: '../data/pairs.txt'

Duplicates between the LFW, Facescrub and Casia-WebFace datasets

In order to reproduce the LFW validation accuracy as you posted on the Wiki using the Facescrub and Casia-WebFace datasets for training, I find out that there are 84+ persons (more than two images in the class) exist in both the LFW and the combined Facescrub and Casia-WebFace datasets. Can you provide more details on how you deal with these duplicates?

The duplicates between Facescrub and Casia-WebFace datasets (about 300 duplicates)
The duplicates between LFW and the combined Facescrub and Casia-WebFace datasets.

Based on how the datasets can be specified for the --data_dir argument, it seems that if the duplicates are not merged together in advance, the same person will be treated as different classes.

compare 2 images

Do you have a similar utility to compare two jpeg face images and determine whether both are the same person or not, like compare.py in openface?

Thanks,

Deep funneling in FaceNet

Hello,
My question is about Deep funneling in FaceNet. We've achieved high accuracy (about 92%) on deep-funneled LFW, but was the model "model-20160506.ckpt-500000" trained on deep-funneled FaceScrub and CASIA-Webface images? If it so, could somebody share the link to implementation of deep-funneling algorithm, to apply it to the result from Viola-Jones or DLib before FaceNet CNN?
Or, if it isn't necessary, could, please, somebody explain me why?
I know, that it is not the issue that is directly refereed to this FaceNet implementation, but I hope you will understand my curiosity)
Waiting for your answer.

Need some help to fully re-implement the FaceNet paper.

Hey David,
Thanks for your code that combines with the openface code to let me investigate.
I've forked your facenet and been working on my own for about 2-3 weeks. I've followed the torch code from openface and seems re-implemented their strategy of training. Especially the one in the image below.

I see your code have some bugs and weird strategy for me.

the batch norm code you borrow from the stackoverflow is wrong, refer to @jrocks' answer on the question for detail. link
I see you want to use 45*40 images to generate triplets, but limited by GPU memory, you doesn't follow openface triplets selection strategy but use every first image of certain people to be the anchor, this seems not very reasonable right? I mean anchor doesn't mean not changed.
This is the most serious problem I'm facing. You split selecting triplets by mini-batch (say 90), and then backward one mini-batch a time. What I have done in my code is that expanded triplets and then accumulate gradient coefficients input just like the openface and then calculate the gradients for all variables by feeding 20 mini-batches (2090=4540), then accumulate these gradients, then apply them once. This procedure is extended from openface since they use 15*20 images, and their torch SDK seems have no API to get gradients. But for now my code isn't working, means the training doesn't seems to achieve at least 80% accuracy on LFW, I don't know why is that, could you please help me out?

You may check the code from my fork optimize branch. Sorry that I've heavily modified the file structure and python coding style.

Please contact me if you have any idea. I think we could work together to use both 45*40 batch size and non-compact triplets.

Aligning facescrub data-set for training

Do we have to use align_dataset.py for facescrub? The dataset had folders with picures of different people but it also has a folder with cropped faces.If we are supposed to use only the cropped faces how can I feed the images to facenet and If I have to use align_dataset.py does it drop the folder with the cropped faces?

Add a test phase

Periodically calculate and log triplet loss for the test dataset

Start to use AdaDelta as optimizer

Start to use AdaDelta as optimizer when it's available in tensorflow
tensorflow/tensorflow#644

load_model failing when running validate_on_lfw

i'm trying to run your pre trained model as linked by the wiki, but it seems that there's an inconsistency between the ckpt file and the meta file. The initial error I get when I try loading the model is

not found: Tensor name "conv1_7x7_1/batch_norm/batch_norm/beta/ExponentialMovingAverage_1" not found in checkpoint files /orpix/research/facenet/facenet/src/models/model.ckpt-500000

Can you let me know if I'm doing something wrong here? thanks

Visualize learnt features

Visualize the filters in the first convolutional layer.
Use conv2d_transpose to visualize higher layer features as described in "Visualizing and Understanding Convolutional Networks" (https://www.cs.nyu.edu/~fergus/papers/zeilerECCV2014.pdf)

Improve exporting of pre-trained models

Importing of a pre-trained models using import_meta_graph(...) has been implemented here.
However, this code seems to initialize the graph with the instantaneous parameter values and not the EMA filtered ones, and this degrades LFW performance a bit.
Figure out how to import a graph and initialize it with filtered parameter values.

Error using the model provided in readme

I tried using the model as shared on the readme via google drive link, but encountered an error when verify the model with lfw dataset.

tensorflow.python.framework.errors.NotFoundError: Tensor name "incept5b/in4_conv1x1_55/weights/ExponentialMovingAverage" not found in checkpoint files model/20160306-500000/model.ckpt

I ran the verify command as python ./src/validate_on_lfw.py --model_dir model/20160306-500000/ --lfw_pairs data/pairs.txt --lfw_dir data/lfw\:dlib-affine-sz\:96/

Add the possibility to run with a fixed seed

Add the possibility to run with a fixed seed to simplify troubleshooting

How to improve accuracy

Hi, David,
Just as the faceNet paper described, the accuracy could be reached over 99%, my question is, how to train the data to get this accuracy, is there any pre-trained model?

about the result

Hi!
I have use a small dataset, which has 600 different people, to train the network.
I have not change any code and parameters as given by you.
I test the result after 150 epoch on LFW and accuary is only 0.66.
I used the code from OpenFace to do face aligment and I have not change the image to bgr image.
Instead , I feed the network with rgb images. (Have you done the same thing?)
Do you think the accuray is alright? In order to get good performance, do you have any advice?
Thank you for sharing the code.

the align function may Shear the face image

H = cv2.getAffineTransform(npLandmarks[npLandmarkIndices],
                                   imgDim * MINMAX_TEMPLATE[npLandmarkIndices]*scale + imgDim*(1-scale)/2)
        thumbnail = cv2.warpAffine(rgbImg, H, (imgDim, imgDim))

I got the following result:

Why not use least square？
Y = MX
M = (YXt)(XXt)-1

Add regularization

Need to add dropout and L2 weight penalty

Train on the Ms-Celeb-V1 dataset

The dataset is available here.
A python program to decode the face thumbnails can be found here.

Check invariance

Check invariance with respect to scale, translation and rotation

error in training

Epoch: [0][999/1000] Time 0.223 Loss 0.164
Epoch: [0][1000/1000] Time 0.226 Loss 0.185
Runnning forward pass on LFW images
Traceback (most recent call last):
File "facenet_train.py", line 298, in
main(parse_arguments(sys.argv[1:]))
File "facenet_train.py", line 134, in main
actual_issame, args.seed, 60, images_placeholder, phase_train_placeholder, embeddings, nrof_folds=args.lfw_nrof_folds)
File "/home/dl1/LXX/facenet-master_new/src/lfw.py", line 33, in validate
np.asarray(actual_issame), seed, nrof_folds=nrof_folds)
File "/home/dl1/LXX/facenet-master_new/src/facenet.py", line 428, in calculate_roc
folds = KFold(n=nrof_pairs, n_folds=nrof_folds, shuffle=True, random_state=seed)
File "/home/dl1/anaconda/envs/tensorflow/lib/python2.7/site-packages/sklearn/cross_validation.py", line 326, in init
super(KFold, self).init(n, n_folds, shuffle, random_state)
File "/home/dl1/anaconda/envs/tensorflow/lib/python2.7/site-packages/sklearn/cross_validation.py", line 257, in init
" than the number of samples: {1}.").format(n_folds, n))
ValueError: Cannot have number of folds n_folds=10 greater than the number of samples: 0.
(tensorflow)dl1@dl1-B85M-DS3H:~/LXX/facenet-master_new/src$

train models.inception_resnet_v2

hi,
when run facenet_train.py to train inception_resnet_v2 ,I met this problem :
Traceback (most recent call last):
File "facenet_train.py", line 339, in
main(parse_arguments(sys.argv[1:]))
File "facenet_train.py", line 70, in main
phase_train=phase_train_placeholder, weight_decay=args.weight_decay)
File "/home/mqq/victorydong/source/face2/facenet/facenet/src/models/inception_resnet_v1.py", line 135, in inference
dropout_keep_prob=keep_probability)
File "/home/mqq/victorydong/source/face2/facenet/facenet/src/models/inception_resnet_v1.py", line 157, in inception_resnet_v1
with tf.variable_scope(scope, 'InceptionResnetV1', [inputs], reuse=reuse):
File "/home/mqq/anaconda2/lib/python2.7/contextlib.py", line 84, in helper
return GeneratorContextManager(func(_args, *_kwds))
TypeError: variable_scope() got multiple values for keyword argument 'reuse'

Documentation for training models from scratch

I would like to help with several of the existing issues, including #19, #23, #22, #2, and many other future issues. One thing that will help me (and possibly others) speed up is a step-by-step documentation for training from scratch.

accuracy on provided model vs facenet paper reported accuracy

You report accuracy of 91% on the LFW data. The facenet paper reports 99+. Do you have any insight as to what they did differently vs your trained model? Just curious, thanks

problem in alignment

When I run alignment,I encounter some problems as follow:

warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.') fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' Traceback (most recent call last): File "align_dataset.py", line 113, in <module> main(parse_arguments(sys.argv[1:])) File "align_dataset.py", line 42, in main img = misc.imread(image_path) AttributeError: 'module' object has no attribute 'imread'

So what's reason of the problem?

classify images

OpenFace has a way of classifying images to get a prediction based on the target labels. Is that implemented in facenet? From what I can see facenet does the embedding, but hasn't implemented training for image -> embedding -> label. If it isn't implemented, what would be the best way to do that?

classification using facenet

I've tried to read your code. but I didn't get how you can classify an image.
Consider that you want to classify one image with this network. is there any simple way to feed the network with that picture and retrieve the class of that image?

Restoring pre-trained model error

Hi! I'm trying to use pre-trained model, but I get the following error during restoring:

NotFoundError: Tensor name "incept4a/in1_conv1x1_75/batch_norm/cond/incept4a/in1_conv1x1_75/batch_norm/moments/moments_1/mean/ExponentialMovingAverage" not found in checkpoint files model-20160506.ckpt-500000

Can you help me with this?

76,5% Accuracy on LFW

Hello,
I have low accuracy on LFW using this FaceNet implementation. I'm using "All images as gzipped tar file" - the first offered dataset from list of downloadable datasets on http://vis-www.cs.umass.edu/lfw/
Photo attached:

Am I doing something wrong? Which dataset should I use to check accuracy and get 0.919+-0.008?
It would be great if you could help we with that issue.
Thank you.

Try training with the VGG Face dataset

Links to the images can be found at
http://www.robots.ox.ac.uk/~vgg/data/vgg_face/
A script to download the images can be found at
https://github.com/davidsandberg/facenet/blob/master/facenet/src/download_vgg_face_dataset.py

how much time does it take you to train a model with 0.919 accuracy on LFW?

Try face alignment using FaceX

With Dlib alignment a part of the faces in e.g. the LFW dataset goes undetected. This can potentially filter out some of the semi-hard examples from the training dataset and degrading performance.
Check if the performance can be improved by using FaceX face alignment.

embedding new faces

Is it possible to embed and classify new faces that were not used during training?

Fix remaining issues with python/tensorflow implemenation of MTCNN

MTCNN has been implemented using python/tensorflow and can be found here. However this implementation gives slightly different results compared to the matlab/caffe implementation from the authors.

In the matlab/caffe implementation a call to imResample is used to down-sample image patches. For the python implementation, different implementations has been attempted and opencv:s
cv2.resize(img, (sz[1], sz[0]), interpolation=cv2.INTER_AREA) seems to perform best. But the result is not identical and this seems to impact performance quite a bit.
The probablity scores for the bounding box hypotheses differs in the two implementations. This is probably due to that the convolutions are performed slightly different.

Bad train/test split

In https://github.com/davidsandberg/facenet/blob/master/facenet/src/facenet.py#L540
You have split based on classes. I think it should be split within each class instead.

Should we add collection for triplet_loss?

I find a collection "losses" for cross_entropy_mean in facenet_train.py.

but in function _add_loss_summaries we always use collection "losses" .
if we should add a collection for triplet_loss?

Improve face alignment using a better align-dlib.py

Per this new discussion on OpenFace, I think your models would benefit from a better dlib aligner. This is a drop in replacement for existing align-dlib.py. Checkout the new align-dlib.py.zip

I think I found a spelling mistake

Hi, David

I think I found a spelling mistake on line 66 in facenet_train.py

# Placeholder for the learning rate
learning_rate_placeholder = tf.placeholder(tf.float32, name='learing_rate')

'learing_rate' should be 'learning_rate' ?

Is there any .meta file for the pre-trained model-20160506.ckpt-500000?

Hi David:
I am trying to run "validate_on_lfw.py". However, when using your provided model "model-20160506.ckpt-500000", I can not find the corresponding ".meta " file. And this results the "utf8' codec can't decode" error when loading the model file.

Best,
GQ