davidsandberg / facenet Goto Github PK
View Code? Open in Web Editor NEWFace recognition using Tensorflow
License: MIT License
Face recognition using Tensorflow
License: MIT License
Hi,I met a problem as follow when I try to run facenet_train.py.
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:925] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX 750 Ti
major: 5 minor: 0 memoryClockRate (GHz) 1.0845
pciBusID 0000:01:00.0
Total memory: 2.00GiB
Free memory: 1.82GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 750 Ti, pci bus id: 0000:01:00.0)
E tensorflow/stream_executor/cuda/cuda_driver.cc:965] failed to allocate 2.00G (2146762752 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
Loading training data
Loaded 1800 images in 1.82 seconds
Selecting suitable triplets for training
E tensorflow/stream_executor/cuda/cuda_dnn.cc:354] could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
E tensorflow/stream_executor/cuda/cuda_dnn.cc:321] could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
F tensorflow/core/kernels/conv_ops.cc:457] Check failed: stream->parent()->GetConvolveAlgorithms(&algorithms)
Aborted (core dumped)
I'm sure I have successfully installed tensorflow0.10 ,cuda7.5 and cudnn7.5v5.
How can I solve this problem?
The implemented L2 pooling gives NaN in the gradient sometimes
I As you mentioned in README files, I use “Facescrub” database. Unfortunately, when I ran your code the tripERR was swinging between different values (0.61, 0.48, 0.39, 0.44, 0.72, ....).
I changed some parameters of your code such as “learning_rate”, “alpha”, “moving_average_decay”, but the result was not much change.
May I ask you why is the tripEER swinging? Could you help me to fix this problem, please?
This is my invocation
$ python compare.py --model_dir ~/facenet/ --dlib_face_predictor ~/dlib-18.18/shape_predictor_68_face_landmarks.dat --image1 ~/s1 --image2 ~/s2
The directory (~/facenet) has the prebuilt checkpoint file model-20160506.ckpt-500000
. Upon running this, I see this:
Traceback (most recent call last):
File "compare.py", line 73, in <module>
main()
File "compare.py", line 53, in main
raise ValueError('Checkpoint not found')
ValueError: Checkpoint not found
Add random translation and horizontal flipping of images
I have used facescrub dataset and dlib to align them before running facenet_train.py
After 1st epoch of training and it has the following error at validation below. Do you know what is wrong?
Traceback (most recent call last):
File "./facenet_train.py", line 274, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py ", line 30, in run
sys.exit(main(sys.argv))
File "./facenet_train.py", line 144, in main
global_step, embeddings, loss, 'validation', summary_writer)
File "./facenet_train.py", line 212, in validate
image_paths, num_per_class = facenet.sample_people(dataset, nrof_people, FLA GS.images_per_person)
File "/mnt/2TB/src/facenet/facenet/src/facenet.py", line 606, in sample_people
class_index = class_indices[i]
IndexError: index 52 is out of bounds for axis 0 with size 52
Hi, David
When run facenet_train.py , I met a problem :
I get the following error:
Runnning forward pass on LFW images
Traceback (most recent call last):
File "facenet_train.py", line 340, in <module>
main(parse_arguments(sys.argv[1:]))
File "facenet_train.py", line 126, in main
images_placeholder, phase_train_placeholder, embeddings, nrof_folds=args.lfw_nrof_folds)
File "/home/dy/tensorflow/facenet-master/facenet/src/lfw.py", line 26, in validate
emb_array = np.vstack(emb_list) # Stack the embeddings to a nrof_examples_per_epoch x 128 matrix
File "/usr/local/lib/python2.7/dist-packages/numpy/core/shape_base.py", line 230, in vstack
return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
ValueError: need at least one array to concatenate
What is the meaning of this parameter lfw_dir ?
I noticed that :--lfw_dir means "Path to the data directory containing aligned face patches."
But i do not understand it , and the default value '~/datasets/lfw/lfw_realigned/' is not exist , run the method lfw.get_paths() can't get “paths” lead to validate error
Waiting for your answer.
hi, i used the model you given and ran the validate_on_lfw.py directly , but the result is not as good as yours, is there something wrong? thank you
Hi,
I modified the facenet_train.py file to load the pretrained model that you provide but it seems to me there is some problem with the file, since whenever I try to load it using,
ckpt = tf.train.get_checkpoint_state(model_dir, latest_filename='model-20160506.ckpt-500000')
I get the following error:
Traceback (most recent call last):
File "/media/esb172/Hard_Disk_2/facenet_data/tightface/facenet-tf/facenet/src/facenet_train.py", line 268, in
main(parse_arguments(sys.argv[1:]))
File "/media/esb172/Hard_Disk_2/facenet_data/tightface/facenet-tf/facenet/src/facenet_train.py", line 98, in main
ckpt = tf.train.get_checkpoint_state(model_dir, latest_filename='model-20160506.ckpt-500000')
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 662, in get_checkpoint_state
coord_checkpoint_filename).decode("utf-8")
File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 2: invalid start byte
Process finished with exit code 1
Implement the possibility to resume training by restoring a checkpoint file
Train a model with dropout and check if the learnt features becomes more sparse
Do you plan on making a pure GPU implementation of facenet?
I am trying to train from scratch but when I start training I get the following error
Traceback (most recent call last):
File "facenet_train.py", line 257, in
main(parse_arguments(sys.argv[1:]))
File "facenet_train.py", line 47, in main
pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs))
File "/home/kalvik/facenet/facenet/src/lfw.py", line 64, in read_pairs
with open(pairs_filename, 'r') as f:
IOError: [Errno 2] No such file or directory: '../data/pairs.txt'
In order to reproduce the LFW validation accuracy as you posted on the Wiki using the Facescrub and Casia-WebFace datasets for training, I find out that there are 84+ persons (more than two images in the class) exist in both the LFW and the combined Facescrub and Casia-WebFace datasets. Can you provide more details on how you deal with these duplicates?
Based on how the datasets can be specified for the --data_dir argument, it seems that if the duplicates are not merged together in advance, the same person will be treated as different classes.
Do you have a similar utility to compare two jpeg face images and determine whether both are the same person or not, like compare.py in openface?
Thanks,
Hello,
My question is about Deep funneling in FaceNet. We've achieved high accuracy (about 92%) on deep-funneled LFW, but was the model "model-20160506.ckpt-500000" trained on deep-funneled FaceScrub and CASIA-Webface images? If it so, could somebody share the link to implementation of deep-funneling algorithm, to apply it to the result from Viola-Jones or DLib before FaceNet CNN?
Or, if it isn't necessary, could, please, somebody explain me why?
I know, that it is not the issue that is directly refereed to this FaceNet implementation, but I hope you will understand my curiosity)
Waiting for your answer.
Hey David,
Thanks for your code that combines with the openface
code to let me investigate.
I've forked your facenet and been working on my own for about 2-3 weeks. I've followed the torch code from openface
and seems re-implemented their strategy of training. Especially the one in the image below.
I see your code have some bugs and weird strategy for me.
anchor
, this seems not very reasonable right? I mean anchor
doesn't mean not changed.openface
and then calculate the gradients for all variables by feeding 20 mini-batches (2090=4540), then accumulate these gradients, then apply them once. This procedure is extended from openface
since they use 15*20 images, and their torch SDK seems have no API to get gradients. But for now my code isn't working, means the training doesn't seems to achieve at least 80% accuracy on LFW, I don't know why is that, could you please help me out?You may check the code from my fork optimize branch. Sorry that I've heavily modified the file structure and python coding style.
Please contact me if you have any idea. I think we could work together to use both 45*40 batch size and non-compact triplets.
Do we have to use align_dataset.py for facescrub? The dataset had folders with picures of different people but it also has a folder with cropped faces.If we are supposed to use only the cropped faces how can I feed the images to facenet and If I have to use align_dataset.py does it drop the folder with the cropped faces?
Periodically calculate and log triplet loss for the test dataset
Start to use AdaDelta as optimizer when it's available in tensorflow
tensorflow/tensorflow#644
i'm trying to run your pre trained model as linked by the wiki, but it seems that there's an inconsistency between the ckpt file and the meta file. The initial error I get when I try loading the model is
not found: Tensor name "conv1_7x7_1/batch_norm/batch_norm/beta/ExponentialMovingAverage_1" not found in checkpoint files /orpix/research/facenet/facenet/src/models/model.ckpt-500000
Can you let me know if I'm doing something wrong here? thanks
Visualize the filters in the first convolutional layer.
Use conv2d_transpose to visualize higher layer features as described in "Visualizing and Understanding Convolutional Networks" (https://www.cs.nyu.edu/~fergus/papers/zeilerECCV2014.pdf)
Importing of a pre-trained models using import_meta_graph(...)
has been implemented here.
However, this code seems to initialize the graph with the instantaneous parameter values and not the EMA filtered ones, and this degrades LFW performance a bit.
Figure out how to import a graph and initialize it with filtered parameter values.
I tried using the model as shared on the readme via google drive link, but encountered an error when verify the model with lfw dataset.
tensorflow.python.framework.errors.NotFoundError: Tensor name "incept5b/in4_conv1x1_55/weights/ExponentialMovingAverage" not found in checkpoint files model/20160306-500000/model.ckpt
I ran the verify command as python ./src/validate_on_lfw.py --model_dir model/20160306-500000/ --lfw_pairs data/pairs.txt --lfw_dir data/lfw\:dlib-affine-sz\:96/
Add the possibility to run with a fixed seed to simplify troubleshooting
Hi, David,
Just as the faceNet paper described, the accuracy could be reached over 99%, my question is, how to train the data to get this accuracy, is there any pre-trained model?
Hi!
I have use a small dataset, which has 600 different people, to train the network.
I have not change any code and parameters as given by you.
I test the result after 150 epoch on LFW and accuary is only 0.66.
I used the code from OpenFace to do face aligment and I have not change the image to bgr image.
Instead , I feed the network with rgb images. (Have you done the same thing?)
Do you think the accuray is alright? In order to get good performance, do you have any advice?
Thank you for sharing the code.
Need to add dropout and L2 weight penalty
Check invariance with respect to scale, translation and rotation
Epoch: [0][999/1000] Time 0.223 Loss 0.164
Epoch: [0][1000/1000] Time 0.226 Loss 0.185
Runnning forward pass on LFW images
Traceback (most recent call last):
File "facenet_train.py", line 298, in
main(parse_arguments(sys.argv[1:]))
File "facenet_train.py", line 134, in main
actual_issame, args.seed, 60, images_placeholder, phase_train_placeholder, embeddings, nrof_folds=args.lfw_nrof_folds)
File "/home/dl1/LXX/facenet-master_new/src/lfw.py", line 33, in validate
np.asarray(actual_issame), seed, nrof_folds=nrof_folds)
File "/home/dl1/LXX/facenet-master_new/src/facenet.py", line 428, in calculate_roc
folds = KFold(n=nrof_pairs, n_folds=nrof_folds, shuffle=True, random_state=seed)
File "/home/dl1/anaconda/envs/tensorflow/lib/python2.7/site-packages/sklearn/cross_validation.py", line 326, in init
super(KFold, self).init(n, n_folds, shuffle, random_state)
File "/home/dl1/anaconda/envs/tensorflow/lib/python2.7/site-packages/sklearn/cross_validation.py", line 257, in init
" than the number of samples: {1}.").format(n_folds, n))
ValueError: Cannot have number of folds n_folds=10 greater than the number of samples: 0.
(tensorflow)dl1@dl1-B85M-DS3H:~/LXX/facenet-master_new/src$
hi,
when run facenet_train.py to train inception_resnet_v2 ,I met this problem :
Traceback (most recent call last):
File "facenet_train.py", line 339, in
main(parse_arguments(sys.argv[1:]))
File "facenet_train.py", line 70, in main
phase_train=phase_train_placeholder, weight_decay=args.weight_decay)
File "/home/mqq/victorydong/source/face2/facenet/facenet/src/models/inception_resnet_v1.py", line 135, in inference
dropout_keep_prob=keep_probability)
File "/home/mqq/victorydong/source/face2/facenet/facenet/src/models/inception_resnet_v1.py", line 157, in inception_resnet_v1
with tf.variable_scope(scope, 'InceptionResnetV1', [inputs], reuse=reuse):
File "/home/mqq/anaconda2/lib/python2.7/contextlib.py", line 84, in helper
return GeneratorContextManager(func(_args, *_kwds))
TypeError: variable_scope() got multiple values for keyword argument 'reuse'
You report accuracy of 91% on the LFW data. The facenet paper reports 99+. Do you have any insight as to what they did differently vs your trained model? Just curious, thanks
When I run alignment,I encounter some problems as follow:
warnings.warn('Matplotlib is building the font cache using fc-list. This may take a moment.') fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' Traceback (most recent call last): File "align_dataset.py", line 113, in <module> main(parse_arguments(sys.argv[1:])) File "align_dataset.py", line 42, in main img = misc.imread(image_path) AttributeError: 'module' object has no attribute 'imread'
So what's reason of the problem?
OpenFace has a way of classifying images to get a prediction based on the target labels. Is that implemented in facenet? From what I can see facenet does the embedding, but hasn't implemented training for image -> embedding -> label. If it isn't implemented, what would be the best way to do that?
I've tried to read your code. but I didn't get how you can classify an image.
Consider that you want to classify one image with this network. is there any simple way to feed the network with that picture and retrieve the class of that image?
Hi! I'm trying to use pre-trained model, but I get the following error during restoring:
NotFoundError: Tensor name "incept4a/in1_conv1x1_75/batch_norm/cond/incept4a/in1_conv1x1_75/batch_norm/moments/moments_1/mean/ExponentialMovingAverage" not found in checkpoint files model-20160506.ckpt-500000
Can you help me with this?
Hello,
I have low accuracy on LFW using this FaceNet implementation. I'm using "All images as gzipped tar file" - the first offered dataset from list of downloadable datasets on http://vis-www.cs.umass.edu/lfw/
Photo attached:
Am I doing something wrong? Which dataset should I use to check accuracy and get 0.919+-0.008?
It would be great if you could help we with that issue.
Thank you.
Links to the images can be found at
http://www.robots.ox.ac.uk/~vgg/data/vgg_face/
A script to download the images can be found at
https://github.com/davidsandberg/facenet/blob/master/facenet/src/download_vgg_face_dataset.py
With Dlib alignment a part of the faces in e.g. the LFW dataset goes undetected. This can potentially filter out some of the semi-hard examples from the training dataset and degrading performance.
Check if the performance can be improved by using FaceX face alignment.
Is it possible to embed and classify new faces that were not used during training?
MTCNN has been implemented using python/tensorflow and can be found here. However this implementation gives slightly different results compared to the matlab/caffe implementation from the authors.
cv2.resize(img, (sz[1], sz[0]), interpolation=cv2.INTER_AREA)
seems to perform best. But the result is not identical and this seems to impact performance quite a bit.In https://github.com/davidsandberg/facenet/blob/master/facenet/src/facenet.py#L540
You have split based on classes. I think it should be split within each class instead.
I find a collection "losses" for cross_entropy_mean in facenet_train.py.
but in function _add_loss_summaries we always use collection "losses" .
if we should add a collection for triplet_loss?
Per this new discussion on OpenFace, I think your models would benefit from a better dlib aligner. This is a drop in replacement for existing align-dlib.py. Checkout the new align-dlib.py.zip
Hi, David
I think I found a spelling mistake on line 66 in facenet_train.py
# Placeholder for the learning rate
learning_rate_placeholder = tf.placeholder(tf.float32, name='learing_rate')
'learing_rate' should be 'learning_rate' ?
Hi David:
I am trying to run "validate_on_lfw.py". However, when using your provided model "model-20160506.ckpt-500000", I can not find the corresponding ".meta " file. And this results the "utf8' codec can't decode" error when loading the model file.
Best,
GQ
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.