Giter Site home page Giter Site logo

tamerthamoqa / facenet-pytorch-glint360k Goto Github PK

View Code? Open in Web Editor NEW
218.0 218.0 59.0 25.33 MB

A PyTorch implementation of the 'FaceNet' paper for training a facial recognition model with Triplet Loss using the glint360k dataset. A pre-trained model using Triplet Loss is available for download.

License: MIT License

Python 100.00%
face-recognition facenet lfw-dataset multi-gpu pretrained-model pytorch triplet-loss vggface2-dataset

facenet-pytorch-glint360k's People

Contributors

agenchev avatar tamerthamoqa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

facenet-pytorch-glint360k's Issues

torch.load pre-trained model error

hello, thank you for your contribution. i download your weight file and load it to check the performance on LFW , while load the pt file according to your description, the error "_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified." occurred. the pickle and save operation for your weight file maybe error???

Pre-trained model does not reproduce the results

Hi,

Thanks for sharing the repo. I tried to evaluate the pre-trained model on LFW. I couldn't reproduce the results you've reported.

The performance gap seems a bit large. I am sharing the ROC curve I've reproduced below:

lfw_roc_resnet34_epoch_0_triplet_evalonly_lfw

Thanks for your help in advance!

Error with code in readme for pretrained model

Hello, I tried your pretrained model with the code in the readme file but I get the below error. If I only pass in 'img' to preprocess it works but results aren't that great. Is there something I'm doing wrong? Thanks!

img = preprocess(img.to(device))
AttributeError: 'numpy.ndarray' object has no attribute 'to'

About make Triplet dataset

I have a puzzle,when generate triplet, I think the distance between pos and anc should less than the distance between anc and neg.

Embeddings are getting clustered together in a small region after training

Hi @tamerthamoqa,

Thanks a lot for such a fantastic repo which we could use for our work.
I was recently working on building a Face verification system using Siamese network. I was using results of pretrained models of Casia webface dataset and VGG2 Face dataset and was able to achieve close to 90% accuracy on my dataset. Further I was using Hard triplet batching sample and training strategy to further fine tune the network but for some reason after training the Embeddings for all the images are being clustered together or in other words the distance of two embeddings corresponding to two persons are getting too close to each other for example earlier using the pre-trained models if for two embedding we were getting 0.45 as cosine distance after training using this triplet loss we were getting 0.006 and it doesn't change much for same person or different person.

If you could give me any insights on this, that would be helpful.
Thanks

Upload raw glint360k?

Hi!

I would like to experiment with using different alignment strategies and more powerful detection/alignment models. For that it would be nice to have raw glint360k dataset. However, it is hard to get access to: the torrent is pretty much dead, and baidu is not very cooperative.

Would it be possible for you to upload raw glint360k to Google Drive?

triplet_loss_dataloader.py

Hello, I'm daniel,
While running your project, one question arose.

In dataloader/triplet_loss_dataloader,
It is a system that generates (pos, neg) class randomly as the number of triplets allocated for each processor, and randomly selects images,
but, When using the function of np.random.choice, I confirmed that the same random value is outputted for each processor.
So I used np.random.RandomState(), and I was able to use a different random value for each processor.

Please let me know if I understand this processor well or not.

Thank you.
Daniel

LICENCE

It will be very much appreciated if you could add a Licence to this project. Thank you.

About maintaining the aspect ratio of face

Hi,
I found the faces in your training and LFW test datasets are stretched. But from my perspective, if the trained model infers in normal aspect ratio faces, which may result in a performance degeneration.

do you think it is neccessary to keep the original face aspect ratio?

Questions about running validate_lfw() function in train_triplets_loss.py

Hello @tamerthamoqa
I use the validate_lfw() functions in my faceNet project, without changing anything, but when evaluating, it tooks almost 2 hours to calculate the distances and other metrics and still I didn't get a results, so the first question is I want to know if evaluating costs a lot of time, cause it computes on CPU instead of GPUs, and if it does, evaluate every epoch would costs, so I wonder how long does it take to train the whole model, it would very thankful if you can share me the training details so I can figure if there something uncorrect with my code.
Thanks Sincerely!

Face alignment for increased TAR@FAR (after training) and couple more thoughts

@tamerthamoqa
Hello again! Your pre-trained model is trained on unaligned VGG2 dataset, so it performs well with variances over pose. But many projects pre-process the images to obtain aligned faces which helps them to increase the TAR @ FAR score with given CNN model.
So I wonder are you interested in testing what can we get with face alignment ?
I implemented face align as transformation for the torchvision.transforms which let me test your pre-trained model on the raw LFW with this transform. It obtained TAR: 0.6640+-0.0389 @ FAR: 0.0010 without training and without face-stretching, which I think is promising. Unfortunately it can not be used with the cropped VGG2 and LFW for training/testing, because the faces are deformed/stretched (although it can be made to stretch the faces as well) and some face detections fail.
Next thing I'm not sure about is whether we can obtain less false-positives if the input faces are not stretched but preserve their shape. This leads to the next question - why the input is chosen to be square 224×224 ? Can't we change it to rectangle (for example 208×240) to better fit the human face instead of stretching the (aligned) faces ?
I also see that the normalized tensors RGB values have range [-2;2] is this the best range ?

Questions about L2 Normalization

Hi @tamerthamoqa ,
I'm curious about L2 Normalization, which would constrain the embedding into an euclidean feature space and 图片, so the maximum distance of two features in feature space shouldn't be 2? why the threshold is from 0.0 to 4.0?
Thanks!

Precision calculations

Hello @tamerthamoqa,
I am using this repo on a custom dataset, but I encountered some weird behaviour, every other metric constantly changes during epochs, but Precision always stays the same at 0.5000+-0.5000. I have also defined a custom validation dataset for which I generated an equal amount of positive and negative pairs in total consisting of 422 pairs, here's an example on one of the epochs:

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:22<00:00, 4.45it/s]
Epoch 137: Number of valid training triplets in epoch: 4
Validating on LFW! ...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:01<00:00, 2.65it/s]
Accuracy on LFW: 0.8818+-0.0417 Precision 0.5000+-0.5000 Recall 0.4364+-0.4368 ROC Area Under Curve: 0.1977 Best distance threshold: 1.16+-0.03 TAR: 0.2068+-0.2145 @ FAR: 0.0000

I tried reducing the range of threshold from the default to:
thresholds_roc = np.arange(0.5, 0.8, 0.1)
thresholds_val = np.arange(0.5, 0.8, 0.1)
But the precision stays the same. My question is what's going on with the precision calculations as far as I have reviewed the calculation logic checks out?
Thank you in advance.

Glint360k downloading issues

Hello, @tamerthamoqa!
I have tried to download glint360k (unpacked) from google drive, but each time downloading fails in the middle due to network issues. Idk now why, but I have not found any other copy of glint360k dataset except yours one.

Can I kindly ask you to split whole .zip file into pieces of 5 GB (by command split --verbose -b5G glint360_unpacked.zip glint360_unpacked.zip.) and download these pieces into google drive so that I could download it without network errors?

This is really important for me - I am doing thesis on face recognition and this dataset shows great metric on validation datasets.

Resnet 18

Hello @tamerthamoqa

May I ask what is the reason you choose ResNet 18 for training? From my understanding, the more data we have, the deeper network we can use. For VGGFace2, it is about 3M images, therefore, I think ResNet 50 will be a better choice isn't it ?

calculate_roc_values wrong function

Hello @tamerthamoqa,
I think there is an error in calculate_roc_values function with it comes to true_positive_rate, and false_positive_rate calculation, the mean should be calculated outside the loop, as we average across all folds

the new code:
def calculate_roc_values(thresholds, distances, labels, num_folds=10):
num_pairs = min(len(labels), len(distances))
num_thresholds = len(thresholds)
k_fold = KFold(n_splits=num_folds, shuffle=False)

true_positive_rates = np.zeros((num_folds, num_thresholds))
false_positive_rates = np.zeros((num_folds, num_thresholds))
precision = np.zeros(num_folds)
recall = np.zeros(num_folds)
accuracy = np.zeros(num_folds)
best_distances = np.zeros(num_folds)

indices = np.arange(num_pairs)

for fold_index, (train_set, test_set) in enumerate(k_fold.split(indices)):
    # Find the best distance threshold for the k-fold cross validation using the train set
    accuracies_trainset = np.zeros(num_thresholds)
    for threshold_index, threshold in enumerate(thresholds):
        _, _, _, _, accuracies_trainset[threshold_index] = calculate_metrics(
            threshold=threshold,
            dist=distances[train_set],
            actual_issame=labels[train_set],
        )
    best_threshold_index = np.argmax(accuracies_trainset)

    # Test on test set using the best distance threshold
    for threshold_index, threshold in enumerate(thresholds):
        (
            true_positive_rates[fold_index, threshold_index],
            false_positive_rates[fold_index, threshold_index],
            _,
            _,
            _,
        ) = calculate_metrics(
            threshold=threshold,
            dist=distances[test_set],
            actual_issame=labels[test_set],
        )

    (
        _,
        _,
        precision[fold_index],
        recall[fold_index],
        accuracy[fold_index],
    ) = calculate_metrics(
        threshold=thresholds[best_threshold_index],
        dist=distances[test_set],
        actual_issame=labels[test_set],
    )
    
    best_distances[fold_index] = thresholds[best_threshold_index]

# Calculate mean values of TPR and FPR across all folds
true_positive_rate = np.mean(true_positive_rates, 0)
false_positive_rate = np.mean(false_positive_rates, 0)


return (
    true_positive_rate,
    false_positive_rate,
    precision,
    recall,
    accuracy,
    best_distances,
)

embedding vectors dimension

Hi @tamerthamoqa ,

Thanks a lot for your great repo.
According to FaceNet paper the best dimension for embedded vector is 128. I am curious to know is there any specific reason that you used four time bigger embedded vector dimension to 512?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.