Giter Site home page Giter Site logo

arcface-pytorch's People

Contributors

ronghuaiyang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arcface-pytorch's Issues

What does eazy_margin in models.metrics.ArcMarginProduct mean?

I don't get the lines below:
if self.easy_margin:
phi = torch.where(cosine > 0, phi, cosine)
else:
phi = torch.where(cosine > self.th, phi, cosine - self.mm)
in which, self.mm = math.sin(math.pi - m) *m . What does mm mean?

BTW, I don't think the condition "cosine > 0" equals to the target position. The implementation seems to be different from the paper.

the CASIA-webface-dataset and the train-validation list

Hi, thanks a lot for the pytorch implementation of the arcface!
I'm trying to train the model using your code, but I find no where to download the webface dataset, It's homepage is missing.
I'd be appreciate if be told where to download the 'CASIA-maxpy-clean-crop-144' dataset and the train-validation split list.

Thks in advance!!

Cannot train the same accuracy as the pretrained model

I didn't change the parameters in your code but I can't train the same accuracy as the pretrained model.Could you tell me the train parameters in "resnet18_110.pth"? Thanks for your help.

Tue May 14 20:19:47 2019 train epoch 49 iter 86 1.9944349123113763 iters/s loss 5.390433311462402 acc 0.359375
Tue May 14 20:20:19 2019 train epoch 49 iter 186 3.1073992095805747 iters/s loss 4.969634532928467 acc 0.375
Tue May 14 20:20:52 2019 train epoch 49 iter 286 3.0889427377963514 iters/s loss 5.293170928955078 acc 0.37890625
Tue May 14 20:21:24 2019 train epoch 49 iter 386 3.084540290424234 iters/s loss 5.502295017242432 acc 0.361328125
Tue May 14 20:21:56 2019 train epoch 49 iter 486 3.1056642862816277 iters/s loss 4.713094234466553 acc 0.373046875
Tue May 14 20:22:29 2019 train epoch 49 iter 586 3.099455363037708 iters/s loss 5.360879421234131 acc 0.349609375
Tue May 14 20:23:01 2019 train epoch 49 iter 686 3.1000786826315307 iters/s loss 5.235039234161377 acc 0.375
Tue May 14 20:23:33 2019 train epoch 49 iter 786 3.081582291481597 iters/s loss 5.106305122375488 acc 0.349609375
(7701, 1024)
total time is 16.355839252471924, average time is 0.06364139786954055
lfw face verification accuracy: 0.8625 threshold: 0.31009173

test problem

when i run python test.py. the problem will come:
D:\anaconda\envs\pytorch\python.exe "D:/py program/arcface-pytorch-master/test.py"
read ./data/Datasets/lfw/lfw-align-128\Abel_Pacheco/Abel_Pacheco_0001.jpg error
Traceback (most recent call last):
File "D:/py program/arcface-pytorch-master/test.py", line 172, in
lfw_test(model, img_paths, identity_list, opt.lfw_test_list, opt.test_batch_size)
File "D:/py program/arcface-pytorch-master/test.py", line 143, in lfw_test
features, cnt = get_featurs(model, img_paths, batch_size=batch_size)
File "D:/py program/arcface-pytorch-master/test.py", line 62, in get_featurs
if images.shape[0] % batch_size == 0 or i == len(test_list) - 1:
AttributeError: 'NoneType' object has no attribute 'shape'

Process finished with exit code 1
why?

Is it solved loss function?

i saw loss function error in this page
is it solved?
so if i train my own dataset, is it possible? not error on loss function?

train

can you give me the train txt file?
If it's convenient, can you send it to my email? Or in other ways. My email:[email protected]
Thanks very much!

What is the lfw_test_list used for

Hi, thank you for your great repo.
I am new in machine learning and I am concerned why should I use a lfw_test_pair.txt? can I just use the whole LFW datasets to test my model?

Make a predict.

I want to make a predict with test data. What should i do ?

acc 0.0

=> total time is 42.86408877372742, average time is 0.1667863376409627
=> lfw face verification accuracy: 0.8643333333333333 threshold: 0.98421407
Tue Oct 16 14:23:41 2018 train epoch 7 iter 75 1.8792263763362134 iters/s loss 14.678580284118652 acc 0.0
Tue Oct 16 14:23:51 2018 train epoch 7 iter 175 10.736604388208807 iters/s loss 14.29747486114502 acc 0.0
Tue Oct 16 14:24:00 2018 train epoch 7 iter 275 10.673344835993447 iters/s loss 15.466602325439453 acc 0.0
Tue Oct 16 14:24:10 2018 train epoch 7 iter 375 10.266020200663386 iters/s loss 15.988080978393555 acc 0.0
Tue Oct 16 14:24:20 2018 train epoch 7 iter 475 10.013510335681348 iters/s loss 14.602261543273926 acc 0.0
Tue Oct 16 14:24:30 2018 train epoch 7 iter 575 9.911183891830836 iters/s loss 15.66580581665039 acc 0.0
Tue Oct 16 14:24:40 2018 train epoch 7 iter 675 10.237420150127207 iters/s loss 15.294169425964355 acc 0.0
Tue Oct 16 14:24:49 2018 train epoch 7 iter 775 10.42534408847289 iters/s loss 15.03502082824707 acc 0.0
Tue Oct 16 14:24:59 2018 train epoch 7 iter 875 10.592710894452473 iters/s loss 15.485675811767578 acc 0.0
Tue Oct 16 14:25:08 2018 train epoch 7 iter 975 10.55012234010108 iters/s loss 14.71179485321045 acc 0.0
Tue Oct 16 14:25:18 2018 train epoch 7 iter 1075 10.474356568004414 iters/s loss 14.181792259216309 acc 0.0
Tue Oct 16 14:25:28 2018 train epoch 7 iter 1175 9.514813052351975 iters/s loss 15.342684745788574 acc 0.0
Tue Oct 16 14:25:38 2018 train epoch 7 iter 1275 10.08869761467461 iters/s loss 14.384868621826172 acc 0.0
Tue Oct 16 14:25:48 2018 train epoch 7 iter 1375 10.343309102628595 iters/s loss 15.508356094360352 acc 0.0

LICENSE

Hi!

What's the license of this code? I'd like to make drastic changes to this codebase and use it in my company.

Many thanks :)

Resnet50 output has 2048-dim feature

The Resnet class in resnet.py has an attribute of 'self.fc5 = nn.Linear(512 * 8 * 8, 512)', the fisrt 512 should match the output dims of backbone conv. It is OK when use resnet18 or resnet34. But what happens if we use resnet50? Should I change the first 512 to 2048. If so, would it bring too much params?

"y_test = (y_score >= th)" in test.py

Is the >= here should be replaced by <= , the cosine distance of the same face is smaller, the distance between the photos of different faces is bigger?

ArcFace easy margin

I was wondering in line 43 of https://github.com/ronghuaiyang/arcface-pytorch/blob/master/models/metrics.py

phi = torch.where(cosine > self.th, phi, cosine - self.mm)

let th = cos(pi - margin)
let mm = margin * sin(pi - margin)

If it isn't too much trouble, could you explain why you do cosine - mm if phi is lower than cos(pi - margin)? I understand if cosine < th then phi < cosine rendering the margin counterproductive, but is there a special geometric meaning by decreasing the value of cosine by mm or was this an arbitrary choice?

I've looked at the ArcFace paper but I didn't see any mention of an easy margin, do you have any references for this?
Also, if I am not mistaken I think that the ArcFaceLoss only uses:

loss1 = self.classify_loss(output, labels)

and not

loss2 = self.classify_loss(cosine, labels)

which I assume is the standard softmax loss. Was this just to help with the training process?

weight update

hi rong.. thanks for providing this implementation...
i have got basic question
torch.optim.Adam([{'params': model.parameters()}, {'params': metric_fc.parameters()}],
lr=opt.lr, weight_decay=opt.weight_decay)
why do we need this to list the param...
I thought using nn.Param() would register the newly added parameter and there after weights should get autoupdated at the end of epoch.

the error of loss function

The value of the loss function cannot be lowered,When I execute to the 50th epoch, this value is still 12.137....,It is similar to the previous epoch.
I listed the process of two epoch
image
image

What number is the batch size

what is the batch size using the 99.3% ?
did you use focal loss when you got the 99.3%?

and what is the performance about other backbone (e.g. res50 res101)
and other layer classifier (e.g. Arcface CosineFace)

the lfw acc

hello, ronghuai, could tell the lfw acc you tested on your model, I use your training code, but can get only 98.n% acc, I have tuned almost all the super-parameters. but can not reach 99%

config配置

train_root = '/data/Datasets/webface/CASIA-maxpy-clean-crop-144/'
train_list = '/data/Datasets/webface/train_data_13938.txt'
val_list = '/data/Datasets/webface/val_data_13938.txt'

test_root = '/data1/Datasets/anti-spoofing/test/data_align_256'
test_list = 'test.txt'

这些文件夹在哪,训练不了

What is purpose of easymargin?

I didn't see any clues in authors' paper about easy margin in ArcMarginProduct. What is this for? And why replace cosine when cosine greater than zero?

loss nan

I am training lenet5(Lenet5) with arcface implemented here, but the loss changes to nan after several batches. Hope for help. Thanks.

image

about load model

use the facenet_18 to train my data,but when i load the mode,i got some error
model_dict = model.state_dict()

Missing key(s) in state_dict: "layer1.0.se.fc.0.weight", "layer1.0.se.fc.0.bias", "layer1.0.se.fc.1.weight", "layer1.0.se.fc.2.weight", "layer1.0.se.fc.2.bias", "layer1.1.se.fc.0.weight", "layer1.1.se.fc.0.bias", "layer1.1.se.fc.1.weight", "layer1.1.se.fc.2.weight", "layer1.1.se.fc.2.bias", "layer2.0.se.fc.0.weight", "layer2.0.se.fc.0.bias", "layer2.0.se.fc.1.weight", "layer2.0.se.fc.2.weight", "layer2.0.se.fc.2.bias", "layer2.1.se.fc.0.weight", "layer2.1.se.fc.0.bias", "layer2.1.se.fc.1.weight", "layer2.1.se.fc.2.weight", "layer2.1.se.fc.2.bias", "layer3.0.se.fc.0.weight", "layer3.0.se.fc.0.bias", "layer3.0.se.fc.1.weight", "layer3.0.se.fc.2.weight", "layer3.0.se.fc.2.bias", "layer3.1.se.fc.0.weight", "layer3.1.se.fc.0.bias", "layer3.1.se.fc.1.weight", "layer3.1.se.fc.2.weight", "layer3.1.se.fc.2.bias", "layer4.0.se.fc.0.weight", "layer4.0.se.fc.0.bias", "layer4.0.se.fc.1.weight", "layer4.0.se.fc.2.weight", "layer4.0.se.fc.2.bias", "layer4.1.se.fc.0.weight", "layer4.1.se.fc.0.bias", "layer4.1.se.fc.1.weight", "layer4.1.se.fc.2.weight", "layer4.1.se.fc.2.bias".

my save code torch.save(model.state_dict(),save_name)
my load code model.load_state_dict(torch.load(os.path.join('checkpoints','resnet18_8.pth')))

pretrained model and dataset

Hi,
Thank you for sharing your code. It's awesome. What's your dataset for training and could you please share the pretrained model?

Tensor size mismatch

I try to test the pre-trained model resnet18_110.pth in lfw dataset, and the test_batch_size = 16 . but i got a mismatch error in the 'resnet.py' like this:

Traceback (most recent call last):
  File "test.py", line 171, in <module>
    lfw_test(model, img_paths, identity_list, opt.lfw_test_list, opt.test_batch_size)
  File "test.py", line 141, in lfw_test
    features, cnt = get_featurs(model, img_paths, batch_size=batch_size)
  File "test.py", line 65, in get_featurs
    output = model(data)
  File "/Users/zhangjiatao/anaconda3/envs/new_python/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/zhangjiatao/anaconda3/envs/new_python/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 138, in forward
    return self.module(*inputs, **kwargs)
  File "/Users/zhangjiatao/anaconda3/envs/new_python/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/zhangjiatao/Documents/MyProject/PingAn/arcface-pytorch/models/resnet.py", line 220, in forward
    x = self.fc5(x)
  File "/Users/zhangjiatao/anaconda3/envs/new_python/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/zhangjiatao/anaconda3/envs/new_python/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 67, in forward
    return F.linear(input, self.weight, self.bias)
  File "/Users/zhangjiatao/anaconda3/envs/new_python/lib/python3.6/site-packages/torch/nn/functional.py", line 1352, in linear
    ret = torch.addmm(torch.jit._unwrap_optional(bias), input, weight.t())
RuntimeError: size mismatch, m1: [16 x 131072], m2: [32768 x 512] at /Users/administrator/nightlies/pytorch-1.0.0/wheel_build_dirs/conda_3.6/conda/conda-bld/pytorch_1544137972173/work/aten/src/TH/generic/THTensorMath.cpp:940

the error occur in x = self.fc5(x) in Class ResNetFace, do you have any idea?

loss decline, accuracy remains zero

I want to use arcface loss to normal classification.
But I found that the acc is always nearly 0.
I checked the logits, it's nearly -1, and when give the label and add some m, the softmax loss is low. And this is learned by the network!!!!!
In the extreme situation, if a network output logit all with -1, then the arcface loss is very low, so network can learn about nothing.

train.py questions

Hi, there are two learnable modules, CNN encoder and FC layer, in "train.py".
I have traced the code.
It seems that only the encoder is set to "model.train()", but "metric_fc" is not set to "metric_fc.train()".
Is it possible to update the weights of the "metric_fc" module without this setting during the training stage?

question about the label

hey, when I run the training epoch 0, the error is below:

/pytorch/aten/src/THC/THCTensorScatterGather.cu:176: void THCudaTensor_scatterFillKernel(TensorInfo<Real, IndexType>, TensorInfo<long, IndexType>, Real, int, IndexType) [with IndexType = unsigned int, Real = float, Dims = 2]: block: [0,0,0], thread: [49,0,0] Assertion indexValue >= 0 && indexValue < tensor.sizes[dim] failed.
THCudaCheck FAIL file=/pytorch/aten/src/THC/generated/../THCReduceAll.cuh line=317 error=59 : device-side assert triggered
Traceback (most recent call last):
File "train.py", line 92, in
loss = criterion(output, label)
File "/home/pris/anaconda2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/pris/yzh/arcface-pytorch/models/focal_loss.py", line 27, in forward
loss = loss.mean()
RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/aten/src/THC/generated/../THCReduceAll.cuh:317

I quite appriciate for your help!
I only changed the input size to (3,112,112) and change the fc5 input_features;
And I set the batch_size = 64, the same as test_batch_size.

0 accuracy

I always get 0 accuracy using Arcface loss and Cross Entropy loss.(loss decrease normally)

Run test

It took more than 20 seconds to run test.py?

The loss function seems to be error

The focal loss function seems to be differernt from the theory.
The theory describes the loss as:
-alpha * (1-y')^gamma * log(y') - (1-alpha) * y'^gamma * log(1-y')
Besides, the arc metric also confuses me...ArcFace is modified based on softmax but there is nothing similiar to softmax in the metric code.
Maybe I misunderstand the theory. I will be grateful if you point my errors out.

traindatasets

Could you please provide some information about the training data ?

ModuleNotFoundError: No module named 'dataset'

Is the relevant document of your project complete?
When I run train.py, the following problem appears. And I don't know what dataset.py does,Can you tell me? Thanks a lot.
image
The files in the following paths don't exist in dataset.py.
image

Can't Download pretrained model

Hi, Baidu doesn't let me sign up for an account without a Chinese phone number. Is it possible to publish the model the other way? Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.