Giter Site home page Giter Site logo

circleloss's Introduction

Circle Loss

An unofficial Pytorch implementation of the paper "Circle Loss: A Unified Perspective of Pair Similarity Optimization".

https://arxiv.org/abs/2002.10857

Update

Use CircleLoss in circle_loss.py, and there is a simple example of mnist in mnist_example.py.

For pair-wise labels, another implementation https://github.com/xiangli13/circle-loss is suggested.

Early

Sorry for using master branch as dev. Some early implementations are kept in circle_loss_early.py.

CircleLossLikeCE is an early implementation to use CircleLoss in the paradigm of approaches like ArcFace. It only consists with the paper on a special case.

CircleLossBackward is an early implementation to avoid overflow in the method of applying backward with handcraft gradients. A negative sign is added to the Eq. 10 in this code to fix the equation. It's correct, stable but messy.

Other

It has been said that the official implementation will be included in https://github.com/MegEngine.

Thanks very much for Yifan Sun's advice!

circleloss's People

Contributors

skanderbug avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

circleloss's Issues

pair方式的metric

您好, 看了挺久, 想交流下, 个人理解, 您的实现是circle loss文章中基于样本对的方式, 而不是基于classification的方式把, 这种方式其实并不适用于现在人脸识别. 您在circle loss early里做了个NormLinear全连接, 但也没有涉及到怎么用. 特别想知道具体于classification的用法

to fix bug

a[label] = torch.clamp_min(- inp[label] + 1 + self.m, min=0).detach()
->
src = torch.clamp_min(
- inp.gather(dim=1, index=label.unsqueeze(1)) + 1 + self.m,
min=0,
).detach()
a.scatter_(1, label.unsqueeze(1), src)

sigma[label] = 1 - self.m
->
src = torch.ones_like(label.unsqueeze(1),
dtype=inp.dtype, device=inp.device) - self.m
sigma.scatter_(1, label.unsqueeze(1), src)

Pair-wise level or Class-level?

单独看circle_loss.py这个文件,作者您实现的似乎是pair-wise level label的circle loss啊(拉近同类样本的embedding,拉远不同类样本的embedding)。为什么readme里要加一句:For pair-wise labels, another implementation https://github.com/xiangli13/circle-loss is suggested 呢?我看了另外一份实现,他用bineary_crossentropy计算loss感觉起来是不太对的啊

Why do a_p and a_n use .detach()?

Hi, thanks for your code!

I wonder why a_p and a_n use .detach() here?

ap = torch.clamp_min(- sp.detach() + 1 + self.m, min=0.)

As Eq. (9) and (10) in the original paper, it seems a_p and a_n are involved in the gradient computation.

Hope to get your answer, thanks!

How to obtain class labels?

I wonder how to obtain class labels in classification tasks?
I have checked circle_loss.py but it only learn similarity between MNIST images

circle loss 的适用范围?

您好,我看完您的论文感觉工作很棒!想问下,如果我想进行二分类的一个学习,可以采用circle loss吗?我看到issue中很多人是用在特定的任务上,我不太确定这个loss是否是通用的?

softplus failed to compute its gradient

i use your version of circle loss to do a image classification
when loss.backward() an error occured

in your code loss -> forward
loss_temp = torch.logsumexp(logit_n, dim=0) + torch.logsumexp(logit_p, dim=0) loss = self.soft_plus(loss_temp)
there is an error that softplus cannot compute its gradient, maybe the F.softplus take an inplace operation?

my version is python 3.7 and torch 1.5.0

Warning: Error detected in SoftplusBackward. Traceback of forward call that caused the error: File "train.py", line 473, in <module> train_fun(args, train_loader, feat_loader, i) File "train.py", line 221, in train_fun loss = criterion(embed_feat, labels) File "/home/anaconda3/envs/ljm/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__ result = self.forward(*input, **kwargs) File "/home/Project/SDC-IL/SDC-IL/losses/circle_loss.py", line 51, in forward loss = F.softplus(loss_temp) (print_stack at /opt/conda/conda-bld/pytorch_1587428266983/work/torch/csrc/autograd/python_anomaly_mode.cpp:60) iter 0Traceback (most recent call last): File "train.py", line 473, in <module> train_fun(args, train_loader, feat_loader, i) File "train.py", line 230, in train_fun loss.backward() File "/home/anaconda3/envs/ljm/lib/python3.7/site-packages/torch/tensor.py", line 198, in backward torch.autogr ad.backward(self, gradient, retain_graph, create_graph) File "/home/anaconda3/envs/ljm/lib/python3.7/site-packages/torch/autograd/__init__.py", line 100, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor []], which is output 0 of SoftplusBackward, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

why are sp and sn both decreasing during training process?

hello, thanks for your great work!
I got a question. According to my understanding, the goal of circleloss is increasing s_p to 1 and decreasing s_n to 0 . and the final decision boundary is s_n - s_p + m = 0. but when i try the mnist example, i find that s_p and s_n in one batch are both decreasing. the final value are both approx 0.41.
in this case, the cosine similarity of positive pairs are close to negative pairs. why would it happen? Or in another way, how can they classify the sample pairs

bug?

loss= self.soft_plus(torch.logsumexp(logit_n, dim=0) + torch.logsumexp(logit_p, dim=0))
这行代码中的'+'是不是应该是'*'感觉和原始论文中的描述不一样

关于circle loss代码实现中的梯度问题

您好,在代码实现中有下面两行来计算ap和an
ap = torch.clamp_min(- sp.detach() + 1 + self.m, min=0.)
an = torch.clamp_min(sn.detach() + self.m, min=0.)
其中包含detach操作,这样的话梯度计算还会跟论文中推导的一致吗?

直接忽略batch维度相加sn/sp似乎不太对

论文里的公式都是1 sample的.你的实现直接把一个batch的sn和sp相加似乎对不上公式啊.
假设

第一个样本 l1 = log(1 + exp(a11+ a12)exp(...))
第二个样本 l2 = log(1 + exp(a21)exp(...))

l1 + l2 = log[(1 + exp(a11+ a12)exp(...)) * (1 + exp(a21)exp(...))] != log[1 + exp(a11 + a12 + a21)exp(...)]

这个CircleLoss是基于批量数据实现?

您好,请问下
看您的CircleLoss代码中的处理与最终打印结果值看着是单独一个数据CircleLoss的计算;但是在您的输入
feat = nn.functional.normalize(torch.rand(256, 64, requires_grad=True))
lbl = torch.randint(high=10, size=(256,)
看着是批量计算256

RuntimeError: name not implemented for 'Bool'

Traceback (most recent call last):
File "main_video_person_reid.py", line 378, in
main()
File "main_video_person_reid.py", line 217, in main
train(model, criterion_circ, optimizer, trainloader, use_gpu)
File "main_video_person_reid.py", line 273, in train
loss = criterion_circ(*convert_label_to_similarity(features, pids))
File "/home/lz3316866/vidreid_cosegmentation/src/circle_loss.py", line 11, in convert_label_to_similarity
positive_matrix = label_matrix.triu(diagonal=1)
RuntimeError: name not implemented for 'Bool'

What if similarity_matrix[positive_matrix] is empty?

For convert_label_to_similarity, there's every chance that positive_matrix be all False. Therefore, similarity_matrix[positive_matrix] will have no element at all, which incurs error in following circle loss computation. How can we handle such case?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.