haomood / bilinear-cnn Goto Github PK

PyTorch implementation of bilinear CNN for fine-grained image recognition

License: GNU General Public License v3.0

Python 100.00%

bilinear-cnn's Issues

How to get the mean and variance of the data normalize transform in your code?

Hi, I am confused about a question in your code. The mean and variance of the data normalize transform in your code is [(0.485, 0.456, 0.406), (0.229, 0.224, 0.225)]. But I computed the mean and variance of train data and got [(0.4856, 0.4994, 0.4324), (0.1817, 0.1811, 0.1927)]. Then I used the mean and variance I computed, but the test accuracy of the result I got is lower than yours. First, I thought maybe you used mean and variance of the whole dataset, then I computed it but got a result very close to what I got before. So can you tell me how to get the mean and std of the data normalize transform in your code? Thank you!

请问，这个PCA要怎么加上去呢？

Cannot reproduce accuracy 84% (after step2)

Hi Hao,

Thank you for a neat implementation.

I wonde if training with the hyperparameters written in README

 --base_lr 1e-2 \
 --batch_size 64 --epochs 25 --weight_decay 1e-5 \
 --model "model.pth"

gives 84.17% test accuracy?

I used exactly the commads which you provide in the README:

    Step 1.
    $ CUDA_VISIBLE_DEVICES=0,1,2,3 ./src/bilinear_cnn_fc.py --base_lr 1.0 \
          --batch_size 64 --epochs 55 --weight_decay 1e-8 \
          | tee "[fc-] base_lr_1.0-weight_decay_1e-8-epoch_.log"

    Step 2. 
    $ CUDA_VISIBLE_DEVICES=0,1,2,3 ./src/bilinear_cnn_all.py --base_lr 1e-2 \
          --batch_size 64 --epochs 25 --weight_decay 1e-5 \
          --model "model.pth" \
          | tee "[all-] base_lr_1e-2-weight_decay_1e-5-epoch_.log"

I have trained step1 model and got 76.67% accuracy on test. I use this as initialization for step2 model and finetune all the layers further. But the accuracy saturates at 76.61% and doesn't grow further.

Are there any extra tricks to get the desired performance?

"model" in `bilinear_cnn_all.py`

How do we obtain the model.pth file for fine tuning all the layers ?

out of memory

Hi! After 2 epochs the backward runs out of memory :( First epoch its okey but then crash on second one. It seems that stores the graph or somethin but I change some things and crash:

`
for X, y in self._train_loader:
# Data.

            # Clear the existing gradients.
            X = X.cuda()
            y = y.cuda()

            # Forward pass.
            score = self._net(X)
            loss = self._criterion(score, y.long())

            with torch.no_grad():
                epoch_loss += loss.item()
                # Prediction.
                prediction = torch.argmax(score, dim=1)
                num_total += y.size(0)
                num_correct += torch.sum(prediction == y.long()).item()

            # Backward pass.
            self._optimizer.zero_grad()
            loss.backward()
            self._optimizer.step()
            
            total_batches+=1
            del X, y, score, loss, prediction

download the model.pth

Thank you very much for your code! But where can I find that model for fine-tuning? Or it need to be trained by myself?

A question about the result of this model

Hello，Could you achieve the result as the paper said, if I may ask?

bilinear sqrt with sign

Hi, it is a concise and useful code for bilinear CNNs, however, from the paper I read about the
" elementwise signed square-root (x ← sign(x)􏰊|x|) and l2 normalization is applied to the matrix A"
which means it should be multiplied by the sign. But in this code just "X = torch.sqrt(X + 1e-5)"

Am I missing something? and even this not same thoroughly, I got the same result (84.2%) which suggests it should be a right answer?

About step1 and step2

Hi, is stetp2 's network based on step1's FC parameters or just training a vgg16 net from scratch?

Why there is one feature extractor in BCNN class?

Hi, Thank you for your code.

I wonder why there's only one feature extractor in BCNN class.
I think BCNN has two feature extractor, can you explain please?

Thank you

Signed square root

Hi Hao

First of all thanks for the excellent implementation. I have used the code here as a reference for my own implementations.

In the original paper (http://vis-www.cs.umass.edu/bcnn/docs/bcnn_iccv15.pdf) the authors have used signed square root operation. Something like:

X = torch.mul(torch.sign(X),torch.sqrt(torch.abs(X)+1e-5))

instead of the normal square root you have used X = torch.sqrt(X + 1e-5)

Was there a particular reason for using this ?

confusion about the parameters of Normalize

Regarding line 129-130 in bilinear_cnn_fc.py, I'm confused about the magic number in

Normalize(mean=(0.485, 0.456, 0.406),
                 std=(0.229, 0.224, 0.225))

where are these numbers from?

In the Step 1, It gives around 56% test set accuracy. I have tried it two times.

外积问题

X = torch.bmm(X, torch.transpose(X, 1, 2)) / (28**2) # Bilinear

特征图A的尺寸为(C,M)，B的尺寸为（C,N）
论文中提到
If fA and fB extract features of size C ×M and
C ×N respectively, then Φ(I) is of size M × N.
但是按照您的写法，这个结果是C × C
但是论文experiment部分似乎结果也是您的512*512，即C × C
我很困惑，望您能解答以下，谢谢。

Hi

Hi，I just used your class BCNN as a module, but what i get is the same classification result of different images.
Is there something wrong?

here is the output of predict class and truct class

predict_class: tensor([61, 61, 61, 61, 61, 61, 61, 61], device='cuda:0')
truth_class: tensor([180, 151, 187, 33, 70, 36, 109, 54], device='cuda:0')

the training process:

            data = data.to(opt.device)
            label = (label).to(opt.device)
            optimizer.zero_grad()
            score  = bcnn_model(data)
            loss = criterion(score,label)
            loss.backward()
            optimizer.step()

question about the bilinear pooling operation

in the forward function of BCNN class, the bilinear operation is

X = torch.bmm(X, torch.transpose(X, 1, 2)) / (28**2) # Bilinear

why does it require the result of matrix multiplication being divided by (28 ** 2)?

About step 2

@HaoMood ，你好，非常感谢你的工作。
我在跑你代码时第一步的test acc是76%，保存的最好结果的model是vgg_16_epoch_21.pth
但是在step 2中load 这个21.pth 的model得到的train acc 是1%， test acc 是0.
请问这是什么问题？

关于双线性汇合特征的规范化

请问torch.sqrt与F.normalize的作用分别是什么？
谢谢。

zombie process when using multiple gpu

hi, thanks a lot for your code! Everything works well when I only use one gpu by setting cuda_visible_devices=0 (for example), but when I use multiple gpus by setting cuda_visible_devices=0,1 (for example), the process will become a zombie process, which means it is not actually training, but it still holds the gpu and cpu resources. What's the worst is, you even cannot kill it through "kill -9 PID". The only thing you can do is a reboot. Have u come across the same issue before? Thanks a lot!

maybe you could offer bilinear cnn of resnet?

About memory

In your README, I see you used 4 gpus. So, how much memory has been used totally in your step1?

haomood / bilinear-cnn Goto Github PK

bilinear-cnn's Issues

Recommend Projects

Recommend Topics

Recommend Org