Giter Site home page Giter Site logo

bilinear-cnn's Issues

How to get the mean and variance of the data normalize transform in your code?

Hi, I am confused about a question in your code. The mean and variance of the data normalize transform in your code is [(0.485, 0.456, 0.406), (0.229, 0.224, 0.225)]. But I computed the mean and variance of train data and got [(0.4856, 0.4994, 0.4324), (0.1817, 0.1811, 0.1927)]. Then I used the mean and variance I computed, but the test accuracy of the result I got is lower than yours. First, I thought maybe you used mean and variance of the whole dataset, then I computed it but got a result very close to what I got before. So can you tell me how to get the mean and std of the data normalize transform in your code? Thank you!

Cannot reproduce accuracy 84% (after step2)

Hi Hao,

Thank you for a neat implementation.

I wonde if training with the hyperparameters written in README

 --base_lr 1e-2 \
 --batch_size 64 --epochs 25 --weight_decay 1e-5 \
 --model "model.pth" 

gives 84.17% test accuracy?

I used exactly the commads which you provide in the README:

    Step 1.
    $ CUDA_VISIBLE_DEVICES=0,1,2,3 ./src/bilinear_cnn_fc.py --base_lr 1.0 \
          --batch_size 64 --epochs 55 --weight_decay 1e-8 \
          | tee "[fc-] base_lr_1.0-weight_decay_1e-8-epoch_.log"

    Step 2. 
    $ CUDA_VISIBLE_DEVICES=0,1,2,3 ./src/bilinear_cnn_all.py --base_lr 1e-2 \
          --batch_size 64 --epochs 25 --weight_decay 1e-5 \
          --model "model.pth" \
          | tee "[all-] base_lr_1e-2-weight_decay_1e-5-epoch_.log"

I have trained step1 model and got 76.67% accuracy on test. I use this as initialization for step2 model and finetune all the layers further. But the accuracy saturates at 76.61% and doesn't grow further.

Are there any extra tricks to get the desired performance?

out of memory

Hi! After 2 epochs the backward runs out of memory :( First epoch its okey but then crash on second one. It seems that stores the graph or somethin but I change some things and crash:

`
for X, y in self._train_loader:
# Data.

            # Clear the existing gradients.
            X = X.cuda()
            y = y.cuda()

            # Forward pass.
            score = self._net(X)
            loss = self._criterion(score, y.long())

            with torch.no_grad():
                epoch_loss += loss.item()
                # Prediction.
                prediction = torch.argmax(score, dim=1)
                num_total += y.size(0)
                num_correct += torch.sum(prediction == y.long()).item()

            # Backward pass.
            self._optimizer.zero_grad()
            loss.backward()
            self._optimizer.step()
            
            total_batches+=1
            del X, y, score, loss, prediction

`

download the model.pth

Thank you very much for your code! But where can I find that model for fine-tuning? Or it need to be trained by myself?

bilinear sqrt with sign

Hi, it is a concise and useful code for bilinear CNNs, however, from the paper I read about the
" elementwise signed square-root (x ← sign(x)􏰊|x|) and l2 normalization is applied to the matrix A"
which means it should be multiplied by the sign. But in this code just "X = torch.sqrt(X + 1e-5)"

Am I missing something? and even this not same thoroughly, I got the same result (84.2%) which suggests it should be a right answer?

About step1 and step2

Hi, is stetp2 's network based on step1's FC parameters or just training a vgg16 net from scratch?

Signed square root

Hi Hao

First of all thanks for the excellent implementation. I have used the code here as a reference for my own implementations.

In the original paper (http://vis-www.cs.umass.edu/bcnn/docs/bcnn_iccv15.pdf) the authors have used signed square root operation. Something like:

X = torch.mul(torch.sign(X),torch.sqrt(torch.abs(X)+1e-5))

instead of the normal square root you have used X = torch.sqrt(X + 1e-5)

Was there a particular reason for using this ?

confusion about the parameters of Normalize

Regarding line 129-130 in bilinear_cnn_fc.py, I'm confused about the magic number in

Normalize(mean=(0.485, 0.456, 0.406),
                 std=(0.229, 0.224, 0.225))

where are these numbers from?

外积问题

X = torch.bmm(X, torch.transpose(X, 1, 2)) / (28**2) # Bilinear

特征图A的尺寸为(C,M),B的尺寸为(C,N)
论文中提到
If fA and fB extract features of size C ×M and
C ×N respectively, then Φ(I) is of size M × N.
但是按照您的写法,这个结果是C × C
但是论文experiment部分似乎结果也是您的512*512,即C × C
我很困惑,望您能解答以下,谢谢。

Hi

Hi,I just used your class BCNN as a module, but what i get is the same classification result of different images.
Is there something wrong?

here is the output of predict class and truct class

predict_class: tensor([61, 61, 61, 61, 61, 61, 61, 61], device='cuda:0')
truth_class: tensor([180, 151, 187, 33, 70, 36, 109, 54], device='cuda:0')

the training process:

            data = data.to(opt.device)
            label = (label).to(opt.device)
            optimizer.zero_grad()
            score  = bcnn_model(data)
            loss = criterion(score,label)
            loss.backward()
            optimizer.step()

question about the bilinear pooling operation

in the forward function of BCNN class, the bilinear operation is

X = torch.bmm(X, torch.transpose(X, 1, 2)) / (28**2) # Bilinear

why does it require the result of matrix multiplication being divided by (28 ** 2)?

About step 2

@HaoMood ,你好,非常感谢你的工作。
我在跑你代码时第一步的test acc是76%,保存的最好结果的model是vgg_16_epoch_21.pth
但是在step 2中load 这个21.pth 的model得到的train acc 是1%, test acc 是0.
请问这是什么问题?

zombie process when using multiple gpu

hi, thanks a lot for your code! Everything works well when I only use one gpu by setting cuda_visible_devices=0 (for example), but when I use multiple gpus by setting cuda_visible_devices=0,1 (for example), the process will become a zombie process, which means it is not actually training, but it still holds the gpu and cpu resources. What's the worst is, you even cannot kill it through "kill -9 PID". The only thing you can do is a reboot. Have u come across the same issue before? Thanks a lot!

About memory

In your README, I see you used 4 gpus. So, how much memory has been used totally in your step1?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.