Giter Site home page Giter Site logo

yuhuayc / da-faster-rcnn Goto Github PK

View Code? Open in Web Editor NEW
331.0 331.0 70.0 7.41 MB

An implementation of our CVPR 2018 work 'Domain Adaptive Faster R-CNN for Object Detection in the Wild'

License: Other

CMake 1.42% Makefile 0.31% HTML 0.10% CSS 0.12% Jupyter Notebook 47.57% C++ 38.16% Shell 0.37% Python 8.03% Cuda 2.81% MATLAB 0.96% C 0.13% Objective-C 0.01% M 0.01%

da-faster-rcnn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

da-faster-rcnn's Issues

training problem

hi , @yuhuayc
In you Usage section, you mentioned that

source domain data should start with the filename 'source_', and target domain data with 'target_'.

1 But i can't find any codes including 'source_' or 'target_' in pascal_voc.py (if i train on pascal) which is used to load data. Please tell me the reason and how to fix it,asap.
2 Besides,could you tell me more details about how to train on pascal_voc or my own datasets:

  1. what hyperparameters or cfgfile should i modify?
  2. what details should i noticed?

cannot reproduce DA results from Table I in the paper

Hi,

I'm trying to reproduce results of the Domain Adaptive Faster R-CNN for GTA Sim 10k -> Cityscapes domain shift (last row in Table I in your paper). However, I'm getting much lower numbers somehow.

My training data looks as follows (trainval.txt file):

source_3384639
source_3384643
source_3384645
...
target_aachen_000000_000019_leftImg8bit
target_aachen_000001_000019_leftImg8bit
target_aachen_000002_000019_leftImg8bit
...

source_* are GTA Sim 10k images from Driving in the Matrix, while target_* are Cityscapes train images (2975). Therefore, 12975 images altogether in trainval. In GTA Sim 10k annotations I have replaced motorbike class by Cityscapes' motorcycle.

The test data (test.txt) consists of Cityscapes val images (500):

target_frankfurt_000001_023369_leftImg8bit
target_frankfurt_000001_075296_leftImg8bit
target_frankfurt_000000_006589_leftImg8bit
...

For the training I'm just following instructions from your GitHub page:

./tools/train_net.py --gpu 0 --solver models/da_faster_rcnn/solver.prototxt --weights data/imagenet_models/VGG16.v2.caffemodel --imdb voc_2007_trainval --iters 70000 --cfg models/da_faster_rcnn/faster_rcnn_end2end.yml

I'm getting the following car AP values:

10k: 32.72
20k: 31.12
30k: 33.51
40k: 33.98
50k: 30.90
60k: 33.53
70k: 33.94

These numbers are much lower as compared to 38.97 from the paper. Do you have any idea what might be wrong in my pipeline?

Thanks a lot in advance for your help! Very much appreciated!

Best,
Alexey

Code complete?

hi, did you give all your code?
I found that all the file except prototxt are the same as py-faster-rcnn?

Run prepare_data.m error

`Unable to perform assignment because the left and right sides have a different number of elements.

Error in prepare_data (line 103)
gt.category(i_inst) = lb_filter(inst_lb);`

How can I solve this error?
Thank you.

Details of source target setting for KITTI to Cityscapes and Cityscapes to KITTI

@yuhuayc Hi! yuhua, thanks for your wonderful work, I'm wondering for the K->C, do you use KITTI full training data as source, and Cityscapes train as target unlabeled data, Cityscapes val as target test data? for C->K, do you use Cityscapes full trainval as source training data, and split KITTI for target unlabeled data and test data? I'm reproducing this part and haven't found details about this in code and paper, looking forward to you reply!

Experimental detail

When performing the task K->C, only the car class is used when training the KITTI dataset? Or do all classes train and only show the AP of the car class in the final result?

Questions about experiment setting.

Hi!
Nice work! I want to have a try on cityscape foggy dataset, but I do not find the exact experiment setting in the paper. Do you take cityscape training dataset as source and take foggy validation dataset as target? And the quantitative result is computed on foggy validation dataset?
By the way, I find that the instance label of cityscape test is full of 3. So there is actually no instance label for cityscape test dataset?
Thanks.

Questions

  1. The paper says you are using a batch_size = 2,however, the py-faster-rcnn your code based on does not seem to support batch_size = 2.
  2. What does the slice_feats node do in the training phase?
  3. Why is the slice_point set to 128?

what should I do if I want use 2 or more gpus?

./tools/train_net.py --gpu {GPU_ID} --solver models/da_faster_rcnn/solver.prototxt --weights data/faster_rcnn_models/VGG16_faster_rcnn_final.caffemodel --imdb voc_2007_trainval --iters 70000 --cfg models/da_faster_rcnn/faster_rcnn_end2end.yml

Weight (lambda) value for image level adaptation and hyperparameter request

Hi I found weight value for image level adaptation loss on "train.prototext" set to 1.0, which is not consistent with your paper(all lambda set to 0.1).

layer {
  name: "da_conv_loss"
  type: "SoftmaxWithLoss"
  bottom: "da_score_ss"
  bottom: "da_label_ss_resize"
  top: "da_conv_loss"
  loss_param {
    ignore_label: 255
    normalize: 1
  }
  propagate_down: 1
  propagate_down: 0
  loss_weight: 1
}

Also "lr_mult" for instance level domain classifier have 10 times more value than other conv of fc.

layer {
  name: "dc_ip3"
  type: "InnerProduct"
  bottom: "dc_ip2"
  top: "dc_ip3"
  param {
    lr_mult: 10
  }
  param {
    lr_mult: 20
  }
  inner_product_param {
    num_output: 1
    weight_filler {
      type: "gaussian"
      # std: 0.3
      std: 0.05
    }
    bias_filler {
      type: "constant"
    }
  }
}

Can you provide exact hyperparameters on "loss_weight", "lr_mult", "gradient_scaler_param" you used on your paper?
It would be appriciated to get the hypereparameters for each setting(image level DA, image + instance level DA, image + instance level DA + consistency loss) and dataset(sim10k->cityscapes, cityscapes->citysacpes_foggy, kitty <-> cityscapes). Thank you.

How to download foggy image set?

Hi. I'm trying to download cityscape dataset for training. I cannot download leftImg8bit_trainvaltest_foggy unless I send a request. However, I couldn't find contact information to send requests.

How could I send request to download the foggy dataset? Thx.

Is "need_backprop" used in the "GradientSilent" layer?

@yuhuayc I think the input "need_backprop" is used to decide whether the gradient should be propagate backward or not. However, I can not find any reference in the defination of the "GradientSilent" layer in'\caffe-fast-rcnn\src\caffe\layers\gradient_silent_layer.cpp'. I wonder if I misunderstand the usage of the input "need_backprop".

layer {
name: "loss_cls_filter"
type: "GradientSilent"
bottom: "cls_score"
bottom: "need_backprop"
top: "cls_score_filter"
}

Few questions for adaptation strategy.

@yuhuayc For domain adaptation of detectors from the source domain to the target domain, we always train from the ImageNet pre-trained model. However, in practical application, the pre-trained model on the source domain is usually available. Why don't we fine-tune from the pre-trained model on the source domain model, but fine-tune from the ImageNet pre-trained model. The latter seems to take more time. Could you explain the reason for this?

problem about prepare_data.m

system(sprintf('find %s -name "*.png"', ...
fullfile(source_data_dir,'leftImg8bit','train')))
拒绝访问 - CITYSCAPES\LEFTIMG8BIT\TRAIN
找不到文件 - -NAME

ans =

 1

How do you select the ROIS for unlabeled images from target domain ?

The problem is that, by definition, you do not have groud truth for unlabelled images.

However you need groudtruth

  • in the definition of your blob objects, in lib.roi_data_da_layer.minibatch
  • in order to select the ROIS

I do not see in the code how you made it possible... and I am trying to reproduce the paper in TF

I can't run the prepare_data.m

line 103 gt.category(i_inst) = lb_filter(inst_lb)      
the number of elements in A and B must be the same. There is anybody else faced this issue ? My matlab is R2016b 

question about anchor scales

Hello, I am running the original baseline from KITTI to cityscape. When I use ANCHOR_SCALES=[8,16,32], I get similar results in your paper, but when I adjust ANCHOR_SCALES to [4, 8, 16,32], I got a very high result about 42% on the baseline, much higher than the results of your paper. It seems that domain adaptation on this dataset doesn't seem to work, but it's also possible that something went wrong when I implemented it. Could you set ANCHOR_SCALES to [4, 8,16,32] and run it? Thank You!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.