yuhuayc / da-faster-rcnn Goto Github PK

An implementation of our CVPR 2018 work 'Domain Adaptive Faster R-CNN for Object Detection in the Wild'

License: Other

CMake 1.42% Makefile 0.31% HTML 0.10% CSS 0.12% Jupyter Notebook 47.57% C++ 38.16% Shell 0.37% Python 8.03% Cuda 2.81% MATLAB 0.96% C 0.13% Objective-C 0.01% M 0.01%

da-faster-rcnn's People

Stargazers

Watchers

Forkers

chriszhenghaochen simonsroad zgsxwsdxg mahfujau wjx2 eric-zhang1990 xiaopingzeng ml-lab jdc08161063 stevenlol fendaq happog fzulj lly2111101 fujenchu zqdeepbluesky bhushan23 solomon1588 arasharchor milulee yuzhms cndylan fcinter zhiqiangwan zwfightzw jeromemutgeert divyam02 zhen-ao gxwuustc wrccrwx dreadlord1984 kinredon baby47 pgrady3 jethrotan zfxu chenjinbit viniciusarruda amirunpri2018 tkkim93 blankworld linhduongtuan youngbaby123 myclab dingxuewen bityangke bodhisattwa-chakraborty amzhanghan zhangwenwen youtang1993 pengchengpcx chaoswjc apurvanandan1997 gcv9htd rosyapril henlong hhy-ee ddghost ve-yyq wuqiman kouxichao taotaoxu piaofu110 aaronhd xyl-py weihao97 jaycheney janus103 louderthanthunderx1 degraded-ai-vision-lab

da-faster-rcnn's Issues

training problem

hi , @yuhuayc
In you Usage section, you mentioned that

source domain data should start with the filename 'source_', and target domain data with 'target_'.

1 But i can't find any codes including 'source_' or 'target_' in pascal_voc.py (if i train on pascal) which is used to load data. Please tell me the reason and how to fix it,asap.
2 Besides,could you tell me more details about how to train on pascal_voc or my own datasets:

what hyperparameters or cfgfile should i modify?
what details should i noticed?

cannot reproduce DA results from Table I in the paper

Hi,

I'm trying to reproduce results of the Domain Adaptive Faster R-CNN for GTA Sim 10k -> Cityscapes domain shift (last row in Table I in your paper). However, I'm getting much lower numbers somehow.

My training data looks as follows (trainval.txt file):

source_3384639
source_3384643
source_3384645
...
target_aachen_000000_000019_leftImg8bit
target_aachen_000001_000019_leftImg8bit
target_aachen_000002_000019_leftImg8bit
...

source_* are GTA Sim 10k images from Driving in the Matrix, while target_* are Cityscapes train images (2975). Therefore, 12975 images altogether in trainval. In GTA Sim 10k annotations I have replaced motorbike class by Cityscapes' motorcycle.

The test data (test.txt) consists of Cityscapes val images (500):

target_frankfurt_000001_023369_leftImg8bit
target_frankfurt_000001_075296_leftImg8bit
target_frankfurt_000000_006589_leftImg8bit
...

For the training I'm just following instructions from your GitHub page:

./tools/train_net.py --gpu 0 --solver models/da_faster_rcnn/solver.prototxt --weights data/imagenet_models/VGG16.v2.caffemodel --imdb voc_2007_trainval --iters 70000 --cfg models/da_faster_rcnn/faster_rcnn_end2end.yml

I'm getting the following car AP values:

10k: 32.72
20k: 31.12
30k: 33.51
40k: 33.98
50k: 30.90
60k: 33.53
70k: 33.94

These numbers are much lower as compared to 38.97 from the paper. Do you have any idea what might be wrong in my pipeline?

Thanks a lot in advance for your help! Very much appreciated!

Best,
Alexey

propossal_target_layer.py

when i try to run demo but get below result

@yuhuayc what should i do to address this issue

Is the consistency regularization exists in train.prototxt?

It seems that the train.prototxt file does not contain the the consistency regularization in your paper.
Is there something I overlook? If not, could you supply a full file? Thanks a lot!

Code complete?

hi, did you give all your code?
I found that all the file except prototxt are the same as py-faster-rcnn?

Run prepare_data.m error

`Unable to perform assignment because the left and right sides have a different number of elements.

Error in prepare_data (line 103)
gt.category(i_inst) = lb_filter(inst_lb);`

How can I solve this error?
Thank you.

Can you provide a detail readme to start the project?

For example,the related datasets and how to test their performance? Thanks!

Did you test on the pascal voc dataset?

Details of source target setting for KITTI to Cityscapes and Cityscapes to KITTI

@yuhuayc Hi! yuhua, thanks for your wonderful work, I'm wondering for the K->C, do you use KITTI full training data as source, and Cityscapes train as target unlabeled data, Cityscapes val as target test data? for C->K, do you use Cityscapes full trainval as source training data, and split KITTI for target unlabeled data and test data? I'm reproducing this part and haven't found details about this in code and paper, looking forward to you reply!

Experimental detail

When performing the task K->C, only the car class is used when training the KITTI dataset? Or do all classes train and only show the AP of the car class in the final result?

必须在Linux系统实现吗

Questions about experiment setting.

Hi!
Nice work! I want to have a try on cityscape foggy dataset, but I do not find the exact experiment setting in the paper. Do you take cityscape training dataset as source and take foggy validation dataset as target? And the quantitative result is computed on foggy validation dataset?
By the way, I find that the instance label of cityscape test is full of 3. So there is actually no instance label for cityscape test dataset?
Thanks.

Questions

The paper says you are using a batch_size = 2,however, the py-faster-rcnn your code based on does not seem to support batch_size = 2.
What does the slice_feats node do in the training phase?
Why is the slice_point set to 128?

Hi， the missing code GIO_DA

In your parer "Learning Semantic Segmentation from Synthetic Data:
A Geometrically Guided Input-Output Adaptation Approach",
You release the code in the github
http://github.com/yuhuayc/gio-ada.
But I cann't find it. So I want to know how to get it

what should I do if I want use 2 or more gpus?

./tools/train_net.py --gpu {GPU_ID} --solver models/da_faster_rcnn/solver.prototxt --weights data/faster_rcnn_models/VGG16_faster_rcnn_final.caffemodel --imdb voc_2007_trainval --iters 70000 --cfg models/da_faster_rcnn/faster_rcnn_end2end.yml

I can't download left8Img_trainvaltest_foggy.zip

I use my student mail to creat a count. But I find I can not download the foggy dataset. I only can download Transmittance.zip

Is there a tensorflow version about da-faster-rcnn?

Weight (lambda) value for image level adaptation and hyperparameter request

Hi I found weight value for image level adaptation loss on "train.prototext" set to 1.0, which is not consistent with your paper(all lambda set to 0.1).

layer {
  name: "da_conv_loss"
  type: "SoftmaxWithLoss"
  bottom: "da_score_ss"
  bottom: "da_label_ss_resize"
  top: "da_conv_loss"
  loss_param {
    ignore_label: 255
    normalize: 1
  }
  propagate_down: 1
  propagate_down: 0
  loss_weight: 1
}

Also "lr_mult" for instance level domain classifier have 10 times more value than other conv of fc.

layer {
  name: "dc_ip3"
  type: "InnerProduct"
  bottom: "dc_ip2"
  top: "dc_ip3"
  param {
    lr_mult: 10
  }
  param {
    lr_mult: 20
  }
  inner_product_param {
    num_output: 1
    weight_filler {
      type: "gaussian"
      # std: 0.3
      std: 0.05
    }
    bias_filler {
      type: "constant"
    }
  }
}

Can you provide exact hyperparameters on "loss_weight", "lr_mult", "gradient_scaler_param" you used on your paper?
It would be appriciated to get the hypereparameters for each setting(image level DA, image + instance level DA, image + instance level DA + consistency loss) and dataset(sim10k->cityscapes, cityscapes->citysacpes_foggy, kitty <-> cityscapes). Thank you.

why target-domain pictures for training also need annotations?

why target-domain pictures for training also need annotations?Target-domain pictures' annotations in real world need much manual work and time,if training needs those annotations,how to adptive real world?

Any dataset related to the paper?

It seems that pascal is not used in your paper.

and some dataset preparing code releasing will be better.

How to download foggy image set?

Hi. I'm trying to download cityscape dataset for training. I cannot download leftImg8bit_trainvaltest_foggy unless I send a request. However, I couldn't find contact information to send requests.

How could I send request to download the foggy dataset? Thx.

Is "need_backprop" used in the "GradientSilent" layer?

@yuhuayc I think the input "need_backprop" is used to decide whether the gradient should be propagate backward or not. However, I can not find any reference in the defination of the "GradientSilent" layer in'\caffe-fast-rcnn\src\caffe\layers\gradient_silent_layer.cpp'. I wonder if I misunderstand the usage of the input "need_backprop".

layer {
name: "loss_cls_filter"
type: "GradientSilent"
bottom: "cls_score"
bottom: "need_backprop"
top: "cls_score_filter"
}

Few questions for adaptation strategy.

@yuhuayc For domain adaptation of detectors from the source domain to the target domain, we always train from the ImageNet pre-trained model. However, in practical application, the pre-trained model on the source domain is usually available. Why don't we fine-tune from the pre-trained model on the source domain model, but fine-tune from the ImageNet pre-trained model. The latter seems to take more time. Could you explain the reason for this?

problem about prepare_data.m

system(sprintf('find %s -name "*.png"', ...
fullfile(source_data_dir,'leftImg8bit','train')))
拒绝访问 - CITYSCAPES\LEFTIMG8BIT\TRAIN
找不到文件 - -NAME

ans =

How do you select the ROIS for unlabeled images from target domain ?

The problem is that, by definition, you do not have groud truth for unlabelled images.

However you need groudtruth

in the definition of your blob objects, in lib.roi_data_da_layer.minibatch
in order to select the ROIS

I do not see in the code how you made it possible... and I am trying to reproduce the paper in TF

I can't run the prepare_data.m

line 103 gt.category(i_inst) = lb_filter(inst_lb)      
the number of elements in A and B must be the same. There is anybody else faced this issue ? My matlab is R2016b

question about anchor scales

Hello, I am running the original baseline from KITTI to cityscape. When I use ANCHOR_SCALES=[8,16,32], I get similar results in your paper, but when I adjust ANCHOR_SCALES to [4, 8, 16,32], I got a very high result about 42% on the baseline, much higher than the results of your paper. It seems that domain adaptation on this dataset doesn't seem to work, but it's also possible that something went wrong when I implemented it. Could you set ANCHOR_SCALES to [4, 8,16,32] and run it? Thank You!

yuhuayc / da-faster-rcnn Goto Github PK

da-faster-rcnn's People

Stargazers

Watchers

Forkers

da-faster-rcnn's Issues

Recommend Projects

Recommend Topics

Recommend Org