Giter Site home page Giter Site logo

adamdad / consistentteacher Goto Github PK

View Code? Open in Web Editor NEW
283.0 5.0 17.0 9.42 MB

[CVPR2023 Highlight] Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection

License: Apache License 2.0

Makefile 0.11% Python 99.06% Shell 0.83%
mmdetection object-detection semi-supervised-learning semi-supervised-object-detection ssod sample-efficiency

consistentteacher's Introduction

πŸ§‘β€πŸ« Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection πŸ§‘β€πŸ«

PWC PWC PWC PWC PWC

This repository contains the offical implementation for our CVPR-2023 paper.

✨We are now able to train detector on 10% MS-COCO to 40 mAP✨

Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection

[arxiv] [code] [project page]

Xinjiang Wang*, Xingyi Yang*, Shilong Zhang, Yijiang Li, Litong Feng, Shijie Fang, Chengqi Lyu, Kai Chen, Wayne Zhang

(*: Co-first Author)

  • Selected as Hightligh for CVPR2023πŸ”₯ (235/2360, top 10% accepted paper)

In this paper, we systematically investigate the inconsistency problems in semi-supervised object detection, where the pseudo boxes may be highly inaccurate and vary greatly at different stages of training. To alleviate the aforementioned problem, we present a holistic semi-supervised object detector termed Consistent-Teacher. Consistent-Teacher achieves compelling improvement on a wide range of evaluations and serves as a new solid baseline for SSOD.

Main Results

All results, logs, configs and checkpoints are listed here. Enjoy πŸ‘€!

MS-COCO 1%/2%/5/%/10% Labeled Data

Method Data mAP config Links Google Drive Baidu Drive
ConsistentTeacher MS-COCO 1% 25.50 config log/ckpt log/ckpt log/ckpt
ConsistentTeacher MS-COCO 2% 30.70 config log/ckpt log/ckpt log/ckpt
ConsistentTeacher MS-COCO 5% 36.60 config log/ckpt log/ckpt log/ckpt
ConsistentTeacher MS-COCO 10% 40.20 config log/ckpt log/ckpt log/ckpt
ConsistentTeacher 2x8 MS-COCO 10% 38.00 config log/ckpt log/ckpt log/ckpt
ConsistentTeacher 2x8 (FP16) MS-COCO 10% 37.90 config log/ckpt log/ckpt log/ckpt

MS-COCO100% Labeled + Unlabeled Data

Method Data mAP config Links Google Drive Baidu Drive
ConsistentTeacher 5x8 MS-COCO 100% + unlabeled 48.20 config log/ckpt log/ckpt log/ckpt

PASCAL VOC07 Label + VOC12 Unlabel

Method Data mAP AP50 config Links
ConsistentTeacher PASCAL VOC07 Label + VOC12 Unlabel 59.00 81.00 config log/ckpt

Notes

  • Defaultly, all models are trained on 8*V100 GPUs with 5 images per GPU.
  • Additionally, we support the 2x8 and fp16 training setting to ensure everyone is able to run the code, even with only 12G graphic cards.
  • With 8x2+fp16, the total training time for MS-COCO is less than 1 day.
  • We carefully tuned the hyper-parameters after submitting the paper, which is why the results in the repository are slightly higher than those reported in the paper.

Visualizations

Zoom in for better View.

File Orgnizations

β”œβ”€β”€ configs              
    β”œβ”€β”€ baseline
    β”‚   |-- mean_teacher_retinanet_r50_fpn_coco_180k_10p.py       
    |       # Mean Teacher COCO 10% config
    |   |-- mean_teacher_retinanet_r50_fpn_voc0712_72k.py      
    |       # Mean Teacher VOC0712 config
    β”œβ”€β”€ consistent-teacher
    |   |-- consistent_teacher_r50_fpn_coco_360k_fulldata.py           
    |       # Consistent Teacher COCO label+unlabel config
    |
    |   |-- consistent_teacher_r50_fpn_coco_180k_1/2/5/10p.py           
    |       # Consistent Teacher COCO 1%/2%/5%/10% config
    |   |-- consistent_teacher_r50_fpn_coco_180k_10p_2x8.py     
    |       # Consistent Teacher COCO 10% config with 8x2 GPU
    |   |-- consistent_teacher_r50_fpn_voc0712_72k.py             
    |       # Consistent Teacher VOC0712 config
β”œβ”€β”€ ssod
    |-- models/mean_teacher.py                           
    |   # Consistent Teacher Class file
    |-- models/consistent_teacher.py                     
    |   # Consistent Teacher Class file
    |-- models/dense_heads/fam3d.py                      
    |   # FAM-3D Class file
    |-- models/dense_heads/improved_retinanet.py                      
    |   # ImprovedRetinaNet baseline file
    |-- core/bbox/assigners/dynamic_assigner.py
    |   # Aadaptive Sample Assignment Class file
β”œβ”€β”€ tools
    |-- dataset/semi_coco.py
    |   # COCO data preprocessing
    |-- train.py/test.py
    |   # Main file for train and evaluate the models

Usage

Requirements

  • Pytorch=1.9.0
  • mmdetection=2.25.0
  • mmcv=1.3.9
  • wandb=0.10.31

or

  • mmdetection=2.28.1
  • mmcv=1.7.1

Notes

  • We use wandb for visualization, if you don't want to use it, just comment line 328-339 in configs/consistent-teacher/consistent_teacher_r50_fpn_coco_180k_10p.py.

Installation

Install all the requirements INSTALL, then git pull the mmdetecton repo and ConsistentTeacher under the same folder

git clone https://github.com/open-mmlab/mmdetection.git
git clone https://github.com/Adamdad/ConsistentTeacher.git
cd ConsistentTeacher/
make install

Data Preparation

COCO Dataset

  • Download the COCO dataset
  • Execute the following command to generate data set splits:
# YOUR_DATA should be a directory contains coco dataset.
# For eg.:
# YOUR_DATA/
#  coco_semi/
#     instances_train2017.${fold}@${percent}.json
#  coco/
#     train2017/
#     val2017/
#     unlabeled2017/
#     annotations/
ln -s ${YOUR_DATA} data
bash tools/dataset/prepare_coco_data.sh conduct

For concrete instructions of what should be downloaded, please refer to tools/dataset/prepare_coco_data.sh line 11-24

VOC0712 Dataset

  • Download JSON files for unlabeled images PASCAL VOC data in COCO format
cd ${DATAROOT}

wget https://storage.cloud.google.com/gresearch/ssl_detection/STAC_JSON.tar
tar -xf STAC_JSON.tar.gz
# voc/VOCdevkit/VOC2007/instances_test.json
# voc/VOCdevkit/VOC2007/instances_trainval.json
# voc/VOCdevkit/VOC2012/instances_trainval.json

Training

  • To train model on the partial labeled data and full labeled data setting:
# CONFIG_FILE_PATH: the config file for experiment.
# GPU_NUM: number of gpus to run the job
bash tools/dist_train.sh <CONFIG_FILE_PATH> <NUM_GPUS>

For example, to train ours R50 model with 8 GPUs:

bash tools/dist_train.sh configs/consistent-teacher/consistent_teacher_r50_fpn_coco_180k_10p.py 8
  • To train model on new dataset:

The core idea is to convert a new dataset to coco format. Details about it can be found in the adding new dataset.

Inference and Demo

  • To inference with the pretrained models on images and videos and plot the bounding boxes, we add two scripts
    • tools/inference.py for image inference
    • tools/inference_vido.py for video inference

License

This project is released under the Apache 2.0 license.

Citation

@article{wang2023consistent,
    author    = {Xinjiang Wang, Xingyi Yang, Shilong Zhang, Yijiang Li, Litong Feng, Shijie Fang, Chengqi Lyu, Kai Chen, Wayne Zhang },
    title     = {Consistent-Teacher: Towards Reducing Inconsistent Pseudo-targets in Semi-supervised Object Detection},
    journal   = {The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR)},
    year      = {2023},
}

Acknowledgement

consistentteacher's People

Contributors

adamdad avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

consistentteacher's Issues

MMDetection 3.0

Hi, great paper! Are there plans to add this to MMDetection main repo?

make install

hello,Is there any way to make install after so that my mmdet is not the latest version?

Negative assignment

Hello, thank you for sharing a code for your work!

I have a question for ASA.
You made a positive assignment for model's prediction, but how did you assign negative label?

Multi card parallel

image
May I ask if there is a parallel issue with multiple cards here? Is there any solution? I used four v100 graphics cards to run.

imagedemo

May I ask, if I want to output the image detection result of the model, can I use the official mmdetection imagedemo.py file directly?

AttributeError: 'NoneType' object has no attribute 'GaussianMixture'

Hello, when I use a custom coco format dataset and in a single GPU I use the command :
python tools/train.py configs/consistent-teacher/consistent_teacher_r50_fpn_coco_720k_fulldata.py
The following problems were encountered.
My versions: cuda=11.4 pytorch=1.9.0, mmcv=1.7.1,mmdet=2.28.1

Traceback (most recent call last):
File "tools/train.py", line 198, in
main()
File "tools/train.py", line 193, in main
meta=meta,
File "/home/yzf/Desktop/semi/ConsistentTeacher/ssod/apis/train.py", line 209, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/yzf/miniconda3/envs/consemi/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 144, in run
iter_runner(iter_loaders[i], **kwargs)
File "/home/yzf/miniconda3/envs/consemi/lib/python3.7/site-packages/mmcv/runner/iter_based_runner.py", line 64, in train
outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
File "/home/yzf/miniconda3/envs/consemi/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 77, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/home/yzf/Desktop/semi/ConsistentTeacher/thirdparty/mmdetection/mmdet/models/detectors/base.py", line 248, in train_step
losses = self(**data)
File "/home/yzf/miniconda3/envs/consemi/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yzf/miniconda3/envs/consemi/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 119, in new_func
return old_func(*args, **kwargs)
File "/home/yzf/Desktop/semi/ConsistentTeacher/thirdparty/mmdetection/mmdet/models/detectors/base.py", line 172, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home/yzf/Desktop/semi/ConsistentTeacher/ssod/models/consistent_teacher.py", line 73, in forward_train
data_groups["unsup_teacher"], data_groups["unsup_student"]
File "/home/yzf/Desktop/semi/ConsistentTeacher/ssod/models/consistent_teacher.py", line 119, in foward_unsup_train
gt_bboxes=[teacher_data['gt_bboxes'][idx] for idx in tidx],
File "/home/yzf/Desktop/semi/ConsistentTeacher/ssod/models/consistent_teacher.py", line 327, in extract_teacher_info
policy=self.train_cfg.get('policy', 'high'))
File "/home/yzf/Desktop/semi/ConsistentTeacher/ssod/models/consistent_teacher.py", line 242, in gmm_policy
gmm = skm.GaussianMixture(
AttributeError: 'NoneType' object has no attribute 'GaussianMixture'

What is this problem and what to do?
Thank you!

Question about the missing ASA in consistent_teacher.py.

The code of ASA confuse me , in the released version of your code about consistent_teacher.py . the unsup loss is directly calculated after GMM filter.
It seems like you use ASA only in the sup loss ,when I look back to the part of retina model, the assigner exists.
But if i refer to your paper, you said that you use GMM and ASA to get better PL for Pseudo_label_loss ?
Need Help!Thanks!

ASA

Hi,

Congrats on your paper acceptance. I would like to know which part of the code contains the logic to "Adaptive Sample Assignment (ASA)".

unsup_loss is 0

a780a9cecf442aea6a4e7007dae0aa8

is it right if the unsup_loss_cls and unsup_loss_bbox are 0 in this stage?

Error:"DynamicSoftLabelAssigner is not in the bbox_assigner registry:

Hi,
When run tools/test.py, an error happens:
My versions: cuda=10.1 pytorch=1.9.0, mmcv=1.7.1,mmdet=2.28.1

Traceback (most recent call last):
File "/home/iry/software/anaconda3/envs/ct/lib/python3.8/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
return obj_cls(**args)
File "/home/iry/Code/ConsistentTeacher/ssod/models/dense_heads/fam3d.py", line 50, in init
super().init(num_classes, in_channels, **kwargs)
File "/home/iry/Code/ConsistentTeacher/mmdetection/mmdet/models/dense_heads/atss_head.py", line 50, in init
super(ATSSHead, self).init(
File "/home/iry/Code/ConsistentTeacher/mmdetection/mmdet/models/dense_heads/anchor_head.py", line 83, in init
self.assigner = build_assigner(self.train_cfg.assigner)
File "/home/iry/Code/ConsistentTeacher/mmdetection/mmdet/core/bbox/builder.py", line 11, in build_assigner
return build_from_cfg(cfg, BBOX_ASSIGNERS, default_args)
File "/home/iry/software/anaconda3/envs/ct/lib/python3.8/site-packages/mmcv/utils/registry.py", line 61, in build_from_cfg
raise KeyError(
KeyError: 'DynamicSoftLabelAssigner is not in the bbox_assigner registry'

Thank You!!
image

The value of the dynamic_ratio in GMM

I have noticed that the file in cfg sets the value of the dynamic_ratio to 1,I want to know whether it is effective?
for i, proposals in enumerate(proposal_list):
dynamic_ratio = self.train_cfg.dynamic_ratio
scores = proposals[:, 4].clone()
scores = scores.sort(descending=True)[0]
if len(scores) == 0:
thrs.append(1) # no kept pseudo boxes
else:
# num_gt = int(scores.sum() + 0.5)
num_gt = int(scores.sum() * dynamic_ratio + 0.5)
num_gt = min(num_gt, len(scores) - 1)
thrs.append(scores[num_gt] - 1e-5)

eval use released checkout by test.py

Thanks for your interesting work first!
How to evaluate released checkpoints by test.py?

I have got a error: KeyError: "FAM3DHead: 'DynamicSoftLabelAssigner is not in the bbox_assigner registry'"

Thanks!

Question about the possibility of providing mmdetection code

Hello author, Given that the latest mmdetection code in github requires mmcv>=2.0.0 which makes some functions in some code appear to be transferred and cannot be referenced. So I tried to adapt your code with previous versions of mmdetection code, but my code version is too old and many codes have been refactored, so I would like to ask if you can provide the mmdetection version of your code like the PseCo project code, thinks a lot.

RuntimeError: DeformConv is not compiled with GPU support

I had an error while running detect code, but inference code is okay.
RuntimeError: DeformConv is not compiled with GPU support
my env
mmcv-full : 1.7.1
mmdetection : 2.28.0
python :3.9
torch:1.12.1
cuda:11.6
Did anyone solved this issue?

Thanks!

Congratulations on your paper being accepted by CVPR!

When I was developing a semi-supervised detection library, I followed your work closely. I was sorry to see that your paper was withdrawn from ICLR, but congratulations on getting it accepted at CVPR!! It is a very meaningful piece of work.

why Loss==0

I installed an environment on Win using Python tools/train. py configs consistent processor consistent_ teacher_ r50_ fpn_ coco_ 180k_ How can I solve the problem of loss==0 when running the project on 1p. py

`2023-04-23 12:49:22,302 - mmdet.ssod - INFO - [<StreamHandler (INFO)>, <FileHandler E:\Object-Detection\Github\consist\ConsistentTeacher\work_dirs\consistent_teacher_r50_fpn_coco_180k_1p\20230423_124922.log (INFO)>]
2023-04-23 12:49:22,303 - mmdet.ssod - INFO - Environment info:

sys.platform: win32
Python: 3.7.11 (default, Jul 27 2021, 09:42:29) [MSC v.1916 64 bit (AMD64)]
CUDA available: True
GPU 0: NVIDIA GeForce RTX 3070 Laptop GPU
CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1
NVCC: Build cuda_11.1.relgpu_drvr455TC455_06.29190527_0
GCC: gcc (Rev2, Built by MSYS2 project) 10.3.0
PyTorch: 1.9.0+cu111
PyTorch compiling details: PyTorch built with:

  • C++ Version: 199711
  • MSVC 192829337
  • Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  • OpenMP 2019
  • CPU capability usage: AVX2
  • CUDA Runtime 11.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  • CuDNN 8.0.5
  • Magma 2.5.4
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=C:/w/b/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/w/b/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON,

TorchVision: 0.10.0+cu111
OpenCV: 4.5.5
MMCV: 1.4.2
MMCV Compiler: MSVC 192930137
MMCV CUDA Compiler: 11.1
MMDetection: 2.25.0+1fa6477

2023-04-23 12:49:24,106 - mmdet.ssod - INFO - Distributed training: False
2023-04-23 12:49:25,727 - mmdet.ssod - INFO - Config:
dataset_type = 'CocoDataset'
data_root = 'data/coco/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='Sequential',
transforms=[
dict(
type='RandResize',
img_scale=[(1333, 400), (1333, 1200)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandFlip', flip_ratio=0.5),
dict(
type='OneOf',
transforms=[
dict(type='Identity'),
dict(type='AutoContrast'),
dict(type='RandEqualize'),
dict(type='RandSolarize'),
dict(type='RandColor'),
dict(type='RandContrast'),
dict(type='RandBrightness'),
dict(type='RandSharpness'),
dict(type='RandPosterize')
])
],
record=True),
dict(type='Pad', size_divisor=32),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ExtraAttrs', tag='sup'),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels'],
meta_keys=('filename', 'ori_shape', 'img_shape', 'img_norm_cfg',
'pad_shape', 'scale_factor', 'tag'))
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]
data = dict(
samples_per_gpu=5,
workers_per_gpu=1,
train=dict(
type='SemiDataset',
sup=dict(
type='CocoDataset',
ann_file='droot_4classes\json\voc07_train_0.3_.json',
img_prefix='droot_4classes',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(
type='Sequential',
transforms=[
dict(
type='RandResize',
img_scale=[(1333, 400), (1333, 1200)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandFlip', flip_ratio=0.5),
dict(
type='OneOf',
transforms=[
dict(type='Identity'),
dict(type='AutoContrast'),
dict(type='RandEqualize'),
dict(type='RandSolarize'),
dict(type='RandColor'),
dict(type='RandContrast'),
dict(type='RandBrightness'),
dict(type='RandSharpness'),
dict(type='RandPosterize')
])
],
record=True),
dict(type='Pad', size_divisor=32),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ExtraAttrs', tag='sup'),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels'],
meta_keys=('filename', 'ori_shape', 'img_shape',
'img_norm_cfg', 'pad_shape', 'scale_factor',
'tag'))
]),
unsup=dict(
type='CocoDataset',
ann_file='droot_4classes\json\voc07_train_unsup_0.3_.json',
img_prefix='droot_4classes',
pipeline=[
dict(type='LoadImageFromFile'),
dict(type='PseudoSamples', with_bbox=True),
dict(
type='MultiBranch',
unsup_teacher=[
dict(
type='Sequential',
transforms=[
dict(
type='RandResize',
img_scale=[(1333, 400), (1333, 1200)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandFlip', flip_ratio=0.5),
dict(
type='ShuffledSequential',
transforms=[
dict(
type='OneOf',
transforms=[
dict(type='Identity'),
dict(type='AutoContrast'),
dict(type='RandEqualize'),
dict(type='RandSolarize'),
dict(type='RandColor'),
dict(type='RandContrast'),
dict(type='RandBrightness'),
dict(type='RandSharpness'),
dict(type='RandPosterize')
]),
dict(
type='OneOf',
transforms=[{
'type': 'RandTranslate',
'x': (-0.1, 0.1)
}, {
'type': 'RandTranslate',
'y': (-0.1, 0.1)
}, {
'type': 'RandRotate',
'angle': (-30, 30)
},
[{
'type':
'RandShear',
'x': (-30, 30)
}, {
'type':
'RandShear',
'y': (-30, 30)
}]])
]),
dict(
type='RandErase',
n_iterations=(1, 5),
size=[0, 0.2],
squared=True)
],
record=True),
dict(type='Pad', size_divisor=32),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ExtraAttrs', tag='unsup_student'),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels'],
meta_keys=('filename', 'ori_shape', 'img_shape',
'img_norm_cfg', 'pad_shape',
'scale_factor', 'tag',
'transform_matrix'))
],
unsup_student=[
dict(
type='Sequential',
transforms=[
dict(
type='RandResize',
img_scale=[(1333, 400), (1333, 1200)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandFlip', flip_ratio=0.5)
],
record=True),
dict(type='Pad', size_divisor=32),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ExtraAttrs', tag='unsup_teacher'),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels'],
meta_keys=('filename', 'ori_shape', 'img_shape',
'img_norm_cfg', 'pad_shape',
'scale_factor', 'tag',
'transform_matrix'))
])
],
filter_empty_gt=False)),
val=dict(
type='CocoDataset',
ann_file='droot_4classes\json\voc07_val_unsup_1_.json',
img_prefix='droot_4classes',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]),
test=dict(
type='CocoDataset',
ann_file='droot_4classes\json\voc07_val_unsup_1_.json',
img_prefix='data/coco/val2017/',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1333, 800),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img'])
])
]),
sampler=dict(
train=dict(
type='SemiBalanceSampler',
sample_ratio=[1, 5],
by_prob=False,
epoch_length=500)))
evaluation = dict(interval=1000, metric='bbox', type='SubModulesDistEvalHook')
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=dict(max_norm=20, norm_type=2))
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=0.001,
step=[2000, 12000])
runner = dict(type='IterBasedRunner', max_iters=20000)
checkpoint_config = dict(interval=1000, by_epoch=False, max_keep_ckpts=2)
log_config = dict(
interval=50,
hooks=[
dict(type='TextLoggerHook', by_epoch=False),
dict(
type='WandbLoggerHook',
init_kwargs=dict(
project='consistent-teacher',
name='consistent_teacher_r50_fpn_coco_180k_1p',
config=dict(
fold=1,
percent=1,
work_dirs=
'./work_dirs\consistent_teacher_r50_fpn_coco_180k_1p',
total_step=20000)),
by_epoch=False)
])
custom_hooks = [
dict(type='NumClassCheckHook'),
dict(type='WeightSummary'),
dict(type='SetIterInfoHook'),
dict(type='MeanTeacher', momentum=0.9995, interval=1, warm_up=0)
]
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
opencv_num_threads = 0
mp_start_method = 'fork'
auto_scale_lr = dict(enable=False, base_batch_size=16)
mmdet_base = '../../../mmdetection/configs/base'
model = dict(
type='ConsistentTeacher',
model=dict(
type='RetinaNet',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch',
init_cfg=dict(
type='Pretrained', checkpoint='torchvision://resnet50')),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=1,
add_extra_convs='on_output',
num_outs=5),
bbox_head=dict(
type='FAM3DHead',
num_classes=4,
in_channels=256,
stacked_convs=4,
feat_channels=256,
anchor_type='anchor_based',
anchor_generator=dict(
type='AnchorGenerator',
ratios=[1.0],
octave_base_scale=8,
scales_per_octave=1,
strides=[8, 16, 32, 64, 128]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.1, 0.1, 0.2, 0.2]),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
activated=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='GIoULoss', loss_weight=2.0)),
train_cfg=dict(
assigner=dict(
type='DynamicSoftLabelAssigner', topk=13, iou_factor=2.0),
alpha=1,
beta=6,
allowed_border=-1,
pos_weight=-1,
debug=False),
test_cfg=dict(
nms_pre=1000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(type='nms', iou_threshold=0.6),
max_per_img=100)),
train_cfg=dict(
num_scores=100,
dynamic_ratio=1.0,
warmup_step=500,
min_pseduo_box_size=0,
unsup_weight=2.0),
test_cfg=dict(inference_on='teacher'))
strong_pipeline = [
dict(
type='Sequential',
transforms=[
dict(
type='RandResize',
img_scale=[(1333, 400), (1333, 1200)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandFlip', flip_ratio=0.5),
dict(
type='ShuffledSequential',
transforms=[
dict(
type='OneOf',
transforms=[
dict(type='Identity'),
dict(type='AutoContrast'),
dict(type='RandEqualize'),
dict(type='RandSolarize'),
dict(type='RandColor'),
dict(type='RandContrast'),
dict(type='RandBrightness'),
dict(type='RandSharpness'),
dict(type='RandPosterize')
]),
dict(
type='OneOf',
transforms=[{
'type': 'RandTranslate',
'x': (-0.1, 0.1)
}, {
'type': 'RandTranslate',
'y': (-0.1, 0.1)
}, {
'type': 'RandRotate',
'angle': (-30, 30)
},
[{
'type': 'RandShear',
'x': (-30, 30)
}, {
'type': 'RandShear',
'y': (-30, 30)
}]])
]),
dict(
type='RandErase',
n_iterations=(1, 5),
size=[0, 0.2],
squared=True)
],
record=True),
dict(type='Pad', size_divisor=32),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ExtraAttrs', tag='unsup_student'),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels'],
meta_keys=('filename', 'ori_shape', 'img_shape', 'img_norm_cfg',
'pad_shape', 'scale_factor', 'tag', 'transform_matrix'))
]
weak_pipeline = [
dict(
type='Sequential',
transforms=[
dict(
type='RandResize',
img_scale=[(1333, 400), (1333, 1200)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandFlip', flip_ratio=0.5)
],
record=True),
dict(type='Pad', size_divisor=32),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ExtraAttrs', tag='unsup_teacher'),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels'],
meta_keys=('filename', 'ori_shape', 'img_shape', 'img_norm_cfg',
'pad_shape', 'scale_factor', 'tag', 'transform_matrix'))
]
unsup_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='PseudoSamples', with_bbox=True),
dict(
type='MultiBranch',
unsup_teacher=[
dict(
type='Sequential',
transforms=[
dict(
type='RandResize',
img_scale=[(1333, 400), (1333, 1200)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandFlip', flip_ratio=0.5),
dict(
type='ShuffledSequential',
transforms=[
dict(
type='OneOf',
transforms=[
dict(type='Identity'),
dict(type='AutoContrast'),
dict(type='RandEqualize'),
dict(type='RandSolarize'),
dict(type='RandColor'),
dict(type='RandContrast'),
dict(type='RandBrightness'),
dict(type='RandSharpness'),
dict(type='RandPosterize')
]),
dict(
type='OneOf',
transforms=[{
'type': 'RandTranslate',
'x': (-0.1, 0.1)
}, {
'type': 'RandTranslate',
'y': (-0.1, 0.1)
}, {
'type': 'RandRotate',
'angle': (-30, 30)
},
[{
'type': 'RandShear',
'x': (-30, 30)
}, {
'type': 'RandShear',
'y': (-30, 30)
}]])
]),
dict(
type='RandErase',
n_iterations=(1, 5),
size=[0, 0.2],
squared=True)
],
record=True),
dict(type='Pad', size_divisor=32),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ExtraAttrs', tag='unsup_student'),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels'],
meta_keys=('filename', 'ori_shape', 'img_shape',
'img_norm_cfg', 'pad_shape', 'scale_factor', 'tag',
'transform_matrix'))
],
unsup_student=[
dict(
type='Sequential',
transforms=[
dict(
type='RandResize',
img_scale=[(1333, 400), (1333, 1200)],
multiscale_mode='range',
keep_ratio=True),
dict(type='RandFlip', flip_ratio=0.5)
],
record=True),
dict(type='Pad', size_divisor=32),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='ExtraAttrs', tag='unsup_teacher'),
dict(type='DefaultFormatBundle'),
dict(
type='Collect',
keys=['img', 'gt_bboxes', 'gt_labels'],
meta_keys=('filename', 'ori_shape', 'img_shape',
'img_norm_cfg', 'pad_shape', 'scale_factor', 'tag',
'transform_matrix'))
])
]
fold = 1
percent = 1
classes = ['loose_l', 'loose_s', 'poor_l', 'porous']
fp16 = None
work_dir = './work_dirs\consistent_teacher_r50_fpn_coco_180k_1p'
cfg_name = 'consistent_teacher_r50_fpn_coco_180k_1p'
gpu_ids = range(0, 1)

2023-04-23 12:49:26,187 - mmdet.ssod - INFO - initialize ResNet with init_cfg {'type': 'Pretrained', 'checkpoint': 'torchvision://resnet50'}
2023-04-23 12:49:26,410 - mmdet.ssod - INFO - initialize FPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
2023-04-23 12:49:26,478 - mmdet.ssod - INFO - initialize ResNet with init_cfg {'type': 'Pretrained', 'checkpoint': 'torchvision://resnet50'}
2023-04-23 12:49:26,638 - mmdet.ssod - INFO - initialize FPN with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'}
Name of parameter - Initialization information
2023-04-23 12:50:24,305 - mmdet.ssod - INFO - Iter [50/20000] lr: 9.890e-04, eta: 4:02:58, time: 0.731, data_time: 0.019, memory: 3454, ema_momentum: 0.9800, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0240, unsup_gmm_thr: 0.0015, loss: 0.0000, grad_norm: 0.0000
2023-04-23 12:50:51,673 - mmdet.ssod - INFO - Iter [100/20000] lr: 1.988e-03, eta: 3:31:56, time: 0.547, data_time: 0.014, memory: 3454, ema_momentum: 0.9900, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0027, loss: 0.0000, grad_norm: 0.0000
2023-04-23 12:51:18,728 - mmdet.ssod - INFO - Iter [150/20000] lr: 2.987e-03, eta: 3:20:36, time: 0.541, data_time: 0.014, memory: 3454, ema_momentum: 0.9933, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0027, loss: 0.0000, grad_norm: 0.0000
2023-04-23 12:51:46,522 - mmdet.ssod - INFO - Iter [200/20000] lr: 3.986e-03, eta: 3:15:56, time: 0.556, data_time: 0.015, memory: 3454, ema_momentum: 0.9950, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0040, unsup_gmm_thr: 0.0067, loss: 0.0000, grad_norm: 0.0000
2023-04-23 12:52:15,543 - mmdet.ssod - INFO - Iter [250/20000] lr: 4.985e-03, eta: 3:14:34, time: 0.580, data_time: 0.015, memory: 3454, ema_momentum: 0.9960, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0064, loss: 0.0000, grad_norm: 0.0000
2023-04-23 12:52:43,742 - mmdet.ssod - INFO - Iter [300/20000] lr: 5.984e-03, eta: 3:12:35, time: 0.564, data_time: 0.015, memory: 3454, ema_momentum: 0.9967, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0013, loss: 0.0000, grad_norm: 0.0000
2023-04-23 12:53:12,320 - mmdet.ssod - INFO - Iter [350/20000] lr: 6.983e-03, eta: 3:11:24, time: 0.572, data_time: 0.015, memory: 3454, ema_momentum: 0.9971, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0040, unsup_gmm_thr: 0.0131, loss: 0.0000, grad_norm: 0.0000
2023-04-23 12:53:40,363 - mmdet.ssod - INFO - Iter [400/20000] lr: 7.982e-03, eta: 3:09:57, time: 0.561, data_time: 0.015, memory: 3454, ema_momentum: 0.9975, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0040, unsup_gmm_thr: 0.0101, loss: 0.0000, grad_norm: 0.0000
2023-04-23 12:54:08,155 - mmdet.ssod - INFO - Iter [450/20000] lr: 8.981e-03, eta: 3:08:32, time: 0.556, data_time: 0.015, memory: 3454, ema_momentum: 0.9978, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0021, loss: 0.0000, grad_norm: 0.0000
2023-04-23 12:54:35,883 - mmdet.ssod - INFO - Iter [500/20000] lr: 9.980e-03, eta: 3:07:16, time: 0.555, data_time: 0.015, memory: 3454, ema_momentum: 0.9980, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0013, loss: 0.0000, grad_norm: 0.0000
2023-04-23 12:55:14,925 - mmdet.ssod - INFO - Iter [550/20000] lr: 1.000e-02, eta: 3:12:49, time: 0.781, data_time: 0.229, memory: 3454, ema_momentum: 0.9982, unsup_loss_cls: 0.0004, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0000, loss: 0.0004, grad_norm: 0.0186
2023-04-23 12:55:42,908 - mmdet.ssod - INFO - Iter [600/20000] lr: 1.000e-02, eta: 3:11:22, time: 0.560, data_time: 0.015, memory: 3454, ema_momentum: 0.9983, unsup_loss_cls: 0.0001, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0000, loss: 0.0001, grad_norm: 0.0020
2023-04-23 12:56:10,335 - mmdet.ssod - INFO - Iter [650/20000] lr: 1.000e-02, eta: 3:09:48, time: 0.549, data_time: 0.015, memory: 3454, ema_momentum: 0.9985, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0000, loss: 0.0000, grad_norm: 0.0015
2023-04-23 12:56:38,082 - mmdet.ssod - INFO - Iter [700/20000] lr: 1.000e-02, eta: 3:08:32, time: 0.555, data_time: 0.014, memory: 3454, ema_momentum: 0.9986, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0000, loss: 0.0000, grad_norm: 0.0012
2023-04-23 12:57:05,840 - mmdet.ssod - INFO - Iter [750/20000] lr: 1.000e-02, eta: 3:07:23, time: 0.555, data_time: 0.015, memory: 3454, ema_momentum: 0.9987, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0000, loss: 0.0000, grad_norm: 0.0010
2023-04-23 12:57:33,505 - mmdet.ssod - INFO - Iter [800/20000] lr: 1.000e-02, eta: 3:06:17, time: 0.553, data_time: 0.014, memory: 3454, ema_momentum: 0.9988, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0000, loss: 0.0000, grad_norm: 0.0008
2023-04-23 12:58:01,648 - mmdet.ssod - INFO - Iter [850/20000] lr: 1.000e-02, eta: 3:05:26, time: 0.563, data_time: 0.015, memory: 3454, ema_momentum: 0.9988, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0000, loss: 0.0000, grad_norm: 0.0008
2023-04-23 12:58:29,195 - mmdet.ssod - INFO - Iter [900/20000] lr: 1.000e-02, eta: 3:04:25, time: 0.551, data_time: 0.014, memory: 3454, ema_momentum: 0.9989, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0000, loss: 0.0000, grad_norm: 0.0007
2023-04-23 12:58:57,751 - mmdet.ssod - INFO - Iter [950/20000] lr: 1.000e-02, eta: 3:03:48, time: 0.571, data_time: 0.014, memory: 3454, ema_momentum: 0.9989, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0000, loss: 0.0000, grad_norm: 0.0006
2023-04-23 12:59:25,490 - mmdet.ssod - INFO - Saving checkpoint at 1000 iterations
2023-04-23 12:59:27,535 - mmdet.ssod - INFO - Exp name: consistent_teacher_r50_fpn_coco_180k_1p.py
2023-04-23 12:59:27,536 - mmdet.ssod - INFO - Iter [1000/20000] lr: 1.000e-02, eta: 3:03:35, time: 0.596, data_time: 0.014, memory: 3454, ema_momentum: 0.9990, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 0.0000, unsup_gmm_thr: 0.0000, loss: 0.0000, grad_norm: 0.0005
`

box map is lower on my dataset

my dataset
labeld images num: 6k,
unlabed images num:45K
class num:7
image size:1024*1024

config like consistent_teacher_r50_fpn_coco_180k_10p.py, just modify dataset root and image_scale

config.load_from set consistent_teacher_r50_fpn_coco_720k_fulldata_iter_720000-d932808f.pth

after train 120k Iters,the sup box map is low,just about 0.05.

train log:
image

train loss:
image

eval:
image

question about Long tail scenarios

Nice work! Is this method suitable for long-tail scenarios? In long-tail scenarios, most of the unlabeled images do not contain the detection objects, so will a teacher model trained on a small dataset produce a large number of false detections on the unlabeled data in the long-tail, leading to poor performance of the student model?

deform_conv_forward

I had an error while running train code, but inference code is okay.

RuntimeError: DeformConv is not compiled with GPU support
image

I used
mmcv : 1.7.1
mmdetection : 2.28.2
python :3.8
cuda_version : 11.7

Did anyone solved this issue?

Thanks in advance!

Semi-supervised training with custom datasets

Thank you for your excellent work, I want to run the program on a custom dataset, but I only have a json file with labeled images - instances_train2017.json file, please how to generate a json file - unlabeled.json for unlabeled images, so as to pass unlabeled images into the teacher model for training, I will thank you for your guidance.

My results are lower than yours.

When I train the model with consistent_teacher_r50_fpn_coco_180k_10p_2x8.py on one GPU, the result is too low. And I didn't change the parameters.
`[>>>>>>>>>>>>>>>>>>>>>>>>>>] 5000/5000, 43.4 task/s, elapsed: 115s, ETA: 0s2023-05-20 11:33:56,083 - mmdet.ssod - INFO - Evaluating bbox...
Loading and preparing results...
DONE (t=1.21s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=17.84s).
Accumulating evaluation results...
DONE (t=5.17s).
2023-05-20 11:34:22,052 - mmdet.ssod - INFO -
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.123
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.197
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.126
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.067
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.142
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.156
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.342
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.342
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.342
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.146
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.359
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.496

[>>>>>>>>>>>>>>>>>>>>>>>>>>] 5000/5000, 43.1 task/s, elapsed: 116s, ETA: 0s2023-05-20 11:36:22,808 - mmdet.ssod - INFO - Evaluating bbox...
Loading and preparing results...
DONE (t=1.24s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=15.87s).
Accumulating evaluation results...
DONE (t=6.67s).
2023-05-20 11:36:48,355 - mmdet.ssod - INFO -
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.098
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 0.165
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.099
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.051
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.113
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.125
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.305
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.305
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.305
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.131
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.315
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.446

2023-05-20 11:36:48,895 - mmdet.ssod - INFO - Exp name: consistent_teacher_r50_fpn_coco_180k_10p_2x8.py
2023-05-20 11:36:48,898 - mmdet.ssod - INFO - Iter(val) [180000] teacher.bbox_mAP: 0.1230, teacher.bbox_mAP_50: 0.1971, teacher.bbox_mAP_75: 0.1262, teacher.bbox_mAP_s: 0.0674, teacher.bbox_mAP_m: 0.1415, teacher.bbox_mAP_l: 0.1562, teacher.bbox_mAP_copypaste: 0.1230 0.1971 0.1262 0.0674 0.1415 0.1562, student.bbox_mAP: 0.0984, student.bbox_mAP_50: 0.1653, student.bbox_mAP_75: 0.0991, student.bbox_mAP_s: 0.0513, student.bbox_mAP_m: 0.1129, student.bbox_mAP_l: 0.1249, student.bbox_mAP_copypaste: 0.0984 0.1653 0.0991 0.0513 0.1129 0.1249
wandb: Waiting for W&B process to finish... (success).`

The Threshold Problem of GMM

After carefully reading the article
3.4 Temporary consistency using Gaussian Mixture Model (GMM)
You use a Gaussian mixture distribution for each category. Each mixture distribution contains a positive and negative Gaussian distribution. It is also a posterior probability probability of EM algorithm inference for each category. However, in model training, the Log log only has a single GMM threshold, and ultimately GMM generates a category threshold or a unified threshold
Looking forward to your answer
Thank you.

Q&A and clarification

Hello, thank you for doing such a great job and sharing it. I have been working on SSOD recently, but now that the large model capability is so powerful, I have some concerns that I would like to consult with you. Do you think doing SSOD is still meaningful or necessary in the era of big models? Looking forward to your Q&A and clarification.

train.py: error: unrecognized arguments: --local-rank=0

Thanks for your awesome work.

I am trying to run the experiments, but I got the following errors:

train.py: error: unrecognized arguments: --local-rank=0
usage: train.py [-h] [--work-dir WORK_DIR] [--resume-from RESUME_FROM] [--no-validate] [--gpus GPUS | --gpu-ids GPU_IDS [GPU_IDS ...]] [--seed SEED] [--deterministic] [--options OPTIONS [OPTIONS ...]] [--cfg-options CFG_OPTIONS [CFG_OPTIONS ...]]
                [--launcher {none,pytorch,slurm,mpi}] [--local_rank LOCAL_RANK]
                config
train.py: error: unrecognized arguments: --local-rank=1

I use pytoch 2.0.0, mmdetection 2.28.2, mmcv-full 1.7.1.

I found the train.py use the arguments --local_rank but here is --local-rank. Thus I change the --local_rank in train.py to --local-rank, but there is a new error as follows:

Traceback (most recent call last):
  File "tools/train.py", line 198, in <module>
    main()
  File "tools/train.py", line 130, in main
    init_dist(args.launcher, **cfg.dist_params)
  File "/home/username/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/runner/dist_utils.py", line 41, in init_dist
    _init_dist_pytorch(backend, **kwargs)
  File "/home/username/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/runner/dist_utils.py", line 72, in _init_dist_pytorch
    torch.cuda.set_device(rank % num_gpus)
ZeroDivisionError: integer division or modulo by zero

Could you give me some suggestions to fix this error?

Dataset

When I was building the dataset, I had some problems, how to solve this please

Traceback (most recent call last):
  File "/data1/pankai/project/ConsistentTeacher/ConsistentTeacher-main/tools/train.py", line 201, in <module>
    main()
  File "/data1/pankai/project/ConsistentTeacher/ConsistentTeacher-main/tools/train.py", line 176, in main
    datasets = [build_dataset(cfg.data.train)]
  File "/home/pankai/.conda/envs/consistent/lib/python3.9/site-packages/mmdet/datasets/builder.py", line 82, in build_dataset
    dataset = build_from_cfg(cfg, DATASETS, default_args)
  File "/home/pankai/.conda/envs/consistent/lib/python3.9/site-packages/mmcv/utils/registry.py", line 72, in build_from_cfg
    raise type(e)(f'{obj_cls.__name__}: {e}')
TypeError: CocoDataset: __init__() got an unexpected keyword argument 'sup'
Traceback (most recent call last):
  File "/home/pankai/.conda/envs/consistent/lib/python3.9/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
    return obj_cls(**args)
TypeError: __init__() got an unexpected keyword argument 'sup'

AttributeError: 'ConfigDict' object has no attribute 'dist_params'

Hello,
I am not sure what is the cause of the problem, but it keeps generating "AttributeError: 'ConfigDict' object has no attribute 'dist_params' " error for different versions of mmcv-full. I have used SoftTeacher before and the installation is not different. I tried different range of mmcv-full, which are compatible with the given mmdet versions. However, the problem is persistent. I have even tried it for single GPU using python tools/train ... instead of the bash, but it still generates "File "tools/train.py", line 198, in
main()
File "tools/train.py", line 169, in main
cfg.model, train_cfg=cfg.get("train_cfg"), test_cfg=cfg.get("test_cfg")
File "/home/sam/ConsistentTeacher/thirdparty/mmdetection/mmdet/models/builder.py", line 55, in build_detector
'train_cfg specified in both outer field and model field '
AssertionError: train_cfg specified in both outer field and model field ".
for multiple GPUs, the error is at line "
File "File "tools/train.py", line 130, in main
init_dist(args.launcher, **cfg.dist_params)
File "/home/sam/anaconda3/envs/consistentTeacher/lib/python3.7/site-packages/mmcv/utils/config.py", line 519, in getattr
return getattr(self._cfg_dict, name)
File "/home/sam/anaconda3/envs/consistentTeacher/lib/python3.7/site-packages/mmcv/utils/config.py", line 50, in getattr
raise ex
AttributeError: 'ConfigDict' object has no attribute 'dist_params' "
Thanks,

mmcv._ext error when start training.

Hi, Congratulation for acceptance in CVPR23!
I'm trying to use your codes for my custom dataset.
I installed packages following your readme file and tried to run training codes

but mmdet 2.25.0 is not running with mmcv 1.3.9.
mmdet 2.25.0 should be run with mmcv.>=1.3.17 and mmcv<=1.6.0 which mentioned in mmdet.init .py

so I modified the mmdet.init.py and ran the training code but it failed again with the following error

return _bootstrap._gcd import(name[level:],package,level)
ModuleNotFoundError: No module named 'mmcv._ext'

My environment setting is as below.

centos7.4
gcc5.4.0
python3.9
cuda11.1
torch1.9
mmdet2.25.0
mmcv1.3.9

Could you try to running your codes with above setting?
If you did well, could you tell me your other settings that you didn't mentioned in readme file?

ImportError: cannot import name 'Config' from 'mmcv'

I installed the mmcv 2.0.0 and ran "bash tools/dist_train.sh configs/consistent-teacher/consistent_teacher_r50_fpn_coco_180k_10p.py 8".
However, got
ImportError: cannot import name 'Config' from 'mmcv' (/home/wangkai/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/init.py)
Traceback (most recent call last):
File "tools/train.py", line 11, in
from mmcv import Config, DictAction
ImportError: cannot import name 'Config' from 'mmcv'

What's the reason and what should I do, thx!

Environmental configuration

I have been experiencing various errors while configuring the environment. Do you have a detailed and complete set of environment configuration steps?

Problems with the loss function

Why does the unsup loss stay at 0 during my training process
2023-12-20 18:55:58,991 - mmdet.ssod - INFO - Iter [4050/18000] lr: 1.000e-02, eta: 6:02:24, time: 5.155, data_time: 4.165, memory: 7971, ema_momentum: 0.9995, sup_loss_cls: 0.3681, sup_loss_bbox: 0.4506, sup_num_gts: 4.1150, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 3.3975, unsup_gmm_thr: 0.3067, loss: 0.8188, grad_norm: 2.2128
2023-12-20 18:56:50,239 - mmdet.ssod - INFO - Iter [4100/18000] lr: 1.000e-02, eta: 5:59:35, time: 1.025, data_time: 0.037, memory: 7971, ema_momentum: 0.9995, sup_loss_cls: 0.3944, sup_loss_bbox: 0.4616, sup_num_gts: 4.0700, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 3.2725, unsup_gmm_thr: 0.3134, loss: 0.8560, grad_norm: 2.4340
2023-12-20 18:57:41,404 - mmdet.ssod - INFO - Iter [4150/18000] lr: 1.000e-02, eta: 5:56:50, time: 1.023, data_time: 0.036, memory: 7971, ema_momentum: 0.9995, sup_loss_cls: 0.3677, sup_loss_bbox: 0.4412, sup_num_gts: 3.5800, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 3.2400, unsup_gmm_thr: 0.3092, loss: 0.8088, grad_norm: 2.2602
2023-12-20 18:58:32,620 - mmdet.ssod - INFO - Iter [4200/18000] lr: 1.000e-02, eta: 5:54:07, time: 1.024, data_time: 0.036, memory: 7971, ema_momentum: 0.9995, sup_loss_cls: 0.3538, sup_loss_bbox: 0.4497, sup_num_gts: 3.9700, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 3.5275, unsup_gmm_thr: 0.3017, loss: 0.8035, grad_norm: 2.0193
2023-12-20 18:59:23,881 - mmdet.ssod - INFO - Iter [4250/18000] lr: 1.000e-02, eta: 5:51:26, time: 1.025, data_time: 0.037, memory: 7971, ema_momentum: 0.9995, sup_loss_cls: 0.3371, sup_loss_bbox: 0.4167, sup_num_gts: 3.8450, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 3.6300, unsup_gmm_thr: 0.3083, loss: 0.7538, grad_norm: 2.1498
2023-12-20 19:00:15,035 - mmdet.ssod - INFO - Iter [4300/18000] lr: 1.000e-02, eta: 5:48:48, time: 1.023, data_time: 0.037, memory: 7971, ema_momentum: 0.9995, sup_loss_cls: 0.3616, sup_loss_bbox: 0.4435, sup_num_gts: 3.9450, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 3.4550, unsup_gmm_thr: 0.3053, loss: 0.8050, grad_norm: 2.2147
2023-12-20 19:01:06,246 - mmdet.ssod - INFO - Iter [4350/18000] lr: 1.000e-02, eta: 5:46:13, time: 1.024, data_time: 0.037, memory: 7971, ema_momentum: 0.9995, sup_loss_cls: 0.3860, sup_loss_bbox: 0.4596, sup_num_gts: 4.8100, unsup_loss_cls: 0.0000, unsup_loss_bbox: 0.0000, unsup_num_gts: 3.2375, unsup_gmm_thr: 0.3129, loss: 0.8456, grad_norm: 2.1909

AnchorGenerator can support multiply scales for ratios?

when set multiply ratios like [0.5, 1, 2.0], it will crash in the FAM3DHead.

  File "/home/tiger/ConsistentTeacher/ssod/models/dense_heads/fam3d.py", line 169, in forward
    b, h, w, 4).permute(0, 3, 1, 2) / stride[0]
RuntimeError: shape '[4, 104, 180, 4]' is invalid for input of size 898560

so will support multiply scales for ratios?

A question about the pipeline setting

Hi, Adam.
I have a question about your pipeline setting in line 233 in configs/consistent-teacher/consistent_teacher_r50_fpn_coco_180k_10p.py, where the setting of type="MultiBranch", unsup_teacher=strong_pipeline, unsup_student=weak_pipeline is different from that in your paper(the tag of strong_pipeline and weak_pipeline are mismatched in my understanding).
Is this a setting mistake or do I misundertand it?
Hope for your answering, thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.