Giter Site home page Giter Site logo

dtennant / reid_baseline_with_syncbn Goto Github PK

View Code? Open in Web Editor NEW
156.0 3.0 35.0 352 KB

Reimplementation of Bag of Tricks and A Strong Baseline for Deep Person Re-identification

License: MIT License

Python 99.91% Makefile 0.09%
reidentification deep-learning vehicle-reid person-reid

reid_baseline_with_syncbn's Issues

VeRI776

By tuning the parameters and using focal loss, the mAP in VeRi776 can be up to 80.0%.

How to train ResNet on person and vehicle ReID datasets like MSMT17 and VeRi?

Hi, I want to train ResNet-50 or 34 for person and vehicle on MSMT17 and VeRi dataset.

1- Can I train ResNet for two classes and use both datasets at the same time or I can train only and it's not suitable for two classes?
2- What is the training process?
2- Can I also use ResNet-18 or ResNet-34 in the same way to increase the speed?

My ultimate goal is to train ResNet ReID model for the person and car. and then combine with this project (https://github.com/GeekAlexis/FastMOT) for Fast multi-object tracking.

How I can achieve this goal?

lr rate of bias when training on multi gpus

Hi, should lr rate of bias be multiplied the num of gpus in the following codes ?

def make_optimizer(cfg, model, num_gpus=1):
    params = []
    for key, value in model.named_parameters():
        if not value.requires_grad:
            continue
        lr = cfg.SOLVER.BASE_LR * num_gpus
        # linear scaling rule
        weight_decay = cfg.SOLVER.WEIGHT_DECAY
        if "bias" in key:
            lr = cfg.SOLVER.BASE_LR * cfg.SOLVER.BIAS_LR_FACTOR
            weight_decay = cfg.SOLVER.WEIGHT_DECAY_BIAS
        params += [{"params": [value], "lr": lr, "weight_decay": weight_decay}]
    if cfg.SOLVER.OPTIMIZER_NAME == 'SGD':
        optimizer = getattr(torch.optim, cfg.SOLVER.OPTIMIZER_NAME)(params, momentum=cfg.SOLVER.MOMENTUM)
    else:
        optimizer = getattr(torch.optim, cfg.SOLVER.OPTIMIZER_NAME)(params)
    return optimizer

multi-GPU training problem

HI, when i trained the model with multi-gpu training, the model didn't start training after more than 30 minutes, and i don't konw why, could you give me some suggestions? Thank you!

2019-09-25 14:56:36,708 reid_baseline.train INFO: More than one gpu used, convert model to use SyncBN.
2019-09-25 14:56:40,504 reid_baseline.train INFO: Using pytorch SyncBN implementation
2019-09-25 14:56:40,535 reid_baseline.train INFO: Trainer Built

中断训练后,如何恢复训练呢?

我在train.py中load之后,出现:RuntimeError: Error(s) in loading state_dict for DataParallelWithCallback:
Missing key(s) in state_dict: "module.base.conv1.weight", "module.base.bn1.weight", "module.base.bn1.bias", "module.base.bn1.running_mean", "module.base.bn1.running_var", "module.base.layer1.0.conv1.weight", "module.base.layer1.0.bn1.weight", "module.base.layer1.0.bn1.bias", "module.base.layer1.0.bn1.running_mean", "module.base.layer1.0.bn1.running_var", "module.base.layer1.0.conv2.weight", "module.base.layer1.0.bn2.weight", "module.base.layer1.0.bn2.bias", "module.base.layer1.0.bn2.running_mean", "module.base.layer1.0.bn2.running_var", "module.base.layer1.0.conv3.weight", "module.base.layer1.0.bn3.weight", "module.base.layer1.0.bn3.bias", "module.base.layer1.0.bn3.running_mean", "module.base.layer1.0.bn3.running_var", "module.base.layer1.0.downsample.0.weight", "module.base.layer1.0.downsample.1.weight", "module.base.layer1.0.downsample.1.bias", "module.base.layer1.0.downsample.1.running_mean", "module.base.layer1.0.downsample.1.running_var", "module.base.layer1.1.conv1.weight", "module.base.layer1.1.bn1.weight", "module.base.layer1.1.bn1.bias", "module.base.layer1.1.bn1.running_mean", "module.base.layer1.1.bn1.running_var", "module.base.layer1.1.conv2.weight", "module.base.layer1.1.bn2.weight", "module.base.layer1.1.bn2.bias", "module.base.layer1.1.bn2.running_mean", "module.base.layer1.1.bn2.running_var", "module.base.layer1.1.conv3.weight", "module.base.layer1.1.bn3.weight", "module.base.layer1.1.bn3.bias", "module.base.layer1.1.bn3.running_mean", "module.base.layer1.1.bn3.running_var", "module.base.layer1.2.conv1.weight", "module.base.layer1.2.bn1.weight", "module.base.layer1.2.bn1.bias", "module.base.layer1.2.bn1.running_mean", "module.base.layer1.2.bn1.running_var", "module.base.layer1.2.conv2.weight", "module.base.layer1.2.bn2.weight", "module.base.layer1.2.bn2.bias", "module.base.layer1.2.bn2.running_mean", "module.base.layer1.2.bn2.running_var", "module.base.layer1.2.conv3.weight", "module.base.layer1.2.bn3.weight", "module.base.layer1.2.bn3.bias", "module.base.layer1.2.bn3.running_mean", "module.base.layer1.2.bn3.running_var", "module.base.layer2.0.conv1.weight", "module.base.layer2.0.bn1.weight", "module.base.layer2.0.bn1.bias", "module.base.layer2.0.bn1.running_mean", "module.base.layer2.0.bn1.running_var", "module.base.layer2.0.conv2.weight", "module.base.layer2.0.bn2.weight", "module.base.layer2.0.bn2.bias", "module.base.layer2.0.bn2.running_mean", "module.base.layer2.0.bn2.running_var", "module.base.layer2.0.conv3.weight", "module.base.layer2.0.bn3.weight", "module.base.layer2.0.bn3.bias", "module.base.layer2.0.bn3.running_mean", "module.base.layer2.0.bn3.running_var", "module.base.layer2.0.downsample.0.weight", "module.base.layer2.0.downsample.1.weight", "module.base.layer2.0.downsample.1.bias", "module.base.layer2.0.downsample.1.running_mean", "module.base.layer2.0.downsample.1.running_var", "module.base.layer2.1.conv1.weight", "module.base.layer2.1.bn1.weight", "module.base.layer2.1.bn1.bias", "module.base.layer2.1.bn1.running_mean", "module.base.layer2.1.bn1.running_var", "module.base.layer2.1.conv2.weight", "module.base.layer2.1.bn2.weight", "module.base.layer2.1.bn2.bias", "module.base.layer2.1.bn2.running_mean", "module.base.layer2.1.bn2.running_var", "module.base.layer2.1.conv3.weight", "module.base.layer2.1.bn3.weight", "module.base.layer2.1.bn3.bias", "module.base.layer2.1.bn3.running_mean", "module.base.layer2.1.bn3.running_var", "module.base.layer2.2.conv1.weight", "module.base.layer2.2.bn1.weight", "module.base.layer2.2.bn1.bias", "module.base.layer2.2.bn1.running_mean", "module.base.layer2.2.bn1.running_var", "module.base.layer2.2.conv2.weight", "module.base.layer2.2.bn2.weight", "module.base.layer2.2.bn2.bias", "module.base.layer2.2.bn2.running_mean", "module.base.layer2.2.bn2.running_var", "module.base.layer2.2.conv3.weight", "module.base.layer2.2.bn3.weight", "module.base.layer2.2.bn3.bias", "module.base.layer2.2.bn3.running_mean", "module.base.layer2.2.bn3.running_var", "module.base.layer2.3.conv1.weight", "module.base.layer2.3.bn1.weight", "module.base.layer2.3.bn1.bias", "module.base.layer2.3.bn1.running_mean", "module.base.layer2.3.bn1.running_var", "module.base.layer2.3.conv2.weight", "module.base.layer2.3.bn2.weight", "module.base.layer2.3.bn2.bias", "module.base.layer2.3.bn2.running_mean", "module.base.layer2.3.bn2.running_var", "module.base.layer2.3.conv3.weight", "module.base.layer2.3.bn3.weight", "module.base.layer2.3.bn3.bias", "module.base.layer2.3.bn3.running_mean", "module.base.layer2.3.bn3.running_var", "module.base.layer3.0.conv1.weight", "module.base.layer3.0.bn1.weight", "module.base.layer3.0.bn1.bias", "module.base.layer3.0.bn1.running_mean", "module.base.layer3.0.bn1.running_var", "module.base.layer3.0.conv2.weight", "module.base.layer3.0.bn2.weight", "module.base.layer3.0.bn2.bias", "module.base.layer3.0.bn2.running_mean", "module.base.layer3.0.bn2.running_var", "module.base.layer3.0.conv3.weight", "module.base.layer3.0.bn3.weight", "module.base.layer3.0.bn3.bias", "module.base.layer3.0.bn3.running_mean", "module.base.layer3.0.bn3.running_var", "module.base.layer3.0.downsample.0.weight", "module.base.layer3.0.downsample.1.weight", "module.base.layer3.0.downsample.1.bias", "module.base.layer3.0.downsample.1.running_mean", "module.base.layer3.0.downsample.1.running_var", "module.base.layer3.1.conv1.weight", "module.base.layer3.1.bn1.weight", "module.base.layer3.1.bn1.bias", "module.base.layer3.1.bn1.running_mean", "module.base.layer3.1.bn1.running_var", "module.base.layer3.1.conv2.weight", "module.base.layer3.1.bn2.weight", "module.base.layer3.1.bn2.bias", "module.base.layer3.1.bn2.running_mean", "module.base.layer3.1.bn2.running_var", "module.base.layer3.1.conv3.weight", "module.base.layer3.1.bn3.weight", "module.base.layer3.1.bn3.bias", "module.base.layer3.1.bn3.running_mean", "module.base.layer3.1.bn3.running_var", "module.base.layer3.2.conv1.weight", "module.base.layer3.2.bn1.weight", "module.base.layer3.2.bn1.bias", "module.base.layer3.2.bn1.running_mean", "module.base.layer3.2.bn1.running_var", "module.base.layer3.2.conv2.weight", "module.base.layer3.2.bn2.weight", "module.base.layer3.2.bn2.bias", "module.base.layer3.2.bn2.running_mean", "module.base.layer3.2.bn2.running_var", "module.base.layer3.2.conv3.weight", "module.base.layer3.2.bn3.weight", "module.base.layer3.2.bn3.bias", "module.base.layer3.2.bn3.running_mean", "module.base.layer3.2.bn3.running_var", "module.base.layer3.3.conv1.weight", "module.base.layer3.3.bn1.weight", "module.base.layer3.3.bn1.bias", "module.base.layer3.3.bn1.running_mean", "module.base.layer3.3.bn1.running_var", "module.base.layer3.3.conv2.weight", "module.base.layer3.3.bn2.weight", "module.base.layer3.3.bn2.bias", "module.base.layer3.3.bn2.running_mean", "module.base.layer3.3.bn2.running_var", "module.base.layer3.3.conv3.weight", "module.base.layer3.3.bn3.weight", "module.base.layer3.3.bn3.bias", "module.base.layer3.3.bn3.running_mean", "module.base.layer3.3.bn3.running_var", "module.base.layer3.4.conv1.weight", "module.base.layer3.4.bn1.weight", "module.base.layer3.4.bn1.bias", "module.base.layer3.4.bn1.running_mean", "module.base.layer3.4.bn1.running_var", "module.base.layer3.4.conv2.weight", "module.base.layer3.4.bn2.weight", "module.base.layer3.4.bn2.bias", "module.base.layer3.4.bn2.running_mean", "module.base.layer3.4.bn2.running_var", "module.base.layer3.4.conv3.weight", "module.base.layer3.4.bn3.weight", "module.base.layer3.4.bn3.bias", "module.base.layer3.4.bn3.running_mean", "module.base.layer3.4.bn3.running_var", "module.base.layer3.5.conv1.weight", "module.base.layer3.5.bn1.weight", "module.base.layer3.5.bn1.bias", "module.base.layer3.5.bn1.running_mean", "module.base.layer3.5.bn1.running_var", "module.base.layer3.5.conv2.weight", "module.base.layer3.5.bn2.weight", "module.base.layer3.5.bn2.bias", "module.base.layer3.5.bn2.running_mean", "module.base.layer3.5.bn2.running_var", "module.base.layer3.5.conv3.weight", "module.base.layer3.5.bn3.weight", "module.base.layer3.5.bn3.bias", "module.base.layer3.5.bn3.running_mean", "module.base.layer3.5.bn3.running_var", "module.base.layer4.0.conv1.weight", "module.base.layer4.0.bn1.weight", "module.base.layer4.0.bn1.bias", "module.base.layer4.0.bn1.running_mean", "module.base.layer4.0.bn1.running_var", "module.base.layer4.0.conv2.weight", "module.base.layer4.0.bn2.weight", "module.base.layer4.0.bn2.bias", "module.base.layer4.0.bn2.running_mean", "module.base.layer4.0.bn2.running_var", "module.base.layer4.0.conv3.weight", "module.base.layer4.0.bn3.weight", "module.base.layer4.0.bn3.bias", "module.base.layer4.0.bn3.running_mean", "module.base.layer4.0.bn3.running_var", "module.base.layer4.0.downsample.0.weight", "module.base.layer4.0.downsample.1.weight", "module.base.layer4.0.downsample.1.bias", "module.base.layer4.0.downsample.1.running_mean", "module.base.layer4.0.downsample.1.running_var", "module.base.layer4.1.conv1.weight", "module.base.layer4.1.bn1.weight", "module.base.layer4.1.bn1.bias", "module.base.layer4.1.bn1.running_mean", "module.base.layer4.1.bn1.running_var", "module.base.layer4.1.conv2.weight", "module.base.layer4.1.bn2.weight", "module.base.layer4.1.bn2.bias", "module.base.layer4.1.bn2.running_mean", "module.base.layer4.1.bn2.running_var", "module.base.layer4.1.conv3.weight", "module.base.layer4.1.bn3.weight", "module.base.layer4.1.bn3.bias", "module.base.layer4.1.bn3.running_mean", "module.base.layer4.1.bn3.running_var", "module.base.layer4.2.conv1.weight", "module.base.layer4.2.bn1.weight", "module.base.layer4.2.bn1.bias", "module.base.layer4.2.bn1.running_mean", "module.base.layer4.2.bn1.running_var", "module.base.layer4.2.conv2.weight", "module.base.layer4.2.bn2.weight", "module.base.layer4.2.bn2.bias", "module.base.layer4.2.bn2.running_mean", "module.base.layer4.2.bn2.running_var", "module.base.layer4.2.conv3.weight", "module.base.layer4.2.bn3.weight", "module.base.layer4.2.bn3.bias", "module.base.layer4.2.bn3.running_mean", "module.base.layer4.2.bn3.running_var", "module.bottleneck.weight", "module.bottleneck.bias", "module.bottleneck.running_mean", "module.bottleneck.running_var", "module.classifier.weight".
Unexpected key(s) in state_dict: "state", "param_groups".
我需要做些什么吗

veri776准确率

您好,我也用这个训练了veri776数据集(all tricks with center loss,resnet_ibn_a),训练了40个epoch,验证集准确率90以上,但是用test.py测试的时候,map只有3%,rank也很低。
我想知道您训练的模型类型,以及epoch数,还有用哪个预训练参数,非常想达到您这样的准确率!
谢谢回复,非常感谢!

Pre-trained model

Hi, Where I can find the VeRi pre-trained model for the only test?

no module amp_C?

image
How can I solve this problem?I have already install the apex,but when I run main.py,this problem happended.

test on windows

Hello,Can it be tested on windows?I test in windows,but always report Permission denied: 'configs',The full error report is:
Traceback (most recent call last):
File "main.py", line 186, in
main()
File "main.py", line 59, in main
train(args)
File "main.py", line 63, in train
cfg.merge_from_file(args.config_file)
File "E:\Anaconda\lib\site-packages\yacs\config.py", line 211, in merge_from_file
with open(cfg_filename, "r") as f:
PermissionError: [Errno 13] Permission denied: 'configs'

(base) E:\code\reid_baseline_with_syncbn-master>ion denied: 'configs'
'ion' 不是内部或外部命令,也不是可运行的程序
或批处理文件。

(base) E:\code\reid_baseline_with_syncbn-master>
(base) E:\code\reid_baseline_with_syncbn-master>(base) E:\code\reid_baseline_with_syncbn-master>
此时不应有 E:\code\reid_baseline_with_syncbn-master。
(base) E:\code\reid_baseline_with_syncbn-master>python main.py -c configs / debug.yml
Traceback (most recent call last):
File "main.py", line 186, in
main()
File "main.py", line 59, in main
train(args)
File "main.py", line 63, in train
cfg.merge_from_file(args.config_file)
File "E:\Anaconda\lib\site-packages\yacs\config.py", line 211, in merge_from_file
with open(cfg_filename, "r") as f:
PermissionError: [Errno 13] Permission denied: 'configs'

veri-wild参数?

这数据集实在太大了,普通实验室跑不起,老哥能分享下训练这个数据集的处理脚本吗?
要训练的话,你还需要把数据集中的图片重命名成0001_c014_00001.jpg(vehicleid_cameraid_imageid.jpg)。

regarding veri 776

The results are really good. Just wanted to make sure - if you have followed the Veri 776 evaluation as mentioned in the paper. that is, to remove the same camera retrievals from the test set ?

多gpu训练问题?

你好
多卡运行到这就卡着不动了
单卡的脚本没问题
2022-06-19 19:09:40,134 reid_baseline.train INFO: Trainer Built

我只修改了这个

MODEL:
PRETRAIN_PATH: '/home/wgj233/.cache/torch/checkpoints/resnet50-19c8e357.pth'

INPUT:
SIZE_TRAIN: [384, 384]
SIZE_TEST: [384, 384]
PIXEL_MEAN: [0.5, 0.5, 0.5]
PIXEL_STD: [0.5, 0.5, 0.5]
PROB: 0.5 # random horizontal flip
RE_PROB: 0.5 # random erasing
PADDING: 0

DATASETS:
NAMES: 'FVRID_sum' # 'market1501'
DATA_PATH: '/home/wgj233/Datasets/FVRID_sum' # '#/home/zbc/data/market1501'
TRAIN_PATH: 'train_foggy' # 'bounding_box_train'
QUERY_PATH: 'query_foggy' # 'query'
GALLERY_PATH: 'gallery_foggy' # 'bounding_box_test'

DATALOADER:
SAMPLER: 'softmax_triplet'
NUM_INSTANCE: 8
NUM_WORKERS: 4

SOLVER:
OPTIMIZER_NAME: 'Adam'
MAX_EPOCHS: 30
BASE_LR: 0.0001
BIAS_LR_FACTOR: 1
WEIGHT_DECAY: 0.0005
WEIGHT_DECAY_BIAS: 0.0005
IMS_PER_BATCH: 16

STEPS: [20, 30, 45, 60, 75, 90, 105, 120, 135, 150, 165, 180, 195, 210, 225, 240, 255]
GAMMA: 0.6

WARMUP_FACTOR: 0.01
WARMUP_ITERS: 10
WARMUP_METHOD: 'linear'

CHECKPOINT_PERIOD: 1
LOG_PERIOD: 100
EVAL_PERIOD: 1

TEST:
IMS_PER_BATCH: 16
DEBUG: True
WEIGHT: "path"
MULTI_GPU: True

OUTPUT_DIR: "/home/wgj233/reid_baseline_with_syncbn-master/outputs/debug_multi-gpu"

使用VehicleID进行训练

Vehicle数据集的QUERY和GALLERY怎么选呢?我个人把其中一个测试集里的800个id分别选出一个作为QUERY,其他的作为GALLERY。然后我对所有图片按照格式重命名为:“00000001_c0000_00000005.jpg”。当我对比使用veri.yml训练的时候,提示如下:
ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable RANK expected, but not set .
如果我注释掉FP16,代码会一直停留在等待状态(后来发现改小batch_size就好了)但我发现训练速度有点慢呀,3块1080Ti训练一个epoch需要1个多小时,这速度正常吗?

VeRi-wild数据集问题?

您好,我是reID刚入门的学生,在运行您的代码时发现VeRi-Wild数据集似乎与VeRi-776数据集有点差别,似乎将VeRi-Wild做了处理,图片进行了重新命名,请问,是否可以提供一下脚本文件,谢谢,已start完毕

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.