layumi / aicity-reid-2020 Goto Github PK

View Code? Open in Web Editor NEW

446.0 15.0 111.0 8.97 MB

:red_car: The 1st Place Submission to AICity Challenge 2020 re-id track (Baidu-UTS submission)

License: MIT License

Python 100.00%

vehicle pytorch paddlepaddle vehicle-reid veri-776 aicity cityflow cvpr2020

aicity-reid-2020's Introduction

Hi there 👋

Contact Me:

✉ Email: zhedongzheng AT um.edu.mo

✧ Website: http://zdzheng.xyz

✧ Linkedin: https://www.linkedin.com/in/zhedongzheng/

✧ Google Scholar: https://scholar.google.com/citations?hl=en&user=XT17oUEAAAAJ

aicity-reid-2020's People

Contributors

Stargazers

Watchers

Forkers

chenliangyu-sc lulujianjie xjsxujingsong clw5180 abutaufique daiguangzhao jakel21 wolegeheheda highland2019 wishgale wangweidamon fengfly2014 yt7589 yemenr buffaloiron fangwudi chasingstar95 zhongtb zlannnn hx2009302823 further2006 lk613 mdyuan926 wolfworld6 taotaoyuhust lhwcv felixzhang7 bismex trantorrepository ansonyanxin person-re-identification knightofdawn 818ajian yuanzhenjie stwrd xrenya liuwenhaha ysc703 minusshi gaiya2050 intjun lwplw mathpopo xrosliang daizzhisheng uptodiff 5l1v3r1 yifenghuang wuxiaomin0110 lujiely 1522454735 shiyuan0806 blingfairy zhimingluo mzq308734881 aeril-ace sui6662012 ingeniousfrog weiplanet chenchy vounteer yx1322441675 zp1018 shiontao shuxjweb tdit-haha verigle zhumingxu jinfei3459 zilipeng xiesibo fangyuchuan jiaojiayuasd smallflyfly sitongzhen yrpang qty0228 deeplearning666 letstarfly vokhidovhusan greitzmann sopranopillow kyra-chang816 michuanhaohao skybocai deepbehavier hehewang625327 ratuchetp leofengxin amirunpri2018 j30206868 appsaigon study3h lachilachi derronqi iceprism 18098329072 nodototaofordl wayson20 cony2021

aicity-reid-2020's Issues

what is the mean of "gallery" in the test_2020.py file.

hi,
I didn't know the meaning of gallery, and I didn't find any clue about "gallery" in AICity dataset.
thanks

Cannot download feature extraction models

Hi,

It seems that the google drive URL does not exist anymore and the OneDrive URL requires authentication.

Can you take a look and reupload those models somewhere?

Thanks a lot

sharing trained model

Hi, I don't have resources to train it. It will be helpful if you could share the trained model.

13个模型特征合并

郑博士您好，在fast_submit.py 中将query_feature0~query_feature12合并作为最终的特征。请问只有feature0是pytorch这边的吗？

prepare_2020.py 难以理解

没有看到从哪里输入AI city 2020的数据开始处理的，全程没有注释，请问是那个参数是输入数据？

multi-scale model performance compared to single scale model

I want to know how much performance has been improved under same circumstances.

I was going through your research paper and was parallelly going through the codes. I was not able to find how you used UNIT(cycle gan) to perform style transfer and then content manipulation.
Kindly help us understand the flow of your code or provide a link where you have trained synthetic images on cycleGAN.

Copy&Paste method

Hello,

Is implemented in the code of this repo the Copy&Paste technique? In that case , in which file is it implemented?
thank you in advance

python train_2020.py

@layumi 郑博士我直接运行python train_2020.py能真确开始训练，但是按照README.md上提供的那一长传命令，就会报下面的错，这是为何？

requests.exceptions.SSLError: HTTPSConnectionPool(host='data.lip6.fr', port=443): Max retries exceeded with url: /cadene/pretrainedmodels/se_resnext101_32x4d-3b2fe3d8.pth (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')],)",),))

感谢分享，请问这个竞赛解决方案会写成paper，挂到arXiv上么？

test_2020.py

测试的时候的数据集不应该是所有真实图片的数量，应该是56277 images呀以及"image_query/". This dir contains 1052 images as queries.
可是结果却显示
Gallery Size: 1477400
Query Size: 465

train_2020.py problem

FileNotFoundError: [Errno 2] No such file or directory: './data/pytorch2020/train+virtual'

Hello, run the train_2020.py file as you said, but the above error occurs. What does the train+virtual folder do?

Vehicle reid based on data fusion

Hello, Dr. Zheng.

Thank you for your excellent work. I have a question for you. In vehicle reidentification, if the three-dimensional point cloud and two-dimensional image are used together, can the accuracy rate of vehicle recognition be improved？

Thank you very much!

where is SE_imbalance_s1_384_p0.5_lr2_mt_d0_b24+v+aug/opts.yaml

Hello,
Can you help me solve this issue. I am testing the pretrained weights and this is the error i am getting

关于VeRi-776 应用的问题

郑博士，您好。我在尝试基于您提供的Res50_imbalance_s1_256_p0.5_lr2_mt_d0_b48 (TMM) 这个模型进行stage_two 的训练，也就是基于 VeRi 的 fine_tune. 希望可以最终得到和您提供的 ft_Res50_imbalance_s1_256_p0.5_lr1_mt_d0.2_b48_w5 (TMM) 近似mAP的模型。

但是，训练过程中我发现，VeRi数据集中所含车辆的编号大于700，但ft_Res50_imbalance_s1_256_p0.5_lr1_mt_d0.2_b48_w5 (TMM) 最后一层的classifier的输出是576.

您是应该用了一部分VeRi来进行fine-tune的么？您是怎样划分的呢？

如下是我应用nclass=769(VeRi 训练集里所有车辆进行训练），stage_two= 12 后，进行测试得到的结果：
Rank@1:0.846246 Rank@5:0.937425 Rank@10:0.967819 mAP:0.547631

跟您提供的模型的mAP=83.4%相比还差了不少。

训练问题

郑博士您好，麻烦请教一下在python train_2020.py...的时候出现 “FileNotFoundError: [Errno 2] No such file or directory: './train.py'”，在pytorch这个文件夹下没有train.py这个文件，请问这是什么原因？

efficientnet_pytorch?

hello,where is efficientnet_pytorch?Thanks

模型

请问最终用的哪12个模型进行集成的呀？

pretrainedmodels.py ?

I was looking around your research, I try to run the file train_2020.py but it says that it can't import pretrainedmodels
What should I do?

It seems your code does not match your description of your workshop paper.

I have read your paper and code roughly, but have few questions.
Your code used SE-ResNeXt101, but ResNeXt101 in your paper; your code did not use Block3; your learning rate is 0.02 instead 0.01 in your paper; the lr scheduler is MultiStepLR instead Cosine; the feature dim is 4092 instead 1024; May I ask why?

cropped_aicity folder order???

I am confused how to provide cropped_aicity folder.
Should I make cropped images folder structure as we do with test_data?
my test_data structure is: test_data/gallery/im_test and test_data/query/im_query

When I run fast_submit_baidu.py I am getting the following error:
RuntimeError: CUDA out of memory. Tried to allocate 1.39 GiB (GPU 0; 7.80 GiB total capacity; 5.19 GiB already allocated; 174.31 MiB free; 5.59 GiB reserved in total by PyTorch)

@layumi Could you please give me some guidance about how to prepare test_dir and crop_dir to run fast_submit_baidu.py?

Thanks again for your great job.

train_2020.py运行后报错Couldn't find any class folder in ./data/pytorch2020/train

你好，郑博士，我把prepare_2020.py训练好的图像数据放在了如图中1位置红色框的文件夹下，但是运行train_2020.py后，报错在./data/pytorch2020/train路径下找不到任何类，这是怎么回事呢？郑博士，是否可以在readme里写清代码里所有文件路径都是什么？需要把什么样的数据放在对应的哪些文件路径下？

mAP calculation

Hello @layumi
I didn't quite understand your mAP calculation.
Is there any explanation you could share me?
Why are you adding olp_precision even though every ap is being summed up
And also I didn't get why we should divide by 2 before sum them up (here: ap = ap + d_recall*(old_precision + precision)/2)

def compute_mAP(index, good_index, junk_index):
    ap = 0
    cmc = torch.IntTensor(len(index)).zero_()
    if good_index.size==0:   # if empty
        cmc[0] = -1
        return ap,cmc

    # remove junk_index
    mask = np.in1d(index, junk_index, invert=True)
    index = index[mask]

    # find good_index index
    ngood = len(good_index)
    mask = np.in1d(index, good_index)
    rows_good = np.argwhere(mask==True)
    rows_good = rows_good.flatten()
    
    cmc[rows_good[0]:] = 1
    for i in range(ngood):
        d_recall = 1.0/ngood
        precision = (i+1)*1.0/(rows_good[i]+1)
        if rows_good[i]!=0:
            old_precision = i*1.0/rows_good[i]
        else:
            old_precision=1.0
        ap = ap + d_recall*(old_precision + precision)/2

    return ap, cmc

negative mining代码

请问郑博士，论文中的negative mining对应哪一部分代码？谢谢

直接测试

作者您好，我是一个新手，我想直接运行测试代码，跳过训练的部分。我尝试运行python fast_submit.py但是发现dataset文件结构不对：
No such file or directory: './data/test_data/gallery'
我参照https://github.com/PaddlePaddle/Research/tree/master/CV/PaddleReid/process_aicity_data 完成了数据准备，但是'./data/test_data/gallery'依然没有，请问测试数据集应该怎么准备？感谢

数据获取

你好，想问一下数据怎么获取呢？我看官网数据获取需要密码。你那边可以共享一下吗？

关于prepare_2020生成的gallery这个文件夹

你好，在prepare_2020.py中生成了gallery这个文件夹，但并没有程序往这个文件夹中添加，但在运行test_2020.py中会报错，说这个文件夹中没有图片

About pre-trained model

Thx for you job! Because the data we can't get it, so can you share your pre-trained model for testing? If not but why?

Missing pkl features

Could you please provide all the necessay features in order to run the fast_submit_baidu.py?
For example the "query_fea_ResNeXt101_vd_64x4d_cos_alldata_final.pkl" is missing

数据集问题

camera-aware model

Hello @layumi @miraclebiu. Thank you for work once more.
I have a question about training camera-aware model.
There is nothing said about in the README. Could you please give me some guidance if possible?
I would like to know how to train and how to implement it.

Thank you a lot

CUDA out of memory

I am running fast_submit_baidu.py file. I have set --batchsize 1. But i am getting an error.

python fast_submit_baidu.py --gpu_ids 0,1 --batchsize 1
./data/test_data
torch.Size([1052, 9718])
Gallery Cluster Class Number: 798
Gallery Cluster Image per Class: 22.92
Low Qualtiy Image in Query: 92
Low Qualtiy Image in Gallery: 2760
torch.Size([1052, 2048])

RuntimeError: CUDA out of memory. Tried to allocate 1.39 GiB (GPU 0; 7.80 GiB total capacity; 5.19 GiB already allocated; 113.00 MiB free; 5.59 GiB reserved in total by PyTorch)

It says `CUDA out of memory. Could you please anyone give me gaudiness?! How can i solve this problem?

Negative mining 以及 triplet loss

郑博士您好！
您在论文中提到有minibatch中５０％的图片用作Negative mining ，并且使用了度量学习的方法．请问Negative mining 以及 triplet loss对应于代码中的何处？

About Content Manipulation

Hi，Professor Zheng.I'm intertested of the Content Manipulation operation to generate a new car.I had read your annother paper《DG-Net》，and run the demo of your code.But I want to use it on the vehicle dataset,I want to know how to realize the opration the same the Content Manipulation.Look forward to your apply,Thanks!

evaluate_gpu.py

@layumi 这博士你好。evaluate_gpu.py中的第53行， cmc[rows_good[0]:] = 1 这句代码，我感觉不太合适。cmc的长度是36935，是去除junk_index之前的index的长度；而rows_good所代表的索引是去除junk_index之后的index的，此时的index，在经过第45行的“index = index[mask]”之后，已经变短了，为36875。那“ cmc[rows_good[0]:] = 1”这句代码是不是不太合适啊？？

可视化

请问如何实现在论文中最后提到的可视化显示

AICity2020 data

hello，where I can download aicity 2020 data？you have Baidu cloud link？ thank you.

how to generate heatmap like the example img?

submit_result_multimmodel.py

您好，在４６－４７行
parser.add_argument('--test_dir', default='../data/test_data', type=str, help='./test_data')
parser.add_argument('--crop_dir', default='../data/cropped_aicity', type=str, help='./test_data')
并没有发现相关代码

请问这份代码的python和pytorch版本，以及其他重要包的版本，第一条评论请忽略

如题，见图中”1“位置，请问val_2020.txt文件是什么？是按照链接https://github.com/PaddlePaddle/Research/tree/master/CV/PaddleReid/process_aicity_data，生成real_trainval_list，syn_trainval_list两个文件后，合并在一起的all_trainval_pids.txt吗？

ff = torch.FloatTensor(n,512).zero_().cuda()

Hi @layumi
sorry for bothering you. May I ask you again?

def extract_feature(/*...*/):    
   // ...
    for data in tqdm(dataloaders):
        //...
        ff = torch.FloatTensor(n,512).zero_().cuda()

Where did 512 come from ?
I am curious about the reason why you set the parameter as 512.

train_2020.py

郑博士以及各位大神好，代码出现点问题，我一个ＥＰＯＣＨ都没跑玩，准确率为１，损失基本为０：
条件是：--name SE_imbalance_s1_384_p0.5_lr2_mt_d0_b24+v+aug --warm_epoch 5 --droprate 0 --stride 1 --erasing_p 0.5 --autoaug --inputsize 384 --lr 0.02 --use_SE --gpu_ids 0 --train_virtual --batchsize 8

下面是代码

from future import print_function, division

import argparse
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
from torch.autograd import Variable
from torchvision import datasets, transforms
import torch.backends.cudnn as cudnn
import matplotlib

matplotlib.use('agg')
import matplotlib.pyplot as plt

from PIL import Image

import time
import os
from losses import AngleLoss, ArcLoss
from model import ft_net, ft_net_dense, ft_net_EF4, ft_net_EF5, ft_net_EF6, ft_net_IR, ft_net_NAS, ft_net_SE,
ft_net_DSE, PCB, CPB, ft_net_angle, ft_net_arc
from random_erasing import RandomErasing
import yaml
from AugFolder import AugFolder
from shutil import copyfile
import random
from autoaugment import ImageNetPolicy
from utils import get_model_list, load_network, save_network, make_weights_for_balanced_classes

version = torch.version

fp16

try:
from apex.fp16_utils import *
from apex import amp, optimizers
except ImportError: # will be 3.x series
print(
'This is not an error. If you want to use low precision, i.e., fp16, please install the apex with cuda support (https://github.com/NVIDIA/apex) and update pytorch to 1.0')

make the output

if not os.path.isdir('/home/ubuntu-guangzhaodai/Desktop/AICIty-reID-2020/data/outputs'):
os.mkdir('/home/ubuntu-guangzhaodai/Desktop/AICIty-reID-2020/data/outputs')
######################################################################

Options

--------

parser = argparse.ArgumentParser(description='Training')
parser.add_argument('--gpu_ids', default='0', type=str, help='gpu_ids: e.g. 0 0,1,2 0,2')
parser.add_argument('--adam', action='store_true', help='use all training data')
parser.add_argument('--name', default='ft_ResNet50', type=str, help='output model name')
parser.add_argument('--init_name', default='imagenet', type=str, help='initial with ImageNet')
parser.add_argument('--data_dir', default='/home/ubuntu-guangzhaodai/Desktop/AICIty-reID-2020/data/pytorch2020',
type=str, help='training dir path')
parser.add_argument('--train_all', action='store_true', help='use all training data')
parser.add_argument('--train_veri', action='store_true', help='use part training data + veri')
parser.add_argument('--train_virtual', action='store_true', help='use part training data + virtual')
parser.add_argument('--train_comp', action='store_true', help='use part training data + comp')
parser.add_argument('--train_pku', action='store_true', help='use part training data + pku')
parser.add_argument('--train_comp_veri', action='store_true', help='use part training data + comp +veri')
parser.add_argument('--train_milktea', action='store_true', help='use part training data + com + veri+pku')
parser.add_argument('--color_jitter', action='store_true', help='use color jitter in training')
parser.add_argument('--batchsize', default=32, type=int, help='batchsize')
parser.add_argument('--inputsize', default=299, type=int, help='batchsize')
parser.add_argument('--h', default=299, type=int, help='height')
parser.add_argument('--w', default=299, type=int, help='width')
parser.add_argument('--stride', default=2, type=int, help='stride')
parser.add_argument('--pool', default='avg', type=str, help='last pool')
parser.add_argument('--autoaug', action='store_true', help='use Color Data Augmentation')
parser.add_argument('--erasing_p', default=0, type=float, help='Random Erasing probability, in [0,1]')
parser.add_argument('--use_dense', action='store_true', help='use densenet121')
parser.add_argument('--use_NAS', action='store_true', help='use nasnetalarge')
parser.add_argument('--use_SE', action='store_true', help='use se_resnext101_32x4d')
parser.add_argument('--use_DSE', action='store_true', help='use senet154')
parser.add_argument('--use_IR', action='store_true', help='use InceptionResNetv2')
parser.add_argument('--use_EF4', action='store_true', help='use EF4')
parser.add_argument('--use_EF5', action='store_true', help='use EF5')
parser.add_argument('--use_EF6', action='store_true', help='use EF6')
parser.add_argument('--lr', default=0.05, type=float, help='learning rate')
parser.add_argument('--droprate', default=0.5, type=float, help='drop rate')
parser.add_argument('--PCB', action='store_true', help='use PCB+ResNet50')
parser.add_argument('--CPB', action='store_true', help='use Center+ResNet50')
parser.add_argument('--fp16', action='store_true',
help='use float16 instead of float32, which will save about 50% memory')
parser.add_argument('--balance', action='store_true', help='balance sample')
parser.add_argument('--angle', action='store_true', help='use angle loss')
parser.add_argument('--arc', action='store_true', help='use arc loss')
parser.add_argument('--warm_epoch', default=0, type=int, help='the first K epoch that needs warm up')
parser.add_argument('--resume', action='store_true', help='use arc loss')
opt = parser.parse_args()

if opt.resume:
model, opt, start_epoch = load_network(opt.name, opt)
else:
start_epoch = 0

print(start_epoch)

fp16 = opt.fp16
data_dir = opt.data_dir
name = opt.name

if not opt.resume:
str_ids = opt.gpu_ids.split(',')
gpu_ids = []
for str_id in str_ids:
gid = int(str_id)
if gid >= 0:
gpu_ids.append(gid)
opt.gpu_ids = gpu_ids

set gpu ids

if len(opt.gpu_ids) > 0:
cudnn.enabled = True
cudnn.benchmark = True
######################################################################

Load Data

---------

if opt.h == opt.w:
transform_train_list = [
# transforms.RandomRotation(30),
transforms.Resize((opt.inputsize, opt.inputsize), interpolation=3),
transforms.Pad(15),
# transforms.RandomCrop((256,256)),
transforms.RandomResizedCrop(size=opt.inputsize, scale=(0.75, 1.0), ratio=(0.75, 1.3333), interpolation=3),
# Image.BICUBIC)
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]

transform_val_list = [
    transforms.Resize(size=opt.inputsize, interpolation=3),  # Image.BICUBIC
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]

else:
transform_train_list = [
# transforms.RandomRotation(30),
transforms.Resize((opt.h, opt.w), interpolation=3),
transforms.Pad(15),
# transforms.RandomCrop((256,256)),
transforms.RandomResizedCrop(size=(opt.h, opt.w), scale=(0.75, 1.0), ratio=(0.75, 1.3333), interpolation=3),
# Image.BICUBIC)
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]

transform_val_list = [
    transforms.Resize((opt.h, opt.w), interpolation=3),  # Image.BICUBIC
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]

if opt.PCB:
transform_train_list = [
transforms.Resize((384, 192), interpolation=3),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]
transform_val_list = [
transforms.Resize(size=(384, 192), interpolation=3), # Image.BICUBIC
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]

if opt.erasing_p > 0:
transform_train_list = transform_train_list + [RandomErasing(probability=opt.erasing_p, mean=[0.0, 0.0, 0.0])]

if opt.color_jitter:
transform_train_list = [transforms.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1,
hue=0)] + transform_train_list

transform_train_list_aug = [ImageNetPolicy()] + transform_train_list

print(transform_train_list)
data_transforms = {
'train': transforms.Compose(transform_train_list),
'train_aug': transforms.Compose(transform_train_list_aug),
'val': transforms.Compose(transform_val_list),
}

train_all = ''
if opt.train_all:
train_all = '_all'

if opt.train_veri:
train_all = '+veri'

if opt.train_comp:
train_all = '+comp'

if opt.train_virtual:
train_all = '+virtual'

if opt.train_pku:
train_all = '+pku'

if opt.train_comp_veri:
train_all = '+comp+veri'

if opt.train_milktea:
train_all = '+comp+veri+pku'

image_datasets = {}

if not opt.autoaug:
image_datasets['train'] = datasets.ImageFolder(os.path.join(data_dir, 'train' + train_all),
data_transforms['train'])
else:
image_datasets['train'] = AugFolder(os.path.join(data_dir, 'train' + train_all),
data_transforms['train'], data_transforms['train_aug'])

if opt.balance:
dataset_train = image_datasets['train']
weights = make_weights_for_balanced_classes(dataset_train.imgs, len(dataset_train.classes))
weights = torch.DoubleTensor(weights)
sampler = torch.utils.data.sampler.WeightedRandomSampler(weights, len(weights))
dataloaders = {}
dataloaders['train'] = torch.utils.data.DataLoader(image_datasets['train'], batch_size=opt.batchsize,
sampler=sampler, num_workers=8,
pin_memory=True) # 8 workers may work faster
else:
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=opt.batchsize,
shuffle=True, num_workers=8, pin_memory=True)
# 8 workers may work faster
for x in ['train']}

dataset_sizes = {x: len(image_datasets[x]) for x in ['train']}
class_names = image_datasets['train'].classes

use_gpu = torch.cuda.is_available()

since = time.time()

inputs, classes = next(iter(dataloaders['train']))

print(time.time()-since)

######################################################################

Training the model

------------------

Now, let's write a general function to train a model. Here, we will

illustrate:

- Scheduling the learning rate

- Saving the best model

In the following, parameter `scheduler` is an LR scheduler object from

`torch.optim.lr_scheduler`.

y_loss = {} # loss history
y_loss['train'] = []
y_loss['val'] = []
y_err = {}
y_err['train'] = []
y_err['val'] = []

def train_model(model, criterion, optimizer, scheduler, start_epoch=0, num_epochs=25):
since = time.time()

warm_up = 0.1  # We start from the 0.1*lrRate
gamma = 0.0  # auto_aug
warm_iteration = round(dataset_sizes['train'] / opt.batchsize) * opt.warm_epoch  # first 5 epoch
total_iteration = round(dataset_sizes['train'] / opt.batchsize) * num_epochs

best_model_wts = model.state_dict()
best_loss = 9999
best_epoch = 0

for epoch in range(num_epochs - start_epoch):
    epoch = epoch + start_epoch
    print('gamma: %.4f' % gamma)
    print('Epoch {}/{}'.format(epoch, num_epochs - 1))
    print('-' * 50)

    # Each epoch has a training and validation phase
    for phase in ['train']:
        if phase == 'train':
            scheduler.step()
            model.train(True)  # Set model to training mode
        else:
            model.train(False)  # Set model to evaluate mode

        running_loss = 0.0
        running_corrects = 0.0
        global_step = 0
        iterate_num = 0
        # Iterate over data.
        for data in dataloaders[phase]:
            # get the inputs
            if opt.autoaug:
                inputs, inputs2, labels = data
                if random.uniform(0, 1) > gamma:
                    inputs = inputs2
                gamma = min(1.0, gamma + 1.0 / total_iteration)
            else:
                inputs, labels = data

            now_batch_size, c, h, w = inputs.shape
            if now_batch_size < opt.batchsize:  # skip the last batch
                continue
            # print(inputs.shape)
            # wrap them in Variable
            if use_gpu:
                inputs = Variable(inputs.cuda().detach())
                labels = Variable(labels.cuda().detach())
            else:
                inputs, labels = Variable(inputs), Variable(labels)
            # if we use low precision, input also need to be fp16
            # if fp16:
            #    inputs = inputs.half()

            # zero the parameter gradients
            optimizer.zero_grad()

            # forward
            if phase == 'val':
                with torch.no_grad():
                    outputs = model(inputs)
            else:
                outputs = model(inputs)

            if opt.PCB:
                part = {}
                sm = nn.Softmax(dim=1)
                num_part = 6
                for i in range(num_part):
                    part[i] = outputs[i]

                score = sm(part[0]) + sm(part[1]) + sm(part[2]) + sm(part[3]) + sm(part[4]) + sm(part[5])
                _, preds = torch.max(score.data, 1)

                loss = criterion(part[0], labels)
                for i in range(num_part - 1):
                    loss += criterion(part[i + 1], labels)
            elif opt.CPB:
                part = {}
                sm = nn.Softmax(dim=1)
                num_part = 4
                for i in range(num_part):
                    part[i] = outputs[i]

                score = sm(part[0]) + sm(part[1]) + sm(part[2]) + sm(part[3])
                _, preds = torch.max(score.data, 1)

                loss = criterion(part[0], labels)
                for i in range(num_part - 1):
                    loss += criterion(part[i + 1], labels)
            else:
                loss = criterion(outputs, labels)
                if opt.angle or opt.arc:
                    outputs = outputs[0]
                _, preds = torch.max(outputs.data, 1)

            # backward + optimize only if in training phase
            if epoch < opt.warm_epoch and phase == 'train':
                warm_up = min(1.0, warm_up + 0.9 / warm_iteration)
                loss *= warm_up

            # backward + optimize only if in training phase
            if phase == 'train':
                if fp16:  # we use optimier to backward loss
                    with amp.scale_loss(loss, optimizer) as scaled_loss:
                        scaled_loss.backward()
                else:
                    loss.backward()
                optimizer.step()
                global_step += 1
                iterate_num += now_batch_size
            print('Epoch: [{0}] [{1} / {2}]\t'
                  'Global_step: {3:}\t'
                  'Loss: {4:.3f}\t'
                  'Accurcy: {5:.3f}\t'.format(epoch, iterate_num, dataset_sizes[phase], global_step, loss.item(),
                                                    float(torch.sum(preds == labels.data)) / now_batch_size))
            # print('Epoch:%d Iteration:%d Total:%d Global_step:%d loss:%.2f accuracy:%.2f' % (
            # epoch, iterate_num, dataset_sizes[phase], global_step, loss.item(), float(torch.sum(preds == labels.data)) / now_batch_size))
            # statistics
            if int(version[0]) > 0 or int(version[2]) > 3:  # for the new version like 0.4.0, 0.5.0 and 1.0.0
                running_loss += loss.item() * now_batch_size
            else:  # for the old version like 0.3.0 and 0.3.1
                running_loss += loss.data[0] * now_batch_size
            running_corrects += float(torch.sum(preds == labels.data))

            del (loss, outputs, inputs, preds)

        epoch_loss = running_loss / dataset_sizes[phase]
        epoch_acc = running_corrects / dataset_sizes[phase]

        print('{} Loss: {:.4f} Acc: {:.4f}'.format(
            phase, epoch_loss, epoch_acc))

        y_loss[phase].append(epoch_loss)
        y_err[phase].append(1.0 - epoch_acc)
        # deep copy the model
        if len(opt.gpu_ids) > 1:
            save_network(model.module, opt.name, epoch + 1)
        else:
            save_network(model, opt.name, epoch + 1)
        draw_curve(epoch)

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print()
    if epoch_loss < best_loss:
        best_loss = epoch_loss
        best_epoch = epoch
        last_model_wts = model.state_dict()

time_elapsed = time.time() - since
print('Training complete in {:.0f}m {:.0f}s'.format(
    time_elapsed // 60, time_elapsed % 60))
print('Best epoch: {:d} Best Train Loss: {:4f}'.format(best_epoch, best_loss))

# load best model weights
model.load_state_dict(last_model_wts)
save_network(model, opt.name, 'last')
return model

######################################################################

Draw Curve

---------------------------

x_epoch = []
fig = plt.figure()
ax0 = fig.add_subplot(121, title="loss")
ax1 = fig.add_subplot(122, title="top1err")

def draw_curve(current_epoch):
x_epoch.append(current_epoch)
ax0.plot(x_epoch, y_loss['train'], 'bo-', label='train')
# ax0.plot(x_epoch, y_loss['val'], 'ro-', label='val')
ax1.plot(x_epoch, y_err['train'], 'bo-', label='train')
# ax1.plot(x_epoch, y_err['val'], 'ro-', label='val')
if current_epoch == 0:
ax0.legend()
ax1.legend()
fig.savefig(os.path.join('/home/ubuntu-guangzhaodai/Desktop/AICIty-reID-2020/data/outputs', name, 'train.png'))

######################################################################

Finetuning the convnet

----------------------

Load a pretrainied model and reset final fully connected layer.

if not opt.resume:
opt.nclasses = len(class_names)
if opt.use_dense:
model = ft_net_dense(len(class_names), opt.droprate, opt.stride, None, opt.pool)
elif opt.use_NAS:
model = ft_net_NAS(len(class_names), opt.droprate, opt.stride)
elif opt.use_SE:
model = ft_net_SE(len(class_names), opt.droprate, opt.stride, opt.pool)
elif opt.use_DSE:
model = ft_net_DSE(len(class_names), opt.droprate, opt.stride, opt.pool)
elif opt.use_IR:
model = ft_net_IR(len(class_names), opt.droprate, opt.stride)
elif opt.use_EF4:
model = ft_net_EF4(len(class_names), opt.droprate)
elif opt.use_EF5:
model = ft_net_EF5(len(class_names), opt.droprate)
elif opt.use_EF6:
model = ft_net_EF6(len(class_names), opt.droprate)
else:
model = ft_net(len(class_names), opt.droprate, opt.stride, None, opt.pool)

if opt.PCB:
    model = PCB(len(class_names))

if opt.CPB:
    model = CPB(len(class_names))

if opt.angle:
    model = ft_net_angle(len(class_names), opt.droprate, opt.stride)
elif opt.arc:
    model = ft_net_arc(len(class_names), opt.droprate, opt.stride)

if opt.init_name != 'imagenet':
old_opt = parser.parse_args()
init_model, old_opt, _ = load_network(opt.init_name, old_opt)
print(old_opt)
opt.stride = old_opt.stride
opt.pool = old_opt.pool
opt.use_dense = old_opt.use_dense
if opt.use_dense:
model = ft_net_dense(opt.nclasses, droprate=opt.droprate, stride=opt.stride, init_model=init_model,
pool=opt.pool)
else:
model = ft_net(opt.nclasses, droprate=opt.droprate, stride=opt.stride, init_model=init_model, pool=opt.pool)

##########################

Put model parameter in front of the optimizer!!!

For resume:

if start_epoch >= 60:
opt.lr = opt.lr * 0.1
if start_epoch >= 75:
opt.lr = opt.lr * 0.1

if len(opt.gpu_ids) > 1:
model = torch.nn.DataParallel(model, device_ids=opt.gpu_ids).cuda()
if not opt.CPB:
ignored_params = list(map(id, model.module.classifier.parameters()))
base_params = filter(lambda p: id(p) not in ignored_params, model.parameters())
optimizer_ft = optim.SGD([
{'params': base_params, 'lr': 0.1 * opt.lr},
{'params': model.module.classifier.parameters(), 'lr': opt.lr}
], weight_decay=5e-4, momentum=0.9, nesterov=True)
else:
ignored_params = (list(map(id, model.module.classifier0.parameters()))
+ list(map(id, model.module.classifier1.parameters()))
+ list(map(id, model.module.classifier2.parameters()))
+ list(map(id, model.module.classifier3.parameters()))
)
base_params = filter(lambda p: id(p) not in ignored_params, model.parameters())
optimizer_ft = optim.SGD([
{'params': base_params, 'lr': 0.1 * opt.lr},
{'params': model.module.classifier0.parameters(), 'lr': opt.lr},
{'params': model.module.classifier1.parameters(), 'lr': opt.lr},
{'params': model.module.classifier2.parameters(), 'lr': opt.lr},
{'params': model.module.classifier3.parameters(), 'lr': opt.lr},
], weight_decay=5e-4, momentum=0.9, nesterov=True)
else:
model = model.cuda()
if not opt.CPB:
ignored_params = list(map(id, model.classifier.parameters()))
base_params = filter(lambda p: id(p) not in ignored_params, model.parameters())
optimizer_ft = optim.SGD([
{'params': base_params, 'lr': 0.1 * opt.lr},
{'params': model.classifier.parameters(), 'lr': opt.lr}
], weight_decay=5e-4, momentum=0.9, nesterov=True)
else:
ignored_params = (list(map(id, model.classifier0.parameters()))
+ list(map(id, model.classifier1.parameters()))
+ list(map(id, model.classifier2.parameters()))
+ list(map(id, model.classifier3.parameters()))
)
base_params = filter(lambda p: id(p) not in ignored_params, model.parameters())
optimizer_ft = optim.SGD([
{'params': base_params, 'lr': 0.1 * opt.lr},
{'params': model.classifier0.parameters(), 'lr': opt.lr},
{'params': model.classifier1.parameters(), 'lr': opt.lr},
{'params': model.classifier2.parameters(), 'lr': opt.lr},
{'params': model.classifier3.parameters(), 'lr': opt.lr},
], weight_decay=5e-4, momentum=0.9, nesterov=True)

if opt.adam:
optimizer_ft = optim.Adam(model.parameters(), opt.lr, weight_decay=5e-4)

Decay LR by a factor of 0.1 every 40 epochs

exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=40, gamma=0.1)

exp_lr_scheduler = lr_scheduler.MultiStepLR(optimizer_ft, milestones=[60 - start_epoch, 75 - start_epoch], gamma=0.1)

######################################################################

Train and evaluate

^^^^^^^^^^^^^^^^^^

It should take around 1-2 hours on GPU.

dir_name = os.path.join('/home/ubuntu-guangzhaodai/Desktop/AICIty-reID-2020/data/outputs', name)

if not opt.resume:
if not os.path.isdir(dir_name):
os.mkdir(dir_name)
# record every run
copyfile('./train_2020.py', dir_name + '/train.py')
copyfile('./model.py', dir_name + '/model.py')
# save opts
with open('%s/opts.yaml' % dir_name, 'w') as fp:
yaml.dump(vars(opt), fp, default_flow_style=False)

model to gpu

if fp16:
# model = network_to_half(model)
# optimizer_ft = FP16_Optimizer(optimizer_ft, dynamic_loss_scale=True)
model, optimizer_ft = amp.initialize(model, optimizer_ft, opt_level="O1")

if opt.angle:
criterion = AngleLoss()
elif opt.arc:
criterion = ArcLoss()
else:
criterion = nn.CrossEntropyLoss()

print(model)
model = train_model(model, criterion, optimizer_ft, exp_lr_scheduler,
start_epoch=start_epoch, num_epochs=80)

batch_size取24，一个steps运行492ms，这个正常吗？想知道作者一个steps的运算时间是多少

--init_name pretrained model

@layumi Hi. May I ask you something again? I would like to use my own previously trained model as a pre-trained model.
Let's say I have SE_imbalance_s1_128_p0.5_lr2_mt_d0_b24+v+aug_20220701 already trained model. And I would like to use my trained model as a pre-trained model for my next training. How could I do this?

I have tried as following:
python train_2020.py --data_dir ../../data/reid_data --name SE_imbalance_s1_128_p0.5_lr2_mt_d0_b24+v+aug --warm_epoch 5 --droprate 0 --stride 1 --erasing_p 0.5 --autoaug --inputsize 128 --lr 0.02 --use_SE --gpu_ids 0,1 --train_virtual --batchsize 128 --init_name SE_imbalance_s1_128_p0.5_lr2_mt_d0_b24+v+aug_20220701

however I got following error:
ModuleAttributeError("'{}' object has no attribute '{}'".format( torch.nn.modules.module.ModuleAttributeError: 'SENet' object has no attribute 'conv1'

Thanks

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

	print(query_feature.shape)
	threshold = 0.5
	#query cluster
	nq = query_feature.shape[0]
	nf = query_feature.shape[1]
	q_q_dist = torch.mm(query_feature, torch.transpose(query_feature, 0, 1))
	q_q_dist = q_q_dist.cpu().numpy()
	q_q_dist[q_q_dist>1] = 1 #due to the epsilon
	q_q_dist = 2-2*q_q_dist
	eps = threshold
	# first cluster
	min_samples= 2
	cluster1 = DBSCAN(eps=eps, min_samples=min_samples, metric='precomputed', algorithm='auto', n_jobs=-1)
	cluster1 = cluster1.fit(q_q_dist)
	qlabels = cluster1.labels_
	nlabel_q = len(np.unique(cluster1.labels_))
	# gallery cluster
	ng = gallery_feature.shape[0]
	### Using tracking ID
	g_g_dist = torch.ones(ng,ng).numpy()
	nlabel_g = 0
	glabels = torch.zeros(ng).numpy() - 1
	with open('data/test_track_id.txt','r') as f:
	for line in f:
	line = line.replace('\n','')
	g_name = line.split(' ')
	g_name.remove('')
	g_name = list(map(int, g_name))
	for i in g_name:
	glabels[i-1] = nlabel_g
	for j in g_name:
	g_g_dist[i-1,j-1] = 0
	nlabel_g +=1
	nimg_g = len(np.argwhere(glabels!=-1))
	print('Gallery Cluster Class Number: %d'%nlabel_g)
	print('Gallery Cluster Image per Class: %.2f'%(nimg_g/nlabel_g))

layumi / aicity-reid-2020 Goto Github PK

aicity-reid-2020's Introduction

Hi there 👋

aicity-reid-2020's People

Contributors

Stargazers

Watchers

Forkers

aicity-reid-2020's Issues

下面是代码

from PIL import Image

fp16

make the output

Options

--------

set gpu ids

Load Data

---------

since = time.time()

inputs, classes = next(iter(dataloaders['train']))

print(time.time()-since)

Training the model

------------------

Now, let's write a general function to train a model. Here, we will

illustrate:

- Scheduling the learning rate

- Saving the best model

In the following, parameter scheduler is an LR scheduler object from

torch.optim.lr_scheduler.

Draw Curve

---------------------------

Finetuning the convnet

----------------------

Load a pretrainied model and reset final fully connected layer.

Put model parameter in front of the optimizer!!!

For resume:

Decay LR by a factor of 0.1 every 40 epochs

exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=40, gamma=0.1)

Train and evaluate

^^^^^^^^^^^^^^^^^^

It should take around 1-2 hours on GPU.

model to gpu

Recommend Projects

Recommend Topics

Recommend Org

In the following, parameter `scheduler` is an LR scheduler object from

`torch.optim.lr_scheduler`.