Giter Site home page Giter Site logo

layumi / aicity-reid-2020 Goto Github PK

View Code? Open in Web Editor NEW
446.0 15.0 111.0 8.97 MB

:red_car: The 1st Place Submission to AICity Challenge 2020 re-id track (Baidu-UTS submission)

License: MIT License

Python 100.00%
vehicle pytorch paddlepaddle vehicle-reid veri-776 aicity cityflow cvpr2020

aicity-reid-2020's Introduction

aicity-reid-2020's People

Contributors

layumi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

aicity-reid-2020's Issues

Cannot download feature extraction models

Hi,

It seems that the google drive URL does not exist anymore and the OneDrive URL requires authentication.

Can you take a look and reupload those models somewhere?

Thanks a lot

sharing trained model

Hi, I don't have resources to train it. It will be helpful if you could share the trained model.

代码中cluster含义

print(query_feature.shape)
threshold = 0.5
#query cluster
nq = query_feature.shape[0]
nf = query_feature.shape[1]
q_q_dist = torch.mm(query_feature, torch.transpose(query_feature, 0, 1))
q_q_dist = q_q_dist.cpu().numpy()
q_q_dist[q_q_dist>1] = 1 #due to the epsilon
q_q_dist = 2-2*q_q_dist
eps = threshold
# first cluster
min_samples= 2
cluster1 = DBSCAN(eps=eps, min_samples=min_samples, metric='precomputed', algorithm='auto', n_jobs=-1)
cluster1 = cluster1.fit(q_q_dist)
qlabels = cluster1.labels_
nlabel_q = len(np.unique(cluster1.labels_))
# gallery cluster
ng = gallery_feature.shape[0]
### Using tracking ID
g_g_dist = torch.ones(ng,ng).numpy()
nlabel_g = 0
glabels = torch.zeros(ng).numpy() - 1
with open('data/test_track_id.txt','r') as f:
for line in f:
line = line.replace('\n','')
g_name = line.split(' ')
g_name.remove('')
g_name = list(map(int, g_name))
for i in g_name:
glabels[i-1] = nlabel_g
for j in g_name:
g_g_dist[i-1,j-1] = 0
nlabel_g +=1
nimg_g = len(np.argwhere(glabels!=-1))
print('Gallery Cluster Class Number: %d'%nlabel_g)
print('Gallery Cluster Image per Class: %.2f'%(nimg_g/nlabel_g))

请问郑博士,这一部分代码是在做什么?谢谢!

13个模型特征合并

郑博士您好,在fast_submit.py 中将query_feature0~query_feature12合并作为最终的特征。请问只有feature0是pytorch这边的吗?

prepare_2020.py 难以理解

没有看到从哪里输入AI city 2020的数据开始处理的,全程没有注释,请问是那个参数是输入数据?

USE of UNIT (cycle GAN)

I was going through your research paper and was parallelly going through the codes. I was not able to find how you used UNIT(cycle gan) to perform style transfer and then content manipulation.
Kindly help us understand the flow of your code or provide a link where you have trained synthetic images on cycleGAN.

Copy&Paste method

Hello,

Is implemented in the code of this repo the Copy&Paste technique? In that case , in which file is it implemented?
thank you in advance

python train_2020.py

@layumi 郑博士 我直接运行python train_2020.py能真确开始训练,但是按照README.md上提供的那一长传命令,就会报下面的错,这是为何?

requests.exceptions.SSLError: HTTPSConnectionPool(host='data.lip6.fr', port=443): Max retries exceeded with url: /cadene/pretrainedmodels/se_resnext101_32x4d-3b2fe3d8.pth (Caused by SSLError(SSLError("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')],)",),))

test_2020.py

测试的时候的数据集不应该是所有真实图片的数量,应该是56277 images呀以及"image_query/". This dir contains 1052 images as queries.
可是结果却显示
Gallery Size: 1477400
Query Size: 465

train_2020.py problem

FileNotFoundError: [Errno 2] No such file or directory: './data/pytorch2020/train+virtual'

Hello, run the train_2020.py file as you said, but the above error occurs. What does the train+virtual folder do?
image

Vehicle reid based on data fusion

Hello, Dr. Zheng.

Thank you for your excellent work. I have a question for you. In vehicle reidentification, if the three-dimensional point cloud and two-dimensional image are used together, can the accuracy rate of vehicle recognition be improved?

Thank you very much!

关于VeRi-776 应用的问题

郑博士,您好。 我在尝试基于您提供的Res50_imbalance_s1_256_p0.5_lr2_mt_d0_b48 (TMM) 这个模型进行stage_two 的训练,也就是基于 VeRi 的 fine_tune. 希望可以最终得到和您提供的 ft_Res50_imbalance_s1_256_p0.5_lr1_mt_d0.2_b48_w5 (TMM) 近似mAP的模型。

但是,训练过程中我发现,VeRi数据集中所含车辆的编号大于700,但ft_Res50_imbalance_s1_256_p0.5_lr1_mt_d0.2_b48_w5 (TMM) 最后一层的classifier的输出是576.

您是应该用了一部分VeRi来进行fine-tune的么?您是怎样划分的呢?

如下是我应用nclass=769(VeRi 训练集里所有车辆进行训练),stage_two= 12 后,进行测试得到的结果:
Rank@1:0.846246 Rank@5:0.937425 Rank@10:0.967819 mAP:0.547631

跟您提供的模型的mAP=83.4%相比还差了不少。

训练问题

郑博士您好,麻烦请教一下在python train_2020.py...的时候出现 “FileNotFoundError: [Errno 2] No such file or directory: './train.py'”,在pytorch这个文件夹下没有train.py这个文件,请问这是什么原因?

模型

请问最终用的哪12个模型进行集成的呀?

pretrainedmodels.py ?

I was looking around your research, I try to run the file train_2020.py but it says that it can't import pretrainedmodels
What should I do?

It seems your code does not match your description of your workshop paper.

I have read your paper and code roughly, but have few questions.
Your code used SE-ResNeXt101, but ResNeXt101 in your paper; your code did not use Block3; your learning rate is 0.02 instead 0.01 in your paper; the lr scheduler is MultiStepLR instead Cosine; the feature dim is 4092 instead 1024; May I ask why?

cropped_aicity folder order???

I am confused how to provide cropped_aicity folder.
Should I make cropped images folder structure as we do with test_data?
my test_data structure is: test_data/gallery/im_test and test_data/query/im_query

When I run fast_submit_baidu.py I am getting the following error:
RuntimeError: CUDA out of memory. Tried to allocate 1.39 GiB (GPU 0; 7.80 GiB total capacity; 5.19 GiB already allocated; 174.31 MiB free; 5.59 GiB reserved in total by PyTorch)

@layumi Could you please give me some guidance about how to prepare test_dir and crop_dir to run fast_submit_baidu.py?

Thanks again for your great job.

train_2020.py运行后报错Couldn't find any class folder in ./data/pytorch2020/train

1652537990(1)
你好,郑博士,我把prepare_2020.py训练好的图像数据放在了如图中1位置红色框的文件夹下,但是运行train_2020.py后,报错在./data/pytorch2020/train路径下找不到任何类,这是怎么回事呢?郑博士,是否可以在readme里写清代码里所有文件路径都是什么?需要把什么样的数据放在对应的哪些文件路径下?

mAP calculation

Hello @layumi
I didn't quite understand your mAP calculation.
Is there any explanation you could share me?
Why are you adding olp_precision even though every ap is being summed up
And also I didn't get why we should divide by 2 before sum them up (here: ap = ap + d_recall*(old_precision + precision)/2)

def compute_mAP(index, good_index, junk_index):
    ap = 0
    cmc = torch.IntTensor(len(index)).zero_()
    if good_index.size==0:   # if empty
        cmc[0] = -1
        return ap,cmc

    # remove junk_index
    mask = np.in1d(index, junk_index, invert=True)
    index = index[mask]

    # find good_index index
    ngood = len(good_index)
    mask = np.in1d(index, good_index)
    rows_good = np.argwhere(mask==True)
    rows_good = rows_good.flatten()
    
    cmc[rows_good[0]:] = 1
    for i in range(ngood):
        d_recall = 1.0/ngood
        precision = (i+1)*1.0/(rows_good[i]+1)
        if rows_good[i]!=0:
            old_precision = i*1.0/rows_good[i]
        else:
            old_precision=1.0
        ap = ap + d_recall*(old_precision + precision)/2

    return ap, cmc

数据获取

你好,想问一下数据怎么获取呢?我看官网数据获取需要密码。你那边可以共享一下吗?

About pre-trained model

Thx for you job! Because the data we can't get it, so can you share your pre-trained model for testing? If not but why?

Missing pkl features

Could you please provide all the necessay features in order to run the fast_submit_baidu.py?
For example the "query_fea_ResNeXt101_vd_64x4d_cos_alldata_final.pkl" is missing

数据集问题

请问训练和测试用的数据集是AICity20 官网的Track2吗?我在官网下载了压缩包,但解压后不是以下格式:
|- 2020AICITY
|- ...
|- 000345_c020_9.jpg
|- ...
|- 002028_c036_4_9_95_2.jpg
请问需要重新组织官网数据集吗?

camera-aware model

Hello @layumi @miraclebiu. Thank you for work once more.
I have a question about training camera-aware model.
There is nothing said about in the README. Could you please give me some guidance if possible?
I would like to know how to train and how to implement it.

Thank you a lot

CUDA out of memory

I am running fast_submit_baidu.py file. I have set --batchsize 1. But i am getting an error.

python fast_submit_baidu.py --gpu_ids 0,1 --batchsize 1
./data/test_data
torch.Size([1052, 9718])
Gallery Cluster Class Number: 798
Gallery Cluster Image per Class: 22.92
Low Qualtiy Image in Query: 92
Low Qualtiy Image in Gallery: 2760
torch.Size([1052, 2048])

RuntimeError: CUDA out of memory. Tried to allocate 1.39 GiB (GPU 0; 7.80 GiB total capacity; 5.19 GiB already allocated; 113.00 MiB free; 5.59 GiB reserved in total by PyTorch)

It says `CUDA out of memory. Could you please anyone give me gaudiness?! How can i solve this problem?

Negative mining 以及 triplet loss

郑博士您好!
您在论文中提到有minibatch中50%的图片用作Negative mining ,并且使用了度量学习的方法.请问Negative mining 以及 triplet loss对应于代码中的何处?

About Content Manipulation

Hi,Professor Zheng.I'm intertested of the Content Manipulation operation to generate a new car.I had read your annother paper《DG-Net》,and run the demo of your code.But I want to use it on the vehicle dataset,I want to know how to realize the opration the same the Content Manipulation.Look forward to your apply,Thanks!

evaluate_gpu.py

@layumi 这博士你好。evaluate_gpu.py中的第53行, cmc[rows_good[0]:] = 1 这句代码,我感觉不太合适。cmc的长度是36935,是去除junk_index之前的index的长度;而rows_good所代表的索引是去除junk_index之后的index的,此时的index,在经过第45行的“index = index[mask]”之后,已经变短了,为36875。那“ cmc[rows_good[0]:] = 1”这句代码是不是不太合适啊??

可视化

请问如何实现在论文中最后提到的可视化显示

AICity2020 data

hello,where I can download aicity 2020 data?you have Baidu cloud link? thank you.

submit_result_multimmodel.py

您好,在46-47行
parser.add_argument('--test_dir', default='../data/test_data', type=str, help='./test_data')
parser.add_argument('--crop_dir', default='../data/cropped_aicity', type=str, help='./test_data')
并没有发现相关代码

ff = torch.FloatTensor(n,512).zero_().cuda()

Hi @layumi
sorry for bothering you. May I ask you again?

def extract_feature(/*...*/):    
   // ...
    for data in tqdm(dataloaders):
        //...
        ff = torch.FloatTensor(n,512).zero_().cuda()

Where did 512 come from ?
I am curious about the reason why you set the parameter as 512.

train_2020.py

郑博士以及各位大神好,代码出现点问题,我一个EPOCH都没跑玩,准确率为1,损失基本为0:
条件是:--name SE_imbalance_s1_384_p0.5_lr2_mt_d0_b24+v+aug --warm_epoch 5 --droprate 0 --stride 1 --erasing_p 0.5 --autoaug --inputsize 384 --lr 0.02 --use_SE --gpu_ids 0 --train_virtual --batchsize 8

下面是代码

from future import print_function, division

import argparse
import torch
import torch.nn as nn
import torch.optim as optim
from torch.optim import lr_scheduler
from torch.autograd import Variable
from torchvision import datasets, transforms
import torch.backends.cudnn as cudnn
import matplotlib

matplotlib.use('agg')
import matplotlib.pyplot as plt

from PIL import Image

import time
import os
from losses import AngleLoss, ArcLoss
from model import ft_net, ft_net_dense, ft_net_EF4, ft_net_EF5, ft_net_EF6, ft_net_IR, ft_net_NAS, ft_net_SE,
ft_net_DSE, PCB, CPB, ft_net_angle, ft_net_arc
from random_erasing import RandomErasing
import yaml
from AugFolder import AugFolder
from shutil import copyfile
import random
from autoaugment import ImageNetPolicy
from utils import get_model_list, load_network, save_network, make_weights_for_balanced_classes

version = torch.version

fp16

try:
from apex.fp16_utils import *
from apex import amp, optimizers
except ImportError: # will be 3.x series
print(
'This is not an error. If you want to use low precision, i.e., fp16, please install the apex with cuda support (https://github.com/NVIDIA/apex) and update pytorch to 1.0')

make the output

if not os.path.isdir('/home/ubuntu-guangzhaodai/Desktop/AICIty-reID-2020/data/outputs'):
os.mkdir('/home/ubuntu-guangzhaodai/Desktop/AICIty-reID-2020/data/outputs')
######################################################################

Options

--------

parser = argparse.ArgumentParser(description='Training')
parser.add_argument('--gpu_ids', default='0', type=str, help='gpu_ids: e.g. 0 0,1,2 0,2')
parser.add_argument('--adam', action='store_true', help='use all training data')
parser.add_argument('--name', default='ft_ResNet50', type=str, help='output model name')
parser.add_argument('--init_name', default='imagenet', type=str, help='initial with ImageNet')
parser.add_argument('--data_dir', default='/home/ubuntu-guangzhaodai/Desktop/AICIty-reID-2020/data/pytorch2020',
type=str, help='training dir path')
parser.add_argument('--train_all', action='store_true', help='use all training data')
parser.add_argument('--train_veri', action='store_true', help='use part training data + veri')
parser.add_argument('--train_virtual', action='store_true', help='use part training data + virtual')
parser.add_argument('--train_comp', action='store_true', help='use part training data + comp')
parser.add_argument('--train_pku', action='store_true', help='use part training data + pku')
parser.add_argument('--train_comp_veri', action='store_true', help='use part training data + comp +veri')
parser.add_argument('--train_milktea', action='store_true', help='use part training data + com + veri+pku')
parser.add_argument('--color_jitter', action='store_true', help='use color jitter in training')
parser.add_argument('--batchsize', default=32, type=int, help='batchsize')
parser.add_argument('--inputsize', default=299, type=int, help='batchsize')
parser.add_argument('--h', default=299, type=int, help='height')
parser.add_argument('--w', default=299, type=int, help='width')
parser.add_argument('--stride', default=2, type=int, help='stride')
parser.add_argument('--pool', default='avg', type=str, help='last pool')
parser.add_argument('--autoaug', action='store_true', help='use Color Data Augmentation')
parser.add_argument('--erasing_p', default=0, type=float, help='Random Erasing probability, in [0,1]')
parser.add_argument('--use_dense', action='store_true', help='use densenet121')
parser.add_argument('--use_NAS', action='store_true', help='use nasnetalarge')
parser.add_argument('--use_SE', action='store_true', help='use se_resnext101_32x4d')
parser.add_argument('--use_DSE', action='store_true', help='use senet154')
parser.add_argument('--use_IR', action='store_true', help='use InceptionResNetv2')
parser.add_argument('--use_EF4', action='store_true', help='use EF4')
parser.add_argument('--use_EF5', action='store_true', help='use EF5')
parser.add_argument('--use_EF6', action='store_true', help='use EF6')
parser.add_argument('--lr', default=0.05, type=float, help='learning rate')
parser.add_argument('--droprate', default=0.5, type=float, help='drop rate')
parser.add_argument('--PCB', action='store_true', help='use PCB+ResNet50')
parser.add_argument('--CPB', action='store_true', help='use Center+ResNet50')
parser.add_argument('--fp16', action='store_true',
help='use float16 instead of float32, which will save about 50% memory')
parser.add_argument('--balance', action='store_true', help='balance sample')
parser.add_argument('--angle', action='store_true', help='use angle loss')
parser.add_argument('--arc', action='store_true', help='use arc loss')
parser.add_argument('--warm_epoch', default=0, type=int, help='the first K epoch that needs warm up')
parser.add_argument('--resume', action='store_true', help='use arc loss')
opt = parser.parse_args()

if opt.resume:
model, opt, start_epoch = load_network(opt.name, opt)
else:
start_epoch = 0

print(start_epoch)

fp16 = opt.fp16
data_dir = opt.data_dir
name = opt.name

if not opt.resume:
str_ids = opt.gpu_ids.split(',')
gpu_ids = []
for str_id in str_ids:
gid = int(str_id)
if gid >= 0:
gpu_ids.append(gid)
opt.gpu_ids = gpu_ids

set gpu ids

if len(opt.gpu_ids) > 0:
cudnn.enabled = True
cudnn.benchmark = True
######################################################################

Load Data

---------

if opt.h == opt.w:
transform_train_list = [
# transforms.RandomRotation(30),
transforms.Resize((opt.inputsize, opt.inputsize), interpolation=3),
transforms.Pad(15),
# transforms.RandomCrop((256,256)),
transforms.RandomResizedCrop(size=opt.inputsize, scale=(0.75, 1.0), ratio=(0.75, 1.3333), interpolation=3),
# Image.BICUBIC)
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]

transform_val_list = [
    transforms.Resize(size=opt.inputsize, interpolation=3),  # Image.BICUBIC
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]

else:
transform_train_list = [
# transforms.RandomRotation(30),
transforms.Resize((opt.h, opt.w), interpolation=3),
transforms.Pad(15),
# transforms.RandomCrop((256,256)),
transforms.RandomResizedCrop(size=(opt.h, opt.w), scale=(0.75, 1.0), ratio=(0.75, 1.3333), interpolation=3),
# Image.BICUBIC)
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]

transform_val_list = [
    transforms.Resize((opt.h, opt.w), interpolation=3),  # Image.BICUBIC
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]

if opt.PCB:
transform_train_list = [
transforms.Resize((384, 192), interpolation=3),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]
transform_val_list = [
transforms.Resize(size=(384, 192), interpolation=3), # Image.BICUBIC
transforms.ToTensor(),
transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]

if opt.erasing_p > 0:
transform_train_list = transform_train_list + [RandomErasing(probability=opt.erasing_p, mean=[0.0, 0.0, 0.0])]

if opt.color_jitter:
transform_train_list = [transforms.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1,
hue=0)] + transform_train_list

transform_train_list_aug = [ImageNetPolicy()] + transform_train_list

print(transform_train_list)
data_transforms = {
'train': transforms.Compose(transform_train_list),
'train_aug': transforms.Compose(transform_train_list_aug),
'val': transforms.Compose(transform_val_list),
}

train_all = ''
if opt.train_all:
train_all = '_all'

if opt.train_veri:
train_all = '+veri'

if opt.train_comp:
train_all = '+comp'

if opt.train_virtual:
train_all = '+virtual'

if opt.train_pku:
train_all = '+pku'

if opt.train_comp_veri:
train_all = '+comp+veri'

if opt.train_milktea:
train_all = '+comp+veri+pku'

image_datasets = {}

if not opt.autoaug:
image_datasets['train'] = datasets.ImageFolder(os.path.join(data_dir, 'train' + train_all),
data_transforms['train'])
else:
image_datasets['train'] = AugFolder(os.path.join(data_dir, 'train' + train_all),
data_transforms['train'], data_transforms['train_aug'])

if opt.balance:
dataset_train = image_datasets['train']
weights = make_weights_for_balanced_classes(dataset_train.imgs, len(dataset_train.classes))
weights = torch.DoubleTensor(weights)
sampler = torch.utils.data.sampler.WeightedRandomSampler(weights, len(weights))
dataloaders = {}
dataloaders['train'] = torch.utils.data.DataLoader(image_datasets['train'], batch_size=opt.batchsize,
sampler=sampler, num_workers=8,
pin_memory=True) # 8 workers may work faster
else:
dataloaders = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=opt.batchsize,
shuffle=True, num_workers=8, pin_memory=True)
# 8 workers may work faster
for x in ['train']}

dataset_sizes = {x: len(image_datasets[x]) for x in ['train']}
class_names = image_datasets['train'].classes

use_gpu = torch.cuda.is_available()

since = time.time()

inputs, classes = next(iter(dataloaders['train']))

print(time.time()-since)

######################################################################

Training the model

------------------

Now, let's write a general function to train a model. Here, we will

illustrate:

- Scheduling the learning rate

- Saving the best model

In the following, parameter scheduler is an LR scheduler object from

torch.optim.lr_scheduler.

y_loss = {} # loss history
y_loss['train'] = []
y_loss['val'] = []
y_err = {}
y_err['train'] = []
y_err['val'] = []

def train_model(model, criterion, optimizer, scheduler, start_epoch=0, num_epochs=25):
since = time.time()

warm_up = 0.1  # We start from the 0.1*lrRate
gamma = 0.0  # auto_aug
warm_iteration = round(dataset_sizes['train'] / opt.batchsize) * opt.warm_epoch  # first 5 epoch
total_iteration = round(dataset_sizes['train'] / opt.batchsize) * num_epochs

best_model_wts = model.state_dict()
best_loss = 9999
best_epoch = 0

for epoch in range(num_epochs - start_epoch):
    epoch = epoch + start_epoch
    print('gamma: %.4f' % gamma)
    print('Epoch {}/{}'.format(epoch, num_epochs - 1))
    print('-' * 50)

    # Each epoch has a training and validation phase
    for phase in ['train']:
        if phase == 'train':
            scheduler.step()
            model.train(True)  # Set model to training mode
        else:
            model.train(False)  # Set model to evaluate mode

        running_loss = 0.0
        running_corrects = 0.0
        global_step = 0
        iterate_num = 0
        # Iterate over data.
        for data in dataloaders[phase]:
            # get the inputs
            if opt.autoaug:
                inputs, inputs2, labels = data
                if random.uniform(0, 1) > gamma:
                    inputs = inputs2
                gamma = min(1.0, gamma + 1.0 / total_iteration)
            else:
                inputs, labels = data

            now_batch_size, c, h, w = inputs.shape
            if now_batch_size < opt.batchsize:  # skip the last batch
                continue
            # print(inputs.shape)
            # wrap them in Variable
            if use_gpu:
                inputs = Variable(inputs.cuda().detach())
                labels = Variable(labels.cuda().detach())
            else:
                inputs, labels = Variable(inputs), Variable(labels)
            # if we use low precision, input also need to be fp16
            # if fp16:
            #    inputs = inputs.half()

            # zero the parameter gradients
            optimizer.zero_grad()

            # forward
            if phase == 'val':
                with torch.no_grad():
                    outputs = model(inputs)
            else:
                outputs = model(inputs)

            if opt.PCB:
                part = {}
                sm = nn.Softmax(dim=1)
                num_part = 6
                for i in range(num_part):
                    part[i] = outputs[i]

                score = sm(part[0]) + sm(part[1]) + sm(part[2]) + sm(part[3]) + sm(part[4]) + sm(part[5])
                _, preds = torch.max(score.data, 1)

                loss = criterion(part[0], labels)
                for i in range(num_part - 1):
                    loss += criterion(part[i + 1], labels)
            elif opt.CPB:
                part = {}
                sm = nn.Softmax(dim=1)
                num_part = 4
                for i in range(num_part):
                    part[i] = outputs[i]

                score = sm(part[0]) + sm(part[1]) + sm(part[2]) + sm(part[3])
                _, preds = torch.max(score.data, 1)

                loss = criterion(part[0], labels)
                for i in range(num_part - 1):
                    loss += criterion(part[i + 1], labels)
            else:
                loss = criterion(outputs, labels)
                if opt.angle or opt.arc:
                    outputs = outputs[0]
                _, preds = torch.max(outputs.data, 1)

            # backward + optimize only if in training phase
            if epoch < opt.warm_epoch and phase == 'train':
                warm_up = min(1.0, warm_up + 0.9 / warm_iteration)
                loss *= warm_up

            # backward + optimize only if in training phase
            if phase == 'train':
                if fp16:  # we use optimier to backward loss
                    with amp.scale_loss(loss, optimizer) as scaled_loss:
                        scaled_loss.backward()
                else:
                    loss.backward()
                optimizer.step()
                global_step += 1
                iterate_num += now_batch_size
            print('Epoch: [{0}] [{1} / {2}]\t'
                  'Global_step: {3:}\t'
                  'Loss: {4:.3f}\t'
                  'Accurcy: {5:.3f}\t'.format(epoch, iterate_num, dataset_sizes[phase], global_step, loss.item(),
                                                    float(torch.sum(preds == labels.data)) / now_batch_size))
            # print('Epoch:%d Iteration:%d Total:%d Global_step:%d loss:%.2f accuracy:%.2f' % (
            # epoch, iterate_num, dataset_sizes[phase], global_step, loss.item(), float(torch.sum(preds == labels.data)) / now_batch_size))
            # statistics
            if int(version[0]) > 0 or int(version[2]) > 3:  # for the new version like 0.4.0, 0.5.0 and 1.0.0
                running_loss += loss.item() * now_batch_size
            else:  # for the old version like 0.3.0 and 0.3.1
                running_loss += loss.data[0] * now_batch_size
            running_corrects += float(torch.sum(preds == labels.data))

            del (loss, outputs, inputs, preds)

        epoch_loss = running_loss / dataset_sizes[phase]
        epoch_acc = running_corrects / dataset_sizes[phase]

        print('{} Loss: {:.4f} Acc: {:.4f}'.format(
            phase, epoch_loss, epoch_acc))

        y_loss[phase].append(epoch_loss)
        y_err[phase].append(1.0 - epoch_acc)
        # deep copy the model
        if len(opt.gpu_ids) > 1:
            save_network(model.module, opt.name, epoch + 1)
        else:
            save_network(model, opt.name, epoch + 1)
        draw_curve(epoch)

    time_elapsed = time.time() - since
    print('Training complete in {:.0f}m {:.0f}s'.format(
        time_elapsed // 60, time_elapsed % 60))
    print()
    if epoch_loss < best_loss:
        best_loss = epoch_loss
        best_epoch = epoch
        last_model_wts = model.state_dict()

time_elapsed = time.time() - since
print('Training complete in {:.0f}m {:.0f}s'.format(
    time_elapsed // 60, time_elapsed % 60))
print('Best epoch: {:d} Best Train Loss: {:4f}'.format(best_epoch, best_loss))

# load best model weights
model.load_state_dict(last_model_wts)
save_network(model, opt.name, 'last')
return model

######################################################################

Draw Curve

---------------------------

x_epoch = []
fig = plt.figure()
ax0 = fig.add_subplot(121, title="loss")
ax1 = fig.add_subplot(122, title="top1err")

def draw_curve(current_epoch):
x_epoch.append(current_epoch)
ax0.plot(x_epoch, y_loss['train'], 'bo-', label='train')
# ax0.plot(x_epoch, y_loss['val'], 'ro-', label='val')
ax1.plot(x_epoch, y_err['train'], 'bo-', label='train')
# ax1.plot(x_epoch, y_err['val'], 'ro-', label='val')
if current_epoch == 0:
ax0.legend()
ax1.legend()
fig.savefig(os.path.join('/home/ubuntu-guangzhaodai/Desktop/AICIty-reID-2020/data/outputs', name, 'train.png'))

######################################################################

Finetuning the convnet

----------------------

Load a pretrainied model and reset final fully connected layer.

if not opt.resume:
opt.nclasses = len(class_names)
if opt.use_dense:
model = ft_net_dense(len(class_names), opt.droprate, opt.stride, None, opt.pool)
elif opt.use_NAS:
model = ft_net_NAS(len(class_names), opt.droprate, opt.stride)
elif opt.use_SE:
model = ft_net_SE(len(class_names), opt.droprate, opt.stride, opt.pool)
elif opt.use_DSE:
model = ft_net_DSE(len(class_names), opt.droprate, opt.stride, opt.pool)
elif opt.use_IR:
model = ft_net_IR(len(class_names), opt.droprate, opt.stride)
elif opt.use_EF4:
model = ft_net_EF4(len(class_names), opt.droprate)
elif opt.use_EF5:
model = ft_net_EF5(len(class_names), opt.droprate)
elif opt.use_EF6:
model = ft_net_EF6(len(class_names), opt.droprate)
else:
model = ft_net(len(class_names), opt.droprate, opt.stride, None, opt.pool)

if opt.PCB:
    model = PCB(len(class_names))

if opt.CPB:
    model = CPB(len(class_names))

if opt.angle:
    model = ft_net_angle(len(class_names), opt.droprate, opt.stride)
elif opt.arc:
    model = ft_net_arc(len(class_names), opt.droprate, opt.stride)

if opt.init_name != 'imagenet':
old_opt = parser.parse_args()
init_model, old_opt, _ = load_network(opt.init_name, old_opt)
print(old_opt)
opt.stride = old_opt.stride
opt.pool = old_opt.pool
opt.use_dense = old_opt.use_dense
if opt.use_dense:
model = ft_net_dense(opt.nclasses, droprate=opt.droprate, stride=opt.stride, init_model=init_model,
pool=opt.pool)
else:
model = ft_net(opt.nclasses, droprate=opt.droprate, stride=opt.stride, init_model=init_model, pool=opt.pool)

##########################

Put model parameter in front of the optimizer!!!

For resume:

if start_epoch >= 60:
opt.lr = opt.lr * 0.1
if start_epoch >= 75:
opt.lr = opt.lr * 0.1

if len(opt.gpu_ids) > 1:
model = torch.nn.DataParallel(model, device_ids=opt.gpu_ids).cuda()
if not opt.CPB:
ignored_params = list(map(id, model.module.classifier.parameters()))
base_params = filter(lambda p: id(p) not in ignored_params, model.parameters())
optimizer_ft = optim.SGD([
{'params': base_params, 'lr': 0.1 * opt.lr},
{'params': model.module.classifier.parameters(), 'lr': opt.lr}
], weight_decay=5e-4, momentum=0.9, nesterov=True)
else:
ignored_params = (list(map(id, model.module.classifier0.parameters()))
+ list(map(id, model.module.classifier1.parameters()))
+ list(map(id, model.module.classifier2.parameters()))
+ list(map(id, model.module.classifier3.parameters()))
)
base_params = filter(lambda p: id(p) not in ignored_params, model.parameters())
optimizer_ft = optim.SGD([
{'params': base_params, 'lr': 0.1 * opt.lr},
{'params': model.module.classifier0.parameters(), 'lr': opt.lr},
{'params': model.module.classifier1.parameters(), 'lr': opt.lr},
{'params': model.module.classifier2.parameters(), 'lr': opt.lr},
{'params': model.module.classifier3.parameters(), 'lr': opt.lr},
], weight_decay=5e-4, momentum=0.9, nesterov=True)
else:
model = model.cuda()
if not opt.CPB:
ignored_params = list(map(id, model.classifier.parameters()))
base_params = filter(lambda p: id(p) not in ignored_params, model.parameters())
optimizer_ft = optim.SGD([
{'params': base_params, 'lr': 0.1 * opt.lr},
{'params': model.classifier.parameters(), 'lr': opt.lr}
], weight_decay=5e-4, momentum=0.9, nesterov=True)
else:
ignored_params = (list(map(id, model.classifier0.parameters()))
+ list(map(id, model.classifier1.parameters()))
+ list(map(id, model.classifier2.parameters()))
+ list(map(id, model.classifier3.parameters()))
)
base_params = filter(lambda p: id(p) not in ignored_params, model.parameters())
optimizer_ft = optim.SGD([
{'params': base_params, 'lr': 0.1 * opt.lr},
{'params': model.classifier0.parameters(), 'lr': opt.lr},
{'params': model.classifier1.parameters(), 'lr': opt.lr},
{'params': model.classifier2.parameters(), 'lr': opt.lr},
{'params': model.classifier3.parameters(), 'lr': opt.lr},
], weight_decay=5e-4, momentum=0.9, nesterov=True)

if opt.adam:
optimizer_ft = optim.Adam(model.parameters(), opt.lr, weight_decay=5e-4)

Decay LR by a factor of 0.1 every 40 epochs

exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=40, gamma=0.1)

exp_lr_scheduler = lr_scheduler.MultiStepLR(optimizer_ft, milestones=[60 - start_epoch, 75 - start_epoch], gamma=0.1)

######################################################################

Train and evaluate

^^^^^^^^^^^^^^^^^^

It should take around 1-2 hours on GPU.

dir_name = os.path.join('/home/ubuntu-guangzhaodai/Desktop/AICIty-reID-2020/data/outputs', name)

if not opt.resume:
if not os.path.isdir(dir_name):
os.mkdir(dir_name)
# record every run
copyfile('./train_2020.py', dir_name + '/train.py')
copyfile('./model.py', dir_name + '/model.py')
# save opts
with open('%s/opts.yaml' % dir_name, 'w') as fp:
yaml.dump(vars(opt), fp, default_flow_style=False)

model to gpu

if fp16:
# model = network_to_half(model)
# optimizer_ft = FP16_Optimizer(optimizer_ft, dynamic_loss_scale=True)
model, optimizer_ft = amp.initialize(model, optimizer_ft, opt_level="O1")

if opt.angle:
criterion = AngleLoss()
elif opt.arc:
criterion = ArcLoss()
else:
criterion = nn.CrossEntropyLoss()

print(model)
model = train_model(model, criterion, optimizer_ft, exp_lr_scheduler,
start_epoch=start_epoch, num_epochs=80)

--init_name pretrained model

@layumi Hi. May I ask you something again? I would like to use my own previously trained model as a pre-trained model.
Let's say I have SE_imbalance_s1_128_p0.5_lr2_mt_d0_b24+v+aug_20220701 already trained model. And I would like to use my trained model as a pre-trained model for my next training. How could I do this?

I have tried as following:
python train_2020.py --data_dir ../../data/reid_data --name SE_imbalance_s1_128_p0.5_lr2_mt_d0_b24+v+aug --warm_epoch 5 --droprate 0 --stride 1 --erasing_p 0.5 --autoaug --inputsize 128 --lr 0.02 --use_SE --gpu_ids 0,1 --train_virtual --batchsize 128 --init_name SE_imbalance_s1_128_p0.5_lr2_mt_d0_b24+v+aug_20220701

however I got following error:
ModuleAttributeError("'{}' object has no attribute '{}'".format( torch.nn.modules.module.ModuleAttributeError: 'SENet' object has no attribute 'conv1'

Thanks

请教楼主score得分的理解

楼主,你好,刚开始入门reid,看了你的代码,最后的评估evaluate_result中的score是算的两个特征矩阵乘积,score = torch.mm(query_feature, torch.transpose( gallery_feature, 0, 1)),不是很理解,一个query的feature与gallery的features做度量,为什么要用乘法呀?求指教

测试所得mAP很低。

@郑博士 你好。我这边运行train+virtual数据集,当运行到40个Epoch的时候,太慢了,我把它停了,然后跑测试集,但结果rank@5才不到90%,mAP才0.35,这正常吗??会是哪里出错了呢??

一个epoch运行时间过长

@layumi 郑博士你好,请问一个epoch得运行多长时间,我这里运行了一个多小时,都没运行完??是真的需要运行这么久,还是出错了??

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.