hou-yz / mvdet Goto Github PK

[ECCV 2020] Codes and MultiviewX dataset for "Multiview Detection with Feature Perspective Transformation".

Home Page: https://hou-yz.github.io/publication/2020-eccv2020-mvdet

Python 50.31% Objective-C 0.31% MATLAB 37.34% C++ 11.55% C 0.45% Shell 0.03%

pytorch multiview pedestrian-detection dataset detection detector

mvdet's Introduction

Multiview Detection with Feature Perspective Transformation [Website] [arXiv]

@inproceedings{hou2020multiview,
  title={Multiview Detection with Feature Perspective Transformation},
  author={Hou, Yunzhong and Zheng, Liang and Gould, Stephen},
  booktitle={ECCV},
  year={2020}
}

Please visit link for our new work MVDeTr, a transformer-powered multiview detector that achieves new state-of-the-art!

Overview

We release the PyTorch code for MVDet, a state-of-the-art multiview pedestrian detector; and MultiviewX dataset, a novel synthetic multiview pedestrian detection datatset.

Wildtrack	MultiviewX

Content

MultiviewX dataset
- Download MultiviewX
- Build your own version
MVDet Code

MultiviewX dataset

Using pedestrian models from PersonX, in Unity, we build a novel synthetic dataset MultiviewX.

MultiviewX dataset covers a square of 16 meters by 25 meters. We quantize the ground plane into a 640x1000 grid. There are 6 cameras with overlapping field-of-view in MultiviewX dataset, each of which outputs a 1080x1920 resolution image. We also generate annotations for 400 frames in MultiviewX at 2 fps (same as Wildtrack). On average, 4.41 cameras are covering the same location.

Download MultiviewX

Please refer to this link for download.

Build your own version

Please refer to this repo for a detailed guide & toolkits you might need.

MVDet Code

This repo is dedicated to the code for MVDet.

Dependencies

This code uses the following libraries

python 3.7+
pytorch 1.4+ & tochvision
numpy
matplotlib
pillow
opencv-python
kornia
matlab & matlabengine (required for evaluation) (see this link for detailed guide)

Data Preparation

By default, all datasets are in ~/Data/. We use MultiviewX and Wildtrack in this project.

Your ~/Data/ folder should look like this

Data
├── MultiviewX/
│   └── ...
└── Wildtrack/ 
    └── ...

Training

In order to train classifiers, please run the following,

CUDA_VISIBLE_DEVICES=0,1 python main.py -d wildtrack

This should automatically return evaluation results similar to the reported 88.2% MODA on Wildtrack dataset.

Pre-trained models

You can download the checkpoints at this link.

mvdet's People

Contributors

Stargazers

Watchers

mvdet's Issues

Can not download the MultiviewX

Hi~Thank you for this amazing work!Can you please put the MultiviewX dataset on github or a domestic website?

Lightweight MVDet.

Thank you for this article, I have a great reference! Now I want to apply it to the project, using the Jetson SUB DEVELOPER KIT. Its CPU and GPU have a combined memory of 16GB, which is obviously not compatible with MVDET's requirement for two graphics cards with more than 11GB of memory. So I would like to ask if there is a MVDET lightweight approach, or is there a lightweight approach to optimize it? Hope to hear from you soon.Best wishes.

code different

摄像头重叠可视化的问题

作者，您好！
向您请教一个问题，我使用了您程序datasets/frameDatasets.py中的test()函数进行逆投影变换的可视化，把WILDTRACK数据集7个摄像头的可视化结果叠在一起是这样的，如下图：

但是在WIDETRACK数据集文章中，他给出的交叠结果是这样的：

两张图片并不相似。我检查了每个人的position在图片中的投影，发现是没问题的。

所以我的问题是：为什么自己显示交叠区域与论文中的差这么多呢？？这会让人怀疑是否逆透视变换正确.....

Unity-related questions

Hello, I am experiencing some difficulty getting from the PersonX tutorial on the official PersonX repo to achieving some likeness of your MultiviewX dataset.

Could you perhaps shed some light on how you managed all of the individual people (game objects) and their movement? I am new to Unity and somehow there doesn't seem to be that much documentation on such uses.

For example, the default PersonX code sequentially loads, moves around, and destroys the models. I tried loading multiple models using a for loop and such but they seemed to malfunction and stay in one fixed position.

Any help would be much appreciated. Thank you.

大卷积核可视化

请问Output: pedestrian occupancy map这个可视化在MVdet源码中有体现嘛

Error when load checkpoint: MultiviewDetector.pth

When I run the code :
` resume_fname = resume_dir + '/MultiviewDetector.pth'

    model.load_state_dict(torch.load(resume_fname, map_location='cuda:0'))`

I come across an error:
RuntimeError: Error(s) in loading state_dict for PerspTransDetector: Missing key(s) in state_dict: "map_classifier.0.weight", "map_classifier.0.bias", "map_classifier.2.weight", "map_classifier.2.bias", "map_classifier.4.weight". Unexpected key(s) in state_dict: "base_pt1.0.weight", "base_pt1.1.weight", "base_pt1.1.bias", "base_pt1.1.running_mean", "base_pt1.1.running_var", "base_pt1.1.num_batches_tracked", "base_pt1.4.0.conv1.weight", "base_pt1.4.0.bn1.weight", "base_pt1.4.0.bn1.bias", "base_pt1.4.0.bn1.running_mean", "base_pt1.4.0.bn1.running_var", "base_pt1.4.0.bn1.num_batches_tracked", "base_pt1.4.0.conv2.weight", "base_pt1.4.0.bn2.weight", "base_pt1.4.0.bn2.bias", "base_pt1.4.0.bn2.running_mean", "base_pt1.4.0.bn2.running_var", "base_pt1.4.0.bn2.num_batches_tracked", "base_pt1.4.1.conv1.weight", "base_pt1.4.1.bn1.weight", "base_pt1.4.1.bn1.bias", "base_pt1.4.1.bn1.running_mean", "base_pt1.4.1.bn1.running_var", "base_pt1.4.1.bn1.num_batches_tracked", "base_pt1.4.1.conv2.weight", "base_pt1.4.1.bn2.weight", "base_pt1.4.1.bn2.bias", "base_pt1.4.1.bn2.running_mean", "base_pt1.4.1.bn2.running_var", "base_pt1.4.1.bn2.num_batches_tracked", "base_pt1.5.0.conv1.weight", "base_pt1.5.0.bn1.weight", "base_pt1.5.0.bn1.bias", "base_pt1.5.0.bn1.running_mean", "base_pt1.5.0.bn1.running_var", "base_pt1.5.0.bn1.num_batches_tracked", "base_pt1.5.0.conv2.weight", "base_pt1.5.0.bn2.weight", "base_pt1.5.0.bn2.bias", "base_pt1.5.0.bn2.running_mean", "base_pt1.5.0.bn2.running_var", "base_pt1.5.0.bn2.num_batches_tracked", "base_pt1.5.0.downsample.0.weight", "base_pt1.5.0.downsample.1.weight", "base_pt1.5.0.downsample.1.bias", "base_pt1.5.0.downsample.1.running_mean", "base_pt1.5.0.downsample.1.running_var", "base_pt1.5.0.downsample.1.num_batches_tracked", "base_pt1.5.1.conv1.weight", "base_pt1.5.1.bn1.weight", "base_pt1.5.1.bn1.bias", "base_pt1.5.1.bn1.running_mean", "base_pt1.5.1.bn1.running_var", "base_pt1.5.1.bn1.num_batches_tracked", "base_pt1.5.1.conv2.weight", "base_pt1.5.1.bn2.weight", "base_pt1.5.1.bn2.bias", "base_pt1.5.1.bn2.running_mean", "base_pt1.5.1.bn2.running_var", "base_pt1.5.1.bn2.num_batches_tracked", "base_pt1.6.0.conv1.weight", "base_pt1.6.0.bn1.weight", "base_pt1.6.0.bn1.bias", "base_pt1.6.0.bn1.running_mean", "base_pt1.6.0.bn1.running_var", "base_pt1.6.0.bn1.num_batches_tracked", "base_pt1.6.0.conv2.weight", "base_pt1.6.0.bn2.weight", "base_pt1.6.0.bn2.bias", "base_pt1.6.0.bn2.running_mean", "base_pt1.6.0.bn2.running_var", "base_pt1.6.0.bn2.num_batches_tracked", "base_pt1.6.0.downsample.0.weight", "base_pt1.6.0.downsample.1.weight", "base_pt1.6.0.downsample.1.bias", "base_pt1.6.0.downsample.1.running_mean", "base_pt1.6.0.downsample.1.running_var", "base_pt1.6.0.downsample.1.num_batches_tracked", "base_pt1.6.1.conv1.weight", "base_pt1.6.1.bn1.weight", "base_pt1.6.1.bn1.bias", "base_pt1.6.1.bn1.running_mean", "base_pt1.6.1.bn1.running_var", "base_pt1.6.1.bn1.num_batches_tracked", "base_pt1.6.1.conv2.weight", "base_pt1.6.1.bn2.weight", "base_pt1.6.1.bn2.bias", "base_pt1.6.1.bn2.running_mean", "base_pt1.6.1.bn2.running_var", "base_pt1.6.1.bn2.num_batches_tracked", "world_classifier.0.weight", "world_classifier.0.bias", "world_classifier.2.weight", "world_classifier.2.bias", "world_classifier.4.weight".
Please tell me how can I resolve it?

Error when load checkpoint: Missing key(s) in state_dict & Unexpected key(s) in state_dict

Help: An error occured to me as shown in below, how to solve it?

Missing key(s) in state_dict: "map_classifier.0.weight", "map_classifier.0.bias", "map_classifier.2.weight", "map_classifier.2.bias", "map_classifier.4.weight".

Unexpected key(s) in state_dict: "world_classifier.0.weight", "world_classifier.0.bias", "world_classifier.2.weight", "world_classifier.2.bias", "world_classifier.4.weight".

How to train synthehicle data with MVDet ?

My project is about V2X

What is the best way to express the performance of each frame?

Hello author, thank you for your article. I have benefited a lot. I have a few questions to ask you. The final performance indicators of this task are MODA and MODP, but they are performance representations of a series of frames or the entire test set. , if I want to express the performance of each frame, which indicator do you think is more appropriate? Thank you again for your great contribution to this field, and I look forward to your reply. I wish you good luck in your work!

Ground truth bounding box is not perfect.

Hi !.
Thank you for sharing your work.

I found that the bounding box annotation in Multiview data set is not perfect.
Some people are not detected from the bounding box annotation.

I guess the reason you didn't label them is those people are outside of the occupancy map?

Model checkpoints

Thank you for your valuable contribution to the field.

When I start training the model I am not being able to see the model checkpoints or weights, also training take a long time. I appreciate any suggestions.

Best wishes,

Problem with black screen during training

I can successfully run MVDet, but during the training process, my computer suddenly goes black, and there is no response to any keyboard or mouse input. The monitor shows no HDMI signal input, but the graphics card fan is still running crazily. I can train up to epoch 4 at most, but most of the time I can only reach epoch 1. I was running on Ubuntu18.04, and my graphics card is two NVIDIA GeForce RTX 3080 with 10G memory each, and I’m not sure if that’s enough. The GPU memory usage rate does not stay at 100% during the code execution process, and the temperature is up to 86°C. I would like to know if the author and other friends have encountered similar problems, and if you can provide some possibilities and suggestions. This is really important to me, thank you.

What is the minimum hardware requirement?

Hi, could you please tell me the GPU consumption of the training? I only got two RTX2070Super at home, would it be possible to train the network with only these equipment? Many thanks

Are the detection results corresponding to the pixels obtained after NMS processing of the pedestrian occupancy map equivalent to the confidence of the anchor-based bounding box in the target detection task?

Hello author, thank you very much for your work, which has brought me great inspiration. I have a question I want to know,Are the detection results corresponding to the pixels obtained after NMS processing of the pedestrian occupancy map equivalent to the confidence of the anchor-based bounding box in the target detection task?I want to optimize inference based on this work. I wonder if the average value of this probability value can be used as a performance indicator of inference results without groundtruth.Looking forward to your reply, thanks again！！！

different grid unit between two datasets

Hi, Thanks for sharing such a nice work! I am still little confused about the worldgrid2worldcoord_matrix, why the unit of grid of MultiviewX is meter but in Wildtrack is centimeter. Does this setting have any special meaning? If I also set the wildtrack one also to meters, will it have any effect?

import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import cv2
from multiview_detector.utils import projection
from multiview_detector.datasets.MultiviewX import MultiviewX

if __name__ == '__main__':
    img = Image.open('D:/dataset/tracking/CMC/data/images/MultiviewX/Image_subsets/C1/0000.png')
    dataset = MultiviewX('D:/dataset/tracking/CMC/data/images/MultiviewX')
    xi = np.arange(0, 480, 40)
    yi = np.arange(0, 1440, 40)
    world_grid = np.stack(np.meshgrid(xi, yi, indexing='ij')).reshape([2, -1])
    world_coord = dataset.get_worldcoord_from_worldgrid(world_grid)
    img_coord = projection.get_imagecoord_from_worldcoord(world_coord, dataset.intrinsic_matrices[0],
                                                          dataset.extrinsic_matrices[0])
    img_coord = img_coord[:, np.where((img_coord[0] > 0) & (img_coord[1] > 0) &
                                      (img_coord[0] < 1920) & (img_coord[1] < 1080))[0]]
    plt.imshow(img)
    plt.show()
    img_coord = img_coord.astype(int).transpose()
    img = cv2.cvtColor(np.array(img), cv2.COLOR_RGB2BGR)
    for point in img_coord:
        cv2.circle(img, tuple(point.astype(int)), 5, (0, 255, 0), -1)
    img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    img.save('img_grid_visualize_mx.png')
    plt.imshow(img)
    plt.show()
    pass

However, the result of grid visualization seems not to be corrected. Could you please help me correct it?

Here is the grid visualization for the Wildtrack dataset