Giter Site home page Giter Site logo

xuez-phd / tfdet Goto Github PK

View Code? Open in Web Editor NEW
24.0 1.0 2.0 34.25 MB

This is an official repository of our TFDet.

License: Apache License 2.0

Python 17.46% Shell 0.15% Jupyter Notebook 82.24% Dockerfile 0.02% Makefile 0.01% CSS 0.01% Batchfile 0.01% MATLAB 0.12%

tfdet's Introduction

TFDet: Target-aware Fusion for RGB-T Pedestrian Detection

This is the official repository for our paper "TFDet: Target-aware Fusion for RGB-T Pedestrian Detection" (arxiv paper link).

Our main contributions are summarized as follows:

  • We comprehensively analyze the adverse impact of false positives on detection performance and identify the noisy feature map as a key factor contributing to these false positives.
  • To address the noisy feature map issue, we propose the target-aware fusion strategy, which can effectively fuse complementary features from both modalities and highlight the feature representation in pedestrian areas while suppressing the representations in the background areas.
  • Experiments show that our TFDet generates discriminative feature maps, significantly reducing false positives. Our TFDet achieves state-of-the-art performance on two challenging multi-spectral pedestrian detection benchmarks: KAIST and LLVIP. Additionally, it is computationally effective and has a comparable inference time to previous state-of-the-art approaches.
  • Notably, TFDet performs especially well under challenging night scenes.

Dataset and Models

  • Datasets and model checkpoints can be downloaded from this cloud link, extractor code: tfde.
  • Since the KAIST dataset has been updated by several previous works, such as Hwang et al., Li et al., Liu et al., and Zhang et al., we upload this dataset for your convenience in using our code.
  • The LLVIP dataset can be downloaded from its official repository.

In the cloud link, files are organized as follow:

TFDet:
├─datasets
│  ├─kaist
│  │  └─zx-sanitized-kaist-keepPerson-fillNonPerson
│  │      ├─annotations.zip
│  │      ├─coco_format.zip
│  │      ├─images
│  │      │  ├─test.zip
│  │      │  ├─train_lwir.zip
│  │      │  └─train_visible.zip
│  │      ├─test.avi
│  │      └─train.avi
│  └─LLVIP		# LLVIP should be downloaded here
│      ├─LLVIP
│      │  ├─coco_format	# For mmdetection
│      │  ├─lwir
│      │  └─visible
│      └─yolov5_format	# For yolov5
│          ├─images
│          │  ├─lwir
│          │  │  ├─test
│          │  │  └─train
│          │  └─visible
│          │      ├─test
│          │      └─train
│          └─labels
│              ├─lwir
│              │  ├─test
│              │  └─train
│              └─visible
│                  ├─test
│                  └─train
├─mmdetection
│  ├─runs
│  │  └─FasterRCNN_vgg16_channelRelation_dscSEFusion_similarityMax_1
│  │      ├─epoch_
│  │      │  ├─epoch_3-test-all.txt
│  │      │  ├─epoch_3-test-day.txt
│  │      │  └─epoch_3-test-night.txt
│  │      ├─epoch_3.pkl
│  │      └─epoch_3.pth
│  └─runs_llvip
│      └─FasterRCNN_r50wMask_ROIFocalLoss5_CIOU20_cosineSE_dcnGWConvGlobalCC_1024x1280
│          ├─20230825_171907.log
│          ├─20230825_171907.log.json
│          └─epoch_7.pth
└─yolov5-master
    └─runs
        └─train
            └─modifiedDCN_MaskSup_negCorr_1024
                └─weights
                    └─best.pt

KAIST

Environmental Requirements

  • We use Faster R-CNN implemented by the MMDetection toolbox to detect pedestrians on the KAIST dataset. Please follow the MMDetection documents to install environments.

In our environment, we use:

python==3.7.13
torch==1.10.1+cu111
torchvision==0.11.2+cu111
mmcv==1.6.0
mmdet==2.24.1

Dataset

  • Please download the KAIST dataset and checkpoint from the above cloud link, and save them following the structure shown in the Dataset and Models section.
  • If you solely intend to assess the inference results, feel free to download the following files: images/test.zip, annotations.zip and coco_format.zip.

Inference

Note: the data_root and img_prefix in configuration files should be correctly modified according to your local dataset path. Please refer to the mmdetection document for more details.

Since the KAIST dataset is evaluated by the log-average miss rate metric, we should run three files: mmdetection/tools/test.py, mmdetection/myCodesZoo/cvtpkl2txt_person.py, and KAISTdevkit-matlab-wrapper/demo_test.m.

We should first run the tools/test.py to generate the detection results.

cd mmdetection

# generate detection result in pkl format
python tools/test.py configs/faster_rcnn/faster_rcnn_vgg16_fpn_sanitized-kaist_v5.py runs/FasterRCNN_vgg16_channelRelation_dscSEFusion_similarityMax_1/epoch_3.pth --work-dir runs/FasterRCNN_vgg16_channelRelation_dscSEFusion_similarityMax_1 --gpu-id 7 --eval bbox --out runs/FasterRCNN_vgg16_channelRelation_dscSEFusion_similarityMax_1/epoch_3.pkl

A few minutes later, you will obtain one pkl file named runs/FasterRCNN_vgg16_channelRelation_dscSEFusion_similarityMax_1/epoch_3.pkl. Then, run the myCodesZoo/cvtpkl2txt_person.py to parse the pkl file

cd myCodesZoo

# convert detections to txt format
python cvtpkl2txt_person.py

Then, you will obtain a folder runs/FasterRCNN_vgg16_channelRelation_dscSEFusion_similarityMax_1/epoch_, which includes epoch_3-test-all.txt, epoch_3-test-day.txt, epoch_3-test-night.txt. Meanwhile, we use the python code provided by MLPD to compute the log-average miss rate, and get a state-of-the-art performance: MR_all: 4.37, MR_day: 5.08, and MR_night: 3.36, which is significantly better than previous state-of-the-art approaches. Nevertheless, we use the commomly-used matlab code to evaluate the log-average miss rate for fair comparisons with other approaches.

cd ../../KAISTdevkit-matlab-wrapper

run demo_test.m

Finally, you will get the result:

Methods MR-All($\downarrow$) MR-Day($\downarrow$) MR-Night($\downarrow$) MR-Near($\downarrow$) MR-Medium($\downarrow$)
MSR (AAAI 2022) 11.39 15.28 6.48 - -
AR-CNN (ICCV 2019) 9.34 9.94 8.38 0.00 16.08
MBNet (ECCV2020) 8.13 8.28 7.86 0.00 16.07
DCMNet (ACM MM 2022) 5.84 6.48 4.60 0.02 16.07
ProbEn3 (ECCV 2022) 5.14 6.04 3.59 0.00 9.59
TFDet (Ours) 4.47 5.22 3.36 0.00 9.29

LLVIP

Environmental Requirements

We use the official repositories of MMDetection and YOLOv5 official repository in our experiments. Please configure your environment following the official documentation. For MMDetection, we employ the same environment as in the experiments on the KAIST dataset. For YOLOv5, some dependencies in our environment include:

python==3.7.16
torch==1.12.1
torchvision==0.13.1

Dataset

  • Please download the LLVIP dataset from this link.

  • For MMDetection, you should convert the annotation into coco-format following this document.

  • For YOLOv5, you should convert the annotation into yolo-format follow this document.

Inference

For MMDetection, we evaluate TFDet at two resolutions. Since the LLVIP dataset have a large number of multispectral images, we use distributed inference code.

# 640 x 512 resolution
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7,8 PORT=29500 bash tools/zx_dist_test_llvip_640x512.sh 8 --eval bbox
# 1280 x 1024 resolution
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7,8 PORT=29500 bash tools/zx_dist_test_llvip_1024x1280.sh 8 --eval bbox

Then, you will get:

Methods Resolution AP.50($\uparrow$) AP($\uparrow$)
DetFusion (ACM MM 2022) 640 x 512 80.7 -
ProbEn (ECCV2022) 640 x 512 93.4 51.5
TFDet (Ours) 640 x 512 95.7 56.1
DCMNet (ACM MM 2022) 1280 x 1024 - 58.4
TFDet (Ours) 1280 x 1024 96.0 59.4

For YOLOv5,

python val.py --device 0 --data LLVIP.yaml --weights runs/train/modifiedDCN_MaskSup_negCorr_1024/weights/best.pt --batch-size 32 --img 1024 --conf-thres 0.008 --iou-thres 0.4 --exist-ok

Finally, you will get:

Methods AP.50($\uparrow$) AP.75($\uparrow$) AP($\uparrow$)
RGB (ICCV 2021) 90.8 56.4 52.7
Thermal (ICCV 2021) 96.5 76.4 67.0
TFDet (Ours) 97.9 83.4 71.1

Citation

If you find our TFDet useful, please cite our paper:

@article{tfdet,
title={TFDet: Target-aware Fusion for RGB-T Pedestrian Detection},
author={Zhang, Xue and Zhang, Xiaohan and Sheng, Zehua and Shen, Hui-Liang},
journal={arXiv preprint arXiv:2305.16580},
year={2023}}

tfdet's People

Contributors

xuez-phd avatar

Stargazers

DongXing avatar Xiaohan Zhang avatar  avatar xyLi avatar Kun Chen avatar  avatar Cao Siyuan avatar Cao Anda avatar  avatar  avatar  avatar Jian Lin avatar Heitor Rapela Medeiros avatar  avatar An-zhi WANG avatar HanLei_GZU avatar river_cold avatar  avatar Zehua Sheng avatar Jiacheng Ying avatar JeffW avatar Shawn avatar liuzengyun avatar Zhang Xin avatar

Watchers

 avatar

Forkers

cv-det xbchen82

tfdet's Issues

Doubts about shared weights for backbone networks

As described in the article, the benefit of shared weights is that it reduces the computation and parameters. In addition, I think that using a backbone network with shared weights for feature extraction on RGBT images, the feature maps on each channel are just going to be more similar, and so the conclusion is obtained that the feature similarity is greater for the diagonal channel. But does this conclusion still hold for networks where weights are not shared?

the code

您好,感谢您精彩的工作!
请问您最近是否可以公布您这篇论文的代码呢?
谢谢您!

No training instructions found

Hello, I am honored to have read your paper.
Your paper has a clear and innovative approach.
I would like to combine your code for research.
Can you provide the training code

[Errno 2] No such file or directory: '/tmp/tmpmgy02awm/tmpnede1i17.py'

Thank you for your excellent work, when I run the test code for llvip
"python tools/test.py configs/faster_rcnn/faster_rcnn_vgg16_fpn_sanitized-kaist_v5.py runs/FasterRCNN_vgg16_channelRelation_dscSEFusion_similarityMax_1/epoch_3.pth --work-dir runs/FasterRCNN_vgg16_channelRelation_dscSEFusion_similarityMax_1 --gpu-id 7 --eval bbox --out runs/FasterRCNN_vgg16_channelRelation_dscSEFusion_similarityMax_1/epoch_3.pkl" ,

An error is reported: [Errno 2] No such file or directory: '/tmp/tmpmgy02awm/tmpnede1i17.py'.

I tried to run the training code "python tools/train.py configs/faster_rcnn/faster_rcnn_vgg16_fpn_sanitized-kaist_v5.py" and it reported the same error, but the test code for coco "python tools/train.py configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py" runs correctly.

Is there a problem with the config file "configs/faster_rcnn/faster_rcnn_r50_fpn_dcnGWConvGloballCC_llvip.py"?
Looking forward to your reply

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.