Giter Site home page Giter Site logo

tjubiit / tju-dhd Goto Github PK

View Code? Open in Web Editor NEW
134.0 8.0 15.0 2.05 MB

A newly built high-resolution dataset for object detection and pedestrian detection (IEEE TIP 2020)

Home Page: https://arxiv.org/abs/2011.09170.pdf

License: MIT License

dataset object-detection pedestrian-detection high-resolution cross-scene-evaluation cross-domain diverse

tju-dhd's Introduction

TJU-DHD dataset (object detection and pedestrian detection)

This is the official website for "TJU-DHD: A Diverse High-Resolution Dataset for Object Detection (TIP2020)", which is a newly built high-resolution dataset for object detection and pedestrian detection.

  • 115k+ images and 700k+ instances
  • Scenes: traffic and campus, Tasks: object detection and pedestrian detection
  • High resolution: image resolution of at least 1624x1200 pixels, the object height from 11 pixels to 4152 pixels.
  • Diversity: A large variance in appearance, scale, illumination, season, and weather
  • Cross-scene evaluation and same-scene evaluation on pedestrian detection
  • If you are interested in pedestrian detection, please refer to our IEEE T-PAMI paper or our github project.
  • Learderboard in Paperswithcode: TJU-Ped-campus, TJU-Ped-traffic

Examples of DHD

Table of Contents

  1. Introduction
  2. Object detection dataset
    2.1 TJU-DHD-traffic
    2.2 TJU-DHD-campus
  3. Pedestrian detection dataset
    3.1 TJU-Ped-traffic
    3.2 TJU-Ped-campus
  4. Benchmark
    4.1 TJU-DHD-traffic
    4.2 TJU-DHD-campus
    4.3 TJU-DHD-pedestrian
  5. Citation
  6. Evaluation on the test set
  7. Contact

1. Introduction

Vehicles, pedestrians, and riders are the most important and interesting objects in the perception modules of self-driving vehicles and video surveillance. However, the state-of-the-art performance of detecting such important objects (esp. small objects) is far from satisfying the demand of the practical systems. Large-scale, rich-diversity, and high-resolution vehicle and pedestrian datasets play an important role in developing better object detection methods to satisfy the demand. Existing public large-scale datasets such as MS COCO collected from websites do not focus on these specific scenarios. Moreover, the popular datasets (e.g., KITTI and Citypersons) collected from these specific scenarios are limited in the number of images and instances, the resolution, and the diversity in seasons, weathers, and illuminations. To attempt to solve the problem, in this paper, we build a diverse high-resolution dataset (called TJU-DHD). The dataset contains 115,354 high-resolution images (52% images have a resolution of 1624x1200 pixels and 48% images have a resolution of at least 2,560x1,440 pixels) and 709,330 labeled objects in total with a large variance in scale and appearance. Meanwhile, the dataset has a rich diversity in season variance, illumination variance, and weather variance. Based on this object dataset, a new diverse pedestrian dataset is further built. With the four different detectors (i.e., the one-stage RetinaNet, anchor-free FCOS, two-stage FPN, and Cascade R-CNN), experiments about object detection and pedestrian detection are conducted. We hope that the newly built dataset can help promote the research on object detection and pedestrian detection in these two scenes.

2. Object detection dataset

name DHD-traffic (#images) DHD-traffic (#instances) DHD-campus (#images) DHD-campus (#instances)
training 45,266 239,980 39,727 267,445
validation 5,000 30,679 5,204 41,620
test 10,000 60,963 10,157 68,643
total 60,266 331,622 55,088 377,708

2.1 TJU-DHD-traffic

2.2 TJU-DHD-campus

The training imageset is too large, thus is ziped as a 4-part archive. After downloading all four parts, you can open the .zip.001 using your favorite zip file extractor. On Linux, the multi-part archive can be also unzipped by

cat dhd_campus_train_images.zip.* > dhd_campus_train_images.zip
unzip dhd_campus_train_images.zip -d /path/to/your/folder

3. Pedestrian detection dataset

name Ped-traffic (#images) Ped-traffic (#instances) Ped-campus (#images) Ped-campus (#instances)
training 13,858 27,650 39,727 234,455
validation 2,136 5,244 5,204 36,161
test 4,344 10,724 10,157 59,007
total 20,338 43,618 55,088 329,623

3.1 TJU-Ped-traffic

(Note that the images are same as those in the TJU-DHD-traffic)

3.2 TJU-Ped-campus

(Note that the images are same as those in the TJU-DHD-campus)

4. Benchmark

4.1 TJU-DHD-traffic

  • Results on validation

    method backbone input size AP [email protected] [email protected] AP_s AP_m AP_l
    RetinaNet ResNet50 1333x800 53.5 80.9 60.0 24.0 50.5 68.0
    FCOS ResNet50 1333x800 53.8 80.0 60.1 24.6 50.6 68.8
    FPN ResNet50 1333x800 55.4 83.4 63.0 30.4 52.2 68.2
    Cascade RCNN ResNet50 1333x800 57.9 82.7 66.6 32.6 54.4 71.4

4.2 TJU-DHD-campus

  • Results on validation

    method backbone input size AP [email protected] [email protected] AP_t AP_s AP_l AP_l
    RetinaNet ResNet50 1333x800 48.4 79.3 52.4 4.7 27.3 56.2 73.8
    FCOS ResNet50 1333x800 49.3 73.8 53.8 5.6 29.6 55.9 74.3
    FPN ResNet50 1333x800 52.4 77.5 58.4 8.5 37.4 58.6 74.9
    Cascade RCNN ResNet50 1333x800 55.1 77.6 60.9 10.8 40.1 61.2 78.8

4.3 TJU-DHD-pedestrian

  • TJU-Ped-campus
Method publication R RS HO R+HO A link
RetinaNet ICCV2017 34.73 82.99 71.31 42.26 44.34 Paper
FCOS ICCV2019 31.89 69.04 81.28 39.38 41.62 Paper
FPN ICCV2017 27.92 67.52 73.14 35.67 38.08 Paper
CrowdDet CVPR2020 25.73 - 66.38 33.63 35.90 Paper
EGCL IEEE TIP2023 24.84 - 65.27 32.39 34.87 Paper
DeFCN CVPR2021 32.1 62.7 72.7 39.9 42.1 Paper
OPL CVPR2023 31.5 61.7 72.4 39.3 41.5 Paper
MTOM WACV2023 21.8 37.04 57.08 - - Paper
  • TJU-Ped-traffic
Method publication R RS HO R+HO A link
RetinaNet ICCV2017 23.89 37.92 61.60 28.45 41.40 Paper
FCOS ICCV2019 24.35 37.40 63.73 28.86 40.02 Paper
FPN ICCV2017 22.30 35.19 60.30 26.71 37.78 Paper
CrowdDet CVPR2020 20.82 - 61.22 25.28 36.94 Paper
EGCL IEEE TIP2023 19.73 - 60.05 24.19 35.76 Paper
DeFCN CVPR2021 24.2 29.1 62.8 29.0 39.7 Paper
Pedestron CVPR2021 18.9 24.0 56.3 - - Paper
OPL CVPR2023 23.4 28.8 62.7 28.0 38.7 Paper
LSFM CVPR2023 18.7 24.9 56.2 - - Paper
MTOM WACV2023 17.4 24.7 52.68 - - Paper
  • Cross-scene evaluation

    method R/R+HO (TJU-Ped-campus -> traffic) R/R+HO (TJU-Ped-traffic -> campus)
    FPN 30.62 / 33.89 42.08 / 50.55

5. Citation

If this project help your research, please consider to cite our works.

@article{Pang_DHD_TIP_2020,
         author = {Yanwei Pang and Jiale Cao and Yazhao Li and Jin Xie and Hanqing Sun and Jinfeng Gong},
         title = {TJU-DHD: A Diverse High-Resolution Dataset for Object Detection},
         journal = {IEEE Transactions on Image Processing},
         year = 2021
        }

@article{Cao_PDR_TPAMI_2020,
         author = {Jiale Cao and Yanwei Pang and Jin Xie and Fahad Shahbaz Khan and Ling Shao},
         title = {From Handcrafted to Deep Features for Pedestrian Detection: A Survey},
         journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
         year = 2022
        }

6. Evaluation on the test set

Ablation studies can be conducted on the validation set. If you would like to evaluate your model on the test set, you can send us (connor#tju.edu.cn, replace # with @) your detection results in the json format.

7. Contact

If you have any questions or want to add your results, please feel free to contact us.

tju-dhd's People

Contributors

hanqing-sun avatar jialecao001 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tju-dhd's Issues

Data license

Thanks for sharing this dataset with the community, please, could you let me know if the dataset (including images and ground truth) can be used for commercial applications? I wasn't able to find the data license.

The quality of this dataset

This dataset contains:

  • duplicated bboxes
  • mislabeled bbox classes
  • one of every 5 images contains a random bbox of a random class
  • most of the bboxes are not properly cropped
  • some images that are either extremely dark or extremely bright

At first I thought it was a problem on my end, but after further investigation it doesn't seem to be the case.
Please fix

Dataset statistics and missing images

There are duplicate images in the dataset. Example dhd_campus/images/train/000000016083.jpg and dhd_campus/images/test/000000051853.jpg are the same image.

Total number of images (no duplicates) = 105076

Combining both campus and traffic datasets, I have the following statistics:

Class label #Instances
Car 152860
Cyclist 76765
Pedestrian 349743
Truck 16634
Van 29877
Type #instances
campus train 291688
campus val 47713
traffic train 251691
traffic val 34787

Total number of bounding boxes = 625879

Missing images

The following 3 image paths are present in the annotation files but the images seem to be missing in your dataset.
dhd_campus/images/train/1499435044634.jpg
dhd_campus/images/train/1497315953923.jpg
dhd_campus/images/train/1497405070612.jpg

We are running some experiments. We would be grateful if you can check these numbers from your end?

P.S
In dhd_campus_train.json ,you have pedstrian as a label . In dhd_traffic_train.json, you have Car, Van and Pedestrian
I renamed pedstrian to Pedestrian.

The size of inputs when training and testing?

In your paper, the size of the input image is set to (1333,800), but the setting doesn't match the ratio of the original image. Should we keep the ratio when training and testing.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.