Giter Site home page Giter Site logo

fan23j / yolov7-pose-whole-body Goto Github PK

View Code? Open in Web Editor NEW
10.0 1.0 1.0 48.82 MB

Yolov7-pose with variable keypoint support. Trained models with COCO Wholebody.

License: Other

Shell 1.53% Python 92.38% Dockerfile 0.17% Jupyter Notebook 5.93%
coco-wholebody pose-estimation yolov7

yolov7-pose-whole-body's Introduction

yolov7-pose-whole-body

Implementation of "YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors" combined with "Whole-Body Human Pose Estimation in the Wild".

This repo seeks to combine the aforementioned papers/repos to add extra keypoints to yolo-pose models.

Pose estimation implimentation is based on YOLO-Pose.

Pretrained models

yolov7-tiny-pose

python train.py --data data/coco_kpts.yaml --cfg cfg/yolov7-tiny-pose.yaml --batch-size 64 --img 640 --kpt-label --sync-bn --device 0  --hyp data/hyp.pose.yaml --nkpt 133 --weights PATH_TO_PRETRAINED_WEIGHTS epochs 500

Dataset preparation

[Keypoints Labels of MS COCO 2017]

COCO Whole-Body: https://github.com/jin-s13/COCO-WholeBody

Handy COCO to YOLO conversion script in utils/coco2yolo.py.

Training

yolov7-w6-person.pt

python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train.py --data data/coco_kpts.yaml --cfg cfg/yolov7-w6-pose.yaml --weights weights/yolov7-w6-person.pt --batch-size 128 --img 960 --kpt-label --sync-bn --device 0,1,2,3,4,5,6,7 --name yolov7-w6-pose --hyp data/hyp.pose.yaml

Deploy

TensorRT:https://github.com/nanmi/yolov7-pose

Testing

yolov7-w6-pose.pt

python test.py --data data/coco_kpts.yaml --img 960 --conf 0.001 --iou 0.65 --weights yolov7-w6-pose.pt --kpt-label

Citation

@article{wang2022yolov7,
  title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
  author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
  journal={arXiv preprint arXiv:2207.02696},
  year={2022}
}

Acknowledgements

Expand

yolov7-pose-whole-body's People

Contributors

fan23j avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

chewchinkeat

yolov7-pose-whole-body's Issues

The mapping is a big issue in the model prediction

Hi Jack,

 We have the similliar results like your examples: 
    https://github.com/fan23j/yolov7-pose-whole-body/tree/main/onnx_inference/

The mapping is a big issue in the model prediction.

 Here are more examples from our training ( Left side is ground true, right side is prediction):
         

     ![pose_wb_comparison](https://user-images.githubusercontent.com/5216043/234075684-7b2fbd9b-422a-4d4f-afad-56ac8ac4e5a2.jpg)

How to decode the output of the onnx

Hi,

I converted the yolov7_pose_whole_body_tiny_baseline.pt to onnx so that I can use C++ and TensorRT to infer the video.

There are 405 output per line. From here https://github.com/fan23j/yolov7-pose-whole-body/blob/main/data/data_format.md, it shows from COCO-WholeBody Annotation File Format:
// 405
//
// 133 x 3 = 399
// "keypoints": list([x, y, v] * 17),
// "foot_kpts" : list([x, y, v] * 6),
// "face_kpts" : list([x, y, v] * 68),
// "lefthand_kpts" : list([x, y, v] * 21),
// "righthand_kpts" : list([x, y, v] * 21),
//
//
//
// "score" : float, //400
// "foot_score" : float, //401
// "face_score" : float, //402
// "lefthand_score" : float, //403
// "righthand_score" : float, //404
// "wholebody_score" : float, //405

But when I try to decode the output like that, it won't work.

If I decode the output like yolov7-w6-pose (I have been using it from last year), for the first 57 data or the first 5 data (step_width_coco_pose = 57;//num_class + 5 + 2 * num_lines_coco_pose + 2 * kp_face_pose), it won't work.

Could you please show me how to decode the output?

Many thanks!

-Scott

the keypoints results are deviated and not stable

Good Job!
When I ran detect.py using your pretrained model file yolov7-tiny-baseline.pt, I got weired results as shown in the following picture.
In addition, the degree to which the keypoints deviated from groundtruth may vary from frame to frame in my test video.
That means, the keypoints results are not stable.
What's the problem?


image

Training errors

Hi Jack,

python3 train.py --data data/coco_kpts.yaml --cfg cfg/yolov7-tiny-pose.yaml --batch-size 8 --img 256 --kpt-label --sync-bn --device 0 --hyp data/hyp.pose.yaml --nkpt 133 --weights yolov7-tiny-baseline.pt --epochs 500

I tried to use the command above to train a model, but got the following errors. Any comment would be appreciated.

 Epoch   gpu_mem       box       obj       cls       kpt      kptv     total    labels  img_size

0%| | 0/7075 [00:00<?, ?it/s]
Traceback (most recent call last):
File "train.py", line 564, in
train(hyp, opt, device, tb_writer)
File "train.py", line 290, in train
for i, (imgs, targets, paths, _) in pbar: # batch -------------------------------------------------------------
File "/home/haixiatan/.local/lib/python3.8/site-packages/tqdm/std.py", line 1195, in iter
for obj in iterable:
File "/home/haixiatan/yolov7-pose-whole-body/utils/datasets.py", line 108, in iter
yield next(self.iterator)
File "/home/haixiatan/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 530, in next
data = self._next_data()
File "/home/haixiatan/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1224, in _next_data
return self._process_data(data)
File "/home/haixiatan/.local/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1250, in _process_data
data.reraise()
File "/home/haixiatan/.local/lib/python3.8/site-packages/torch/_utils.py", line 457, in reraise
raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/haixiatan/.local/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/home/haixiatan/.local/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/haixiatan/.local/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 49, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/haixiatan/yolov7-pose-whole-body/utils/datasets.py", line 578, in getitem
img, labels = load_mosaic(self, index)
File "/home/haixiatan/yolov7-pose-whole-body/utils/datasets.py", line 791, in load_mosaic
img4, labels4 = random_perspective(img4, labels4, segments4,
File "/home/haixiatan/yolov7-pose-whole-body/utils/datasets.py", line 1012, in random_perspective
xy_kpts[:, :2] = targets[:,5:].reshape(n*num_kpt, 2)
ValueError: cannot reshape array of size 136 into shape (532,2)

Models

Hello,

Could you please share your trained models?

Many thanks!

-Scott

Detection fail

Hi,

I run the detection and it fails. Here are the log:

python3 detect.py --weights yolov7_pose_whole_body_tiny_baseline.pt --source onnx_inference/img.png --save-crop
Namespace(agnostic_nms=False, augment=False, classes=None, conf_thres=0.25, device='', exist_ok=False, hide_conf=False, hide_labels=False, img_size=640, iou_thres=0.45, kpt_label=False, line_thickness=3, name='exp', nkpt=17, nosave=False,
project='runs/detect', save_bin=False, save_conf=False, save_crop=True, save_txt=False, save_txt_tidl=False, source='onnx_inference/img.png', update=False, view_img=False, weights=['yolov7_pose_whole_body_tiny_baseline.pt'])
YOLOv5 � 2023-6-7 torch 2.0.1+cu117 CUDA:0 (NVIDIA GeForce RTX 3090 Ti, 24563.375MB)

Fusing layers...
Model Summary: 314 layers, 8862259 parameters, 0 gradients
/usr/local/lib/python3.8/dist-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3483.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
image 1/1 /mnt/d/Workspace2021/models/yolov7_pose_whole_body/yolov7-pose-whole-body-main/onnx_inference/img.png: tensor(0.83447, device='cuda:0')
Traceback (most recent call last):
File "detect.py", line 205, in
detect(opt=opt)
File "detect.py", line 102, in detect
scale_coords(img.shape[2:], det[:, 6:], im0.shape, kpt_label=kpt_label, step=3)
File "/mnt/d/Workspace2021/models/yolov7_pose_whole_body/yolov7-pose-whole-body-main/utils/general.py", line 385, in scale_coords
coords[:, [0, 2]] -= pad[0] # x padding
IndexError: index is out of bounds for dimension with size 0

Thanks,

-Scott

Annotation Mapping Bug

I am currently trying to train this model on custom data, however when running a single epoch of training, the training_batch.jpg files indicate an error when showcasing the keypoints. They seem not to be in the bounding boxes.
I suspect that the padding of not square images to be square does not work properly when changing the keypoints accordingly. Also, x and y values of keypoints appear to be swapped, as they appear to be mirrored across the x,y diagonal.

Example:
for a blank 640*400 image with a single detection, with three keypoints, each located at one corner, this should be in the corresponding label file:
0 0.3 0.7 0.1 0.1 0.25 0.65 2 0.25 0.75 2 0.35 0.65 2

The resulting train_batch.jpg should be this:
0 png

However: when actually training on this single image, this image is the output:
train_batch0

Is this an error in the repository code for dealing with keypoints or am I mistaken and did understand the label incorrectly?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.