Giter Site home page Giter Site logo

depthai_blazepose's Introduction

Blazepose tracking with DepthAI

Running Google Mediapipe single body pose tracking models on DepthAI hardware (OAK-1, OAK-D, ...).

The Blazepose landmark models available in this repository are the version "full", "lite" and "heavy" of mediapipe 0.8.6 (2021/07),

The pose detection model comes from mediapipe 0.8.4 and is compatible with the 3 landmark models (the 0.8.6 version currently cannot be converted into a Myriad blob).

For the challenger Movenet on DepthAI, please visit : depthai_movenet

For an OpenVINO version of Blazepose, please visit : openvino_blazepose

Architecture: Host mode vs Edge mode

Two modes are available:

  • Host mode : aside the neural networks that run on the device, almost all the processing is run on the host (the only processing done on the device is the letterboxing operation before the pose detection network when using the device camera as video source). Use this mode when you want to infer on external input source (videos, images).
  • Edge mode : most of the processing (neural networks, post-processings, image manipulations) is run on the device thaks to the depthai scripting node feature. It works only with the device camera but is definitely the best option when working with the internal camera (much faster than in Host mode). The data exchanged between the host and the device is minimal: the landmarks of detected body (~3kB/frame). In Edge mode, you can choose not to send the camera video frame to the host (by specifying 'rgb_laconic' as input).
Landmark model (Edge mode) FPS (FPS with 'xyz' option)
Full 20 (18)
Lite 26 (22)
Heavy 8 (7)

Host mode

Edge mode

For depth-capable devices, when measuring the 3D location of a reference point, more nodes are used and not represented here (2 mono cameras, stereo node, spatial location calculator).

Note : the Edge mode schema is missing a custom NeuralNetwork node between the ImageManip node on the right and the landmark NeuralNetwork. This custom NeuralNetwork runs a very simple model that normalize (divide by 255) the output image from the ImageManip node. This is a temporary fix, should be removed when depthai ImageManip node will support setFrameType(RGBF16F16F16p).

Inferred 3D vs Measured 3D

  • Inferred 3D : the Landmark model is able to infer 3D (x,y,z) landmarks. Actually, the 0.8.6 model yields 2 outputs with 3D landmarks : Identity (accessed via body.landmarks with the API) and Identity_4 (accessed via body.world_landmarks). It may sound as redundant information but there is a difference: world landmarks are real-world 3D coordinates in meters with the origin at the center between hips. world landmarks share the same landmark topology as landmarks. However, landmarks provide coordinates (in pixels) of a 3D object projected onto the 2D image surface, while world landmarks provide coordinates (in meters) of the 3D object itself.
  • Measured 3D : for devices able to measure depth (like OAK-D), we can determine the real 3D location of a point of the image in camera coordinate system. So one idea would be to measure the 3D locations of each inferred 2D body landmarks. It turns out it is not a good idea in practice for at least 2 reasons. First, a inferred keypoint may stand "outside" of its counterpart in the image and therefore in the aligned depth frame. It happens probably more frequently with extremities, and it can be explained by the inaccuracy of the model which is never 100% perfect. Secondly, we can't get depth for hidden keypoints. An alternative solution, implemented here, is to combine the inferred 3D world landmarks with the measured 3D location of one reference point, the center between hips. Compared to extremities, this reference point can be more robustly measured.

The image below demonstrates the 3 modes of 3D visualization:

  1. Image mode (top-right), based on body.landmarks. Note that the size of the drawn skeleton depends on the distance camera-body, but that the mid hips reference point is restricted and can only moved inside a plane parallel to the wall grid;
  2. World mode (bottom-left), based on body.world_landmarks. Note the mid hips reference point is fixed and the size of the skeleton does not change;
  3. Mixed mode (bottom right), mixing body.world_landmarks with measured 3D location of the reference point. Like in World mode, the size of the skeleton does not change. But the mid hips reference point is not restricted any more.

3D visualizations

Install

Install the python packages (depthai, opencv, open3d) with the following command:

python3 -m pip install -r requirements.txt

Run

Usage:

-> python3 demo.py -h
usage: demo.py [-h] [-e] [-i INPUT] [--pd_m PD_M] [--lm_m LM_M] [-xyz] [-c]
               [--no_smoothing] [-f INTERNAL_FPS]
               [--internal_frame_height INTERNAL_FRAME_HEIGHT] [-s] [-t]
               [--force_detection] [-3 {None,image,mixed,world}]
               [-o OUTPUT]

optional arguments:
  -h, --help            show this help message and exit
  -e, --edge            Use Edge mode (postprocessing runs on the device)

Tracker arguments:
  -i INPUT, --input INPUT
                        'rgb' or 'rgb_laconic' or path to video/image file to
                        use as input (default=rgb)
  --pd_m PD_M           Path to an .blob file for pose detection model
  --lm_m LM_M           Landmark model ('full' or 'lite' or 'heavy') or path
                        to an .blob file
  -xyz, --xyz           Get (x,y,z) coords of reference body keypoint in
                        camera coord system (only for compatible devices)
  -c, --crop            Center crop frames to a square shape before feeding
                        pose detection model
  --no_smoothing        Disable smoothing filter
  -f INTERNAL_FPS, --internal_fps INTERNAL_FPS
                        Fps of internal color camera. Too high value lower NN
                        fps (default= depends on the model)
  --internal_frame_height INTERNAL_FRAME_HEIGHT
                        Internal color camera frame height in pixels
                        (default=640)
  -s, --stats           Print some statistics at exit
  -t, --trace           Print some debug messages
  --force_detection     Force person detection on every frame (never use
                        landmarks from previous frame to determine ROI)

Renderer arguments:
  -3 {None,image,mixed,world}, --show_3d {None,image,mixed,world}
                        Display skeleton in 3d in a separate window. See
                        README for description.
  -o OUTPUT, --output OUTPUT
                        Path to output video file

Examples :

  • To use default internal color camera as input with the model "full" in Host mode:

    python3 demo.py

  • To use default internal color camera as input with the model "full" in Edge mode [preferred]:

    python3 demo.py -e

  • To use a file (video or image) as input :

    python3 demo.py -i filename

  • To use the model "lite" :

    python3 demo.py -lm_m lite

  • To measure body spatial location in camera coordinate system (only for depth-capable device like OAK-D): python3 demo.py -e -xyz

    The measure is made only on one reference point: - the middle of the hips if both hips are visible; - the middle of the shoulders if hips are not visible and both shoulders are visible.

  • To show the skeleton in 3D 'world' mode (-xyz flag needed):

    python3 demo.py -e -xyz -3 world

    World mode

    Note that the floor and wall grids does not correspond to a real floor and wall. Each grid square size is 1m x 1m.

  • When using the internal camera, to change its FPS to 15 :

    python3 demo.py --internal_fps 15

    Note: by default, the default internal camera FPS depends on the model, the mode (Edge vs Host), the use of depth ("-xyz"). These default values are based on my own observations. Please, don't hesitate to play with this parameter to find the optimal value. If you observe that your FPS is well below the default value, you should lower the FPS with this option until the set FPS is just above the observed FPS.

  • When using the internal camera, you probably don't need to work with the full resolution. You can set a lower resolution (and win a bit of FPS) by using this option:

    python3 demo.py --internal_frame_size 450

    Note: currently, depthai supports only some possible values for this argument. The value you specify will be replaced by the closest possible value (here 432 instead of 450).

  • By default, temporal filters smooth the landmark positions. Use --no_smoothing to disable the filter.

Keypress in OpenCV window Function
Esc Exit
space Pause
r Show/hide the bounding rotated rectangle around the body
l Show/hide landmarks
s Show/hide landmark score
f Show/hide FPS
x Show/hide (x,y,z) coordinates (only on depth-capable devices and if using "-xyz" flag)
z Show/hide the square zone used to measure depth (only on depth-capable devices and if using "-xyz" flag)

If using a 3D visualization mode ("-3" or "--show_3d"):

Keypress in Open3d window Function
o Oscillating (rotating back and forth) of the view
r Continuous rotating of the view
s Stop oscillating or rotating
Up Increasing rotating or oscillating speed
Down Decreasing rotating or oscillating speed
Right or Left Change the point of view to a predefined position
Mouse Freely change the point of view

Mediapipe models

You can directly find the model files (.xml and .bin) under the 'models' directory. Below I describe how to get the files in case you need to regenerate the models.

  1. Clone this github repository in a local directory (DEST_DIR)

  2. In DEST_DIR/models directory, download the tflite models from this archive. The archive contains:

    • Pose detection model from Mediapipe 0.8.4,
    • Full, Lite anf Hevay pose landmark modelfrom Mediapipe 0.8.6.

    Note: the Pose detection model from Mediapipe 0.8.6 can't currently be converted (more info here).

  3. Install the amazing PINTO's tflite2tensorflow tool. Use the docker installation which includes many packages including a recent version of Openvino.

  4. From DEST_DIR, run the tflite2tensorflow container: ./docker_tflite2tensorflow.sh

  5. From the running container:

cd workdir/models
./convert_models.sh

The convert_models.sh converts the tflite models in tensorflow (.pb), then converts the pb file into Openvino IR format (.xml and .bin), and finally converts the IR files in MyriadX format (.blob).

  1. By default, the number of SHAVES associated with the blob files is 4. In case you want to generate new blobs with different number of shaves, you can use the script gen_blob_shave.sh:
# Example: to generate blobs for 6 shaves
./gen_blob_shave.sh -m pd -n 6     # will generate pose_detection_sh6.blob
./gen_blob_shave.sh -m full -n 6   # will generate pose_landmark_full_sh6.blob

Explanation about the Model Optimizer params :

  • The preview of the OAK-* color camera outputs BGR [0, 255] frames . The original tflite pose detection model is expecting RGB [-1, 1] frames. --reverse_input_channels converts BGR to RGB. --mean_values [127.5,127.5,127.5] --scale_values [127.5,127.5,127.5] normalizes the frames between [-1, 1].
  • The original landmark model is expecting RGB [0, 1] frames. Therefore, the following arguments are used --reverse_input_channels, but unlike the detection model, we choose to do the normalization in the python code and not in the models (via --scale_values). Indeed, we have observed a better accuracy with FP16 models when doing the normalization of the inputs outside of the models (a possible explanation).

Custom models

The custom_models directory contains the code to build the following custom models:

  • DetectionBestCandidate: this model processes the outputs of the pose detection network (a 1x2254x1 tensor for the scores and a 1x2254x12 for the regressors) and yields the regressor with the highest score.
  • DivideBy255: this model transforms an 256x256 RGB888p ([0, 255]) image to a 256x256 RGBF16F16F16p image ([0., 1.]).

The method used to build these models is well explained on the rahulrav's blog.

Landmarks

Source

Code

To facilitate reusability, the code is splitted in 2 classes:

  • BlazeposeDepthai, which is responsible of computing the body landmarks. The importation of this class depends on the mode:
# For Host mode:
from BlazeposeDepthai import BlazeposeDepthai
# For Edge mode:
from BlazeposeDepthaiEdge import BlazeposeDepthai
  • BlazeposeRenderer, which is responsible of rendering the landmarks and the skeleton on the video frame.

This way, you can replace the renderer from this repository and write and personalize your own renderer (for some projects, you may not even need a renderer).

The file demo.py is a representative example of how to use these classes:

from BlazeposeDepthaiEdge import BlazeposeDepthai
from BlazeposeRenderer import BlazeposeRenderer

# The argparse stuff has been removed to keep only the important code

tracker = BlazeposeDepthai(input_src=args.input, 
            pd_model=args.pd_m,
            lm_model=args.lm_m,
            smoothing=not args.no_smoothing,   
            xyz=args.xyz,           
            crop=args.crop,
            internal_fps=args.internal_fps,
            internal_frame_height=args.internal_frame_height,
            force_detection=args.force_detection,
            stats=args.stats,
            trace=args.trace)   

renderer = BlazeposeRenderer(
                pose, 
                show_3d=args.show_3d, 
                output=args.output)

while True:
    # Run blazepose on next frame
    frame, body = tracker.next_frame()
    if frame is None: break
    # Draw 2d skeleton
    frame = renderer.draw(frame, body)
    key = renderer.waitKey(delay=1)
    if key == 27 or key == ord('q'):
        break
renderer.exit()
tracker.exit()

For more information on:

  • the arguments of the tracker, please refer to the docstring of class BlazeposeDepthai or BlazeposeDepthaiEdge in BlazeposeDepthai.py or BlazeposeDepthaiEdge.py;
  • the attributes of the 'body' element you can exploit in your program, please refer to the doctring of class Body in mediapipe_utils.py.

Examples

Semaphore alphabet Sempahore alphabet

Credits

depthai_blazepose's People

Contributors

geaxgx avatar ibaigorordo avatar spmallick avatar szabolcsgergely avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

depthai_blazepose's Issues

Multiple errors running "python demo.py --xyz" with OAK-D camera

Hello @geaxgx and thank you so much for this terrific software! I have been able to get depthai_blazepose to work in many modes with my OAK-D camera. For example when I run python demo.py --edge --lm_m='lite' I see 26 FPS which is awesome!

However, I really want to be able to see the xyz coordinates, however when I run python demo.py --xyz I get several errors (see below) including a RuntimeWarning and a TypeError, and the program quits. Am I doing something wrong?

Pose detection blob file : /Volumes/GoogleDrive-102536951206467607597/My Drive/Projects/Punchout/SkeletalTracker/depthai_blazepose/models/pose_detection_sh4.blob
Landmarks using blob file : /Volumes/GoogleDrive-102536951206467607597/My Drive/Projects/Punchout/SkeletalTracker/depthai_blazepose/models/pose_landmark_full_sh4.blob
Internal camera FPS set to: 13
Sensor resolution: (1920, 1080)
Internal camera image size: 1152 x 648 - crop_w:0 pad_h: 252
2254 anchors have been created
Creating pipeline...
Creating Color Camera...
Creating Pose Detection pre processing image manip...
Creating Pose Detection Neural Network...
Creating Landmark Neural Network...
Pipeline created.
[14442C1001FA60D700] [61.040] [NeuralNetwork(12)] [warning] Number of inference threads assigned for network is 1, assigning 2 will likely yield in better performance
[14442C1001FA60D700] [61.055] [NeuralNetwork(12)] [warning] The issued warnings are orientative, based on optimal settings for a single network, if multiple networks are running in parallel the optimal settings may vary
Pipeline started - USB speed: SUPER
/Volumes/GoogleDrive-102536951206467607597/My Drive/Projects/Punchout/SkeletalTracker/depthai_blazepose/mediapipe_utils.py:246: RuntimeWarning: overflow encountered in exp
scores = 1 / (1 + np.exp(-scores))
Traceback (most recent call last):
File "/Volumes/GoogleDrive-102536951206467607597/My Drive/Projects/Punchout/SkeletalTracker/depthai_blazepose/demo.py", line 68, in
frame = renderer.draw(frame, body)
File "/Volumes/GoogleDrive-102536951206467607597/My Drive/Projects/Punchout/SkeletalTracker/depthai_blazepose/BlazeposeRenderer.py", line 158, in draw
self.draw_landmarks(body)
File "/Volumes/GoogleDrive-102536951206467607597/My Drive/Projects/Punchout/SkeletalTracker/depthai_blazepose/BlazeposeRenderer.py", line 114, in draw_landmarks
cv2.rectangle(self.frame, body.xyz_zone[0:2], body.xyz_zone[2:4], (180,0,180), 2)
TypeError: Argument 'thickness' is required to be an integer

Error in the edge mode

Hi,
first of all thank you for your great work!

I got a problem that I'm not able to explain. I'm working with an OAK-D-PRO FF.

If I use the edge mode specifying also the xyz flag everything is fine, but if I just launch python3 demo.py -e (without -xyz) I always get this error:

[18443010E1A6AB0F00] [3.6] [7.446] [system] [critical] Fatal error. Please report to developers. Log: 'ResourceLocker' '358'
Traceback (most recent call last):
File "/home/user/Development/body/depthai_blazepose/depthai_blazepose/demo.py", line 65, in
frame, body = tracker.next_frame()
File "/home/user/Development/body/depthai_blazepose/depthai_blazepose/BlazeposeDepthaiEdge.py", line 486, in next_frame
res = marshal.loads(self.q_manager_out.get().getData())
RuntimeError: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'manager_out' (X_LINK_ERROR)'

Do you have any idea?

Thank you!

align the person

The paper says:"we align the person so that the point between the hips is loceated at the center of the square image passed as the neural network input". Can you tell me where this part of the code is?

AttributeError: 'depthai.node.Script' object has no attribute 'setScriptData'

I found a program using lazepose:https://github.com/Atzingen/QuickiumGymI
I download the requirements as the requirements.yml follows,when I try to run the quickium_solution.py,it happened like this :
Pose detection blob file : /home/ycj/QuickiumGym-main/scores/blazepose/models/pose_detection_sh4.blob
Landmarks using blob file : /home/ycj/QuickiumGym-main/scores/blazepose/models/pose_landmark_full_sh4.blob
Internal camera image size: 576 x 324 - pad_h: 126
Creating pipeline...
Creating Color Camera...
Traceback (most recent call last):
File "quickium_solution.py", line 105, in
pose = BlazeposeDepthai()
File "/home/ycj/QuickiumGym-main/scores/blazepose/BlazeposeDepthaiEdge.py", line 98, in init
self.device = dai.Device(self.create_pipeline())
File "/home/ycj/QuickiumGym-main/scores/blazepose/BlazeposeDepthaiEdge.py", line 143, in create_pipeline
manager_script.setScriptData(self.build_manager_script())
AttributeError: 'depthai.node.Script' object has no attribute 'setScriptData'
2022-10-08 20-13-36 ηš„ε±εΉ•ζˆͺε›Ύ
what could be the problem? should I replace the whole "blazepose" folder in that program with yours ?thank you very much!

run as standalone network device on OAK-D Pro POE

Hello.

I know you might not have a POE device. But i was wondering if you know how i can make you code run without the need of a terminal window being open.

Right now i cd to my project repo and call python3.8 demo_osc.py -e --oscIP 10.100.0.101 --oscPort 12345 -xyz
I have modified your code already to not show a live video, not any window other than the terminal window.
I added code to send the skeleton data over UPD / OSC.

Thanks for your advice.

How are Y coordinates actually mean to be interpreted?

In the BlazePoseRenderer.py file, there is a curious comment block:

Beware, the y value of landmarks_world coordinates is negative for landmarks 
above the mid hips (like shoulders) and negative for landmarks below (like feet).
The y value of (x,y,z) coordinates given by depth sensor is negative in the lower part
 of the image and positive in the upper part.

Is there in fact a mistake in the first sentence?

Surely it is not possible for the Y values to be negative BOTH for landmarks above AND below the mid hips?

Should the sentence not read something like (change in bold):

the y value of landmarks_world coordinates is negative for landmarks above the mid hips (like shoulders) and positive for landmarks below (like feet)

This is important, because I'm trying to translate the landmark points to absolute positions in the same way in my own project, and this "explanation" made no sense to me.

how to specify IP address of OAK-D Pro

I know this might be a question for the depthai github account but their examples do not seem to use your code structure.

I have multiple OAK-D pros connected to the same network, each with their own fixed IP address.
I am hoping to define in the code which device to connect to.

But doing this does not work:

        device_info = dai.DeviceInfo("10.100.0.21")
        self.device = dai.Device(device_info)

https://github.com/geaxgx/depthai_blazepose/blob/main/BlazeposeDepthaiEdge.py#L104

But in their example they are using a structure that does not seem to match the way you write your code:

device_info = dai.DeviceInfo("169.254.1.222")
# device_info = depthai.DeviceInfo("14442C108144F1D000") # MXID
# device_info = depthai.DeviceInfo("3.3.3") # USB port name

with dai.Device(pipeline, device_info) as device:
    qRgb = device.getOutputQueue(name="rgb", maxSize=4, blocking=False)
    while True:
        cv2.imshow("rgb", qRgb.get().getCvFrame())
        if cv2.waitKey(1) == ord('q'):
            break
``
https://docs.luxonis.com/projects/hardware/en/latest/pages/guides/getting-started-with-poe.html#manually-specify-device-ip

Thanks again for any advice.

License

Hi!

I'm from ReDrawing team for the OpenCV AI 2021 competition, and we used a part of this repository code. We noticed that the repository don't have any license, so we can't derive from it, technically not even run it.

If not on purpose, could you please add a license? We also used your
depthai_hand_tracker project. Here some help about licenses.

Our project repository: link

Input for landmark regression too small

I'm trying to implement a version of the DepthAI Blazepose project integrated with Yolo in order to pass multiple crops.
The problem is that if I walk away from the camera for 2 meters then the body score is 0 or very low.
I've noticed that the frame for the landmark regression is very small.
image

And a lot of times the cropped frame with warping is hidden in the corner like this:
image

I'm referring to that piece of code:

frame_nn = mpu.warp_rect_img(body.rect_points, square_frame, lm_input_length, lm_input_length)
# print(f"Body rect points: {body.rect_points} - Dim cropped frame: {cropped_frame.shape}")

cv2.imshow("Input landmark", frame_nn)

# cv2.imwrite("debug_pre_lm_yolo.png", frame_nn)

frame_nn = frame_nn / 255.

# cv2.imshow("landmark input", frame_nn)
nn_data = dai.NNData()
nn_data.setSequenceNum(pose_detection.getSequenceNum())
nn_data.setTimestamp(body_time)
nn_data.setLayer("input_1", to_planar(frame_nn, (lm_input_length, lm_input_length)))
q_landmark_in.send(nn_data)

How can be solved this problem?

How to replace ColorCamera with XLinkIn node?

I want to replace the Colorcamera with XLinkIn node.
image

I know that it will not be on edge, but I'm trying to use this pipeline since it is optimized and works on high FPS rate.

I've tried to send data through the XLinkIn node in the following way:

frame_time = time.monotonic()
frame_nn = dai.ImgFrame()
frame_nn.setTimestamp(frame_time)
frame_nn.setData(to_planar(video_frame, (video_frame.shape[1], video_frame.shape[0])))
frame_nn.setData(video_frame)
self.q_image_in.send(frame_nn)       

But it gives me this error:
[14442C10310D57D700] [1.4] [180.825] [system] [critical] Fatal error. Please report to developers. Log: 'Fatal error on MSS CPU: trap: 2A, address: 8009C784' '0'

Question

Hi, I've just have a question for the two parameters : lm_input_length & pd_input_length
You have addressed to them static value, but what does they mean ? Because I saw that letter you used those value to divide the landmarks resulted by the camera

Thanks in advance

About the support for depthai==2.7.1.0

I'm working on a logic fix to make it compatible with the latest depthai package (2.7.1.0). I'd like to issue a pull request to your repository if I can. πŸ˜ƒ
PINTO0309@d5cd3f9

I've already got everything working properly except Edge-only, but I'm wondering how to fix the following part. If you know of a way to do this, could you please let me know?

# Define manager script node
manager_script = pipeline.create(dai.node.Script)
manager_script.setScriptData(self.build_manager_script())

The reason why I want to fix this is because another engineer reported an issue that the latest OAK-D does not work properly with depthai==2.3.0. I feel that I want everyone in the world to enjoy your wonderful repository.

I have successfully gotten it to work with 2.7.1.0 if it is not FullEdge.
ezgif com-gif-maker (15)

Wrong 3d hand pose

Hi, this is really a great work. However, when I am using your codes, I notice that the overall pose estimation is excellent, but the estimated position of thumbs are not so precise, especially whem palms are upwards. Actually, the palms are still backwards in 3d pose estimation while in reality they are upwards. It seems like that the model cannot detect the real position of pinky, index and thumb. Have you solved this problem?

Question about getting spatial data of each landmark

(I asked on the depthai forums as well, but was also curious about your take!)

I am trying to use Oak-D to get spatial data of the pose landmarks. For example, XYZ data of all the points that mark the joints.

I think the way Oak-D works for spatial detection on objects is that it draws a box around an object, and then averages out all of the xyz data of the points in the box from a depth map (created by using disparity matching), is this correct?

If so, what happens when a box is drawn around a person, and then some of the points in the box are objects that are further behind the person (such as the wall)?

In the case of getting spatial data for the body landmarks, do you think it's wiser to draw boxes around the landmarks and then average out the depth data? Or just get the depth data precisely at the landmark points?

Thanks in advance,
Jae

--show_3d might not be working on rasberry pi?

Hi sorry to bother again.
Thank you for your previous help and quick replies, now --show_3d is working amazingly on my Mac and I am very happy about it.
Now I tried to run it on my rasberry pi 4B. I have opencv and all the depthai dependencies installed. I also installed your requirements.txt file.

However when running the BlazeposeDepthai.py script with -3 / --show_3d I get the following error:

python3 BlazeposeDepthai.py -3 /home/pi/.virtualenvs/cv-installatie/lib/python3.7/site-packages/open3d/__init__.py:21: UserWarning: failed to import module _transformations warnings.warn('failed to import module %s' % name) 896 anchors have been created Traceback (most recent call last): File "BlazeposeDepthai.py", line 610, in <module> internal_fps=args.internal_fps) File "BlazeposeDepthai.py", line 166, in __init__ self.vis3d = o3d.visualization.Visualizer() AttributeError: module 'open3d' has no attribute 'visualization'

If I run it without it works fine. Is this not possible on a rasberry pi? or am I missing something?
Thank you for your time and this great Repo!

BlazeposeDepthai.py -3 throws GLFW Error

All the requirements seemed to have installed OK, and running the basic python3 BlazeposeDepthai.py appears to work, although my OAK-D is at present not mounted where I can get far enough away for a full length body picture (I'll need a USB3 extension cable), I do see some segments being tracked.

But running the python3 BlazeposeDepthai.py -3 I get this error:
~/depthai_blazepose-main$ python3 BlazeposeDepthai.py -3 896 anchors have been created [Open3D WARNING] GLFW Error: GLX: Failed to create context: BadValue (integer parameter out of range for operation) [Open3D WARNING] Failed to create window Traceback (most recent call last): File "BlazeposeDepthai.py", line 591, in <module> ht = BlazeposeDepthai(input_src=args.input, File "BlazeposeDepthai.py", line 165, in __init__ opt.background_color = np.asarray([0, 0, 0]) AttributeError: 'NoneType' object has no attribute 'background_color'
Suggestions?
I've zero experience with the open-3d python package. I'm running on Ubuntu 20.04 with all updates as of a few hours ago.

"No available devices" on Windows

I was trying to run BlazeposeDepthai.py either with webcam or filename on Window but each time I get error.
Can you help?

Creating pipeline...
Creating Pose Detection Neural Network...
Creating Landmark Neural Network...
Pipeline created.
Traceback (most recent call last):
  File "D:/Users/Downloads/depthai_blazepose-main/depthai_blazepose-main/BlazeposeDepthai.py", line 604, in <module>
    ht.run()
  File "D:/Users/Downloads/depthai_blazepose-main/depthai_blazepose-main/BlazeposeDepthai.py", line 408, in run
    device = dai.Device(self.create_pipeline())
RuntimeError: No available devices```

3d absolute position of any keypoint

Hi, how would I go about determining the absolute 3d location of a key point besides the refrence point? Right now, I'm able to find what the 3d positions of the mid point between my hips or shoulders are, but I haven't been able to figure out a way to do this for say, my right hand, or any other arbitrary body part. For my project, I'm able to assume certain keypoints such as the hand are in frame, so I'm not sure if the points mentioned in the readme would still be an issue.

Question about shift_x = 0 and shift_y = 0 in mediapipe_utils.py

Hi,

I've been exploring your custom BlazePose implementation and have a question on L398 and L399 in

A couple of lines down on L408 and L409 (in the case of rotation != 0) the following assignments are given:

x_shift = (w * width * shift_x * cos(rotation) - h * height * shift_y * sin(rotation)) 
y_shift = (w * width * shift_x * sin(rotation) + h * height * shift_y * cos(rotation)) 

These 0 terms result in x_shift and y_shift to be zero as well.

Here are (I believe) the defaults set by mediapipe, https://github.com/google/mediapipe/blob/ecb5b5f44ab23ea620ef97a479407c699e424aa7/mediapipe/calculators/util/rect_transformation_calculator.proto - 5 and 6 seems a bit high. Does this mean that no shift is necessary in the BlazePose implementation of DepthAI?

does each landmark have a visibility or confidence value?

Hello.
Thanks for making this great project.

I got it all running and am now trying to iterate through all landmarks in order to send them out the network over OSC.
I am able to send xyz

 def send_pose(self, client: udp_client, body):
              
        if body is None:
            client.send_message(OSC_ADDRESS, 0)
            return

        # create message and send
        builder = OscMessageBuilder(address=OSC_ADDRESS)
        builder.add_arg(1)
        for i, oneLM in enumerate(body.landmarks):
            if self.is_present(body, i):
#                print ("lm",i, "x", oneLM[0], "y", oneLM[1], "z", oneLM[2])
                builder.add_arg(i, arg_type='i'
                builder.add_arg(oneLM[0], arg_type='i')
                builder.add_arg(oneLM[1], arg_type='i')
                builder.add_arg(oneLM[2], arg_type='i')
                # visibility
                # builder.add_arg(x_y_z_v[3])
                
        msg = builder.build()
        client.send(msg)

But I was oping to also include the visibility score for each landmark.

In your code I see

        body.visibility = 1 / (1 + np.exp(-lm_raw[:,3]))
        body.presence = 1 / (1 + np.exp(-lm_raw[:,4]))

But I think this only refers to the full body not each landmark. ?

if I do

for i, oneBody in enumerate(body.landmarks):

I get this error TypeError: 'Body' object is not iterable

thanks for any advice.

using IR LED and mono cameras for skeleton detection

I am able to display the right mono camera and can see the IR LED is illuminating the scene.
But the blazepose seems to be run on the RGB camera.
I would like to use the camera in a dark environment and would like to learn how to let blazepose use the mono camera that can see in the dark thanks to the IR LED.

Screen Shot 2022-08-13 at 11 16 54 AM

Screen Shot 2022-08-13 at 11 17 06 AM

problem on landmarks shift

Hi, I have this problem when I run your demo. The following picture shows that when I run the demo, the landmarks have a great shift from where they ought to be.
blazepose
However, if I stream the image to the host and use mediapipe solution directly, the result is good.
pose on host

I wonder if this problem is related to the blob file generated from tflite? How can I solve this problem?
Thanks.

OAK-D Pro POE

Your current example uses cam = pipeline.create(dai.node.ColorCamera)
I wonder if you can help me take advantage of the IR laser projector to improve the depth readings.

I know Blazepose used 2D images to infer the depth. I am hoping to use the OAK-D Pro POE at light. This means I should at least use the IR LED to illuminate the scene. And even better use the IR laser dot projector to sample the depths.

Thanks,
Stephan.

Depth value in script

Hi, again :x,

I would like to know, if there's an easy way to modify your script in order to replace the z value generated by the model, with the z value perceived by stereo capture. At the moment, I've implemented a queue with the stereo output in, but the number of request on the pipeline for the depth slowed the retrieval of the body data.

RuntimeError: No available devices

Hi, i'm trying to run and test your repository, but every time I launch this python3 demo.py -i path/to/video.mp4 , I get the following error:

Pose detection blob file : depthai_blazepose/models/pose_detection_sh4.blob
Landmarks using blob file : depthai_blazepose/models/pose_landmark_full_sh4.blob
Video FPS: 30
Original frame size: 852x480
Padding on height : 186
Frame working size: 852x480
2254 anchors have been created
Creating pipeline...
Creating Pose Detection Neural Network...
Creating Landmark Neural Network...
Pipeline created.
Traceback (most recent call last):
File "demo.py", line 46, in
pose = BlazeposeDepthai(input_src=args.input,
File "depthai_blazepose/BlazeposeDepthai.py", line 175, in init
self.device = dai.Device(self.create_pipeline())
RuntimeError: No available devices

Can you explain me why this error appear?
Thank you,
Bests

Problem with Spatial Location

Hi, I come back to you for a simple question. I'm trying to add to your library the possibility to have the approximate position in millimeters of a specific point as proposed here https://docs.luxonis.com/projects/api/en/latest/samples/SpatialDetection/spatial_location_calculator/

But when I try it I've got non realistic value for x and y. As I understand the value are calculated compared to the center of the frame. But I've got too similarly values and I don't know how to trouble shot it. Have you done something special for making it work for the center hips or shoulders ?

I'm trying with this two point :
image

For each point their real position in frame are : [1149, 842] and [1256, 405]

And my spatial locator give me the respective true positions :
[-252.84371948242188, -120.76118469238281, 1234.7139892578125] and§ [-255.68605041503906, -121.71033477783203, 1234.7718505859375]

For the z position it seems to be legit, cause far away of the camera I go bigger the distance is..

I don't know if you will be able to help me

'Couldn't read data from stream: 'manager_out' (X_LINK_ERROR)' on OAK-D S2, M1 Mac

Getting the following error on my M1 Mac with the OAK-D S2 camera:

(venv) ~/Luxonis/depthai_blazepose βŽ‡ (main)
βœ— ➜  python3 demo.py -e
Pose detection blob file : /Users/liuyue/Luxonis/depthai_blazepose/models/pose_detection_sh4.blob
Landmarks using blob file : /Users/liuyue/Luxonis/depthai_blazepose/models/pose_landmark_full_sh4.blob
Internal camera FPS set to: 20
Internal camera image size: 1152 x 648 - pad_h: 252
Creating pipeline...
Creating Color Camera...
Creating Pose Detection pre processing image manip...
Creating Pose Detection Neural Network...
Creating Pose Detection post processing Neural Network...
Creating Landmark pre processing image manip...
Creating DiveideBy255 Neural Network...
Creating Landmark Neural Network...
Pipeline created.
[19443010C1351C1300] [1.1.4] [1.199] [NeuralNetwork(5)] [warning] Network compiled for 1 shaves, maximum available 13, compiling for 6 shaves likely will yield in better performance
[19443010C1351C1300] [1.1.4] [1.200] [NeuralNetwork(9)] [warning] Network compiled for 4 shaves, maximum available 13, compiling for 6 shaves likely will yield in better performance
[19443010C1351C1300] [1.1.4] [1.433] [NeuralNetwork(4)] [warning] Network compiled for 4 shaves, maximum available 13, compiling for 6 shaves likely will yield in better performance
[19443010C1351C1300] [1.1.4] [1.446] [NeuralNetwork(5)] [warning] The issued warnings are orientative, based on optimal settings for a single network, if multiple networks are running in parallel the optimal settings may vary
Pipeline started - USB speed: SUPER
[19443010C1351C1300] [1.1.4] [1.446] [NeuralNetwork(9)] [warning] The issued warnings are orientative, based on optimal settings for a single network, if multiple networks are running in parallel the optimal settings may vary
[19443010C1351C1300] [1.1.4] [1.446] [NeuralNetwork(4)] [warning] The issued warnings are orientative, based on optimal settings for a single network, if multiple networks are running in parallel the optimal settings may vary
[19443010C1351C1300] [1.1.4] [7.415] [system] [critical] Fatal error. Please report to developers. Log: 'ResourceLocker' '358'
libc++abi: terminating due to uncaught exception of type std::runtime_error: Communication exception - possible device error/misconfiguration. Original message 'Couldn't read data from stream: 'manager_out' (X_LINK_ERROR)'
Stack trace (most recent call last):
#10   Object "libc++.1.dylib", at 0x180d313ef, in std::rethrow_exception(std::exception_ptr) + 23
#9    Object "libc++abi.dylib", at 0x180dbfeeb, in std::terminate() + 55
#8    Object "libc++abi.dylib", at 0x180dbff47, in std::__terminate(void (*)()) + 15
#7    Object "libobjc.A.dylib", at 0x180a8703b, in _objc_terminate() + 159
#6    Object "libc++abi.dylib", at 0x180db03b3, in demangling_terminate_handler() + 319
#5    Object "libc++abi.dylib", at 0x180dc0b83, in abort_message + 131
#4    Object "libsystem_c.dylib", at 0x180d15ae7, in abort + 179
#3    Object "libsystem_pthread.dylib", at 0x180e07c27, in pthread_kill + 287
#2    Object "libsystem_platform.dylib", at 0x180e36a83, in _sigtramp + 55
#1    Object "depthai.cpython-310-darwin.so", at 0x289cff467, in backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 19
#0    Object "depthai.cpython-310-darwin.so", at 0x289cff4bf, in backward::SignalHandling::sig_handler(int, __siginfo*, void*) + 107
[1]    75155 abort      python3 demo.py -e

Does this repository support Apple Silicon macs?

[ImageManip(7)] [error] Invalid configuration or input image -skipping frame

My camera has this problem after running for about 5 hours, and the program is directly stuck, and the program has not been changed. I suspect that the reason is that the temperature of my camera is too high, but this error is that when there is no one for a long time, someone suddenly starts to jam. Have you ever encountered this situation? I hope you can reply to me as soon as possible. Thank you.

'depthai.Pipeline' object has no attribute 'create'

Hello, I receive the error "depthai.Pipeline object has no attribute "create". Any ideas where the problem is? Thank you for your answer!

Pose detection blob file : C:\Users\Admin\depthai_blazepose-main\models\pose_detection_sh4.blob
Landmarks using blob file : C:\Users\Admin\depthai_blazepose-main\models\pose_landmark_full_sh4.blob
Internal camera FPS set to: 8
Sensor resolution: (1920, 1080)
Internal camera image size: 1792 x 1008 - crop_w:0 pad_h: 392
2254 anchors have been created
Creating pipeline...
Creating Color Camera...
Creating Pose Detection pre processing image manip...
Traceback (most recent call last):
File "C:\Users\Admin\depthai_blazepose-main\examples\semaphore_alphabet\demo.py", line 51, in
pose = BlazeposeDepthai(input_src=args.input, lm_model=args.model)
File "../..\BlazeposeDepthai.py", line 218, in init
self.device.startPipeline(self.create_pipeline())
File "../..\BlazeposeDepthai.py", line 287, in create_pipeline
pre_pd_manip = pipeline.create(dai.node.ImageManip)
AttributeError: 'depthai.Pipeline' object has no attribute 'create'
[Finished in 3.5s with exit code 1]
[shell_cmd: python -u "C:\Users\Admin\depthai_blazepose-main\examples\semaphore_alphabet\demo.py"]
[dir: C:\Users\Admin\depthai_blazepose-main\examples\semaphore_alphabet]
[path: C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\ProgramData\DockerDesktop\version-bin;C:\Program Files\Docker\Docker\Resources\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\libnvvp;C:\Program Files (x86)\Common Files\Intel\Shared Libraries\redist\intel64\compiler;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files (x86)\Brackets\command;C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR;C:\Program Files\Git\cmd;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0;C:\WINDOWS\System32\OpenSSH;C:\Program Files\dotnet;C:\Program Files\OpenNI\Bin64;C:\Program Files\Nuitrack\nuitrack\nuitrack\bin;C:\Users\Admin\AppData\Local\Programs\Python\Python37\Scripts;C:\Users\Admin\AppData\Local\Programs\Python\Python37;C:\Users\Admin\AppData\Local\Microsoft\WindowsApps;"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\libnvvp;C:\ProgramData\DockerDesktop\version-bin;C:\Program Files\Docker\Docker\Resources\bin;C:\Program Files (x86)\Common Files\Intel\Shared";C:\ProgramData\NVIDIA Corporation\Downloader\PostProcessing\GFE\a1b8f95e233fed392281175573cfcd07\GFExperience.NvStreamSrv\amd64\server;C:\Program Files\NVIDIA Corporation\NvStreamSrv;;C:\Users\Admin\AppData\Local\Programs\Microsoft VS Code\bin;C:\Users\Admin.dotnet\tools]

Pose estimation for multiple people

Hi, I'm using this repository to test the pose estimation on an OAK-IOT device for the edge-programming and not host-programming. It works perfectly for one person but it can't detect other people. Is there any setting or way to do this?

glitchiness

Hello,

So I am trying to run this project on my Oak-D camera, my main goal is to extract potentially data for all joints/landmarks in 3D. However I am experiences some a drop in FPS rate and glichiness whenever a skeleton appears. I don't think it is a processing power issue.

Device not recognized for OAK-D W OV9782

Script returns error:
self.device = dai.Device()
RuntimeError: Failed to find device after booting, error message: X_LINK_DEVICE_NOT_FOUND

However, the same code works on OAK-D W with IMX378 sensor.

I wonder if it's related to a setup script.

--show_3d isn't working

Hi thank you so much for sharing this amazing demo!

maybe its something I am doing wrong, but I cant seem to get --show_3d to work.
When I run it, it gives this error:
Traceback (most recent call last): File "/Users/corlangerak/depthai_blazepose/BlazeposeDepthai.py", line 626, in <module> ht.run() File "/Users/corlangerak/depthai_blazepose/BlazeposeDepthai.py", line 412, in run device = dai.Device(self.create_pipeline()) File "/Users/corlangerak/depthai_blazepose/BlazeposeDepthai.py", line 213, in create_pipeline pd_nn.setBlobPath(str(Path(self.pd_path).resolve().absolute())) RuntimeError: NeuralNetwork node | Blob at path: /Library/Frameworks/Python.framework/Versions/3.8/Resources/Python.app/Contents/Resources/models/pose_detection.blob doesn't exist
Am I missing something? the path is actually defined in the code, so I do not really understand.
Thanks again!

Question for xyz mode

Hi, I'm trying to include your library in my Bachelor project, but as I'm trying to understand it correctly a question appear to me.

When the xyz mode is enable, it create de stereo depth capture by linking right and left capture, and as I can see it give us directly the correct measure for the center of the hips or the shoulder. But, when I'm printing landmarks, I can see that there is also 3 output value by landmark, even if the xyz mode is disable. It is tangible data for landmark reprensentation in 3d in pixel, even if the xyz is disable ?

Thanks in advance for your answer

Renderer Argument -3 World Attribute Error

On a Raspberry Pi4/b Ubuntu 22.04: Running python3 demo.py -e -xyz -3 world generates: File "/home/ubuntu/depthai_blazepose/o3d_utils.py", line 119, in init opt.background_color = np.asarray(bg_color) "AttributeError: 'None Type' object has no attribute 'background_color.

BTW: Running python3 demo.py -e -xyz without error required downgrading depthai to version 2.20.2 from 2.21.2.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.