wolterlw / hand_tracking Goto Github PK
View Code? Open in Web Editor NEWMinimal Python interface for Google's Mediapipe HandTracking pipeline
License: Apache License 2.0
Minimal Python interface for Google's Mediapipe HandTracking pipeline
License: Apache License 2.0
@ahmed-shariff Added support for 3d poses on this code base:
ahmed-shariff@310fb95
However, using this hand_tracker.py
detector = HandTracker(palm_model_path, landmark_model_path, anchors_path, box_shift=0.2, box_enlarge=1.3)
d = detector(img)
With print(d[0])
, I still am getting the 2D estimation.
@ahmed-shariff mind clarifying?
[[457.47368348, 359.82393988], [429.87579848, 298.62916644], [379.11063081, 253.75389192], [317.62233118, 238.91130884], [263.22237604, 230.28409888], [350.34224016, 210.99346312], [283.266783 , 173.77211398], [237.94263826, 172.72647489], [213.60732161, 182.52631797], [342.0273813 , 235.3966395 ], [272.1424833 , 220.17817074], [233.09361395, 237.22693846], [223.68429643, 260.37553656], [342.1249241 , 265.83757083], [281.07808576, 271.40460841], [262.76236076, 303.5830144 ], [266.86327714, 327.15547596], [348.08264754, 291.63407504], [320.50589946, 311.77354425], [316.85192093, 341.60134386], [323.00332217, 360.62679611]]
when I import models, the error shows below:
ValueError: Didn't find custom op for name 'Convolution2DTransposeBias' with version 1
Registration failed
so, I want to know what's the version for the three packages.
Thanks !
Can you clarify first two steps.
Now we need to change the BUILD file. Is BUILD file file inside tensorflow/tensorflow?
we now are inside tensorflow repo you should remove the @org_tensorflow from dependency paths. Can you provide more information where I need to remove @org_tensorflow?
I am getting an error:
fatal error: mediapipe/util/tflite/operations/transpose_conv_bias.h: No such file or directory
Hi,
I have a dataset of sequences of images (contain right/left hands) and i want to use mediapipe to extract hand regions from these images (both right and left hand or simply the right hand) and then save the hand regions in a destination folder.
I used the hand_tracker script to extract the hands. however, i've noticed that the model sometimes fails to track the same hand. You mentioned here #4 (comment) that hand_tracker2.py supports multiple hands tracking. How can i use that script to keep track of the correct hand (right hand) for a given sequence of frames?
@wolterlw @AmitMY
i appreciate any feedback, thank you.
Hello~
Thank you for your work.
In the hand_tracker.py,
I want to enlarge the size and then detect, but I don’t know which parts need to change. Why do you resize to 256,256? If I want to change to other sizes, is it okay?
I wonder if it can dectect muti-hands and how many hands could be detected at most?
I put in hand_landmark_3d.tflite but got this error instead:
kp_orig = (self._pad1(joints) @ Minv.T)[:,:2]
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 4)
I am getting the following error while running the provided jupyter notebook
"Model provided has model identifier 'd d
', should be 'TFL3' "
Hi im a newbie. i tried to run the hand_tracker.py but nothing is happened. i run it in Terminal.
I wanna do train hand gesture using your hand tracking but I dont know where to start.
Thank for your guide! :)
But when completed, I got an error "cannot find operation Convolution2DTransposeBias". It seems "CustomOperations" not participating in the compilation.
Could you please share your files which modified compared to tensorflow's source, include "CustomOperations" files and "BUILD" files ?
The frame rate that I get when running metalwhale's repo's hand tracking model is much extremely slow. The output frames don't form a smooth stream, instead, I see jittery, rough transitions on my output frames as I move my hand slowly across the camera. My output is significantly slower than the GIF output in the repository. Is there a way to fix this issue? If this is optimized better in this repo it would be great if I can try it. Please let me know how to run your model. Thanks.
I read the SSD paper, but I still don't understand what the numbers in the anchor file mean. From your code, I know that the first two columns of anchor represent the predicted target center point, but the last two columns are always 1. I don't know what they mean, hope to get your help, thank you
Hi @wolterlw, thank you very much for your kind guidance about building tensorflow with custom operations!
May I ask you a question? How did you generate data/anchors.csv
file? I tried to search around inside the mediapipe source code but got nothing.
I'm newbie to the machine learning era so please forgive me if this is a stupid question...
Hello,
Thanks for the nice work! I tried the mutihand branch on Epic Kitchens and the keypoints are visible. I was wondering, however, is there a way to visualize the connections between keypoints (a complete 2D pose estimation)? I may have missed it coz I'm new to this so please let me know if that's the case and how I can do visualize that.
Thanks!
Hi, thank you for sharing your code. I have been trying to change the threshold (even to 0.95) to eliminate false positive, buy it did not help.
Line 125 in 0cccb81
Have anyone had the same situation? Or any suggestion is greatly appreciated.
Not an actual issue, more of a clarification request.
I'm trying to do hand pose estimation to a video.
This is what OpenPose returns:
It doesn't return the hands because it can't detect them.
Next, I tried to run this hand-tracking code, the simplest way I can:
for frame in frames:
hands = detector(frame)
As you can see, there are hands flying out, so I added a detector.reset()
before invoking it every time:
Which does work better regarding flying hands, but has its own artifacts.
I then tried to perhaps feed the previously detected hands back:
if len(last_hands) > 0:
hands = detector(img, last_hands)
else:
hands = detector(img)
last_hands = hands
or to add a detector.reset()
every time no hand is detected:
and for completeness sake, here is a synchronized video of everything I tried:
Hi,
Thank you for your efforts, I've been looking for something like this.
I'm using MediaPipe multihandtracking to get 42 keypoints for both hands (21 keypoints/hand).
From C++, I'm able to get them and right them to a txt file, but I was hoping to use your code to get these 42 keypoints into python in real time, is this possible?
Thanks
ref: https://ai.googleblog.com/2019/08/on-device-real-time-hand-tracking-with.html
Currently hand detections are not aggregated, instead the detection with the largest bounding box size is selected. This often leads to incorrect bounding box rotations, which impedes landmark detection.
Accuracy can be approved by implementing non-max suppression or a variation of it.
Hi,@wolterlw ,thanks for your nice job. I want to know why hand_tracker.py don't use the GPUs (I find it only use computer's CPUs).
If I want to use that on a computer with GPU , how should I do ?
Looking forword your reply! Thanks alot.
I was following the instructions and trying to reproduce the examples. But it raises a recursion error when calling the sigmoid function here.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\user\Desktop\temp\hand_tracking\hand_tracker.py", line 170, in __call__
source, keypoints = self.detect_hand(img_norm)
File "C:\Users\user\Desktop\temp\hand_tracking\hand_tracker.py", line 125, in detect_hand
detecion_mask = self._sigm(out_clf) > 0.7
File "C:\Users\user\Desktop\temp\hand_tracking\hand_tracker.py", line 95, in _sigm
return 1 / (1 + np.exp(-x) )
RecursionError: maximum recursion depth exceeded while calling a Python object
Any idea of how to deal with this? I am using TensorFlow 2.0, python 35 in virtualenv of anaconda. Much appreciated!
Can i get the values of z coordinate using this hand_tracking module
Hi!
Firstly, one of the files you are trying to load using load_models.sh, has been renamed:
hand_landmark_3d.tflite >> hand_landmark.tflite
This itself took me a few hours to figure out and modify the bash script so I can run this command without getting a 404 error message.
Secondly, I am not able to load the two other files, which are from metalwhale/hand_tracking repo. When I just ran the original load_models.sh commands for this repo, I get a Error 404 Not Found. Then I changed the raw file address according to what I get when I right-click on the Download button in metalwhale/hand_tracking/models/palm_detection_without_custom_op.tflite . Therefore what I do is I change:
wget --header 'Host: raw.githubusercontent.com' --user-agent 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:71.0) Gecko/20100101 Firefox/71.0' --header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' --header 'Accept-Language: en-US,en;q=0.5' --referer 'https://github.com/metalwhale/hand_tracking/blob/master/palm_detection_without_custom_op.tflite' --header 'Upgrade-Insecure-Requests: 1' 'https://raw.githubusercontent.com/metalwhale/hand_tracking/master/palm_detection_without_custom_op.tflite' --output-document './models/palm_detection.tflite'
to:
wget --header 'Host: raw.githubusercontent.com' --user-agent 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:71.0) Gecko/20100101 Firefox/71.0' --header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' --header 'Accept-Language: en-US,en;q=0.5' --referer 'https://github.com/metalwhale/hand_tracking/blob/master/palm_detection_without_custom_op.tflite' --header 'Upgrade-Insecure-Requests: 1' 'https://github.com/metalwhale/hand_tracking/raw/master/models/palm_detection_without_custom_op.tflite' --output-document './models/palm_detection.tflite'
And now the error has changed to the following:
--2020-08-26 18:58:50-- https://github.com/metalwhale/hand_tracking/raw/master/models/palm_detection_without_custom_op.tflite
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://github.com/metalwhale/hand_tracking/raw/master/models/palm_detection_without_custom_op.tflite [following]
--2020-08-26 18:58:50-- https://github.com/metalwhale/hand_tracking/raw/master/models/palm_detection_without_custom_op.tflite
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 301 Moved Permanently
.
.
.
Location: https://github.com/metalwhale/hand_tracking/raw/master/models/palm_detection_without_custom_op.tflite [following]
20 redirections exceeded.
Could you please help me with this? Not sure why these redirections happen and why they're unsuccessful.
Thanks!!
I used the code in the current master branch to predict the bbox and joints for an image:
Works as well as expected.
Then I used the hand_tracker2
code in the other branch, which supports multiple hands detection and joints prediction, and got this:
Notice that the second hand from right (the one detected in both tests) - seems to get better joints prediction when the rotation is predicted.
I'm not sure if there is any other thing going on here, or if this is just a bad, sample size of 1 example, but should be taken into consideration.
Code:
palm_model_path = "./models/palm_detection.tflite"
landmark_model_path = "./models/hand_landmark.tflite"
anchors_path = "data/anchors.csv"
img = cv2.imread('data/hands.jpg')[:, :, ::-1]
# box_shift determines
detector = HandTracker(palm_model_path, landmark_model_path, anchors_path, box_shift=0.2, box_enlarge=1.3)
hands = detector(img)
f, ax = plt.subplots(1, 1, figsize=(10, 10))
ax.imshow(img)
for hand in hands:
ax.scatter(hand["joints"][:, 0], hand["joints"][:, 1])
ax.add_patch(Polygon(hand["bbox"], color="#00ff00", fill=False))
f.savefig("data/hands_out.png")
I cannot assemble the version TensorFlow for Mac OS!
It seems I do everything according to the instructions but I get an error on the output
ValueError: Didn't find custom op for name 'Convolution2DTransposeBias' with version 1
1. Step:
Moved all files from folder .../mediapipe/util/tflite/operations to folder .../tensorflow/tensorflow/lite/python/custom_ops/
2. Step:
BUILD fixed in folder .../tensorflow/tensorflow/lite/python/custom_ops/
licenses(["notice"]) # Apache 2.0
package(default_visibility = [
"//visibility:public",
])
cc_library(
name = "max_pool_argmax",
srcs = ["max_pool_argmax.cc"],
hdrs = ["max_pool_argmax.h"],
deps = [
"//tensorflow/lite/kernels:kernel_util",
"//tensorflow/lite/kernels:padding",
"//tensorflow/lite/kernels/internal:common",
"//tensorflow/lite/kernels/internal:tensor",
"//tensorflow/lite/kernels/internal:tensor_utils",
],
)
cc_library(
name = "max_unpooling",
srcs = ["max_unpooling.cc"],
hdrs = ["max_unpooling.h"],
deps = [
"//tensorflow/lite/kernels:kernel_util",
"//tensorflow/lite/kernels:padding",
"//tensorflow/lite/kernels/internal:common",
"//tensorflow/lite/kernels/internal:tensor",
"//tensorflow/lite/kernels/internal:tensor_utils",
],
)
cc_library(
name = "transpose_conv_bias",
srcs = ["transpose_conv_bias.cc"],
hdrs = ["transpose_conv_bias.h"],
deps = [
"//tensorflow/lite/kernels:kernel_util",
"//tensorflow/lite/kernels:padding",
"//tensorflow/lite/kernels/internal:tensor",
"//tensorflow/lite/kernels/internal:tensor_utils",
"//tensorflow/lite/kernels/internal:types",
],
)
3. Step:
Configure the build
./configure
4. Step:
bazel build --action_env PATH="$PATH" --noincompatible_strict_action_env --config=opt --incompatible_disable_deprecated_attr_params=false //tensorflow/tools/pip_package:build_pip_package
5. Step:
./bazel-bin/tensorflow/tools/pip_package/build_pip_package ./tensorflow_pkg
thanks so much for this python version hand track
when i adjust different values of "box_shift“ and “box_enlarge“, it would affect the effect of hand detection,whether should i adjust these two parameters for different input images? and how to adjust?
thx
Hi @wolterlw, thank you very much for this mediapipe wrapper for python!
I have a pretty noob question, I am new to this environment and I would like to use my web cam to use the handtracker, is this possible? You think you could upload a mini example :( I've been trying using videocapture from opencv, but it didn't work for me.
Hello, I modified the program to input a picture, but when a similar picture is input, the coordinates of the key points of the output gesture will be different. How to eliminate this difference? Thank you!
I write a webcam demo and find it is using cpu only. How to use gpu?
Because of the not-trivial build needed here, do you mind creating a dockerfile / a docker image so using this repo is easier?
Thanks for making it available in python!
I am getting around 2.1 fps on a desktop CPU, which is nowhere around real-time 😄
Is this the same for everyone?
Also, in Gooogle AI blog post they said they have gesture classification head, but I can't find it in the code. Can somebody point out where I should look for?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.