Giter Site home page Giter Site logo

hand_tracking's People

Contributors

amitmy avatar farnaznouraei avatar wolterlw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hand_tracking's Issues

3D Pose Estimation

@ahmed-shariff Added support for 3d poses on this code base:
ahmed-shariff@310fb95

However, using this hand_tracker.py

detector = HandTracker(palm_model_path, landmark_model_path, anchors_path, box_shift=0.2, box_enlarge=1.3)
d = detector(img)

With print(d[0]), I still am getting the 2D estimation.
@ahmed-shariff mind clarifying?

   [[457.47368348, 359.82393988],
   [429.87579848, 298.62916644],
   [379.11063081, 253.75389192],
   [317.62233118, 238.91130884],
   [263.22237604, 230.28409888],
   [350.34224016, 210.99346312],
   [283.266783  , 173.77211398],
   [237.94263826, 172.72647489],
   [213.60732161, 182.52631797],
   [342.0273813 , 235.3966395 ],
   [272.1424833 , 220.17817074],
   [233.09361395, 237.22693846],
   [223.68429643, 260.37553656],
   [342.1249241 , 265.83757083],
   [281.07808576, 271.40460841],
   [262.76236076, 303.5830144 ],
   [266.86327714, 327.15547596],
   [348.08264754, 291.63407504],
   [320.50589946, 311.77354425],
   [316.85192093, 341.60134386],
   [323.00332217, 360.62679611]]

python opencv tensorflow version

when I import models, the error shows below:
ValueError: Didn't find custom op for name 'Convolution2DTransposeBias' with version 1
Registration failed

so, I want to know what's the version for the three packages.
Thanks !

Following the steps.

Can you clarify first two steps.

  1. Now we need to change the BUILD file. Is BUILD file file inside tensorflow/tensorflow?

  2. we now are inside tensorflow repo you should remove the @org_tensorflow from dependency paths. Can you provide more information where I need to remove @org_tensorflow?

  3. I am getting an error:
    fatal error: mediapipe/util/tflite/operations/transpose_conv_bias.h: No such file or directory

hands tracking

Hi,

I have a dataset of sequences of images (contain right/left hands) and i want to use mediapipe to extract hand regions from these images (both right and left hand or simply the right hand) and then save the hand regions in a destination folder.
I used the hand_tracker script to extract the hands. however, i've noticed that the model sometimes fails to track the same hand. You mentioned here #4 (comment) that hand_tracker2.py supports multiple hands tracking. How can i use that script to keep track of the correct hand (right hand) for a given sequence of frames?
@wolterlw @AmitMY

i appreciate any feedback, thank you.

Wht img_norm.shape must be (256,256,3)?

Hello~
Thank you for your work.
In the hand_tracker.py,
I want to enlarge the size and then detect, but I don’t know which parts need to change. Why do you resize to 256,256? If I want to change to other sizes, is it okay?

MultiHand Detection

I wonder if it can dectect muti-hands and how many hands could be detected at most?

Hand landmark 3D model

I put in hand_landmark_3d.tflite but got this error instead:

kp_orig = (self._pad1(joints) @ Minv.T)[:,:2]
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 4)

Slow FPS for MetalWhale's repo, how to run this one?

The frame rate that I get when running metalwhale's repo's hand tracking model is much extremely slow. The output frames don't form a smooth stream, instead, I see jittery, rough transitions on my output frames as I move my hand slowly across the camera. My output is significantly slower than the GIF output in the repository. Is there a way to fix this issue? If this is optimized better in this repo it would be great if I can try it. Please let me know how to run your model. Thanks.

what does the anchor file mean?

I read the SSD paper, but I still don't understand what the numbers in the anchor file mean. From your code, I know that the first two columns of anchor represent the predicted target center point, but the last two columns are always 1. I don't know what they mean, hope to get your help, thank you

Where did the `data/anchors.csv` file come from?

Hi @wolterlw, thank you very much for your kind guidance about building tensorflow with custom operations!
May I ask you a question? How did you generate data/anchors.csv file? I tried to search around inside the mediapipe source code but got nothing.
I'm newbie to the machine learning era so please forgive me if this is a stupid question...

2D Pose Estimation (including connections between joints) possible?

Hello,

Thanks for the nice work! I tried the mutihand branch on Epic Kitchens and the keypoints are visible. I was wondering, however, is there a way to visualize the connections between keypoints (a complete 2D pose estimation)? I may have missed it coz I'm new to this so please let me know if that's the case and how I can do visualize that.

Thanks!

How to reduce false positive?

Hi, thank you for sharing your code. I have been trying to change the threshold (even to 0.95) to eliminate false positive, buy it did not help.

detecion_mask = self._sigm(out_clf) > 0.7

Have anyone had the same situation? Or any suggestion is greatly appreciated.

Recommended way to process videos

Not an actual issue, more of a clarification request.

I'm trying to do hand pose estimation to a video.

Given this video:
ezgif-6-8ce658e05744

This is what OpenPose returns:
openpose
It doesn't return the hands because it can't detect them.

Next, I tried to run this hand-tracking code, the simplest way I can:

for frame in frames:
  hands = detector(frame)

googlepose

As you can see, there are hands flying out, so I added a detector.reset() before invoking it every time:
ezgif-6-c80b24c39b10
Which does work better regarding flying hands, but has its own artifacts.

I then tried to perhaps feed the previously detected hands back:

    if len(last_hands) > 0:
        hands = detector(img, last_hands)
    else:
        hands = detector(img)
    last_hands = hands

googlepose_feedback_no_reset

or to add a detector.reset() every time no hand is detected:
googlepose_feedback_with_reset

and for completeness sake, here is a synchronized video of everything I tried:
ezgif-6-db38a7989fbf

Improve hand detection aggregation

image
Currently hand detections are not aggregated, instead the detection with the largest bounding box size is selected. This often leads to incorrect bounding box rotations, which impedes landmark detection.
Accuracy can be approved by implementing non-max suppression or a variation of it.

run it on a GPU

Hi,@wolterlw ,thanks for your nice job. I want to know why hand_tracker.py don't use the GPUs (I find it only use computer's CPUs).

If I want to use that on a computer with GPU , how should I do ?

Looking forword your reply! Thanks alot.

maximum recursion depth exceeded

I was following the instructions and trying to reproduce the examples. But it raises a recursion error when calling the sigmoid function here.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\user\Desktop\temp\hand_tracking\hand_tracker.py", line 170, in __call__
    source, keypoints = self.detect_hand(img_norm)
  File "C:\Users\user\Desktop\temp\hand_tracking\hand_tracker.py", line 125, in detect_hand
    detecion_mask = self._sigm(out_clf) > 0.7
  File "C:\Users\user\Desktop\temp\hand_tracking\hand_tracker.py", line 95, in _sigm
    return 1 / (1 + np.exp(-x) )
RecursionError: maximum recursion depth exceeded while calling a Python object

Any idea of how to deal with this? I am using TensorFlow 2.0, python 35 in virtualenv of anaconda. Much appreciated!

301 Moved Permanently recieved when running load_models.sh

Hi!

Firstly, one of the files you are trying to load using load_models.sh, has been renamed:
hand_landmark_3d.tflite >> hand_landmark.tflite
This itself took me a few hours to figure out and modify the bash script so I can run this command without getting a 404 error message.

Secondly, I am not able to load the two other files, which are from metalwhale/hand_tracking repo. When I just ran the original load_models.sh commands for this repo, I get a Error 404 Not Found. Then I changed the raw file address according to what I get when I right-click on the Download button in metalwhale/hand_tracking/models/palm_detection_without_custom_op.tflite . Therefore what I do is I change:

wget --header 'Host: raw.githubusercontent.com' --user-agent 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:71.0) Gecko/20100101 Firefox/71.0' --header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' --header 'Accept-Language: en-US,en;q=0.5' --referer 'https://github.com/metalwhale/hand_tracking/blob/master/palm_detection_without_custom_op.tflite' --header 'Upgrade-Insecure-Requests: 1' 'https://raw.githubusercontent.com/metalwhale/hand_tracking/master/palm_detection_without_custom_op.tflite' --output-document './models/palm_detection.tflite'
to:
wget --header 'Host: raw.githubusercontent.com' --user-agent 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:71.0) Gecko/20100101 Firefox/71.0' --header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' --header 'Accept-Language: en-US,en;q=0.5' --referer 'https://github.com/metalwhale/hand_tracking/blob/master/palm_detection_without_custom_op.tflite' --header 'Upgrade-Insecure-Requests: 1' 'https://github.com/metalwhale/hand_tracking/raw/master/models/palm_detection_without_custom_op.tflite' --output-document './models/palm_detection.tflite'

And now the error has changed to the following:

--2020-08-26 18:58:50--  https://github.com/metalwhale/hand_tracking/raw/master/models/palm_detection_without_custom_op.tflite
Resolving github.com (github.com)... 140.82.112.3
Connecting to github.com (github.com)|140.82.112.3|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://github.com/metalwhale/hand_tracking/raw/master/models/palm_detection_without_custom_op.tflite [following]
--2020-08-26 18:58:50--  https://github.com/metalwhale/hand_tracking/raw/master/models/palm_detection_without_custom_op.tflite
Reusing existing connection to github.com:443.
HTTP request sent, awaiting response... 301 Moved Permanently
.
.
.
Location: https://github.com/metalwhale/hand_tracking/raw/master/models/palm_detection_without_custom_op.tflite [following]
20 redirections exceeded.

Could you please help me with this? Not sure why these redirections happen and why they're unsuccessful.
Thanks!!

Removing rotation estimation decreases accuracy

I used the code in the current master branch to predict the bbox and joints for an image:
image

Works as well as expected.

Then I used the hand_tracker2 code in the other branch, which supports multiple hands detection and joints prediction, and got this:
image

Notice that the second hand from right (the one detected in both tests) - seems to get better joints prediction when the rotation is predicted.

I'm not sure if there is any other thing going on here, or if this is just a bad, sample size of 1 example, but should be taken into consideration.

Code:

palm_model_path = "./models/palm_detection.tflite"
landmark_model_path = "./models/hand_landmark.tflite"
anchors_path = "data/anchors.csv"

img = cv2.imread('data/hands.jpg')[:, :, ::-1]

# box_shift determines
detector = HandTracker(palm_model_path, landmark_model_path, anchors_path, box_shift=0.2, box_enlarge=1.3)
hands = detector(img)

f, ax = plt.subplots(1, 1, figsize=(10, 10))
ax.imshow(img)
for hand in hands:
    ax.scatter(hand["joints"][:, 0], hand["joints"][:, 1])
    ax.add_patch(Polygon(hand["bbox"], color="#00ff00", fill=False))

f.savefig("data/hands_out.png")

I cannot assemble the version TensorFlow for Mac OS

I cannot assemble the version TensorFlow for Mac OS!

It seems I do everything according to the instructions but I get an error on the output

ValueError: Didn't find custom op for name 'Convolution2DTransposeBias' with version 1

1. Step:

Moved all files from folder .../mediapipe/util/tflite/operations to folder .../tensorflow/tensorflow/lite/python/custom_ops/

2. Step:

BUILD fixed in folder .../tensorflow/tensorflow/lite/python/custom_ops/

licenses(["notice"])  # Apache 2.0

package(default_visibility = [
    "//visibility:public",
])

cc_library(
    name = "max_pool_argmax",
    srcs = ["max_pool_argmax.cc"],
    hdrs = ["max_pool_argmax.h"],
    deps = [
        "//tensorflow/lite/kernels:kernel_util",
        "//tensorflow/lite/kernels:padding",
        "//tensorflow/lite/kernels/internal:common",
        "//tensorflow/lite/kernels/internal:tensor",
        "//tensorflow/lite/kernels/internal:tensor_utils",
    ],
)

cc_library(
    name = "max_unpooling",
    srcs = ["max_unpooling.cc"],
    hdrs = ["max_unpooling.h"],
    deps = [
        "//tensorflow/lite/kernels:kernel_util",
        "//tensorflow/lite/kernels:padding",
        "//tensorflow/lite/kernels/internal:common",
        "//tensorflow/lite/kernels/internal:tensor",
        "//tensorflow/lite/kernels/internal:tensor_utils",
    ],
)

cc_library(
    name = "transpose_conv_bias",
    srcs = ["transpose_conv_bias.cc"],
    hdrs = ["transpose_conv_bias.h"],
    deps = [
        "//tensorflow/lite/kernels:kernel_util",
        "//tensorflow/lite/kernels:padding",
        "//tensorflow/lite/kernels/internal:tensor",
        "//tensorflow/lite/kernels/internal:tensor_utils",
        "//tensorflow/lite/kernels/internal:types",
    ],
)

3. Step:

Configure the build

./configure

4. Step:

bazel build --action_env PATH="$PATH" --noincompatible_strict_action_env --config=opt --incompatible_disable_deprecated_attr_params=false //tensorflow/tools/pip_package:build_pip_package

5. Step:

./bazel-bin/tensorflow/tools/pip_package/build_pip_package ./tensorflow_pkg

Any idea what I'm doing wrong?

How can I use hand_tracking with my webcam?

Hi @wolterlw, thank you very much for this mediapipe wrapper for python!

I have a pretty noob question, I am new to this environment and I would like to use my web cam to use the handtracker, is this possible? You think you could upload a mini example :( I've been trying using videocapture from opencv, but it didn't work for me.

GPU not used.

I write a webcam demo and find it is using cpu only. How to use gpu?

Docker image?

Because of the not-trivial build needed here, do you mind creating a dockerfile / a docker image so using this repo is easier?

Thanks for making it available in python!

Inference speed and classification head

I am getting around 2.1 fps on a desktop CPU, which is nowhere around real-time 😄
Is this the same for everyone?

Also, in Gooogle AI blog post they said they have gesture classification head, but I can't find it in the code. Can somebody point out where I should look for?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.