A unified convolutional neural network (CNN) algorithm for both hand gesture recognition and fingertip detection is presented here. The proposed algorithm uses a single network to predict both finger class probabilities and fingertips positional output in one evaluation. From the finger class probabilities, the gesture is recognized and using both of the information fingertips are localized. Instead of directly regressing fingertips position from the fully connected (FC) layer of the CNN, we regress the ensemble of fingertips position from the fully convolutional network (FCN) and subsequently take ensemble average to regress the final fingertips positional output.
- TensorFlow-GPU==1.11.0
- Keras==2.2.4
- OpenCV==3.4.4
- ImgAug==0.2.6
- Weights: Download the trained weights files for gesture recognition and fingertip detection model and put the weights folder in the working directory.
The weights
folder contains three weights files. The comparison.h5
is for first five classes and performance.h5
is for first eight classes. solo.h5
is for hand detection.
The proposed gesture recognition and fingertip detection model is trained by employing Scut-Ego-Gesture Dataset
which has a total of eleven different
single hand gesture datasets. Among the eleven different gesture datasets, eight of them are considered for the experimentation.
The detail explanation about the partition of the dataset along with the list of the images used in the training, validation, and
the test set is provided in the
dataset/
folder.
To implement the algorithm, the following network architecture is proposed where a single CNN is utilized for both hand gesture recognition and fingertip detection.
To run in real-time simply clone the repository and download the weights file and then run the real-time.py
file.
directory > python real-time.py
In real-time execution, there are two stages. In the first stage, the hand is detected using single object localization (SOLO) algorithm which is a simple single object multi-class object detection algorithm primarily published in my Fingertip Mixed Reality repository. The detected hand portion is then cropped and fed to the second stage for gesture recognition and fingertip detection.
Here is the output of the unified gesture recognition and fingertip detection model for all of the classes of the dataset where not only each fingertip is detected but also each finger is classified.