Patuli Bisindo Sign Language Model

This repository contains the code and resources for building a real-time Indonesian sign language (Bisindo) object detection model specifically designed for a mobile app. The model utilizes transfer learning techniques with SSD MobileNet V2 FPNLite architecture to accurately detect and localize sign language gestures in real-time video streams.

Description

This real-time Bisindo sign language object detection model is built using transfer learning, leveraging the pre-trained weights of SSD MobileNet V2 FPNLite 320x320 on a large-scale image recognition dataset. By fine-tuning the model on a custom dataset of sign language images, it has been trained to recognize and classify sign language gestures in real-time.

More information for the model: here

The model consists of three seperate model, each trained to detect specific categories:

Abjad (Alphabet): Detects and classifies various alphabet signs (26 classes)
Angka (Number): Detects and classifies various number signs. (11 classes)
Kata (Word): Detects and classifies various word signs. (23 classes)

The trained models have been converted into optimized .tflite format for efficient deployment on mobile devices. The model is also quantized to reduce the model size while maintaining accuracy.

Model Performance

We have trained the models three time with each version having improved dataset and different training parameters. The following show the links for the Tensorboard visualization of the training process for each model:

Model V1

V1 is our experimental model, trained with varying steps and still not optimized datasets.
- Abjad
- Angka
- Kata

Model V2 (We use this in the application)

V2 is our first production model, trained with optimized dataset and 40k steps of training.
- Abjad
- Angka
- Kata

Model V3

V3 is our second production model, trained with the same dataset as V2 and this time with less steps of training (20k steps).
- Abjad
- Angka
- Kata

Our Dataset

The training dataset used for this project consists of a large collection of annotated sign language images. The dataset includes diverse samples of different sign language gestures, captured under various lighting conditions, backgrounds, and hand orientations. The annotations provide bounding box coordinates and corresponding labels for each sign language gesture.

Link to the dataset:

Training the Model

To train the model, you can do it locally or on Google Colab. For our case, we trained the model locally on a machine with CUDA enabled GPU in a linux environment using WSL2 (Ubuntu 20 LTS). You need to train the model in a linux environment because some of the commands used in the training process are linux specific.

To train the model locally, you can follow the steps below:

Ensure that you have installed CUDA and cuDNN on your machine. You can follow the steps here to install CUDA and here to install cuDNN. We follow this guide to install CUDA on our WSL2 Ubuntu.
Install the WSL extension on your VSCode.
Prepare a virtual environment to train your model. You can follow the steps here to create a virtual environment.
Go to your roboflow account to get your API key in the account settings and then store it in .env file. Alternatively, You can also download it manually by visiting our dataset links above and download the dataset in VOC format, then moving it to project directory into images folder. By doing this, you can skip the downloading process of the dataset in the notebook.
Clone the repository, open the notebooks in its own directory or as a root directory.
Follow the steps in the notebook to train the model.

To train the model on Google Colab, you will need to set the python version to 3.8.10, you can follow the steps below:

Create a new notebook on Google Colab.
Import the notebook from this repository.
Go to your roboflow account to get your API key in the account settings and then store it in .env file. Alternatively, You can also download it manually by visiting our dataset links above and download the dataset in VOC format, then moving it to project directory into images folder. By doing this, you can skip the downloading process of the dataset in the notebook.
Follow the steps in the notebook to train the model.
After you have finished training the model, you can download the model from the notebook.

patuli-pahlawan-tuli / patuli-ml Goto Github PK

patuli-ml's Introduction

Patuli Bisindo Sign Language Model

Description

Model Performance

Our Dataset

Training the Model

patuli-ml's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent