Giter Site home page Giter Site logo

gaurav-aditya / computer-automation Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 0.0 117.53 MB

Computer Automation using OpenCv and MediaPipe - Virtually controlling computer using hand-gestures and voice commands.

License: GNU General Public License v3.0

Python 85.60% CSS 9.27% HTML 2.44% JavaScript 2.68%
aditya-prakash-pandey anaconda artificial-intelligence computer-automation final-year-project gaurav-aditya machine-learning python aditya-prakash computer-automation-using-opencv-and-mediapipe computer-automation-using-opencvand opencv-and-mediapipe

computer-automation's Introduction

Computer Automation using Gesture Controls   platform

Computer Automation is a Gesture Controlled Virtual Mouse makes human computer interaction simple by making use of Hand Gestures and Voice Commands. The computer requires almost no direct contact. All i/o operations can be virtually controlled by using static and dynamic hand gestures along with a voice assistant. This project makes use of the state-of-art Machine Learning and Computer Vision algorithms to recognize hand gestures and voice commands, which works smoothly without any additional hardware requirements. It leverages models such as CNN implemented by MediaPipe running on top of pybind11. It consists of two modules: One which works direct on hands by making use of MediaPipe Hand detection, and other which makes use of Gloves of any uniform color. Currently it works on Windows platform.

Note: Use Python version: 3.8.5

Features

click on dropdown to know more

Gesture Recognition:

Neutral Gesture Neutral Gesture. Used to halt/stop execution of current gesture.
Move Cursor Cursor is assigned to the midpoint of index and middle fingertips. This gesture moves the cursor to the desired location. Speed of the cursor movement is proportional to the speed of hand.
Left Click Gesture for single left click
Right Click Gesture for single right click
Double Click Gesture for double click
Scrolling Dynamic Gestures for horizontal and vertical scroll. The speed of scroll is proportional to the distance moved by pinch gesture from start point. Vertical and Horizontal scrolls are controlled by vertical and horizontal pinch movements respectively.
Drag and Drop Gesture for drag and drop functionality. Can be used to move/tranfer files from one directory to other.
Multiple Item Selection Gesture to select multiple items
Volume Control Dynamic Gestures for Volume control. The rate of increase/decrease of volume is proportional to the distance moved by pinch gesture from start point.
Brightness Control Dynamic Gestures for Brightness control. The rate of increase/decrease of brightness is proportional to the distance moved by pinch gesture from start point.

Voice Assistant ( Proton ):

Launch / Stop Gesture Recognition launch stop gesture recognition
  • Proton Launch Gesture Recognition
    Turns on webcam for hand gesture recognition.
  • Proton Stop Gesture Recognition
    Turns off webcam and stops gesture recognition. (Termination of Gesture controller can also be done via pressing Enter key in webcam window)
Google Search proton search github
  • Proton search {text_you_wish_to_search}
    Opens a new tab on Chrome Browser if it is running, else opens a new window. Searches the given text on Google.
Find a Location on Google Maps proton find location
  1. Proton Find a Location
    Will ask the user for the location to be searched.
  2. {Location_you_wish_to_find}
    Will find the required location on Google Maps in a new Chrome tab.
File Navigation proton list filesproton openproton go back
  • Proton list files / Proton list
    Will list the files and respective file_numbers in your Current Directory (by default C:)
  • Proton open {file_number}
    Opens the file / directory corresponding to specified file_number.
  • Proton go back / Proton back
    Changes the Current Directory to Parent Directory and lists the files.
Current Date and Time proton date / time
  • Proton what is today's date / Proton date
    Proton what is the time / Proton time
    Returns the current date and time.
Copy and Paste proton copy proton paste
  • Proton Copy
    Copies the selected text to clipboard.
  • Proton Paste
    Pastes the copied text.
Sleep / Wake up Proton proton sleep / wake up
  • Sleep
    Proton bye
    Pauses voice command execution till the assistant is woken up.
  • Wake up
    Proton wake up
    Resumes voice command execution.
Exit proton exit
  • Proton Exit
    Terminates the voice assisstant thread. GUI window needs to be closed manually.

Getting Started

Pre-requisites

Python: (3.6 - 3.8.5)
Anaconda Distribution: To download click here.

Procedure

git clone https://github.com/gaurav-aditya/Computer-Automation.git

For detailed information about cloning visit here.

Step 1:

conda create --name gest python=3.8.5

Step 2:

conda activate gest

Step 3:

pip install -r requirements.txt

Step 4:

conda install PyAudio
conda install pywin32

Step 5:

cd to the GitHub Repo till src folder

Command may look like: cd C:\Users\.....\Computer-Automation\src

Step 6:

For running Voice Assistant:

python Proton.py

( You can enable Gesture Recognition by using the command "Proton Launch Gesture Recognition" )

Or to run only Gesture Recognition without the voice assisstant:

Uncomment last 2 lines of Code in the file Gesture_Controller.py

python Gesture_Controller.py

Collaborators

Aditya Prakash GitHub Email LinkedIn
Amit Tiwari Github Email LinkedIn
Bhuwan Chauhan GitHub Email LinkedIn

Computer-Automation-


#About me- [https://linktr.ee/echoaditya]

computer-automation's People

Contributors

gaurav-aditya avatar algorithm-unlock avatar

Stargazers

Surepalli Dinesh Naga Teja  avatar SUBHASH PUBLIC SCHOOL avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.