Giter Site home page Giter Site logo

m-davies / eye-of-horus Goto Github PK

View Code? Open in Web Editor NEW
3.0 3.0 0.0 18.37 MB

A facial and gesture recognition authentication system

License: GNU General Public License v3.0

Python 52.96% HTML 0.91% CSS 1.06% JavaScript 43.98% Pug 0.13% Shell 0.97%
all-in-one authentication aws biometrics boto3 camera face facial-recognition gesture gesture-recognition gestures hybrid kinesis mobile rekognition security security-camera unlock vision

eye-of-horus's People

Contributors

dependabot[bot] avatar m-davies avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

eye-of-horus's Issues

No sanity checking for invalid gestures

We have a set number of gesture types but no logic in the python scripts to limit which gestures a user can pass to them. We will need to find a way to pull the labels we have specified from AWS and ensure the user is using one of them.

Remove ability to turn off the model

This is more of a note issue for later down the line. Normal users should not be able to turn the rekog model on and off. That should be down to S3 Admins only (which is not a going to be a feature of this project).

Implement custom gesture pin lengths

Currently it's mined and maxed at 4 per combination but the max could probably be expanded to be larger. However, this is not a high priority reflecting on the timescale this project needs to be completed in

Add in unit tests for python

Although an ongoing topic, it would be good to have a number of tests that thoroughly test the different entry points of the application alongside the upcoming website tests. pytest seems to be a reasonable industry standard to look into to accomplish this.

Address slow gesture recog

Could be due to sleeping prior to checking frames, general latency, python perf or just that we are using a web api for a streaming service (probably not the best idea). Regardless, this will require investigating and debugging.

Originally posted in #22
Arguably more serious (and possibly related to point 1) (EDIT: Unlikely, the model is now pretty accurate and timestamp debug output shows the frames are quite slow to come in), the model is too slow at picking up gesture types. At the moment, if the gesture combination given is not immediately correct, the rate at which old frames from the stream are identified is slower than the rate at which new frames are produced by the stream. This is obviously an issue where a user could find that they will be access denied if they are not quick enough, the ultimate antithesis of this project which focuses on accessibility. I think this is more likely a limitation of calling a web resource (vs a locally stored resource) instead of performance. Regardless, I will ask about Pythonic ways to improve perf in the standup tomorrow but if that doesn't help, my remaining bleak options are thus:

  • Only capture every other frame in the stream -> Reduces the rate of income but also reduces accuracy as only every other snapshot from the livestream will be picked up
  • Record a video instead of a live stream and instead run that through detect_custom_labels -> This actually exists as a thing for standard labels but not for custom labels (which is what I need). Nevertheless, this would eliminate the race condition problem altogether but would bring up security concerns about allowing an attack surface for gesture combination "brute forcing", essentially the allowance of passing a video file would create big problems. Furthermore, it may still take a long time to search for the gesture.
  • Take pictures instead of a stream -> Definitely the quickest method and adds a lot more structure and reliability to the process, especially when it comes to the website. However, this still has the same problems as above and has no failsafe (there is no "next frame" in a still image). This could be bruteforced even harder too. This is probably my preferred method but the security concerns do worry me as this is supposed to be a good authentication system. I will need to give this some thought

Investigate AWS Rekognition for facial comparison capabilities

Why?

AWS provides more powerful and refined services than local python libraries for a reasonable cheap, if not free, price. Albeit harder to set up and understand initially, it'll save me a lot of development time later into the project

Aims

Investigate AWS Rekognition, mainly following this tutorial to produce a prototype script. Collect video evidence to confirm it works.

This script would take in a video stream (from my laptop or phone camera) and compare the faces in it to stored images on an AWS bucket. The response would be similar to the examples here. A successful test would be a positive match with my face stored in AWS to a group of faces in the video stream.

Setup AWS Custom Labels

Follow up to #6

Once we have a stream image extractor, we can begin working on setting up the AWS Custom Label framework using the console (following this tutorial)

Add in ability to edit user face or gesture combination

Currently, only the ability to create/overwrite a profile or delete it exists. It would be useful to include an edit field that will allow for the editing of objects without having to delete them. This could either be handled by the current Python scripts (preferable from a development standpoint) or via the website backend (preferable from a security standpoint)

Add in file copying and renaming sanitisation in python scripts

Right now, if you try to upload a file called 20210301_120317(0).jpg it won't be expanded. I am unsure of the consequences of this but the path does result in being 20210301_120317\(0\).jpg to escape the brackets (and that's without considering the effect of files with spaces). As a result, we should sanitise or at least warn the user about these issues

Add a start stream button to frontend and hook up with backend

The skeleton is in place (#3), now we just need to ensure the frontend can actually utilise and query the backend. Since we'll be using S3 as a database, this will be considered complete when we can query the frontend and it will execute a backend script.

This involves hooking up startStream.sh with a frontend button that will begin the streaming process to aws

Deprecation warning: tostring()

Follow up from #22, need to figure out an alternative as I don't believe the output of tobytes() is accepted by AWS

/Users/morgan/Documents/Repos/eye-of-horus/src/scripts/manager.py:588: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
  foundGesture = gesture_recog.checkForGestures(cv2.imencode(".jpg", frame)[1].tostring())

Improve Gesture ML model

Right now, the machine learning gesture recognition model is at 95% accuracy. This is acceptable for development but a production ML should be consistently getting at least 98% (my own standard). As such, we need more tight and available training data in the model (Related #21)

Add in logout route and view

A subsequent logout page and authentication and routes to do so will be needed to prove that logging out works just as well as logging in

Remove unlock from login form

This is currently there as it shares usage with the create form. However, we will need to have this toggled on/off depending on if we are logging in or not

Expand gesture functionality to allow objects combinations to act as keys too

Random idea, perhaps a bad pitch, instead of just allowing for gesture recog, why not allow combinations to consist of combinations of ANY objects! This would dramatically decreases the brute force attack space AND allow for perhaps a more accurate model if it only contains pictures of certain objects rather than similar looking gestures (see #30 for the struggles I have had with this).

Remove setUserExists

It's not really needed anymore since we are going to generate the authenticate.js page by the registering bool instead of a token

Allow scripts to return an identifiable response to their queries

Case in point, index_photo.py will say in the console log that it was successful at adding the image to the Rekognition collection but doesn't return any JSON or API code response that the NodeJS backend will be able to utilise.

We will need to add some sort of framework (early on) that will return the result (both payload and code) of the python script execution to the NodeJS backend for use in error handling and/or api redirection.

Define a set number of gesture types a user can make

For testing, I'm thinking 4 (thumbs up, thumbs down, open hand, closed hand) but this is interchangeable and not set in stone. Furthermore, we also need to develop the actual test images for AWS Custom Labels (at least 30 of each category, ideally 40 to be safe, in a variety of different lightings, locations and people in the image).

Create Website Skeleton

Deploy a basic website skeleton that will hold the nodejs backend and reactjs frontend

To save time and avoid errors, I'm using Facebook's create-react-app tool to generate the skeleton

See this article for further reasons why it's important to get this correct

image_manager.py is in the wrong place

If we plan on having this as a master file, we will need to either adjust it to be not the case and create a new master file (or adjust commons.py to be a new master file) OR we move it up the dir to scripts/ and make it the new master file. This is because it will also need to make use of the gesture scripts in the dir alongside it, creating weird pathing problems from going up and down dirs.

Investigate AWS Rekognition and Azure CustomVision for gesture comparison capabilities

Why?

AWS provides more powerful and refined services than local python libraries for a reasonable cheap, if not free, price. Albeit harder to set up and understand initially, it'll save me a lot of development time later into the project

Aims

COMPLETE ONE OF THESE TWO

  • Investigate AWS Rekognition, mainly following this tutorial to produce a prototype script. Collect video evidence to confirm it works.
  • Investigate Azure Custom Vision, partially following this tutorial to produce a prototype script. Collect video evidence to confirm it works.

The reason I say one of these two is because I haven't explored AWS Custom Labels (the branch of AWS Rekognition that allows for user labeling of training data) that well so I'm not sure if it conforms to our spec or if it's free.

This script would IDEALLY take in a video stream (from my laptop or phone camera) and compare performed gestures in it to stored images of gestures on an online bucket. I say IDEALLY because AWS Custom Labels doesn't allow for video streaming and I'm unsure if Azure does or not. In the case of both of these not supporting streams or video, I will have to adapt my solution slightly so it takes pictures of gestures instead of streamed video.

A successful test would be a positive match with my performed gesture stored in the cloud to a gesture performed in the video stream or image.

Add redirects for when specific tokens are not set

E.g. authenticated and username and userexists will need to be set in order to access the dashboard. This can be achieved by automatically redirecting with an error alert to explain the situation if the specific tokens are not set

if(!token) {
  return <Login setToken={setToken} />
}

Define an acceptance policy for recognition

To save time and processing power, we could define a policy to accept high confidence faces or gestures from the camera instead of setting a static 90% confidence or more policy (what we have now). Idea:

  • Face recog of 90% or above = Access Granted
  • Face recog of 70-90% and gesture combination was correctly entered = Access Granted
  • Face recog of less than 70% leading to a average gesture recog of 90% or above OR higher than 70% but wrong gesture combination was given = Access Denied
  • Either face or gesture is less than 70% or non-existent = Access Denied

NOTE: The strikethroughs are due to me reevaluating what will actually be needed in light of the web frontend and feasibility of an "average" gesture combination given

Wrap scripts and website inside docker files

Our backend (and probably our frontend too eventually) contain a lot of dependancies across quite a few different systems. To make life easier for end users and for me in case I lose my computer, it would be useful to wrap the backend and frontend in docker containers (perhaps even encapsulated in a kubernetes cluster) via dockerfiles

Documentation on running the scripts locally

The python scripts are going to be designed with modularity in mind, meaning you can run parts of the process or none at all from different parts of the program. This may be confusing to new users/examiners so we will need to document this as we go in a docs/ folder

Flesh out the frontend

The website is rather dead and lacks any navbar or login features. Once the backend and testing phases are sorted, we'll need to work on getting a number of features implemented into the frontend for the backend to hook to. This list is not complete:

  • Navbar (Login/Logout, Home, About, Contributing, etc)
  • Pagination (Register, About Us, Feedback/Contact form, etc)

Verbosity option for python scripts

Sometimes we don't want to print out all the debugging data, just the errors. Other times, we do. It would be good to have a -v/--verbose option to enable/disable this factor

Add in user account gesture storage logic for AWS

This basically means that we need to figure out how we're going to store a user's gesture combination in AWS. Could be:

  • Some sort of file (JSON?) that lists the order and type of gesture
  • The files themselves in a folder marked 1/ or equivalent
  • A combination of both methods

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.