m-davies / eye-of-horus Goto Github PK

A facial and gesture recognition authentication system

License: GNU General Public License v3.0

Python 52.96% HTML 0.91% CSS 1.06% JavaScript 43.98% Pug 0.13% Shell 0.97%

all-in-one authentication aws biometrics boto3 camera face facial-recognition gesture gesture-recognition gestures hybrid kinesis mobile rekognition security security-camera unlock vision

eye-of-horus's People

Contributors

Stargazers

Watchers

eye-of-horus's Issues

No sanity checking for invalid gestures

We have a set number of gesture types but no logic in the python scripts to limit which gestures a user can pass to them. We will need to find a way to pull the labels we have specified from AWS and ensure the user is using one of them.

Remove ability to turn off the model

This is more of a note issue for later down the line. Normal users should not be able to turn the rekog model on and off. That should be down to S3 Admins only (which is not a going to be a feature of this project).

Add in session storage handling to each frontend route

We heavily use session storage and we don't clear it when we actually need to. It would be imperative to ensure session storage is cleared or altered when key events occur

Implement custom gesture pin lengths

Currently it's mined and maxed at 4 per combination but the max could probably be expanded to be larger. However, this is not a high priority reflecting on the timescale this project needs to be completed in

Add in unit tests for python

Although an ongoing topic, it would be good to have a number of tests that thoroughly test the different entry points of the application alongside the upcoming website tests. pytest seems to be a reasonable industry standard to look into to accomplish this.

Add in user account editing and server route

The frontend is present in a basic form but needs improving with css and the server routes need adding

Add loading circle when doing backend stuff

This is pretty much a must have considering how long some tasks will take

Address slow gesture recog

Could be due to sleeping prior to checking frames, general latency, python perf or just that we are using a web api for a streaming service (probably not the best idea). Regardless, this will require investigating and debugging.

Originally posted in #22
Arguably more serious ~~(and possibly related to point 1)~~ (EDIT: Unlikely, the model is now pretty accurate and timestamp debug output shows the frames are quite slow to come in), the model is too slow at picking up gesture types. At the moment, if the gesture combination given is not immediately correct, the rate at which old frames from the stream are identified is slower than the rate at which new frames are produced by the stream. This is obviously an issue where a user could find that they will be access denied if they are not quick enough, the ultimate antithesis of this project which focuses on accessibility. I think this is more likely a limitation of calling a web resource (vs a locally stored resource) instead of performance. Regardless, I will ask about Pythonic ways to improve perf in the standup tomorrow but if that doesn't help, my remaining bleak options are thus:

Only capture every other frame in the stream -> Reduces the rate of income but also reduces accuracy as only every other snapshot from the livestream will be picked up
Record a video instead of a live stream and instead run that through detect_custom_labels -> This actually exists as a thing for standard labels but not for custom labels (which is what I need). Nevertheless, this would eliminate the race condition problem altogether but would bring up security concerns about allowing an attack surface for gesture combination "brute forcing", essentially the allowance of passing a video file would create big problems. Furthermore, it may still take a long time to search for the gesture.
Take pictures instead of a stream -> Definitely the quickest method and adds a lot more structure and reliability to the process, especially when it comes to the website. However, this still has the same problems as above and has no failsafe (there is no "next frame" in a still image). This could be bruteforced even harder too. This is probably my preferred method but the security concerns do worry me as this is supposed to be a good authentication system. I will need to give this some thought

Add check in compare_faces.py to check if we are actively streaming or not

This may be difficult to next to impossible since there is no way I know of on the AWS side to detect if we are actively streaming. This will require extra investigation...

Add new env values to github secrets

Added by #43

Investigate AWS Rekognition for facial comparison capabilities

Why?

AWS provides more powerful and refined services than local python libraries for a reasonable cheap, if not free, price. Albeit harder to set up and understand initially, it'll save me a lot of development time later into the project

Aims

Investigate AWS Rekognition, mainly following this tutorial to produce a prototype script. Collect video evidence to confirm it works.

This script would take in a video stream (from my laptop or phone camera) and compare the faces in it to stored images on an AWS bucket. The response would be similar to the examples here. A successful test would be a positive match with my face stored in AWS to a group of faces in the video stream.

Setup AWS Custom Labels

Follow up to #6

Once we have a stream image extractor, we can begin working on setting up the AWS Custom Label framework using the console (following this tutorial)

Add in ability to edit user face or gesture combination

Currently, only the ability to create/overwrite a profile or delete it exists. It would be useful to include an edit field that will allow for the editing of objects without having to delete them. This could either be handled by the current Python scripts (preferable from a development standpoint) or via the website backend (preferable from a security standpoint)

Add in file copying and renaming sanitisation in python scripts

Right now, if you try to upload a file called 20210301_120317(0).jpg it won't be expanded. I am unsure of the consequences of this but the path does result in being 20210301_120317\(0\).jpg to escape the brackets (and that's without considering the effect of files with spaces). As a result, we should sanitise or at least warn the user about these issues

Add in user account deletion and server route

No page or route exists currently

Consider allowing file input in the comparison face script instead of a stream

Reason being is this would be quicker and easier to achieve on the website side, but that is only if the streaming option does not work on the nodejs server

Add a start stream button to frontend and hook up with backend

The skeleton is in place (#3), now we just need to ensure the frontend can actually utilise and query the backend. Since we'll be using S3 as a database, this will be considered complete when we can query the frontend and it will execute a backend script.

This involves hooking up startStream.sh with a frontend button that will begin the streaming process to aws

Deprecation warning: tostring()

Follow up from #22, need to figure out an alternative as I don't believe the output of tobytes() is accepted by AWS

/Users/morgan/Documents/Repos/eye-of-horus/src/scripts/manager.py:588: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
  foundGesture = gesture_recog.checkForGestures(cv2.imencode(".jpg", frame)[1].tostring())

Improve Gesture ML model

Right now, the machine learning gesture recognition model is at 95% accuracy. This is acceptable for development but a production ML should be consistently getting at least 98% (my own standard). As such, we need more tight and available training data in the model (Related #21)

Investigate whether we can create user images without having to append png or jpg

Would be easier than having to force the file extension every time. If AWS and Python can handle it without the extension, why should we create extra work for ourselves in the first place?

Add in logout route and view

A subsequent logout page and authentication and routes to do so will be needed to prove that logging out works just as well as logging in

Create User account registration for backend

NodeJS Backend implementations of users (frontend handled by #8)

Logging in/out
Registering

Remove unlock from login form

This is currently there as it shares usage with the create form. However, we will need to have this toggled on/off depending on if we are logging in or not

Expand gesture functionality to allow objects combinations to act as keys too

Random idea, perhaps a bad pitch, instead of just allowing for gesture recog, why not allow combinations to consist of combinations of ANY objects! This would dramatically decreases the brute force attack space AND allow for perhaps a more accurate model if it only contains pictures of certain objects rather than similar looking gestures (see #30 for the struggles I have had with this).

Remove setUserExists

It's not really needed anymore since we are going to generate the authenticate.js page by the registering bool instead of a token

Allow scripts to return an identifiable response to their queries

Case in point, index_photo.py will say in the console log that it was successful at adding the image to the Rekognition collection but doesn't return any JSON or API code response that the NodeJS backend will be able to utilise.

We will need to add some sort of framework (early on) that will return the result (both payload and code) of the python script execution to the NodeJS backend for use in error handling and/or api redirection.

Lots of linter issues

Added by #41. Most can probably be disabled or ignored

Define a set number of gesture types a user can make

For testing, I'm thinking 4 (thumbs up, thumbs down, open hand, closed hand) but this is interchangeable and not set in stone. Furthermore, we also need to develop the actual test images for AWS Custom Labels (at least 30 of each category, ideally 40 to be safe, in a variety of different lightings, locations and people in the image).

Script to run gesture recognition against the captured stream frames

Once the AWS and gesture extractor framework is setup, we'll need another script that will connect to AWS Custom Labels and check the image for any gestures.

Create Website Skeleton

Deploy a basic website skeleton that will hold the nodejs backend and reactjs frontend

To save time and avoid errors, I'm using Facebook's create-react-app tool to generate the skeleton

See this article for further reasons why it's important to get this correct

image_manager.py is in the wrong place

If we plan on having this as a master file, we will need to either adjust it to be not the case and create a new master file (or adjust commons.py to be a new master file) OR we move it up the dir to scripts/ and make it the new master file. This is because it will also need to make use of the gesture scripts in the dir alongside it, creating weird pathing problems from going up and down dirs.

Consider making lock gesture optional

Mainly thinking of scenarios where a user cannot log themselves out (although that could also be handled with timeouts)

Figure out why two alerts are produced during href change

It's not an awful issue but it is rather annoying to press ok twice

Investigate AWS Rekognition and Azure CustomVision for gesture comparison capabilities

Why?

Aims

COMPLETE ONE OF THESE TWO

Investigate AWS Rekognition, mainly following this tutorial to produce a prototype script. Collect video evidence to confirm it works.
Investigate Azure Custom Vision, partially following this tutorial to produce a prototype script. Collect video evidence to confirm it works.

The reason I say one of these two is because I haven't explored AWS Custom Labels (the branch of AWS Rekognition that allows for user labeling of training data) that well so I'm not sure if it conforms to our spec or if it's free.

This script would IDEALLY take in a video stream (from my laptop or phone camera) and compare performed gestures in it to stored images of gestures on an online bucket. I say IDEALLY because AWS Custom Labels doesn't allow for video streaming and I'm unsure if Azure does or not. In the case of both of these not supporting streams or video, I will have to adapt my solution slightly so it takes pictures of gestures instead of streamed video.

A successful test would be a positive match with my performed gesture stored in the cloud to a gesture performed in the video stream or image.

Add redirects for when specific tokens are not set

E.g. authenticated and username and userexists will need to be set in order to access the dashboard. This can be achieved by automatically redirecting with an error alert to explain the situation if the specific tokens are not set

if(!token) {
  return <Login setToken={setToken} />
}

Script to allow for the creation of a new label (and relevant test data)

This is probably more of a long term interest but it would be cool to have an easy script that will add a new label and collect the necessary training data so it's ready to go within 2 hours and all via an end user

Define an acceptance policy for recognition

To save time and processing power, we could define a policy to accept high confidence faces or gestures from the camera instead of setting a static 90% confidence or more policy (what we have now). Idea:

Face recog of 90% or above = Access Granted
Face recog of 70-90% and gesture combination was correctly entered = Access Granted
Face recog of less than 70% ~~leading to a average gesture recog of 90% or above~~ OR higher than 70% but wrong gesture combination was given = Access Denied
~~Either face or gesture is less than 70% or non-existent = Access Denied~~

NOTE: The strikethroughs are due to me reevaluating what will actually be needed in light of the web frontend and feasibility of an "average" gesture combination given

Remove OtherExploredSoloutions folders

Replace this with a README that links to the projects instead

Investigate whether we should add user gestures to AWS training data

Exactly as it sounds, it would be a good way to obtain free training data but at the same time, that's the disadvantage (I imagine GDPR would frown upon this). This will require extra research and feedback.

Gesture combination uploads are overwritten when pulling new files

Not a really big issue as people can drag in files to a directory and do it that way. But again, it would be nice for quality of life to change that

Create a working matching face extractor using Python and AWS

There are existing scripts for adding and removing users as well as a yet untested script to compare spotted faces in a stream with these users stored in AWS.

I should focus on getting this script tested and production ready ASAP

Wrap scripts and website inside docker files

Our backend (and probably our frontend too eventually) contain a lot of dependancies across quite a few different systems. To make life easier for end users and for me in case I lose my computer, it would be useful to wrap the backend and frontend in docker containers (perhaps even encapsulated in a kubernetes cluster) via dockerfiles

Create a converter script to extract individual images from the stream

This is needed for AWS Custom Labels (which cannot take a stream as input) which we need for gesture recognition. We will need to figure out how many frames to extract and how often (needs to be limited). https://theailearner.com/2018/10/15/extracting-and-saving-video-frames-using-opencv-python/ seems like a good tutorial to start with.

Documentation on running the scripts locally

The python scripts are going to be designed with modularity in mind, meaning you can run parts of the process or none at all from different parts of the program. This may be confusing to new users/examiners so we will need to document this as we go in a docs/ folder

Flesh out the frontend

The website is rather dead and lacks any navbar or login features. Once the backend and testing phases are sorted, we'll need to work on getting a number of features implemented into the frontend for the backend to hook to. This list is not complete:

Navbar (Login/Logout, Home, About, Contributing, etc)
Pagination (Register, About Us, Feedback/Contact form, etc)

Some sort of file (JSON?) that lists the order and type of gesture
The files themselves in a folder marked 1/ or equivalent
A combination of both methods

Error handling for implemented routes

This was missed in the initial work during #43 to prevent the PR becoming too big. This is needed now

m-davies / eye-of-horus Goto Github PK

eye-of-horus's People

Contributors

Stargazers

Watchers

eye-of-horus's Issues

Why?

Aims

Why?

Aims

Recommend Projects

Recommend Topics

Recommend Org