m-davies / eye-of-horus Goto Github PK
View Code? Open in Web Editor NEWA facial and gesture recognition authentication system
License: GNU General Public License v3.0
A facial and gesture recognition authentication system
License: GNU General Public License v3.0
Follow up to #6
Once we have a stream image extractor, we can begin working on setting up the AWS Custom Label framework using the console (following this tutorial)
A subsequent logout page and authentication and routes to do so will be needed to prove that logging out works just as well as logging in
Added by #43
Random idea, perhaps a bad pitch, instead of just allowing for gesture recog, why not allow combinations to consist of combinations of ANY objects! This would dramatically decreases the brute force attack space AND allow for perhaps a more accurate model if it only contains pictures of certain objects rather than similar looking gestures (see #30 for the struggles I have had with this).
This basically means that we need to figure out how we're going to store a user's gesture combination in AWS. Could be:
1/
or equivalentOtherwise someone could just bruteforce a gesture combination. Perhaps the value could be stored in the gesture config file?
E.g. authenticated and username and userexists will need to be set in order to access the dashboard. This can be achieved by automatically redirecting with an error alert to explain the situation if the specific tokens are not set
if(!token) {
return <Login setToken={setToken} />
}
Right now, if you try to upload a file called 20210301_120317(0).jpg
it won't be expanded. I am unsure of the consequences of this but the path does result in being 20210301_120317\(0\).jpg
to escape the brackets (and that's without considering the effect of files with spaces). As a result, we should sanitise or at least warn the user about these issues
This is probably more of a long term interest but it would be cool to have an easy script that will add a new label and collect the necessary training data so it's ready to go within 2 hours and all via an end user
Exactly as it sounds, it would be a good way to obtain free training data but at the same time, that's the disadvantage (I imagine GDPR would frown upon this). This will require extra research and feedback.
Deploy a basic website skeleton that will hold the nodejs backend and reactjs frontend
To save time and avoid errors, I'm using Facebook's create-react-app tool to generate the skeleton
See this article for further reasons why it's important to get this correct
Could be due to sleeping prior to checking frames, general latency, python perf or just that we are using a web api for a streaming service (probably not the best idea). Regardless, this will require investigating and debugging.
Originally posted in #22
Arguably more serious (and possibly related to point 1) (EDIT: Unlikely, the model is now pretty accurate and timestamp debug output shows the frames are quite slow to come in), the model is too slow at picking up gesture types. At the moment, if the gesture combination given is not immediately correct, the rate at which old frames from the stream are identified is slower than the rate at which new frames are produced by the stream. This is obviously an issue where a user could find that they will be access denied if they are not quick enough, the ultimate antithesis of this project which focuses on accessibility. I think this is more likely a limitation of calling a web resource (vs a locally stored resource) instead of performance. Regardless, I will ask about Pythonic ways to improve perf in the standup tomorrow but if that doesn't help, my remaining bleak options are thus:
The python scripts are going to be designed with modularity in mind, meaning you can run parts of the process or none at all from different parts of the program. This may be confusing to new users/examiners so we will need to document this as we go in a docs/
folder
There are existing scripts for adding and removing users as well as a yet untested script to compare spotted faces in a stream with these users stored in AWS.
I should focus on getting this script tested and production ready ASAP
Mainly thinking of scenarios where a user cannot log themselves out (although that could also be handled with timeouts)
The skeleton is in place (#3), now we just need to ensure the frontend can actually utilise and query the backend. Since we'll be using S3 as a database, this will be considered complete when we can query the frontend and it will execute a backend script.
This involves hooking up startStream.sh with a frontend button that will begin the streaming process to aws
It's not really needed anymore since we are going to generate the authenticate.js page by the registering bool instead of a token
For testing, I'm thinking 4 (thumbs up, thumbs down, open hand, closed hand) but this is interchangeable and not set in stone. Furthermore, we also need to develop the actual test images for AWS Custom Labels (at least 30 of each category, ideally 40 to be safe, in a variety of different lightings, locations and people in the image).
This may be difficult to next to impossible since there is no way I know of on the AWS side to detect if we are actively streaming. This will require extra investigation...
The frontend is present in a basic form but needs improving with css and the server routes need adding
This is needed for AWS Custom Labels (which cannot take a stream as input) which we need for gesture recognition. We will need to figure out how many frames to extract and how often (needs to be limited). https://theailearner.com/2018/10/15/extracting-and-saving-video-frames-using-opencv-python/ seems like a good tutorial to start with.
Not a really big issue as people can drag in files to a directory and do it that way. But again, it would be nice for quality of life to change that
This was missed in the initial work during #43 to prevent the PR becoming too big. This is needed now
No page or route exists currently
Although an ongoing topic, it would be good to have a number of tests that thoroughly test the different entry points of the application alongside the upcoming website tests. pytest
seems to be a reasonable industry standard to look into to accomplish this.
The website is rather dead and lacks any navbar or login features. Once the backend and testing phases are sorted, we'll need to work on getting a number of features implemented into the frontend for the backend to hook to. This list is not complete:
AWS provides more powerful and refined services than local python libraries for a reasonable cheap, if not free, price. Albeit harder to set up and understand initially, it'll save me a lot of development time later into the project
COMPLETE ONE OF THESE TWO
The reason I say one of these two is because I haven't explored AWS Custom Labels (the branch of AWS Rekognition that allows for user labeling of training data) that well so I'm not sure if it conforms to our spec or if it's free.
This script would IDEALLY take in a video stream (from my laptop or phone camera) and compare performed gestures in it to stored images of gestures on an online bucket. I say IDEALLY because AWS Custom Labels doesn't allow for video streaming and I'm unsure if Azure does or not. In the case of both of these not supporting streams or video, I will have to adapt my solution slightly so it takes pictures of gestures instead of streamed video.
A successful test would be a positive match with my performed gesture stored in the cloud to a gesture performed in the video stream or image.
It's not an awful issue but it is rather annoying to press ok twice
Our backend (and probably our frontend too eventually) contain a lot of dependancies across quite a few different systems. To make life easier for end users and for me in case I lose my computer, it would be useful to wrap the backend and frontend in docker containers (perhaps even encapsulated in a kubernetes cluster) via dockerfiles
Would be easier than having to force the file extension every time. If AWS and Python can handle it without the extension, why should we create extra work for ourselves in the first place?
Added by #41. Most can probably be disabled or ignored
NodeJS Backend implementations of users (frontend handled by #8)
We heavily use session storage and we don't clear it when we actually need to. It would be imperative to ensure session storage is cleared or altered when key events occur
Will allow for an easier replication of the build script environment by other systems.
This is currently there as it shares usage with the create form. However, we will need to have this toggled on/off depending on if we are logging in or not
Follow up from #22, need to figure out an alternative as I don't believe the output of tobytes() is accepted by AWS
/Users/morgan/Documents/Repos/eye-of-horus/src/scripts/manager.py:588: DeprecationWarning: tostring() is deprecated. Use tobytes() instead.
foundGesture = gesture_recog.checkForGestures(cv2.imencode(".jpg", frame)[1].tostring())
Right now, the machine learning gesture recognition model is at 95% accuracy. This is acceptable for development but a production ML should be consistently getting at least 98% (my own standard). As such, we need more tight and available training data in the model (Related #21)
To save time and processing power, we could define a policy to accept high confidence faces or gestures from the camera instead of setting a static 90% confidence or more policy (what we have now). Idea:
NOTE: The strikethroughs are due to me reevaluating what will actually be needed in light of the web frontend and feasibility of an "average" gesture combination given
Sometimes we don't want to print out all the debugging data, just the errors. Other times, we do. It would be good to have a -v/--verbose
option to enable/disable this factor
If we plan on having this as a master file, we will need to either adjust it to be not the case and create a new master file (or adjust commons.py
to be a new master file) OR we move it up the dir to scripts/
and make it the new master file. This is because it will also need to make use of the gesture scripts in the dir alongside it, creating weird pathing problems from going up and down dirs.
Once the AWS and gesture extractor framework is setup, we'll need another script that will connect to AWS Custom Labels and check the image for any gestures.
Replace this with a README that links to the projects instead
AWS provides more powerful and refined services than local python libraries for a reasonable cheap, if not free, price. Albeit harder to set up and understand initially, it'll save me a lot of development time later into the project
Investigate AWS Rekognition, mainly following this tutorial to produce a prototype script. Collect video evidence to confirm it works.
This script would take in a video stream (from my laptop or phone camera) and compare the faces in it to stored images on an AWS bucket. The response would be similar to the examples here. A successful test would be a positive match with my face stored in AWS to a group of faces in the video stream.
Case in point, index_photo.py will say in the console log that it was successful at adding the image to the Rekognition collection but doesn't return any JSON or API code response that the NodeJS backend will be able to utilise.
We will need to add some sort of framework (early on) that will return the result (both payload and code) of the python script execution to the NodeJS backend for use in error handling and/or api redirection.
This is pretty much a must have considering how long some tasks will take
Currently it's mined and maxed at 4 per combination but the max could probably be expanded to be larger. However, this is not a high priority reflecting on the timescale this project needs to be completed in
Reason being is this would be quicker and easier to achieve on the website side, but that is only if the streaming option does not work on the nodejs server
We have a set number of gesture types but no logic in the python scripts to limit which gestures a user can pass to them. We will need to find a way to pull the labels we have specified from AWS and ensure the user is using one of them.
This is more of a note issue for later down the line. Normal users should not be able to turn the rekog model on and off. That should be down to S3 Admins only (which is not a going to be a feature of this project).
Currently, only the ability to create/overwrite a profile or delete it exists. It would be useful to include an edit field that will allow for the editing of objects without having to delete them. This could either be handled by the current Python scripts (preferable from a development standpoint) or via the website backend (preferable from a security standpoint)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.