Giter Site home page Giter Site logo

aiwintermuteai / edge_ml_emotion_recognition Goto Github PK

View Code? Open in Web Editor NEW
4.0 3.0 1.0 3.67 MB

Training, inference and evaluation of MobileNet/MobileFaceNet models suitable for inference on Edge Devices

License: MIT License

Python 26.52% Shell 0.11% Jupyter Notebook 73.36%

edge_ml_emotion_recognition's Introduction

Facial Expression Recognition

result

This is a adaptation of MobileNet and MobileFaceNet Keras models for face expression recognition task. The reference model was trained on refined version of FER2013 dataset, but can be fine-tuned or retrained on other datasets, as long as they follow the same convention.

This repository is specifically created to train smaller models, that are suitable for running on embedded devices with constrained resources while still achieving acceptable accuracy.

In refined version of FER2013 dataset called FER+ each image has been labeled by 10 crowd-sourced taggers, which provide better quality ground truth for still image emotion than the original FER labels. Thus the model is trained to predict percentage of each emotion for given face - the labels ARE NOT hot encoded, but the sum of the prediction output is 1.0. To give an example one's face can be described as 60% neutral and 38% sad - that would be moderately sad expression, as opposed to 95% sad, which would be crying, i.e extremely sad.

As such the model is evaluated during training using Mean Absolute Error rather than top-1 accuracy. Although top-1 accuracy can be obtained using evaluation script, it should not be used as main performance metric for this particular approach.

Final MAE of pre-trained MobileFaceNet model (alpha = 1.0, total params: 212,617, model size: 3.1 Mb Raspberry Pi 4 CPU inference time: ) is 0.0578, top-1 accuracy is 77% on test subset of FER+.

Dependencies

  • Python3.5+
  • Tensorflow 2.4.1
  • numpy, Pandas, tqdm
  • dlib (for demo)
  • OpenCV3

Tested on:

  • Ubuntu 18.04, Python 3.9.4, Tensorflow 2.4.1, CUDA 10.1.243, cuDNN 7.6.5
  • CPU: i5-6500 [email protected], GPU: GeForce GTX1080

Usage

Use pretrained model for demo

result

Run the demo script (requires web cam). You can use --image_dir [IMAGE_DIR] option to use images in the [IMAGE_DIR] directory instead.

python3 tests/demo.py

The pretrained model for TensorFlow backend will be automatically downloaded to the pretrained_models directory.

Download or create training data

First, download the dataset. The dataset is downloaded and extracted to the data directory by:

./download.sh

You can re-create dataset yourself by downloading:

  • FER2013
  • FER+

And then running generate_training_data.py from FER+ repository to get train/valid/test images in .png format. Then use notebooks/convert_dataset.py to process .csv data and divide it into three .csv files.

Train network

Train the network using the training data created above.

python train.py --train_csv data/train.csv  --validation_csv data/valid.csv --lr 1e-3 --alpha 1.0 --aug

Trained weight files are stored as checkpoints/weights.*.hdf5 for each epoch if the mean average error becomes less than in previous epochs.

usage: train.py [-h] --train_csv TRAIN_CSV --validation_csv VALIDATION_CSV [--batch_size BATCH_SIZE] [--img_size IMG_SIZE] [--nb_epochs NB_EPOCHS]
                [--lr LR] [--opt OPT] [--alpha ALPHA] [--bottleneck_weights BOTTLENECK_WEIGHTS] [--checkpoint CHECKPOINT] [--model_type MODEL_TYPE]
                [--aug] [--output_path OUTPUT_PATH]

This script trains the CNN model for facial emotion estimation.

optional arguments:
  -h, --help            show this help message and exit
  --train_csv TRAIN_CSV
                        path to train database csv file (default: None)
  --validation_csv VALIDATION_CSV
                        path to validation database csv file (default: None)
  --batch_size BATCH_SIZE
                        batch size (default: 32)
  --img_size IMG_SIZE   image size (default: 48)
  --nb_epochs NB_EPOCHS
                        number of epochs (default: 50)
  --lr LR               initial learning rate (default: 0.001)
  --opt OPT             optimizer name; 'sgd' or 'adam' (default: adam)
  --alpha ALPHA         alpha parameter for filter number (default: 0.5)
  --bottleneck_weights BOTTLENECK_WEIGHTS
                        bottleneck_weights (default: None)
  --checkpoint CHECKPOINT
                        checkpoint to continue training (default: None)
  --model_type MODEL_TYPE
                        MobileNet or MobileFaceNet (default: MobileFaceNet)
  --aug                 use data augmentation if set true (default: False)
  --output_path OUTPUT_PATH
                        checkpoint dir (default: checkpoints/2021-07-28_11-31-08)

Use the trained network

python3 tests/demo.py
usage: demo.py [-h] [--model_file MODEL_FILE] [--margin MARGIN] [--image_dir IMAGE_DIR]

This script detects faces from web cam input, and estimates face emotion for the detected faces.

optional arguments:
  -h, --help            show this help message and exit
  --model_file MODEL_FILE
                        path to weight file (e.g. weights.28-3.73.hdf5) (default: None)
  --margin MARGIN       margin around detected face (default: 0.1)
  --image_dir IMAGE_DIR
                        target image directory; if set, images in image_dir are used instead of webcam (default: None)

Please use the best model among checkpoints/weights.*.hdf5 for MODEL_FILE if you use your own trained models.

Evaluation

Evaluation on the FER+ dataset

You can evaluate a trained model on the FER+ (test) dataset by:

usage: evaluate_test.py [-h] [--model_file MODEL_FILE]

This script evaluate face emotion estimation model using the FER+ test data.

optional arguments:
  -h, --help            show this help message and exit
  --model_file MODEL_FILE
                        path to weight file (e.g. weights.28-3.73.hdf5) (default: None)

Please refer to here for the details of the FER+ dataset.

The results of pretrained model is:

Top result accuracy: 0.7742025741466144
Class-wise MAE: 
neutral      0.123708
happiness    0.066834
surprise     0.057518
sadness      0.084724
anger        0.061439
disgust      0.017457
fear         0.035667
contempt     0.021012
unknown      0.051911
Name: 0, dtype: object
Total MAE: 0.05780777047146134

result

For further improvement

If you want better results, there would be several options:

  • increase alpha parameter to see how large of the model you can get without overfitting
  • when overfitting becomes a problem it would be sensible to use EMOTIC or Google Facial Expression Comparison Dataset datasets instead

License

This project is released under the MIT license.

All the images are collected from the Internet, and the copyright belongs to the original owners. If any of the images belongs to you and you would like it removed, please contact the original dataset creators.

edge_ml_emotion_recognition's People

Contributors

aiwintermuteai avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

gipsyblues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.