The human voice is very versatile and carries a multitude of emotions. Emotion in speech carries extra insight about human actions. Through further analysis, we can better understand the motives of people and therefore emotion recognition plays an important role in human-computer interaction.
In my initial Literature review I found that a majority of Speech Emotion Recognition (SER) studies address the problem of SER considering emotions solely from the viewpoint of a single language. Therefore for the Major Research Project my aim will be: Build a model that helps classify basic emotions in English Language Test and extend the project to understand how the model performs and why it differs on Canadian French Language dataset.
For my MRP i’ll be using the following two datasets: English Language Dataset: https://zenodo.org/record/1188976#.Xo0dCFNKjOS French Language Dataset: https://zenodo.org/record/1478765#.Xo0cFVNKjOQ
The aim of my MRP will be: Build a model that’ll help classify basic human emotions in English Language. From the dataset i’ll classify happy, sad, fearful and surprise emotions (emotions that are present in both datasets) Test the model on Canadian French Language and help explain differences and highlight how emotions may differ depending on Language
I’ll be using the following tools and techniques: Google Colab Python’s librosa library to help convert wav files to numpy array Test Ensemble Methods and Deep Learning Models to help classify emotions (pick best model) Feature extraction methods to help reduce feature dimensions (audio pitch, frequency etc.)