Giter Site home page Giter Site logo

proxemo's Introduction

ProxEmo: Gait-based Emotion Learning and Multi-view Proxemic Fusion for Socially-Aware Robot Navigation

ProxEmo is a novel end-to-end emotion prediction algorithm for socially aware robot navigation among pedestrians. The approach predicts the perceived emotions of a pedestrian from walking gaits, which is then used for emotion-guided navigation taking into account social and proxemic constraints. Multi-view skeleton graph convolution-based model uses commodity camera mounted onto a moving robot to classify emotions. Our emotion recognition is integrated into a mapless navigation scheme and makes no assumptions about the environment of pedestrian motion.

Overview

We first capture an RGB video from an onboard camera and extract pedestrian poses and track them at each frame. These tracked poses over a predefined time period are embedded into an image, which is then passed into our ProxEmo model for classifying emotions into four classes. The obtained emotions then undergo proxemic fusion with the LIDAR data and are finally passed into the navigation stack.

Model Architecture

The network is trained on image embeddings of the 5D gait set G, which are scaled up to 244×244. The architecture consists of four group convolution (GC) layers. Each GC layer consists of four groups that have been stacked together. This represents the four group convolution outcomes for each of the four emotion labels. The group convolutions are stacked in two stages represented by Stage 1 and Stage 2. The output of the network has a dimension of 4 × 4 after passing through a sof tmax layer. The final predicted emotion is given by the maxima of this 4×4 output.

Prerequisites

The code is implemented in Python and has the following dependency:

  1. Python3
  2. Pytorch >= 1.4
  3. torchlight
  4. OpenCV 3+

To run the demo with intel realsense D435 camera following libraries are required:

  1. OpenCV 3+
  2. pyrealsense2
  3. Cubemos SDK (works with Ubuntu 18.04)

Before running the code

Dataset

Sample dataset can be downloaded from EWalk: Emotion Walk. Sample H5 files can be found in GitHub. For full dataset, please contact the authors.

Pretrained model

VS-GCNN model trained on the above dataset can be downloaded from google drive

Config file changes

Below are the basic changes to be made in config file. Open config file from [proxemo folder]/emotion_classification/modeling/config and make following changes.

  1. Set the mode
GENERAL : MODE : ['train' | 'test' ]
  1. Specify pretrained model path if running in inferece or test mode or warm starting the training
MODEL : PRETRAIN_PATH : <path to model dir>
MODEL : PRETRAIN_NAME : <model file name>
  1. Specify features and labels H5 files.
DATA : FEATURES_FILE : <path to features file>
DATA : LABELS_FILE : <path to lables file>

Running the code

Clone the repo.

git clone https://github.com/vijay4313/proxemo.git
cd <proxemo directory>

Find the latest release tag from released versions and checkout the latest release.

git checkout tags/<latest_tag_name>

example

git fetch --all --tags
git checkout tags/v1.0

All the settings are configured as yaml file from [proxemo folder]/emotion_classification/modeling/config. We have provided two settings file one for inference and one to train the model.

To run the code with specific settings file, run the below command

python3 main.py --settings infer

To run the demo, connect intel realsense D435 camera with above mentioned pre requsites and execute below command

python3 demo.py --model ./emotion_classification/modeling/config/infer.yaml

Links

Results

We use the emotions detected by ProxEmo along with the LIDAR data to perform Proxemic Fusion. This gives us a comfort distance around a pedestrian for emotionally-guided navigation. The green arrows represent the path after accounting for comfort distance while the violet arrows indicate the path without considering this distance. Observe the significant change in the path taken in the sad case. Note that the overhead image is representational, and ProxEmo works entirely from a egocentric camera on a robot.

Comparison of ProxEmo with other state-of-theartemotion classification algorithms.

Here we present the performance metrics of our ProxEmo network compared to the state-of-the-art arbitrary view action recognition models. We perform a comprehensive comparison of models across multiple distances of skeletal gaits from the camera and across multiple view-groups. It can be seen that our ProxEmo network outperforms other state-of-the-art network by 50% at an average in terms of prediction accuracy.

Confusion Matrix

Cite this paper

@article{narayanan2020proxemo,
  title={ProxEmo: Gait-based Emotion Learning and Multi-view Proxemic Fusion for Socially-Aware Robot Navigation},
  author={Narayanan, Venkatraman and Manoghar, Bala Murali and Dorbala, Vishnu Sashank and Manocha, Dinesh and Bera, Aniket},
  journal={arXiv preprint arXiv:2003.01062},
  year={2020}
}

Contact authors

Venkatraman Narayanan [email protected]

Bala Murali Manoghar Sai Sudhakar [email protected]

Vishnu Sashank Dorbala [email protected]

Aniket Bera [email protected]

proxemo's People

Contributors

bsaisudh avatar vijay4313 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.