Giter Site home page Giter Site logo

posture-classification-models's Introduction

Image Classification Models to Improve Posture Classifcation

This repository consists of data visualizations and documentation of our model implementations for predicting postures of subjects in images. We also used have code to process videos to load into models for posture prediction.

We did some exploratory data analysis and created some visualizations in the visualizations_doc.ipynb file linked. Our goal was to see if the distribution of images labeled occluded or not is significant, and if there are noticeble patterns dimensions of the images in relation to the image's primary_posture label.

stack bar of posture distribution

We also visualized height to width ratios of the images.

boxplot occluded

boxplot how many people

As we can see, typically images labeled with 'Standing' have higher height-to-width ratios than those labeled with 'Sitting' or 'Lying', and the 'Lying' images have the lowest overall height-to-width ratios. In addition, we also notice that there exist many positive outliers over the upperbound of the boxplots across all 3 posture labels. When grouped by occludedness, on one hand we could still see that the 'Standing' images generally have the highest height-to-width ratios. On the other hand, for the 'Sitting' and 'Standing images those that are not occluded usually have higher height-to-width ratios than the occluded and unknown ones. Nevertheless, such a trend reversed for the 'Lying' images. When group by number of peopleWe see that when the number of people is 0, the height to width ratios tend to be higher than images with people in them. The spread of the height to width ratio is also very large for standing so after looking into that, we found that most of these images where how_many == None are labeled as 'Standing'. Also, a lot of these images are occluded. From showing these images, we can see that a lot of the photos are either cut off (where not an entire body is pictured), or that they are images of side or back profiles of the people, which explains why a lot of them are occluded and maybe classifed as no one in them. The photos that are cut off are usually cut vertically so that the width is very small, which explains why the box plots of the height to width ratios of these images are generally higher than images labeled with more than 0 number of people in them. Another thing to note that photos that do not include live human beings but instead has mannequins in them (such as images 1 and 4 in the plot above) may be labeled with how_many == None.

Overall Findings: The data is unbalanced in that the majority of the sample images are labeled with 'Standing', while the 'Lying' label is the smallest in terms of the number of pictures labeled with. Further, the majority of images labeled with 'Standing' have the highest height-to-width ratios, whereas those labeled with 'Lying' have the lowest. Additionally, although images that are not occluded seem to have higher height-to-width ratios than the occluded ones for 'Standing' and 'Sitting' labels, such a trend not longer holds for images labeled with 'Lying' probably because there are much fewer pictures labeled with 'Lying' than those labeled with 'Standing' or 'Sitting'. Furthermore, there exists a potential relationship between the number of people each image contains (from 0 to 3) and its height-to-width ratio: the more the people 1 picture contains, the lower its height-to-width ratio would be across all 3 postures labels ('Standing', 'Sitting', and 'Lying'). As a result, it might be worth splitting the data according the number of people (in each image) before preprocessing and then running them in a model.

Runtime for a single prediction: 3.12ms

Since the blank MobileNet model (random initialization) will cause overfitting, we tried MobileNet pre-trained on imagenet, froze its convolution layers, and retrained its fully connected layers on our data. Our data set is all the three tranches excluding the null and unknown values, with about 36k images. We trained on 30 thousand images with a 0.5 validation split and tested the model on the remaining 6 thousand. The model was trained on Google Colab. The results are as follows:

mobilenet1

mobilenet2

Inception-ResNet v2

Similar to MobileNet, we used the same training, validation, and test set. And we froze the convolutional layers pre-trained on imagenet and retrained the fully connected layers. The model was trained on Google Colab. The results are as follows:

inceptionresnetv2

MobileNet + Edge Detection

Runtime for a single prediction: 2.89ms

We also implemented some preprocessing techniques. The edge detection code was from group 2. We combine the original image and the edge-detected image together into the training set. The following models were trained on the CARC system from USC. We tried MobileNet pre-trained on imagenet, first training on the original images. Then we trained the model on edge-detected images combined with original images to see if there is any improvement.

Original Images

mobilenet_edgedet1

mobilenet_edgedet2

As a result, we don’t see considerable improvement in accuracy. It may be due to the following reasons: the convolutional layers are frozen or edge detection is not effective. In order to check the reason, we unfroze the convolutional layers in MobileNet and trained the model.

MobileNet Unfrozen Without Edge Detection

mobilenet_edgedet3

Runtime for a single prediction: 2.90ms

mobilenet_edgedet4

Surprisingly, the accuracy increases as we unfroze the layers. The validation accuracy is now about 90%.

In this file, we process a video by cutting the video into individual frames as a list of images in order to make posture predictions. We can load any of the above models to make predictions.

Summary Table

summary

Limitations and Improvements

Ratio of the pictures was affected by posture and the amount of people. The pictures with standing had the highest ratio, followed by sitting and lying. The fewer the people, the higher the ratio. Also, the non-occluded pictures had higher ratios as well.

One of the improvements that our group found was dropping out the images with extreme ratios. This is because the outliers in picture ratios can dilute the actual results. Also, we found that object detitions could be a better improvement in results. Focusing on the center of the object, instead of the edges could lead to higher accuracy then the edge-detection system. Furthermore, taking a region of the graph instead of resizing the whole image could lead to a better accuracy. Finally, we decided that we would split the loudness and/or number of people in the image for more complete data.

posture-classification-models's People

Contributors

alan556022 avatar drewm8080 avatar justinezth avatar tommypolikj avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.