Giter Site home page Giter Site logo

babyjokes's Introduction

Baby Jokes Video Analysis

Caspar Addyman [email protected]

A demonstration project using machine learning models to analyse dataset of videos of parents demonstrating jokes to babies. This dataset was assembled for Sage Ethical AI hackathon 2023. It serves as a small test case to explore challenges with machine learning models of parent child interactions. You can watch a video motivating the project here Sage Hackathon 2023 - PCI Video Analysis 6m20

Dataset

A small test dataset is provided in the LookitLaughter.test folder. It consists of 54 videos of parents demonstarting simple jokes to their babies. Metadata is provided in _LookitLaughter.xlsx. Each video shows one joke from a set of five possibilities [Peekaboo,TearingPaper,NomNomNom,ThatsNotAHat,ThatsNotACat]. For each joke parents rated how funny the child found it [Not Funny, Slightly Funny, Funny, Extremely Funny] and whether they laughed [Yes, No] A larger dataset with 1425 videos is available on request.

Code

All notebooks and supporting code are in the code folder. The numbered notebooks should be run in order to process the data, train the models and generate the results.n

#TODO - visualise data #TODO - build models & analysis

Installation / Key Requirements

This project makes use of the following libraries and versions:

  • Python 3.11
  • Pytorch 2.1.0 (for YOLOv8, deepface, whisper)
  • ultralytics 8.0 (wrapper for YOLOv8 object detection model)
  • deepface 0.0.68 (Facial Expression Recognition)
  • speechbrain 0.5 (Speech Recognition)
  • openai-whisper (OpenAI's Whisper speech recognition -open source version)

Installing with Conda

A Conda environment.yml file is provided but dependencies are complex so can fail to install in a single step. The culprit seems to be the pytorch dependencies. So instead run the follow commands in the terminal.

  1. Create a new Python 3.11 environment
conda create --name "babyjokes" python=3.11
  1. Activate the environment
conda activate babyjokes
  1. Install PyTorch Advisable to follow the instructions at pytorch.org to get the correct version for your system.
  2. Add the other dependencies.
    Run the following command from the root directory of this project.
conda env update --file environment.yml

Installing with Pip

We also provide a pip requirements.txt file. This should work but has not been tested. We recommend following similar steps to the conda installation above.

  1. Create a new python 3.11 environment.
  2. Install PyTorch
  3. Installing the other dependencies:
pip install -r requirements.txt

If you get this working, please let us know what you did (and what OS you are using) so we can update this README.

Sage Hackathon

Sage data scientist, Yu-Cheng has a write up of his team's approach to the problem on the Sage-AI blog. Quantifying Parent-Child Interactions: Advancing Video Understanding with Multi-Modal LLMs Repositories from the hackathon are found here:

babyjokes's People

Contributors

infantlab avatar

Watchers

 avatar

babyjokes's Issues

Automatically label parent and infant

Add a helper function that works out which person is which.
We expect each video to contain a parent and child. Yolo just labels person1 and person2. So want to algorithmically assign these. Could do it based on relative size of participants
baby << adult

Function to generate annotated videos for all rows of ProcessedVideos.xlsx

In code\06_ca_clean_label_actors_remove_ghosts.ipynb we want to call display.createAnnotatedVideo for all the videos in our dataset.
Store the outputs in data\2_final
Follow a similar logic to how steps 2 & 3 generate the annotations

  • don't run if already processed.
  • Unless forceCreate = True
  • Update processedvideos with path to annotated videos

Normalise all x & y coordinates.

Let's convert all coordinates to values between 0 and 1.

Might want two options for this.

  1. (Required) Normalise by height and width. - Divide x coords by width and y coord by height. This seems to be how most models we are using work.
  2. (Optional) Preserve aspect ratios. So divide all x and y coords by the max of (height, width). This also has the advantage of being easier to code.

Want to do this for pose-estimation coordinates and object and face bounding boxes. (And anything else we label on the videos).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.