Giter Site home page Giter Site logo

kabyru / facetomidi Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 124 KB

A Python project that uses the locations of Facial Action Units generated by OpenFace's Facial Recognition algorithm as note pitch inputs in pretty-MIDI's song writing capabilities.

Python 100.00%

facetomidi's Introduction

FaceToMIDI

"FaceToMIDI" is a Python project that uses the output of OpenFace's Facial Recognition algorithm as input into pretty-MIDI's music writing capabilities. As you will find by testing different faces, the distance the face is from the camera determines the octave range of the song, with each face creating a different variation of a similar melody. This project won't sing you The Beatles, but it will show you that in the eyes of the OpenFace algorithm, faces are relatively similar, with each of us capable of providing different variations to the same tune.

This project requires you to download the OpenFace Windows x64 Release, which can be found here. Follow the instructions on the OpenFace installation page to ensure all reference models are properly installed, or this software will not work properly. Simply drop the Release ZIP in the FaceToMIDI directory and extract.

This software makes use of OpenFace 2.0.5 x64 Release. YMMV with other releases, but I recommend that you stick with this release for now, as it is the most recent version made public. The main script searches for an OpenFace 2.0.5 release and will require slight modification to work with other versions.

How It Works

FaceToMIDI takes a .jpg or .jpeg image file as input, which is used as input into OpenFace's facial recognition algorithm specifically designed for images (FaceLandmarkImg.exe). Once the algorithm completes its analysis of the image, an output image and CSV file are generated containing the data generated from the algorithm. For example, the output image generated for the example photo in the repo looks like this:

alt text

Once these output files are generated, the script then uses Python's built-in CSV reading capabilities to read the locations of the generated facial action units to use as MIDI note inputs.

To convert this data into acceptable MIDI note inputs, the script makes use of the pretty-MIDI Python Library (which will install itself automatically via PIP). As noted by the General MIDI Guidelines, MIDI note pitches are defined as a range of numbers spanning from 21 to 108. To remain compliant with this range, the script takes inputs taken from the generated CSV and continuously divides their value by 2 until their value lies within the acceptable MIDI pitch range.

alt text

An issue that comes with this method is the fear that all generated notes will sounds relatively similar to each other, therefore other methods of ensuring compliancy with the MIDI pitch range will be investigated.

Once the script generates MIDI notes for each facial action unit, an output MIDI is generated and located in the root directory of the project. For the example photo in the repo, the generated MIDI sounds like this:

A Beautiful Output Face Melody

This Music Isn't My Taste. What's the Point of This?

I created this project to test what musical capabilities could come from using OpenFace's facial recognition algorithm, and in turn, what musical applications could come from this computer vision innovation. I plan to push this project further by implementing real-time musical output using a webcam stream, which opens a door of opportunities, including using facial action units as inputs to multiple musical instruments, vis-a-vis, a face orchestra!

Let's work together!

If you have interest in pushing this project further in a direction as mentioned above, feel free to fork and begin your own work! Feel free to contact me with ideas revolving around facial input musical applications, and together we can make something truly extraordinary.

Yours in Musicality, Kaleb Byrum [email protected] GitHub: kabyru

facetomidi's People

Contributors

kabyru avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.