Giter Site home page Giter Site logo

Comments (13)

HalfdanJ avatar HalfdanJ commented on June 23, 2024

This is high on our wishlist. The issue we haven't solved yet is that the pre-processing of the audio data to the format the network expects hasn't been written for python yet. As I understand it, fft processing of audio is handled differently natively in javascript and python, making it tricky.

The model that is used for audio training is https://github.com/tensorflow/tfjs-models/tree/master/speech-commands, and does unfortunately not yet have a python counter part.

Contributions to this is very welcomed!

from teachablemachine-community.

td0m avatar td0m commented on June 23, 2024

Thanks for the reply @HalfdanJ, does the speech commands package work on node.js? I tried running it in a non-browser environment and it didnt seem to work.

from teachablemachine-community.

HalfdanJ avatar HalfdanJ commented on June 23, 2024

I don't believe so

from teachablemachine-community.

caisq avatar caisq commented on June 23, 2024

@d0minikt Can you say a little more about your use case? The audio model is tied to WebAudio's frequency analyzer (FFT). This means that in order to use the model in Python, you'll find a way to replicate the audio input parameters and frequency analysis.

from teachablemachine-community.

td0m avatar td0m commented on June 23, 2024

@caisq if that's hard to do, is it easier to port the speech-commands package so that it also runs on node.js? I'm sure I'm not the only one with a headless use case.

from teachablemachine-community.

lc0 avatar lc0 commented on June 23, 2024

There are ops for DSP in tensorflow directly[1], but I guess it's hard to maintain these for different platforms like TFLite and TFjs.

Also, most likely you rely on optimized FFT of browser.

  1. https://www.tensorflow.org/api_docs/python/tf/signal

from teachablemachine-community.

lc0 avatar lc0 commented on June 23, 2024

Also, any plans to support exporting saved model? Currently I only see export to Tensorflow.js

from teachablemachine-community.

nickoala avatar nickoala commented on June 23, 2024

As I understand it, fft processing of audio is handled differently natively in javascript and python, making it tricky.

The model that is used for audio training is ...... and does unfortunately not yet have a python counter part.

In case anyone hasn't noticed, the Coral example project Keyphrase detector seems to have the pre-processing code necessary. Not sure it's equivalent to those in Speech Commands, but at least they both compute Mel spectrogram.

I am just saying this, in case it may be helpful to someone.

from teachablemachine-community.

caisq avatar caisq commented on June 23, 2024

@nickoala To be clear, I'm pretty sure the preprocessing steps in the Coral example doesn't fit TF.js Speech Commands, because Speech Commands is based on the browser's WebAudio FFT, which is a linear-frequency spectrum, not a Mel one.

from teachablemachine-community.

nickoala avatar nickoala commented on June 23, 2024

@caisq, but there is a SOFT_FFT option to speechCommands.create() right? This file does seem to compute Mel spectrogram.

from teachablemachine-community.

caisq avatar caisq commented on June 23, 2024

@nickoala My apologies: The document is not very clear and some of the code is obsolete. The SOFT_FFT mode does use Mel spectrum. But the default mode of Speech Commands (BROWSER_FFT) uses linear spectrum from WebAudio.

from teachablemachine-community.

nickoala avatar nickoala commented on June 23, 2024

I see. Thank you for clarification.

from teachablemachine-community.

charlielito avatar charlielito commented on June 23, 2024

Hey guys, I had the same problem trying to run an audio model in a headless device with python. I could make it work but with node.js, but the trick could work also with python. The little trick was to launch a headless chromium with puppeteer where the javascript run the model and inside the node.js script the predictions are parsed and then you are free to go and do whatever with the predictions.

I made it to turn off/on my room's light. If you want to check out the code and how to do it go to: https://github.com/charlielito/teachable-machines-audio-demo

Any feedback is welcome!

from teachablemachine-community.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.