Comments (6)
Notes on hotg drive: https://docs.google.com/document/d/1IeJjxcj8VIca_nFGxnNVsbxsuQvGmnI0-Lga0Wy5Tg8/edit#
Overall summary:
- The spectrogram-computer function in Python (implemented in C/C++, really) is quite complicated; we probably want to simplify and retrain the model with the simpler version of spectrogram-computer
- The spectrogram-computer library sonogram in Rust does not do exactly what we need (misses mel-spectrum) and gives a bit poor access to parameters; we probably want to replace the use of sonogram with (1) our own windowing function, written in Rust, (2) an existing FFT crate
- OR we just call the individual C/C++ code steps from Rust, in TF library
Hereβs the TF Ops repo with all the parts to the TF spectrogram-computer, in C/C++ implementation: TF spectrogram-computer repo
The steps in the TF spectrogram-computer (they are all sequentially called from frontend.c) are, with links to the relevant code:
- Step 1: A windowing function, that chops the incoming audio sample into windows: window.c - this is currently part of sonogram. should not be too difficult to figure out / reverse engineer
- Step 2: FFT - applied on each window - this exists in Rust already: fft.cc
- Step 3: Filterbank calculations - convert the FFT complex and imaginary parts into energy - filterbank.c (FilterbankConvertFftComplexToEnergy & FilterbankAccumulateChannels)
- Step 4: Noise reduction - apply a low pass filter on each of the windows: noise_reduction.c (NoiseReductionApply)
- Step 5: Auto gain control - this might be complicated to reimplement, the algorithm is explained in Wang et al. 2016: pcan_gain_control.c (PcanGainControlApply)
- Step 6: Logarithmic scaling: log_scale.c (LogScaleApply)
I think a reasonable plan to match the model might be (given that especially step 5 might be quite complicated):
- Stage 1: Retrain the TF model with noise reduction and gain control turned off, and match with a Rust proc block that does steps (1) windowing, (2) FFT, (3) filterbank, (6) log scaling - this should be doable with sonogram + some hacking
- Stage 2: Match steps (4) noise reduction and (5) gain control in Rust (or we call the C/C++ functions from rust?)
- Stage 3: Go back to using the original model (now that the FFT proc block is fully matched)
from rune.
Starting with the microspeech with fft fix.
@Michael-F-Bryan might have a better idea here.
https://github.com/kthakore/json-eater
!!! If we can test proc blocks in python that is HUGE deal
from rune.
Need to make an implementation in Rust (copying over a python function) for microspeech. More notes from @meelislootus .
from rune.
- Should we make prock_block libraries of these. Users could use these in the their procblock.
from rune.
from rune.
It looks like microspeech is good so I'll close this and #113.
from rune.
Related Issues (20)
- Enable Apple targets for librune_native in Github actions
- Consistent Tensor element types
- Allow people to attach a description to a capability
- Allow each Rune to have a description
- Make the compiler's internals deterministic
- Missing dependency libclang causes build to fail and hang HOT 3
- Let people specify supported runtime environments/backends in the Runefile
- Split the rune CLI up so there are individual binaries for each sub-command
- Caching strategy for long running processes like BERT QA inference HOT 3
- Add support for inference on ONNX and TensorFlow models HOT 1
- Make it easier to pass inputs through the builder API
- The "rune build" command completely ignores the "image" field
- The web runtime should try to cache Runes where possible
- Add rune serve to the cli
- Structured Logging HOT 1
- Load tf.js models using a custom loader
- BigUint64Array is not supported on some versions of Safari
- Random capability should be float
- Build error when connecting image source with normalisation block HOT 2
- Adding Observability Probes (Behind the scenes) to inputs to Runes
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rune.