flakm / jupiter-search Goto Github PK
View Code? Open in Web Editor NEWConvert podstast RSS feed to transcriptions using whisper model
License: Apache License 2.0
Convert podstast RSS feed to transcriptions using whisper model
License: Apache License 2.0
Currently in ffmpeg_decoder.rs conversion from mp3 to wav is done using creating a child subprocess and calling ffmpeg
directly.
This is suboptimal for a number of reasons:
ffmpeg
version changes (or it is missing completely)I've tried to prepare an alternative rust-only solution inside decoder.rs but it is not working - possibly because I'm a complete tool when it comes to audio.
Why it is a no-stopper:
ffmpeg
is pretty stable so the flags won't change from version to versionffmpeg
is missing (?)According to ggerganov/whisper.cpp#394 it should be possible to speed up audio and still get some results.
Here is ffmpeg instruction for this: https://trac.ffmpeg.org/wiki/How%20to%20speed%20up%20/%20slow%20down%20a%20video
Acceptance criteria:
Since we need to download the data we might parse all of the data from mp3 to connect it to the transcript.
Awesome crate for this specific task: https://docs.rs/lofty/latest/lofty/
Since transcription is very time-consuming (0.66xaudio_length) the results of stt should be cached based on RSS hashing to s3 or some other storage.
Potential segfault in the time crate
Details | |
---|---|
Package | time |
Version | 0.1.44 |
URL | time-rs/time#293 |
Date | 2020-11-18 |
Patched versions | >=0.2.23 |
Unaffected versions | =0.2.0,=0.2.1,=0.2.2,=0.2.3,=0.2.4,=0.2.5,=0.2.6 |
Unix-like operating systems may segfault due to dereferencing a dangling pointer in specific circumstances. This requires an environment variable to be set in a different thread than the affected functions. This may occur without the user's knowledge, notably in a third-party library.
The affected functions from time 0.2.7 through 0.2.22 are:
time::UtcOffset::local_offset_at
time::UtcOffset::try_local_offset_at
time::UtcOffset::current_local_offset
time::UtcOffset::try_current_local_offset
time::OffsetDateTime::now_local
time::OffsetDateTime::try_now_local
The affected functions in time 0.1 (all versions) are:
at
at_utc
now
Non-Unix targets (including Windows and wasm) are unaffected.
Pending a proper fix, the internal method that determines the local offset has been modified to always return None
on the affected operating systems. This has the effect of returning an Err
on the try_*
methods and UTC
on the non-try_*
methods.
Users and library authors with time in their dependency tree should perform cargo update
, which will pull in the updated, unaffected code.
Users of time 0.1 do not have a patch and should upgrade to an unaffected version: time 0.2.23 or greater or the 0.3 series.
No workarounds are known.
See advisory page for additional details.
Find out why unit test that does the same thing as the example code in get_transcript.rs hangs? oO
fn stt_works() {
let mut ctx = SttContext::try_new("resources/ggml-tiny.en.bin").unwrap();
let t = ctx
.get_transcript_file("resources/super_short.mp3", false)
.unwrap();
println!("{:?}", t);
assert!(t.utterances.len() > 0)
}
Maybe those crates won't get used so much but having a hosted version of documentation would be awesome ;)
Currently the results of generic models is less than satisfactory.
The language model clearly doesn't have any linuxy techy words, so it should be tuned for this specific purpose.
Training requires checkpoints that can be taken from the releases page https://github.com/coqui-ai/STT/releases?q=1.3.0&expanded=true all 1.x.x releases are backward compatible with models generic models.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.