Giter Site home page Giter Site logo

Comments (4)

sandrohanea avatar sandrohanea commented on July 28, 2024

Hello @drajvver,
Not all the flags in the main example of whisper.cpp have a correlated With~ fluent API, but all whisper.cpp whisper_full_params https://github.com/ggerganov/whisper.cpp/blob/master/whisper.h#L332 have a correlated FluentAPI in whisper.net.

Some of the arguments are just implemented on the client (e.g. diarization): but I added example of this as well: https://github.com/sandrohanea/whisper.net/tree/main/examples/Diarization

For the -ml (--max-len), there are multiple whisper_full_params changes:
https://github.com/ggerganov/whisper.cpp/blob/master/examples/main/main.cpp#LL776C1-L779C1

The Whisper.net equivalent of that would be:

        .WithTokenTimestamps()
        .WithMaxSegmentLength(15)

from whisper.net.

drajvver avatar drajvver commented on July 28, 2024

So I think that it does not work as it should or I'm making some sort of silly mistake.
For this: https://www.youtube.com/shorts/g9IYllmOtUc

And settings:

await using var processor = whisperFactory.CreateBuilder()
.WithLanguage("en")
.WithTemperature(0.2f)
.WithTokenTimestamps()
.WithMaxSegmentLength(4)
.WithPrintProgress()
.WithPrintResults()
.WithPrintTimestamps()
.Build();

I get output like this:

[00:00:00.000 --> 00:00:06.140] My friend Julius just moved into his new home and needed to go grab some tools, so he asked me to watch his place.
[00:00:06.140 --> 00:00:13.060] I watched his kitchen and found what I thought was his only ramen stash, but I looked to the left and saw another bag of ramen packages.
[00:00:13.060 --> 00:00:19.700] I started digging through it to see if any of them sounded good. Then I looked to the right and found even more instant ramen in a box.
[00:00:19.700 --> 00:00:28.060] Then I felt the urge to turn around and boom, there's another bag of noodles. I grabbed the super spicy ones and started to quickly make them before Julius got back.
[00:00:28.060 --> 00:00:35.900] I felt like I had spent too much time perusing his ramen stash, so I didn't add much to this. Now was it super spicy as advertised? Eh.
[00:00:35.900 --> 00:00:40.980] It definitely had a pleasant kick and the noodles were nice and chewy, but spice was probably a 3 out of 10.

It's completely possible that I'm doing something very wrong but I can't see what would that be

from whisper.net.

sandrohanea avatar sandrohanea commented on July 28, 2024

It sounds indeed like a bug, but didn't have time to check it yet :(

from whisper.net.

sandrohanea avatar sandrohanea commented on July 28, 2024

Hello again @drajvver ,
I tried to reproduce the bug but couldn't (whisper.net was returning the same as whisper.cpp) for tiny model
image

Can you please try to create some repro zip (including the model), using Whisper.net 1.4.4?

from whisper.net.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.