sandrohanea / whisper.net Goto Github PK

Whisper.net. Speech to text made simple using Whisper Models

License: MIT License

CMake 11.85% C# 30.35% PowerShell 1.11% Makefile 3.93% Metal 52.74%

cross-platform dotnet dotnetcore speech-recognition speech-to-text translation

whisper.net's Issues

WithNoSpeechThreshold doesn't seem to do anything

Either I lack the understanding what this is supposed to do or it doesn't work.
I tried a variety values from 0.1f to 10f and I still had hallucinations remaining in my output, so stuff like "you" repeatedly which seems to occur whenever an empty audio stream is passed to whisper as that has likely not been including in the model training data.

Faster transcription

I tried using WithThreads
I have a very powerful processor
But it didn't help anything, although it took more power from the processor, but it took the same amount of time, which means I just lost
Is there something I'm doing wrong?
I tried too
WithSpeedUp2x
And then it just doesn't transcribe anything!
Thanks for this wonderful library
and for your help

Add Android support

Thanks for creating this project!
I'd like to use it in an Android app with Maui. Are you planning to add support for Android?

OpenBLAS support

I've managed to hack OpenBLAS support into the cmake files for linux-x64. For some reason, find_package(BLAS) does not work, and I had to set some variables manually. This resulted in greatly improved processing time on a 2 thread VM. On a random audio file I've been testing with, originally it took 1511 seconds, now it takes 770 seconds!

I'm very rusty with CMake, so I'm open to ideas.

Where should I insert my api key?

I implemeted this library per the example provided here and I get the following exception when trting to submit requests:
System.Security.Authentication.AuthenticationException: Authentication failed

I guess this is due to missing api key.

Could not able to compile and build while trying to integrate in .net framework project 4.7.2

Hello there . Greeting

I have a desktop application in which i am trying to integrate whisper .net library . The project is based on .net framework 4.7.2 and WPF . Visual studio 2019 is used

After adding the project from nugget when i try to add sample code in the project in the following line i get an compilation error .

await foreach (var result in processor.ProcessAsync(fileStream))
           {
               Console.WriteLine($"{result.Start}->{result.End}: {result.Text}");
           }

The error is The type 'IAsyncEnumerable<>' is defined in an assembly that is not referenced. You must add a reference to assembly 'Microsoft.Bcl.AsyncInterfaces, Version=7.0.0.0, Culture=neutral, PublicKeyToken=cc7b13ffcd2ddd51'

after doing some research i found that IAsyncEnumerable is part of .net core sdk . to use it for framework version i need to use Microsoft.Bcl.AsyncInterfaces package . even after installing it the error is still there . i have also .net core installed in the sytem and also tried with chaning c# language version from 7.3 to 8.0 . error is still there

I have successfully build sample project which is based on .net core .

So my question is , is this library .net framework compatible or .net core is must ?

WithDuration() is not working?

Hi!
I want to transcribe audio for only one duration. When I use both WithOffset() and WithDuration(), Whisper often outputs text that exceeds the duration setting length, is it my problem?

Debug a code made in maui for osx (maccatalyst-arm64)

I can't get it to work for a project made with maui for mac osx (Debug a code made in maui for osx (net7.0-maccatalyst/maccatalyst-arm64).

I downloaded the specific libs for osx-arm64 and maccatalyst, but the code:
await foreach (var segment in processor.ProcessAsync(fileStream, CancellationToken.None))

It doesn't enter foreach and doesn't give an error. Can someone help me?

Thanks.

Using a microphone

Is there a way in the library to use the microphone and not just transcribe an existing recording?
because the original library has
in whisper.cpp

Failed to load native whisper library.

When trying to build the WhisperProcessor I get the following error: 'Failed to load native whisper library.'

I am using Windows 11 Pro on ARM64 with the latest Visual Studio running .NET 7.

I don't know why this error occurs because I took this code from the example on GitHub and the model file is downloaded correctly.
Can you help me out?

Any version for .netframework 4.7.x ?

Great job! it will be available for .NetFramework 4.7 ?

thanks!

Controlling the length of the generated text segments

Please ignore this post.

is there a possibility to add CUDA or OpenCL support to whisper.net?

I'd love to use whisper.net with a graphics card. Waiting half an hour every time I run the code gets a tad tedious after some time. Is there a possibility how we can add graphics cores and mem support?

Anyways, thanks for the port :)

I hope this project can support .net framework 4.7.2

some times,we have to use winform to develop our software,so .net framework 4.7.2 can be supported,it will be well for winform.

Repeats Previous Segments

Hello, thanks for porting this to .NET! I was playing around with it last night and found that each time a new segment is generated, the event handler receives all previous segments.

Here's a short example output from my program:

await using var fileStream = File.OpenRead("/home/evan/Downloads/audio/output.wav");
using var processor = WhisperProcessorBuilder.Create()
    .WithSegmentEventHandler((sender, e) => Console.WriteLine("{0} - {1} - {2}", e.Start, e.End, e.Segment))
    .WithFileModel("ggml-base.en.bin")
    .WithThreads(1)
    .WithLanguage("en")
    .Build();

00:00:00 - 00:00:25.8400000 -  CHAPTER I
00:00:00 - 00:00:25.8400000 -  CHAPTER I
00:00:25.8400000 - 00:00:32.1600000 -  The Jacques-Arde bathrobe hanging on his bedpost bore the monogram Hotel Ritz Paris.
00:00:00 - 00:00:25.8400000 -  CHAPTER I
00:00:25.8400000 - 00:00:32.1600000 -  The Jacques-Arde bathrobe hanging on his bedpost bore the monogram Hotel Ritz Paris.
00:00:32.1600000 - 00:00:36.4000000 -  Slowly the fog began to lift. Langdon picked up the receiver.
00:00:00 - 00:00:25.8400000 -  CHAPTER I
00:00:25.8400000 - 00:00:32.1600000 -  The Jacques-Arde bathrobe hanging on his bedpost bore the monogram Hotel Ritz Paris.
00:00:32.1600000 - 00:00:36.4000000 -  Slowly the fog began to lift. Langdon picked up the receiver.
00:00:36.4000000 - 00:00:37.4000000 -  "Hello?"
00:00:00 - 00:00:25.8400000 -  CHAPTER I
00:00:25.8400000 - 00:00:32.1600000 -  The Jacques-Arde bathrobe hanging on his bedpost bore the monogram Hotel Ritz Paris.
00:00:32.1600000 - 00:00:36.4000000 -  Slowly the fog began to lift. Langdon picked up the receiver.
00:00:36.4000000 - 00:00:37.4000000 -  "Hello?"
00:00:37.4000000 - 00:00:43.1600000 -  "Mr. Langdon?" a man's voice said. "I hope I have not awoken you."

Each time we get another segment, we also receive all previous segments... Is this by design?

Using WhisperProcessor.ProcessAsync more than once

Thanks for this great project!
I am attempting to use a WhisperProcessor instance more than once, calling ProcessAsync serially. However, after the first successful recognition, it only sometimes returns SegmentDatas. It appears that OnSegmentHandler is not being called while in whisper_full_with_state.

To work around this, I save the builder and then Build() before every ProcessAsync. Everything gets recognized successfully.

The program has exited with code -1073741795 (0xc000001d) 'Illegal Instruction'.

Hi,
I got this error message, which indicates that a program has attempted to execute an invalid or unsupported CPU instruction. My CPU is Intel(R) Core(TM) i5-2430M CPU @ 2.40GHz , X64-based processor.

Any idea?

The code

`void FullDetection()
{
var processor = WhisperProcessorBuilder.Create()
.WithSegmentEventHandler(OnNewSegment)
.WithFileModel(modelName)
.WithTranslate()
.WithLanguage("auto")
.Build();

        void OnNewSegment(object sender, OnSegmentEventArgs e)
        {
            textBox1.Text= ($"CSSS {e.Start} ==> {e.End} : {e.Segment}");
        }

        lock (new object())
        {
            using (var fileStream = File.OpenRead(filename))
            {
                processor.Process(fileStream);
            }
        }

    }`

AccessViolation when try to cancel ProcessAsync()

Whisper.Net version: 1.2.2
Environment: win10-x64
.NET version: Framework 4.7.2
Model: Small.bin
wav file language: Japanese

Use Whisper.net.Demo sample code, in Program.cs, pass a 1 minute cancellation token to ProcessAsync():

    var cts = new CancellationTokenSource(TimeSpan.FromMinutes(1));
    await foreach (var segment in processor.ProcessAsync(fileStream, cts.Token))
    {
        Console.WriteLine($"New Segment: {segment.Start} ==> {segment.End} : {segment.Text}");
    }

After 1min, the Demo crashed by AccessViolationException:
0x00007FFB245934AE (whisper.dll)处(位于 Whisper.net.Demo.exe 中)引发的异常: 0xC0000005: 读取位置 0x000001D0863A1D90 时发生访问冲突。

How to handle real-time sound streams

thank u

NuGet library is not able to load the whisper library on Android

The NativeLibraryLoader uses conditional preprocessor directives (#if ANDROID) which are not being applied so the existing NuGet ends up trying to load the Linux arm64 library and fails. Might be wrong but I think looking for the platform at runtime would be the correct way of supporting the various platforms. Cheers!

Invalid wave file header when using Win11 Recorder

I was able to get the demo code running using the Kennedy.wav file. But when I recorded a file using the Windows 11 Recorder it said the wave file header was invalid.

Whisper.net.Wave.CorruptedWaveException: 'Invalid wave file header.'

Windows 11 Sound Recorder can generate Wav files of various qualities.

I took your suggestion from Issue #33 and wrote out the headers for each quality level.

Kennedy.wav: RIFF?¶WAVEfmt
Auto.wav: RIFF??☻WAVEJUNK
Medium.wav: RIFFJ?☺WAVEJUNK
Best.wav: RIFFB?WAVEJUNK
High.wav: RIFFv?♠WAVEJUNK

I would have expected these files to be valid. Is there something I'm missing?

Process terminated. A callback was made on a garbage collected delegate of type 'Whisper.net!Whisper.net.Native.WhisperNewSegmentCallback::Invoke'

Process terminated. A callback was made on a garbage collected delegate of type 'Whisper.net!Whisper.net.Native.WhisperNewSegmentCallback::Invoke'.
Repeat 2 times:

at Whisper.net.Native.NativeMethods.whisper_full(IntPtr, Whisper.net.Native.WhisperFullParams, IntPtr, Int32)

at Whisper.net.WhisperProcessor.Process(System.IO.Stream)
at WhisperAI.AudioProcessor+d__1.MoveNext()
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.__Canon ByRef)
at WhisperAI.AudioProcessor.ProcessAudio(System.String)
at Program+<

$>d__0.MoveNext()
at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.__Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.__Canon ByRef)
at Program.$(System.String[])
at Program.(System.String[])

This error happens on big and long .WAV files (200mb and 40minutes long) It happens very randomly.

how to retrieve diarization and how to use prompt?

there is no interface in the binding?

GetAvgSamplesAsync reads beyond data chunk

The GetAvgSamples() method reads the number of samples from the stream based on the value in SamplesCount. However the async version of that method reads until the end of the stream without considering SamplesCount. If there are other chunks after the data chunk this leads to an out of bounds exception.
A workaround right now is processor.ProcessAsync(new WaveParser(fileStream).GetAvgSamples())), effectively reading sync, but still processing async.

Explanation of FluentAPI settings

Hello!

Is there any information which "With~" in the fluent api corresponds to which settings/flags in whisper.cpp?
I'm mostly interested in -ml flag, which allows for limiting output length per line.

Looks like the WithMaxSegmentLength() should work the same way as -ml but I think it does not

Thanks!

[BUG] Getting error "System.Net.Http.HttpRequestException" when trying to use WhisperGgmlDownloader.GetGgmlModelAsync with the large model

When trying to use the WhisperGgmlDownloader.GetGgmlModelAsync method with the large model
using var modelStream = await WhisperGgmlDownloader.GetGgmlModelAsync(GgmlType.Large);
I get the following error:
System.Net.Http.HttpRequestException: Cannot write more bytes to the buffer than the configured maximum buffer size: 2147483647
Pull request for bugfix is out. Changed issue to solved

Error Message of "Invalid wave file RIFF header" with various valid wav files

I copy/pasted the demo code to the file whisper.cs isntalled the packages in nuget. The only changes I made are changing the models (base to large) and the file name inside Default="" in the'f' Option. The code is besides taht really the same as the demo code of this repo!
The wav files are in the project folder and registered by whisper.net

But, unfortunateIy I get the following error message every time no matter which wav file I try:

   at Whisper.net.Wave.WaveParser.InitializeAsync()
   at Whisper.net.Wave.WaveParser.GetAvgSamplesAsync(CancellationToken cancellationToken)
   at Whisper.net.WhisperProcessor.ProcessAsync(Stream waveStream, CancellationToken cancellationToken)+MoveNext()
   at Whisper.net.WhisperProcessor.ProcessAsync(Stream waveStream, CancellationToken cancellationToken)+System.Threading.Tasks.Sources.IValueTaskSource<System.Boolean>.GetResult()
   at Program.<<Main>$>g__FullDetection|0_2(Options opt) in C:\Users\huddeij\RiderProjects\whisperTest\whisper2.cs:line 80
   at Program.<<Main>$>g__FullDetection|0_2(Options opt) in C:\Users\huddeij\RiderProjects\whisperTest\whisper2.cs:line 80
   at Program.<<Main>$>g__Demo|0_0(Options opt) in C:\Users\huddeij\RiderProjects\whisperTest\whisper2.cs:line 33
   at CommandLine.ParserResultExtensions.WithParsedAsync[T](ParserResult`1 result, Func`2 action)
   at Program.<Main>$(String[] args) in C:\Users\huddeij\RiderProjects\whisperTest\whisper2.cs:line 13
   at Program.<Main>(String[] args)

The output before the error message:

whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 1280
whisper_model_load: n_audio_head  = 20
whisper_model_load: n_audio_layer = 32
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 1280
whisper_model_load: n_text_head = 20
whisper_model_load: n_text_layer = 32
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: type = 5
whisper_model_load: mem required = 3557.00 MB (+ 71.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx = 2950.97 MB
whisper_model_load: model size = 2950.66 MB

I tried the sample wav files from this repo, audio records, converted into wav via cloudconvert and ffmpeg.

Environment:
MS Windows 11 Pro 22H2
.Net v7.0.203
Jetbrains Rider 2023.1.1

What am i doing wrong here?

Failed to load native whisper library

I can't get it work, I keep getting error "Failed to load native whisper library". Not sure if I'm supposed to do something rather then adding packages in project, downloading model and creating processor. I'm doing exactly what is done in Simple example.
The error appears at following line:
using var whisperFactory = WhisperFactory.FromPath("ggml-base.bin");

I'm integrating into dotnet core 5 application running on linux-x64 machine.
Should I manually run whisper library or add it somewhere? Not sure that I understand the process at all..

    // This section detects whether the "ggml-base.bin" file exists in our project disk. If it doesn't, it downloads it from the internet
    if (!System.IO.File.Exists(modelFileName))
    {
      await DownloadModel(modelFileName, ggmlType);
    }

    // This section creates the whisperFactory object which is used to create the processor object.
    using var whisperFactory = WhisperFactory.FromPath("ggml-base.bin");

    // This section creates the processor object which is used to process the audio file, it uses language `auto` to detect the language of the audio file.
    using var processor = whisperFactory.CreateBuilder()
        .WithLanguage("auto")
        .Build();

Catching the native call exception

If you give an invalid path or a path that does not exist yet to WhisperFactory.FromPath, the program waits until NativeMethods.whisper_init_state is called to fail. And it fails hard by throwing a non-recoverable AccessViolationExeption. If it is the correct approach, could you add a validation to the whisper factory to prevent this?

netstandard2.0 dll is not installed

I will run this project in visual studio 2022 ,all dll is installed but netstandard2.0 is not installed . I installed .Net Core 2 and Desktop Development C++ library but still this error remain.

Issue with Wave Memorystream obtained through ffmpegcore

I seem to have run into an issue with a memorystream created by letting ffmpegcore download a video and stripping the audio from it.
As soon as it arrives at the ProcessAsync call. The application uses almost 10 gigs of memory. Calling Process instead of the async variant leads to "unable to read beyond the end of the stream"

I debugged part of it already and saw that the dataChunkSize seems to be massive compared to the memorystreams length (memory streams length: 888910)(https://github.com/sandrohanea/whisper.net/blob/main/Whisper.net/Wave/WaveParser.cs#L356)
When I hardcoded the dataChunkSize to be the length, it read the stream fine and gave me the expected output.

I wondered if you could tell me what might be the issue here. Either by the settings for the wave (unsupported codec or something else) or what might go wrong with reading the created memorystream. I added an example project to this post. (You might need to get the required ffmpeg binaries from https://ffbinaries.com/downloads)

WhisperIssueExample project

Identifying two speakers

Is there any built-in way to identify two speakers?
Thank you

How do you do it faster?

I'm looking for ways to make the transcription faster?
This library is excellent and I really enjoy it, but the transcription takes me a long time.
I have a powerful CPU
And a powerful GPU
And still it's very slow, it doesn't use at all the CPU it could use, certainly not the GPU
Is there a way I can make it work faster?

Most of the usage is in the CPU and not in the GPU

I tested the new version, and most of the usage is on the CPU, only occasionally it uses the GPU for a moment, I don't know why not all the usage is on the GPU, which would be much faster?
You can also see here
If you use what he did, it is only on GPU and works very fast
https://github.com/Const-me/Whisper
Thanks for everything, it definitely improved performance, but not as much as I expected

Unable to find an entry point named 'whisper_init_from_file_no_state' in DLL 'whisper'

I get an error when initializing WhisperFactory:

System.EntryPointNotFoundException: Unable to find an entry point named 'whisper_init_from_file_no_state' in DLL 'whisper'.
   at Whisper.net.Native.NativeMethods.whisper_init_from_file_no_state(String path)
   at Whisper.net.Internals.ModelLoader.WhisperProcessorModelFileLoader.LoadNativeContext()
   at Whisper.net.WhisperFactory..ctor(IWhisperProcessorModelLoader loader, Boolean delayInit, String libraryPath, Boolean bypassLoading)
   at Whisper.net.WhisperFactory.FromPath(String path, Boolean delayInitialization, String libraryPath, Boolean bypassLoading)

Tested with both version 1.4.3 and 1.4.2 on Windows Server 2022 x64 AMD EPYC.

            using (var whisperFactory = WhisperFactory.FromPath(modelPath)) // this gives exception
            {
                 // ...
            }

Win-x64 library issue

Hello

In both 1.4 (and the newly released 1.4.2) I'm getting the error
System.Exception : Failed to load native whisper library. Error: The specified module could not be found.

I've dug into the Whisper.net.Runtime package and it looks like the win-x64/whisper.dll is not supported. The Rider assembly explorer is flagging it as a Win32 resource - perhaps the pipeline built the wrong file?

Suggestions on how to fix?

Many thanks

Working with MP3s

How can we use the library with MP3 files? At the moment when working with MP3, the error "Invalid wave file RIFF header" is thrown. The original Whisper supports MP3 files.

Memory required with model medium and large

I'm downloaded model Medium and Large at https://ggml.ggerganov.com/
After run:
whisper_init_from_file: loading model from 'ggml-model-whisper-medium.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 1024
whisper_model_load: n_audio_head = 16
whisper_model_load: n_audio_layer = 24
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 1024
whisper_model_load: n_text_head = 16
whisper_model_load: n_text_layer = 24
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 2
whisper_model_load: type = 4
whisper_model_load: mem required = 1720.00 MB (+ 43.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx = 1462.35 MB

How to convert whisper model to GGML

Is there a way to do this in C#?

Unable to find an entry point named 'whisper_full_default_params_by_ref' in shared library 'whisper'.

Hi y'all, thank you for working on this .NET implementation for Whisper!

I'm trying to run the "Simple" example from the repo but run into issues on macOS Ventura (ARM, M1 Pro).
It appears to find the native library but can't call it correctly.

Older Whisper.net versions (1.4.4, 1.4.3) are exhibiting the same behavior.

whisper.net/examples/Simple on  main [✘] via .NET 7.0.101 
➜ dotnet run --framework net6.0
Downloading Model ggml-base.bin
whisper_init_from_file_no_state: loading model from 'ggml-base.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 2
whisper_model_load: mem required  =  310.00 MB (+    6.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     =  140.66 MB
whisper_model_load: model size    =  140.54 MB
Unhandled exception. System.EntryPointNotFoundException: Unable to find an entry point named 'whisper_full_default_params_by_ref' in shared library 'whisper'.
   at Whisper.net.Native.NativeMethods.whisper_full_default_params_by_ref(WhisperSamplingStrategy strategy)
   at Whisper.net.WhisperProcessor.GetWhisperParams()
   at Whisper.net.WhisperProcessor..ctor(WhisperProcessorOptions options)
   at Whisper.net.WhisperProcessorBuilder.Build()
   at Program.Main(String[] args) in /Users/philippbauer/Learning/whisper.net/examples/Simple/Program.cs:line 29
   at Program.<Main>(String[] args)

Always english

When using .WithLanguageDetection() or .WithLanguage("auto") the language is always English auto-detected language: en (p = 0.515557) but if you specify the correct language then everything is fine

Other "port" and "bindings" libraries work fine and detect the language correctly

Only 16KHz sample rate is supported.

Whisper.net.Wave.NotSupportedWaveException
HResult=0x80131500
Message=Only 16KHz sample rate is supported.

How To solve this problem

Do you have any synchronous examples?

Hello, the code examples are all asynchronous. Do you have any synchronous examples?

System.Exception: 'Failed to load native whisper library.'

HI,

I have been trying to run this code, but I keep getting this error message:
System.Exception: 'Failed to load native whisper library.'

`
using System;
using System.IO;
using System.Threading.Tasks;
using System.Windows.Forms;
using CommandLine;
using Whisper.net;
using Whisper.net.Ggml;
using Whisper.net.Wave;

namespace NameSpace
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}

    private async void button1_Click(object sender, EventArgs e)
    {

        await Parser.Default.ParseArguments<object>(new string[] { })
            .WithParsedAsync(this.Demo);
    }
    string modelName = "ggml-base.bin";
    string filename = "1min.wav"; 
    async Task Demo(object opt)
    {
        if (!File.Exists(modelName))
        {
            Console.WriteLine($"Downloading Model ggml-base.bin");
            var modelStream = await WhisperGgmlDownloader.GetGgmlModelAsync(GgmlType.BaseEn);
            var fileWriter = File.OpenWrite(modelName);
            await modelStream.CopyToAsync(fileWriter);
        }

        FullDetection();
    }

 

    void FullDetection()
    {
        var processor = WhisperProcessorBuilder.Create()
        .WithSegmentEventHandler(OnNewSegment)
        .WithFileModel(modelName)
        .WithTranslate()
         .WithLanguage("auto")
        .Build();

        void OnNewSegment(object sender, OnSegmentEventArgs e)
        {
            Console.WriteLine($"CSSS {e.Start} ==> {e.End} : {e.Segment}");
        }

        var fileStream = File.OpenRead(filename);
        processor.Process(fileStream);
    }
}

}

Improve Model Downloader

Improve model downloader with new HuggingFace link, to include quantized models and CoreML models.

.WithMaxTokensPerSegment(1) returns only one segment

if you specify .WithMaxTokensPerSegment(1) then there will be only one segment in output. Everything is fine in whisper.cpp library.

API doesn't permit loading native .dylib from non standard location

Hi,

Thanks for an awesome library. When looking to incorporate Whisper.net into a product, we would need the ability to load the native .dylib from a location other than where the NativeLibraryLoader is currently trying to find the binary (under runtimes/...).

Would it be possible to make an extension to the API to allow the library user to specify search paths manually? We'd much prefer this over having to fork the library just for this purpose.

try
{
    await foreach (var segment in processor.ProcessAsync(decodedFileStream, ctx))
        yield return segment;
}
finally
{
    processor.Dispose();
}

// CPU Usage is still 100% here

I can see 100% CPU usage after ProcessAsync throws.

if I start to process another file before the CPU usage drops to zero, it pretty much crawls to a halt for minutes until the original instance terminates.

sandrohanea / whisper.net Goto Github PK

whisper.net's Issues

Process terminated. A callback was made on a garbage collected delegate of type 'Whisper.net!Whisper.net.Native.WhisperNewSegmentCallback::Invoke'. Repeat 2 times:

at Whisper.net.Native.NativeMethods.whisper_full(IntPtr, Whisper.net.Native.WhisperFullParams, IntPtr, Int32)

Recommend Projects

Recommend Topics

Recommend Org

Process terminated. A callback was made on a garbage collected delegate of type 'Whisper.net!Whisper.net.Native.WhisperNewSegmentCallback::Invoke'.
Repeat 2 times: