Giter Site home page Giter Site logo

iwater / whisper.rn Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mybigday/whisper.rn

0.0 0.0 0.0 1.43 MB

React Native binding of whisper.cpp.

License: MIT License

Shell 0.30% JavaScript 14.40% Ruby 4.22% C++ 9.84% C 0.11% Objective-C 4.11% Java 32.19% TypeScript 9.13% Objective-C++ 23.95% Swift 0.06% Makefile 1.68%

whisper.rn's Introduction

whisper.rn

Actions Status License: MIT npm

React Native binding of whisper.cpp.

whisper.cpp: High-performance inference of OpenAI's Whisper automatic speech recognition (ASR) model

Run example with release mode on iPhone 13 Pro Max

Installation

npm install whisper.rn

Then re-run npx pod-install again for iOS.

For Expo, you will need to prebuild the project before using it. See Expo guide for more details.

Add Microphone Permissions (Optional)

If you want to use realtime transcribe, you need to add the microphone permission to your app.

iOS

Add these lines to ios/[YOU_APP_NAME]/info.plist

<key>NSMicrophoneUsageDescription</key>
<string>This app requires microphone access in order to transcribe speech</string>

For tvOS, please note that the microphone is not supported.

Android

Add the following line to android/app/src/main/AndroidManifest.xml

<uses-permission android:name="android.permission.RECORD_AUDIO" />

Usage

import { initWhisper } from 'whisper.rn'

const filePath = 'file://.../ggml.base.en.bin'
const sampleFilePath = 'file://.../sample.wav'

const whisperContext = await initWhisper({ filePath })

const options = { language: 'en' }
const { stop, promise } = whisperContext.transcribe(sampleFilePath, options)

const { result } = await promise
// result: (The inference text result from audio file)

Use realtime transcribe:

const { stop, subscribe } = await whisperContext.transcribeRealtime(options)

subscribe(evt => {
  const { isCapturing, data, processTime, recordingTime } = evt
  console.log(
    `Realtime transcribing: ${isCapturing ? 'ON' : 'OFF'}\n` +
      // The inference text result from audio record:
      `Result: ${data.result}\n\n` + 
      `Process time: ${processTime}ms\n` +
      `Recording time: ${recordingTime}ms`,
  )
  if (!isCapturing) console.log('Finished realtime transcribing')
})

In Android, you may need to request the microphone permission by PermissionAndroid.

The documentation is not ready yet, please see the comments of index file for more details at the moment.

Run with example

The example app is using react-native-fs to download the model file and audio file.

Model: base.en in https://huggingface.co/datasets/ggerganov/whisper.cpp
Sample file: jfk.wav in https://github.com/ggerganov/whisper.cpp/tree/master/samples

For test better performance on transcribe, you can run the app in Release mode.

  • iOS: yarn example ios --configuration Release
  • Android: yarn example android --mode release

Please follow the Development Workflow section of contributing guide to run the example app.

Mock whisper.rn

We have provided a mock version of whisper.rn for testing purpose you can use on Jest:

jest.mock('whisper.rn', () => require('whisper.rn/jest/mock'))

Contributing

See the contributing guide to learn how to contribute to the repository and the development workflow.

License

MIT


Made with create-react-native-library


Built and maintained by BRICKS.

whisper.rn's People

Contributors

jhen0409 avatar dependabot[bot] avatar iwater avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.