Giter Site home page Giter Site logo

i have one question about sherpa-onnx HOT 8 CLOSED

k2-fsa avatar k2-fsa commented on May 16, 2024
i have one question

from sherpa-onnx.

Comments (8)

csukuangfj avatar csukuangfj commented on May 16, 2024 1

@dongqianzhuan

please use the latest master. It should be fixed in #74

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on May 16, 2024

Are you using python and which operating system are using?

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on May 16, 2024

Also, could you describe in detail what commands are you using?

from sherpa-onnx.

erquren avatar erquren commented on May 16, 2024
#!/usr/bin/env python3

# Real-time speech recognition from a microphone with sherpa-onnx Python API
# with endpoint detection.
#
# Please refer to
# https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html
# to download pre-trained models

import argparse
import sys
from pathlib import Path

try:
    import sounddevice as sd
except ImportError as e:
    print("Please install sounddevice first. You can use")
    print()
    print("  pip install sounddevice")
    print()
    print("to install it")
    sys.exit(-1)

import sherpa_onnx


def assert_file_exists(filename: str):
    assert Path(
        filename
    ).is_file(), f"{filename} does not exist!\nPlease refer to https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html to download it"


def get_args():
    parser = argparse.ArgumentParser(
        formatter_class=argparse.ArgumentDefaultsHelpFormatter
    )

    parser.add_argument(
        "--tokens",
        type=str,
        help="Path to tokens.txt",
    )

    parser.add_argument(
        "--encoder",
        type=str,
        help="Path to the encoder model",
    )

    parser.add_argument(
        "--decoder",
        type=str,
        help="Path to the decoder model",
    )

    parser.add_argument(
        "--joiner",
        type=str,
        help="Path to the joiner model",
    )

    parser.add_argument(
        "--wave-filename",
        type=str,
        help="""Path to the wave filename. Must be 16 kHz,
        mono with 16-bit samples""",
    )

    return parser.parse_args()


def create_recognizer():
    args = get_args()
    assert_file_exists(args.encoder)
    assert_file_exists(args.decoder)
    assert_file_exists(args.joiner)
    assert_file_exists(args.tokens)
    # Please replace the model files if needed.
    # See https://k2-fsa.github.io/sherpa/onnx/pretrained_models/index.html
    # for download links.
    recognizer = sherpa_onnx.OnlineRecognizer(
        tokens=args.tokens,
        encoder=args.encoder,
        decoder=args.decoder,
        joiner=args.joiner,
        enable_endpoint_detection=True,
        rule1_min_trailing_silence=2.4,
        rule2_min_trailing_silence=1.2,
        rule3_min_utterance_length=300,  # it essentially disables this rule
    )
    return recognizer


def main():
    print("Started! Please speak")
    recognizer = create_recognizer()
    sample_rate = 16000
    samples_per_read = int(0.1 * sample_rate)  # 0.1 second = 100 ms
    last_result = ""
    stream = recognizer.create_stream()

    last_result = ""
    segment_id = 0
    frame_num = 0
    with sd.InputStream(channels=1, dtype="float32", samplerate=sample_rate) as s:
        while True:
            frame_num+=1
            # print(f"frame_num: {frame_num}")
            samples, _ = s.read(samples_per_read)  # a blocking read
            samples = samples.reshape(-1)
            stream.accept_waveform(sample_rate, samples)
            while recognizer.is_ready(stream):
                recognizer.decode_stream(stream)

            is_endpoint = recognizer.is_endpoint(stream)

            result = recognizer.get_result(stream)

            if result and (last_result != result):
                last_result = result
                print(f"{segment_id}: {result}")

            if is_endpoint:
                if result:
                    segment_id += 1
                recognizer.reset(stream)


if __name__ == "__main__":
    devices = sd.query_devices()
    print(devices)
    default_input_device_idx = sd.default.device[0]
    print(f'Use default device: {devices[default_input_device_idx]["name"]}')

    try:
        main()
    except KeyboardInterrupt:
        print("\nCaught Ctrl + C. Exiting")

from sherpa-onnx.

erquren avatar erquren commented on May 16, 2024

run the model onnx

https://huggingface.co/csukuangfj/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tree/main

from sherpa-onnx.

erquren avatar erquren commented on May 16, 2024

Are you using python and which operating system are using?
https://k2-fsa.github.io/sherpa/onnx/python/install.html#method-1-from-pre-compiled-wheels
https://k2-fsa.github.io/sherpa/onnx/python/install.html#method-3-for-developers

method-1 and method 3

my deivice have x86 and arm ,the same phenomenon

from sherpa-onnx.

csukuangfj avatar csukuangfj commented on May 16, 2024

@dongqianzhuan

Sorry, it is a bug introduced in #69

I am fixing it.

from sherpa-onnx.

erquren avatar erquren commented on May 16, 2024

now , it is normal work!
thank you

from sherpa-onnx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.