Hi John, I was trying to use it to classify my audio frames into speech and silenc

Error: Error while processing frame about py-webrtcvad HOT 2 CLOSED

great-thoughts commented on May 20, 2024

Error: Error while processing frame

from py-webrtcvad.

Comments (2)

wiseman commented on May 20, 2024 1

The underlying webrtc code can only handle frames that are 10, 20, or 30 ms long. You can use webrtcvad.valid_rate_and_frame_length to check whether a sample rate/frame size is valid (see e.g. https://github.com/wiseman/py-webrtcvad/blob/master/test_webrtcvad.py#L20).

from py-webrtcvad.

Prakash2608 commented on May 20, 2024

import webrtcvad
import soundfile as sf
import numpy as np
import librosa

def extract_speech_segments(audio_path, output_path):
# Load the audio file
audio, sr = librosa.load(audio_path, sr = 16000)

# Set the VAD parameters
vad = webrtcvad.Vad()
vad.set_mode(3)  # Aggressiveness level (0-3)

# Set the frame duration for VAD analysis
frame_duration = 30  # in milliseconds

# Convert the frame duration to the number of samples
frame_size = int(sr * (frame_duration / 1000.0))

# Initialize variables
speech_segments = []
current_segment_start = 0
current_segment_end = 0

# Iterate over the audio frames
for i in range(0, len(audio), frame_size):
    frame = audio[i:i + frame_size]

    # Convert the frame to int16 format
    frame = np.int16(frame * 32768)

    # Check if the frame contains speech
    if vad.is_speech(frame.tobytes(), sample_rate=sr):
        # If it's a new speech segment, update the current segment start
        if current_segment_start == 0:
            current_segment_start = i

        # Update the current segment end
        current_segment_end = i + frame_size

    # If the frame does not contain speech
    else:
        # If we were in a speech segment, add it to the list
        if current_segment_start != 0:
            speech_segments.append((current_segment_start, current_segment_end))
            current_segment_start = 0
            current_segment_end = 0

# Save the speech segments as individual audio files
for idx, (start, end) in enumerate(speech_segments):
    segment_audio = audio[start:end]
    segment_output_path = f"{output_path}_segment{idx}.wav"
    sf.write(segment_output_path, segment_audio, sr)

return speech_segments

Example usage

audio_path = '/kaggle/working/audio.wav'
output_path = '/kaggle/working/'
speech_segments = extract_speech_segments(audio_path, output_path)

This is my code.

Error Traceback (most recent call last)
Cell In[10], line 60
58 audio_path = '/kaggle/working/audio.wav'
59 output_path = '/kaggle/working/'
---> 60 speech_segments = extract_speech_segments(audio_path, output_path)

Cell In[10], line 33, in extract_speech_segments(audio_path, output_path)
30 frame = np.int16(frame * 32768)
32 # Check if the frame contains speech
---> 33 if vad.is_speech(frame.tobytes(), sample_rate=sr):
34 # If it's a new speech segment, update the current segment start
35 if current_segment_start == 0:
36 current_segment_start = i

File /opt/conda/lib/python3.10/site-packages/webrtcvad.py:27, in Vad.is_speech(self, buf, sample_rate, length)
23 if length * 2 > len(buf):
24 raise IndexError(
25 'buffer has %s frames, but length argument was %s' % (
26 int(len(buf) / 2.0), length))
---> 27 return _webrtcvad.process(self._vad, sample_rate, buf, length)

Error: Error while processing frame

and I am getting this error, also checked the prerequisites.

from py-webrtcvad.

Error: Error while processing frame about py-webrtcvad HOT 2 CLOSED

Comments (2)

Example usage

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent