jonathanfly / bark Goto Github PK
View Code? Open in Web Editor NEWThis project forked from suno-ai/bark
๐ BARK INFINITY GUI CMD ๐ถ Powered Up Bark Text-prompted Generative Audio Model
License: MIT License
This project forked from suno-ai/bark
๐ BARK INFINITY GUI CMD ๐ถ Powered Up Bark Text-prompted Generative Audio Model
License: MIT License
What do I have to do to make JonathanFly/bark work? I already have barkwebui inst. (oneclickinst.) and then copied the files from your repo to the bark folder and "pip install soundfile" via cmd inst.
I get the following error when i run this python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3
command in cmd.
Microsoft Windows [Version 10.0.22621.1555]
(c) Microsoft Corporation. Alle Rechte vorbehalten.
D:\AI\Bark_WebUI\bark>python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3
Traceback (most recent call last):
File "D:\AI\Bark_WebUI\bark\bark_speak.py", line 3, in
from bark import SAMPLE_RATE, generate_audio, preload_models
File "D:\AI\Bark_WebUI\bark\bark_init_.py", line 1, in
from .api import generate_audio, text_to_semantic, semantic_to_waveform
File "D:\AI\Bark_WebUI\bark\bark\api.py", line 5, in
from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
File "D:\AI\Bark_WebUI\bark\bark\generation.py", line 7, in
from encodec import EncodecModel
ModuleNotFoundError: No module named 'encodec'
D:\AI\Bark_WebUI\bark>
You make this a PR
Request to add this feature in the future. Especially with your gradio one click installer coming soon :)
https://huggingface.co/QuickWick/Music-AI-Voices and https://github.com/neonbjb/tortoise-tts
this would allow people to easily take voices they made in tortoise into bark
Could someone describe to me how to clone my own voice?
I see .wav and .npz files here but Is there any way to add my own voice?
1080ti not detected, while working with stable-diffusion and GPTchat
First encountered No module named 'encoded'
Fixed with running: python -m pip install . (yes, include the .)
Second "No GPU being used. Careful, Inference might be extremely slow!" message
type python, hit enter
then type import torch and hit enter
now type torch.cuda.is_available() and see if it says true or false
if false go to pytorch and follow the steps for manual reinstall
if still not working while rest is working (like in my case)
download anaconda or miniconda, create a clean environment and start over
this did the trick finally for me, while I still had to take again the steps above.
I want to generate long consistent and natural audio for my youtube videos to make a voice over : https://www.youtube.com/SECourses
I tried tortoise TTS even cloned a voice but natural and high quality enough
I need consistency and high quality
Reading the repo it says Ramshackle Gradio App
But can't find this dev branch or anything
I can be volunteer to test and make a tutorial video
my discord : MonsterMMORPG#2198
git clone -b https://github.com/JonathanFly/bark.git
--> Leads to an error
git clone -b main
https://github.com/JonathanFly/bark.git
--> works
It would greatly help to have a guide for structuring commands so they work properly, and we can concentrate on only making music or sound effects. A Gui would also be greatly appreciated.
It seems that line breaks might help with song/rap structure -or it may have no effect, is very hard to tell!
Anyway, is there a way to do that with command line prompt?
(This is a great project, thank you so much!)
Thank you for youre project!!!
It is possible to make a Colab version?
multiple attempts to generate audio resulted in this runtime error.
relevant img attached.
followed the mamba install, using windows 10. running on rtx 3090. only change was the directory of the bark clone from c-drive to g-drive (place for my ai-related stuff).
also occurs when i try to pre-load models. double checked, the models are located at their supposedly correct locations:
C:\Users\Admin\.cache\suno\bark_v0
I have encountered a bug in the Gradio UI where the audio preview does not update if the file name has a number appended to it. For example, if the original file name is "audio.wav" and a new file is generated with the name "audio_1.wav", the UI only loads the original audio preview and does not update to the newly generated file.
To reproduce this issue, please follow these steps:
I have checked the console output and it is clear that the script is not correctly processing numbered files.
File "C:\Users\darkl\bark\bark_webui.py", line 172, in generate_audio_long_gradio
trim_logs()
File "C:\Users\darkl\bark\bark_webui.py", line 686, in trim_logs
lines = f.readlines()
File "C:\Users\darkl\mambaforge\envs\bark-infinity-oneclick\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 189: character maps to <undefined>
A few minor bugs, nothing game-breaking!
When there is a :
in the text prompt, such as "TV AD: Blah blah text here", then the script attempts to save the files with an unsanitized filename. On Windows, :
is an illegal character in the filename, which causes no files to be saved out, only a broken 0kb file with the name up to the :
. Might be best to strip out everything that isn't regular text.
When executing the same prompt twice, then any already existing speaker file will be overwritten. Only the wave file gets a _1 appended to the filename. Would be nice if this also applied to the speaker files.
Thanks for making this wrapper!
Is it possible to include a requirements.txt file for this project?
While attempting to run the script as written in readme i get
Traceback (most recent call last):
File "D:\tmp\bark\bark-inf\bark_perform.py", line 3, in <module>
from bark import SAMPLE_RATE, generate_audio, preload_models
File "D:\tmp\bark\bark-inf\bark\__init__.py", line 1, in <module>
from .api import generate_audio, text_to_semantic, semantic_to_waveform
File "D:\tmp\bark\bark-inf\bark\api.py", line 3, in <module>
from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
File "D:\tmp\bark\bark-inf\bark\generation.py", line 24, in <module>
torch.cuda.is_bf16_supported()
File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 92, in is_bf16_supported
return torch.cuda.get_device_properties(torch.cuda.current_device()).major >= 8 and cuda_maj_decide
File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 481, in current_device
_lazy_init()
File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 216, in _lazy_init
torch._C._cuda_init()
RuntimeError: Unrecognized CachingAllocator option: garbage_collection_threshold
I have gtx 1060 with 6gb vram, windows 10 and python 3.10.6
oh and PYTORCH_CUDA_ALLOC_CONF
is garbage_collection_threshold:0.6,max_split_size_mb:128
D:\bark>python bark_webui.py
Running on local URL: http://127.0.0.1:7860
To create a public link, set share=True
in launch()
.
after open http://127.0.0.1:7860๏ผthe interface shows error๏ผ
The BAT appears to have the expectation it will install on C: under username
@echo off
call %USERPROFILE%\mambaforge\Scripts\activate.bat bark-infinity-oneclick
python %USERPROFILE%\bark\bark_webui.py
pause
But I installed on E:\Bark
How do I modify this to properly run from E:?
so need to install both mambaforge and miniforge for Windows? Because when installing the mambaforge I see nothing about miniforge
python bark_perform.py --use_smaller_models --text_prompt "Hello, world testing this for Bark Infinity since it's giving CUDA out of memory error so to overcome this error just simply use --use_smaller_models as args" --split_by_words 35
in web-ui version of bark and in suno-ai/bark we can do ["SUNO_USE_SMALL_MODELS"] = "True" but can't in this Infinity so to overcome this CUDA out of memory error simply use --use_smaller_models as args in cmd
Hi everyone,
I have been working on creating a one-click PowerShell script to install and set up Bark from this repo. The script is designed to download and install Mambaforge, Git, and other required packages, then clone the Bark repository, create a virtual environment, and activate it.
However, I'm experiencing issues with the Mambaforge installation step. Although the script downloads the Mambaforge installer and seemingly starts the installation, it appears that nothing actually gets installed. The script then continues to the next steps, which eventually fail due to the missing Mambaforge installation.
I am looking for assistance in identifying and resolving the issue with the Mambaforge installation. If you have experience with PowerShell scripting, any suggestions or insights would be greatly appreciated. Your input will help improve the script and make the Bark setup process smoother and more efficient for users.
Thank you for your help!
if (-not ([Security.Principal.WindowsPrincipal] [Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole] "Administrator")) {
Write-Host "This script requires Administrator privileges. Please run the script as Administrator."
return
}
function Check-Error {
param (
[string]$ErrorMessage
)
if ($LASTEXITCODE -ne 0) {
throw $ErrorMessage
}
}
function Test-CommandExists {
param (
[string]$Command
)
try {
Get-Command $Command -ErrorAction Stop | Out-Null
return $true
} catch {
return $false
}
}
if (-not (Test-CommandExists "mamba")) {
# Download Mambaforge
$mambaForgeExe = "Mambaforge-Windows-x86_64.exe"
if (-not (Test-Path $mambaForgeExe)) {
Write-Host "Downloading Mambaforge..."
try {
Invoke-WebRequest -Uri "https://github.com/conda-forge/miniforge/releases/latest/download/$mambaForgeExe" -OutFile $mambaForgeExe
} catch {
throw "Failed to download Mambaforge: $_"
}
} else {
Write-Host "Mambaforge installer already downloaded, skipping download."
}
# Install Mambaforge
Write-Host "Installing Mambaforge..."
try {
Start-Process -FilePath ".\$mambaForgeExe" -ArgumentList "/S /D=%UserProfile%\miniforge" -Wait
} catch {
throw "Failed to install Mambaforge: $_"
}
# Add Mambaforge to Path
$Env:Path = "$Env:UserProfile\miniforge\Scripts;$Env:Path"
} else {
Write-Host "Mambaforge is already installed, skipping download and installation."
}
# Check if git is installed
if (-not (Test-CommandExists "git")) {
$gitExe = "Git-2.40.1-64-bit.exe"
if (-not (Test-Path $gitExe)) {
Write-Host "Downloading Git..."
Invoke-WebRequest -Uri "https://github.com/git-for-windows/git/releases/download/v2.40.1.windows.1/$gitExe" -OutFile $gitExe
} else {
Write-Host "Git installer already downloaded, skipping download."
}
Write-Host "Installing Git..."
Start-Process -FilePath ".\$gitExe" -ArgumentList "/VERYSILENT" -Wait
} else {
Write-Host "Git is already installed, skipping installation."
}
# Clone Bark repository
Write-Host "Cloning Bark repository..."
if (-not (Test-Path ".\bark")) {
git clone https://github.com/JonathanFly/bark.git
Check-Error "Failed to clone Bark repository"
} else {
Write-Host "Bark repository already exists, skipping cloning."
}
# Change to Bark directory
Set-Location ".\bark"
# Create and activate environment
Write-Host "Creating and activating environment..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe env create -f environment-cuda.yml
Check-Error "Failed to create environment"
Write-Host "Activating environment..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe activate bark-infinity-oneclick
Check-Error "Failed to activate environment"
# Install additional packages
Write-Host "Installing additional packages..."
pip install encodec
pip install rich-argparse
# Uninstall and reinstall soundfile
Write-Host "Fixing soundfile installation..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe uninstall pysoundfile
pip install soundfile
Write-Host "Installation complete. Please use the Miniforge Prompt to start Bark."
Here is the current PowerShell script: one_click_install.zip
Traceback (most recent call last):
File "/home/xxx/sound/bark/bark_webui.py", line 350, in
with gr.Blocks(theme=default_theme,css=bark_console_style) as demo:
File "/home/xxx/.local/lib/python3.11/site-packages/gradio/blocks.py", line 1285, in exit
self.config = self.get_config_file()
^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxx/.local/lib/python3.11/site-packages/gradio/blocks.py", line 1261, in get_config_file
"input": list(block.input_api_info()), # type: ignore
^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxx/anaconda3/envs/bark/lib/python3.11/site-packages/gradio_client/serializing.py", line 40, in input_api_info
return (api_info["serialized_input"][0], api_info["serialized_input"][1])
~~~~~~~~^^^^^^^^^^^^^^^^^^^^
KeyError: 'serialized_input'
Loading Bark models...
Loading text model from C:\Users\Ariana.cache\suno\bark_v0\text_2.pt to cpu
Loading coarse model from C:\Users\Ariana.cache\suno\bark_v0\coarse_2.pt to cpu
Traceback (most recent call last):
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_perform.py", line 128, in
main(namespace_args)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_perform.py", line 95, in main
generation.preload_models(args.text_use_gpu, args.text_use_small, args.coarse_use_gpu, args.coarse_use_small, args.fine_use_gpu, args.fine_use_small, args.codec_use_gpu, args.force_reload)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 888, in preload_models
_ = load_model(
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 781, in load_model
model = _load_model_f(ckpt_path, device)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 813, in _load_model
checkpoint = torch.load(ckpt_path, map_location=device)
File "C:\Users\Ariana\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\serialization.py", line 797, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "C:\Users\Ariana\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\serialization.py", line 283, in init
super().init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory
I mostly test in WSL2 and just noticed miniforge in base windows has some annoying bits. The console has errors, even though everything is still working. And you can't control-c it, it just sits there.
When I attempt to use the symbol "โช" in the prompt to indicate a singing voice it just gives me an error?, do I need to format it in a specific way (besides this -> โช I'm the king of the jungle โช ) or is there a setting that I need to check first (typing [singing] seems to sometimes work making the voice sing what comes after it, is that intentional or what?)
Also the using the speaker like in (Man: Hi ... Woman: Hello) isn't consistent at all, is there a setting I need to adjust for it to work?
I made a quick batch file to run the script with args. On weekend i will make GUI with tkinter so that anyone can copy paste their prompt and run program with their selected speaker and selected settings. If anyone wants to take over i will provide the files i have made.
And many thanks to OP for making such awesome AI work with larger text.
BTW: i noticed one thing, when it splits text the first one gives --history_prompt in gen: chosen speaker but on next generations it gives --history_prompt in gen: none. could that be fixed?
D:\AI\bark>python bark_perform.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes... or can you? (and the next page, and the next page...)" --split_by_words 35
Traceback (most recent call last):
File "D:\AI\bark\bark_perform.py", line 3, in <module>
from bark import SAMPLE_RATE, generate_audio, preload_models
File "D:\AI\bark\bark\__init__.py", line 1, in <module>
from .api import generate_audio, text_to_semantic, semantic_to_waveform
File "D:\AI\bark\bark\api.py", line 3, in <module>
from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
File "D:\AI\bark\bark\generation.py", line 7, in <module>
from encodec import EncodecModel
File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\encodec\__init__.py", line 12, in <module>
from .model import EncodecModel
File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\encodec\model.py", line 14, in <module>
import torch
File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\__init__.py", line 133, in <module>
raise err
OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll" or one of its dependencies.
I tried to reinstall torch but it still didnt work
I am getting the out of memory issue on my 8GB card i noticed you had posted a comment about using --use_smaller_models
. I installed the gui version from the instructions and then updated the files with the one in your repo. I am unsure where to put the --use_smaller_models
would appreciate some help. Thank you.
Hi, when I try to run test voice Bark starts to download few GB of models
python3 bark_perform.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes... or can you? (and the next page, and the next page...)" --split_by_words 35
Loading Bark models...
finally it gives me
No GPU being used. Careful, Inference might be extremely slow!
No GPU being used. Careful, Inference might be extremely slow!
No GPU being used. Careful, Inference might be extremely slow!
Downloading: "https://dl.fbaipublicfiles.com/encodec/v0/encodec_24khz-d7cc33bc.th" to /Users/paulinajaskulska/.cache/torch/hub/checkpoints/encodec_24khz-d7cc33bc.th
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1348, in do_open
h.request(req.get_method(), req.selector, req.data, headers,
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1282, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1328, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1277, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1037, in _send_output
self.send(msg)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 975, in send
self.connect()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1454, in connect
self.sock = self._context.wrap_socket(self.sock,
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 513, in wrap_socket
return self.sslsocket_class._create(
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1071, in _create
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1342, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark_perform.py", line 312, in <module>
main(args)
File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark_perform.py", line 245, in main
preload_models()
File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 312, in preload_models
_ = load_codec_model(use_gpu=use_gpu, force_reload=True)
File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 290, in load_codec_model
model = _load_codec_model(device)
File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 254, in _load_codec_model
model = EncodecModel.encodec_model_24khz()
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/encodec/model.py", line 279, in encodec_model_24khz
state_dict = EncodecModel._get_pretrained(checkpoint_name, repository)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/encodec/model.py", line 262, in _get_pretrained
return torch.hub.load_state_dict_from_url(url, map_location='cpu', check_hash=True) # type:ignore
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/hub.py", line 746, in load_state_dict_from_url
download_url_to_file(url, cached_file, hash_prefix, progress=progress)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/hub.py", line 611, in download_url_to_file
u = urlopen(req)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 519, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 536, in _open
result = self._call_chain(self.handle_open, protocol, protocol +
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 496, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1391, in https_open
return self.do_open(http.client.HTTPSConnection, req,
File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1351, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)>
what can I do about it?
Hello!
Thank you for making this application.
I find it very interesting but I have stuck with inconvenient way to install it.
It would be great to me an who doesn't familiar with python to make a full installer or provide a step by step instruction of installation.
Cause its frustration to find out how to install python, pip and use them.
Thank you for your work.
Actually I just created a virtual environment and made a git + install pip requirements.
Works fine so far, also webui starts, however Voice generation only works on CPU it says. So i guess that mamba thing is for cuda.
Automatic1111 seems to work without mamba?
I try the new version. thare is no option of --stable-voice in the commands...
Even though I selected en_speaker_1 the voices changed in the middle of the process. Can this be fixed? Sending the file.
UnicodeEncodeError: 'charmap' codec can't encode characters in position 122-241: character maps to
*** You may need to add PYTHONIOENCODING=utf-8 to your environment ***
This error seems to randomly appear and does not go away. No idea what causes it. Sometimes the prompts work, and other times the same prompt triggers that error, and just continues to get worse after it starts.
Sorry for the basic question, but I'd love to use this on my laptop (rtx 3050ti) in one of my python programs
I pulled to the newest version of bark, installed the requirements using
pip install -r requirements-pip.txt
and tried running the webui. However, it errors out here:
Traceback (most recent call last):
File "/home/rlt/bark/bark_webui.py", line 5, in <module>
import gradio as gr
File "/home/rlt/.local/lib/python3.10/site-packages/gradio/__init__.py", line 3, in <module>
import gradio.components as components
File "/home/rlt/.local/lib/python3.10/site-packages/gradio/components.py", line 55, in <module>
from gradio import processing_utils, utils
File "/home/rlt/.local/lib/python3.10/site-packages/gradio/utils.py", line 38, in <module>
import matplotlib
File "/home/rlt/.local/lib/python3.10/site-packages/matplotlib/__init__.py", line 107, in <module>
from collections import MutableMapping
ImportError: cannot import name 'MutableMapping' from 'collections' (/usr/lib/python3.10/collections/__init__.py)
I am using Python 3.10.6. Any advice? Same result in miniconda.
Thanks in advance.
I would like to request the implementation of a multiple pass quality assurance process for the voice generation program. The aim of this process is to regenerate any output audio segments that do not meet a specified quality standard.
According to ChatGPT, a quality assessment function can be created to evaluate the output audio segment. This function can use a combination of signal processing and machine learning techniques to analyze the audio and determine if it is distorted or of poor quality. Some useful metrics for this assessment include signal-to-noise ratio (SNR), total harmonic distortion (THD), and perceptual evaluation of speech quality (PESQ).
I believe that implementing this multiple pass quality assurance process will significantly improve the overall quality of the generated voice output
".editorconfig" that forces formatting to prevent the 100+ line "changes"
Originally posted by @rmcc3 in #44 (comment)
super cool stuff!
Hi there, getting an error just before my audio file is generated:
RuntimeError: cuDNN version incompatibility: PyTorch was compiled against (8, 7, 0) but found runtime version (8, 6, 0). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN.
Any help on this is possible? Thank you!
Full log before it breaks -
--Segment 1/1: est. 0.40s
test
Loading text model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\text_2.pt to cpu
_load_model model loaded: 312.3M params, 1.269 loss generation.py:840Loading coarse model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\coarse_2.pt to cpu
_load_model model loaded: 314.4M params, 2.901 loss generation.py:840Loading fine model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\fine_2.pt to cpu
_load_model model loaded: 302.1M params, 2.079 loss generation.py:840Traceback (most recent call last):
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\blocks.py", line 1075, in process_api
result = await self.call_function(
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\blocks.py", line 884, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\to_thread.py", line 28, in run_sync
return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable,
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\_backends\_asyncio.py", line 818, in run_sync_in_worker_thread
return await future
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\_backends\_asyncio.py", line 754, in run
result = context.run(func, *args)
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\helpers.py", line 587, in tracked_fn
response = fn(*args)
File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_webui.py", line 205, in generate_audio_long_gradio
full_generation_segments, audio_arr_segments, final_filename_will_be = api.generate_audio_long_from_gradio(**kwargs)
File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 467, in generate_audio_long_from_gradio
full_generation_segments, audio_arr_segments, final_filename_will_be = generate_audio_long(**kwargs)
File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 577, in generate_audio_long
full_generation, audio_arr = generate_audio_barki(text=segment_text, **kwargs)
File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 434, in generate_audio_barki
audio_arr = codec_decode(fine_tokens)
File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\generation.py", line 747, in codec_decode
model.to(models_devices["codec"])
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 1145, in to
return self._apply(convert)
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
module._apply(fn)
[Previous line repeated 1 more time]
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 202, in _apply
self._init_flat_weights()
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 139, in _init_flat_weights
self.flatten_parameters()
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 169, in flatten_parameters
not torch.backends.cudnn.is_acceptable(fw.data)):
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\backends\cudnn\__init__.py", line 97, in is_acceptable
if not _init():
File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\backends\cudnn\__init__.py", line 60, in _init
raise RuntimeError(base_error_msg)
RuntimeError: cuDNN version incompatibility: PyTorch was compiled against (8, 7, 0) but found runtime version (8, 6, 0). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN.```
As an integrator, I would like to have access to the same logic being used by the bark_perform.py
script via bark.api
.
Please relocate these methods to the api and instead, import and use them through the perform script.
Additionally, the methods you have updated in the api have not had their doc strings updated correctly, so it confusingly returns a tuple
now instead of the audio array.
Hey, I'm a bit confused on how to use the prompt_file_separator command... Would it be possible to provide a basic example on how to do so ?
Hello there, so the question is "How to use GPU using bark"?
Windows 10, RTX 4090
E:\Magazyn\Grafika\AI\Text2Voice\bark>python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3
Loading Bark models...
Models loaded.
Estimated time: 6.00 seconds.
Generating: It is a mistake to think you can solve any major problems just with potatoes.
Using speaker: en_speaker_3
history_prompt in gen: en_speaker_3
en_speaker_3
aa
100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 100/100 [00:12<00:00, 8.20it/s]
100%|โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ| 31/31 [00:35<00:00, 1.15s/it]
Traceback (most recent call last):
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 175, in
main(args)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 157, in main
gen_and_save_audio(prompt, history_prompt, text_temp, waveform_temp, filename, output_dir)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 97, in gen_and_save_audio
save_audio_to_file(filename, audio_array, SAMPLE_RATE, output_dir=output_dir)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 63, in save_audio_to_file
sf.write(filepath, audio_array, sample_rate, format=format, subtype=subtype)
File "C:\Users\jurandfantom\miniconda3\lib\site-packages\soundfile.py", line 338, in write
data = np.asarray(data)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.
E:\Magazyn\Grafika\AI\Text2Voice\bark>
Traceback (most recent call last):
File "D:\5118\movielearning\testbark\test.py", line 1, in
from bark import SAMPLE_RATE, generate_audio
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark_init_.py", line 1, in
from .api import generate_audio, text_to_semantic, semantic_to_waveform
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark\api.py", line 5, in
from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark\generation.py", line 24, in
torch.cuda.is_bf16_supported()
AttributeError: module 'torch.cuda' has no attribute 'is_bf16_supported'
Just noticed this bug. In the meantime, if you run into it you can go into the 'Even More Options' tab and set
--output_filename my_filename
And it won't try and use the prompt. It will just add numbers for duplicated files.
I'm trying to install on MacOS 13.2.1. When I get to mamba env create -f environment-cuda.yml
it gives me the error
Could not solve for environment specs
The following packages are incompatible
โโ cudatoolkit 11.8.0** does not exist (perhaps a typo or a missing channel);
โโ pytorch-cuda 11.8** is uninstallable because it requires
โโ cuda 11.8.* , which does not exist (perhaps a missing channel).
On NVIDA's website it says "NVIDIAยฎ CUDA Toolkit 11.8 no longer supports development or running applications on macOS."
So am I out of luck? Does anyone know a way to get this running on MacOS?
Thanks!
This is what I get.
Loading Bark models... No GPU being used. Careful, Inference might be extremely slow!
I have a 2080 TI and it works with with SD and GPTs.
Any ideas?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.