jonathanfly / bark Goto Github PK

View Code? Open in Web Editor NEW

This project forked from suno-ai/bark

968.0 968.0 91.0 19.81 MB

🚀 BARK INFINITY GUI CMD 🎶 Powered Up Bark Text-prompted Generative Audio Model

License: MIT License

Python 3.66% Jupyter Notebook 96.34% Batchfile 0.01% Dockerfile 0.01%

ai audio bark grado machine-learning text-to-speech torch tts

bark's People

Contributors

Stargazers

Watchers

Forkers

markrosos mirprog ausboss ethanpisani rwl4 iammarcin bigchain h-h-h-h purplesparkle devxpy drewthomasson kromond szhaomsft user-0123456789 arcadianer chikiuso geomorillo diruuu poechant alex625051 soidaken tronmetatron ndhuyquant mrwonderfulness architectindustries marjan2k mwang87 iambaney limlabs siris-rose twenty-3rd personthe boringcrypto chugarah zhengzaixiazai shoutian38 jawharkod t-mc4 melvinebenezer lik1122 anrize-ltd oojjcorp dmater01 jojooojjcorp melmass tyler-newton yatesdr pansapiens uetuluk devwearsprada mywdka jpollard-cs platform-kit-team fbradyirl mikekmiller voklur 0xjace csyderek steinhaug skyprolk rhardock paulsunnypark g-force78 rahul-sindhu jmanhype harryvu bigsk1 deepu-89 xiaomazzzz deephansda diaeryani revmagi gersooonn anngdev fengnet0769 zarevi4 jason571 dimka2555 id006701 heyafro gotomypc chatbotsgpt kudoushinichi6 wsk3373 epietrocola harshagowda gignus79 blueskyscorpio bheardnetwork quickpanda youssef-sourour

bark's Issues

ModuleNotFoundError: No module named 'encodec'

What do I have to do to make JonathanFly/bark work? I already have barkwebui inst. (oneclickinst.) and then copied the files from your repo to the bark folder and "pip install soundfile" via cmd inst.
I get the following error when i run this python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3 command in cmd.

D:\AI\Bark_WebUI\bark>python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3
Traceback (most recent call last):
File "D:\AI\Bark_WebUI\bark\bark_speak.py", line 3, in
from bark import SAMPLE_RATE, generate_audio, preload_models
File "D:\AI\Bark_WebUI\bark\bark_init_.py", line 1, in
from .api import generate_audio, text_to_semantic, semantic_to_waveform
File "D:\AI\Bark_WebUI\bark\bark\api.py", line 5, in
from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
File "D:\AI\Bark_WebUI\bark\bark\generation.py", line 7, in
from encodec import EncodecModel
ModuleNotFoundError: No module named 'encodec'

D:\AI\Bark_WebUI\bark>

Make this a PR?

You make this a PR

[feature request] oobabooga extension

Request to add this feature in the future. Especially with your gradio one click installer coming soon :)

[feature request] pth voice compatibility

https://huggingface.co/QuickWick/Music-AI-Voices and https://github.com/neonbjb/tortoise-tts

this would allow people to easily take voices they made in tortoise into bark

Cloning own voice

Could someone describe to me how to clone my own voice?
I see .wav and .npz files here but Is there any way to add my own voice?

GPU problems with CUDA detection, No module named 'encoded' and "No GPU being used. Careful" fixed

1080ti not detected, while working with stable-diffusion and GPTchat

First encountered No module named 'encoded'
Fixed with running: python -m pip install . (yes, include the .)

Second "No GPU being used. Careful, Inference might be extremely slow!" message

type python, hit enter
then type import torch and hit enter
now type torch.cuda.is_available() and see if it says true or false

if false go to pytorch and follow the steps for manual reinstall

if still not working while rest is working (like in my case)
download anaconda or miniconda, create a clean environment and start over
this did the trick finally for me, while I still had to take again the steps above.

where is Ramshackle Gradio App and dev branch?

I want to generate long consistent and natural audio for my youtube videos to make a voice over : https://www.youtube.com/SECourses

I tried tortoise TTS even cloned a voice but natural and high quality enough

I need consistency and high quality

Reading the repo it says Ramshackle Gradio App

But can't find this dev branch or anything

I can be volunteer to test and make a tutorial video

my discord : MonsterMMORPG#2198

branch missing in howto

git clone -b https://github.com/JonathanFly/bark.git
--> Leads to an error

git clone -b main https://github.com/JonathanFly/bark.git
--> works

More comprehensive command list structure and gui

It would greatly help to have a guide for structuring commands so they work properly, and we can concentrate on only making music or sound effects. A Gui would also be greatly appreciated.

Is there a way to do line breaks like with the GUI?

It seems that line breaks might help with song/rap structure -or it may have no effect, is very hard to tell!

Anyway, is there a way to do that with command line prompt?

(This is a great project, thank you so much!)

Request for a version of the new model that will work as a notebook in Google Collab

Thank you for youre project!!!
It is possible to make a Colab version?

PyTorch Stream Reader failed reading zip archive: failed finding central directory

multiple attempts to generate audio resulted in this runtime error.
relevant img attached.
followed the mamba install, using windows 10. running on rtx 3090. only change was the directory of the bark clone from c-drive to g-drive (place for my ai-related stuff).
also occurs when i try to pre-load models. double checked, the models are located at their supposedly correct locations:
C:\Users\Admin\.cache\suno\bark_v0

Bug in Gradio UI: Audio preview not updating with numbered file names

I have encountered a bug in the Gradio UI where the audio preview does not update if the file name has a number appended to it. For example, if the original file name is "audio.wav" and a new file is generated with the name "audio_1.wav", the UI only loads the original audio preview and does not update to the newly generated file.

To reproduce this issue, please follow these steps:

Generate a prompt
Generate another prompt
Observe that the UI only displays the original audio preview and does not update to the newly generated file.

I have checked the console output and it is clear that the script is not correctly processing numbered files.

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 189: character maps to <undefined>

 File "C:\Users\darkl\bark\bark_webui.py", line 172, in generate_audio_long_gradio
    trim_logs()
  File "C:\Users\darkl\bark\bark_webui.py", line 686, in trim_logs
    lines = f.readlines()
  File "C:\Users\darkl\mambaforge\envs\bark-infinity-oneclick\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 189: character maps to <undefined>

Minor bugs

A few minor bugs, nothing game-breaking!

When there is a : in the text prompt, such as "TV AD: Blah blah text here", then the script attempts to save the files with an unsanitized filename. On Windows, : is an illegal character in the filename, which causes no files to be saved out, only a broken 0kb file with the name up to the :. Might be best to strip out everything that isn't regular text.
When executing the same prompt twice, then any already existing speaker file will be overwritten. Only the wave file gets a _1 appended to the filename. Would be nice if this also applied to the speaker files.

requirements.txt

Thanks for making this wrapper!
Is it possible to include a requirements.txt file for this project?

Output iterations in the webgui don't reset the history of the prompt

RuntimeError: Unrecognized CachingAllocator option: garbage_collection_threshold

While attempting to run the script as written in readme i get

Traceback (most recent call last):
  File "D:\tmp\bark\bark-inf\bark_perform.py", line 3, in <module>
    from bark import SAMPLE_RATE, generate_audio, preload_models
  File "D:\tmp\bark\bark-inf\bark\__init__.py", line 1, in <module>
    from .api import generate_audio, text_to_semantic, semantic_to_waveform
  File "D:\tmp\bark\bark-inf\bark\api.py", line 3, in <module>
    from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
  File "D:\tmp\bark\bark-inf\bark\generation.py", line 24, in <module>
    torch.cuda.is_bf16_supported()
  File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 92, in is_bf16_supported
    return torch.cuda.get_device_properties(torch.cuda.current_device()).major >= 8 and cuda_maj_decide
  File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 481, in current_device
    _lazy_init()
  File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 216, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Unrecognized CachingAllocator option: garbage_collection_threshold

I have gtx 1060 with 6gb vram, windows 10 and python 3.10.6
oh and PYTORCH_CUDA_ALLOC_CONF is garbage_collection_threshold:0.6,max_split_size_mb:128

error：Something went wrong Expecting value: line 1 column 1 (char 0)

D:\bark>python bark_webui.py
Running on local URL: http://127.0.0.1:7860
To create a public link, set share=True in launch().

after open http://127.0.0.1:7860，the interface shows error：

C-drive installation? How to launch from non-C?

The BAT appears to have the expectation it will install on C: under username
@echo off
call %USERPROFILE%\mambaforge\Scripts\activate.bat bark-infinity-oneclick
python %USERPROFILE%\bark\bark_webui.py
pause

But I installed on E:\Bark
How do I modify this to properly run from E:?

a bit confusing instruction

so need to install both mambaforge and miniforge for Windows? Because when installing the mambaforge I see nothing about miniforge

Out of GPU VRAM luanching using python bark_perform.py command

Win10

rtx 2060 super 8gb gpu

Low VRAM or CUDA out of memory for BARK INFINITY Solved

python bark_perform.py --use_smaller_models --text_prompt "Hello, world testing this for Bark Infinity since it's giving CUDA out of memory error so to overcome this error just simply use --use_smaller_models as args" --split_by_words 35

in web-ui version of bark and in suno-ai/bark we can do ["SUNO_USE_SMALL_MODELS"] = "True" but can't in this Infinity so to overcome this CUDA out of memory error simply use --use_smaller_models as args in cmd

Feature: Troubleshooting Mambaforge Installation in One-Click PowerShell Script

Hi everyone,

I have been working on creating a one-click PowerShell script to install and set up Bark from this repo. The script is designed to download and install Mambaforge, Git, and other required packages, then clone the Bark repository, create a virtual environment, and activate it.

However, I'm experiencing issues with the Mambaforge installation step. Although the script downloads the Mambaforge installer and seemingly starts the installation, it appears that nothing actually gets installed. The script then continues to the next steps, which eventually fail due to the missing Mambaforge installation.

I am looking for assistance in identifying and resolving the issue with the Mambaforge installation. If you have experience with PowerShell scripting, any suggestions or insights would be greatly appreciated. Your input will help improve the script and make the Bark setup process smoother and more efficient for users.

Thank you for your help!

if (-not ([Security.Principal.WindowsPrincipal] [Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole] "Administrator")) {
    Write-Host "This script requires Administrator privileges. Please run the script as Administrator."
    return
}

function Check-Error {
    param (
        [string]$ErrorMessage
    )

    if ($LASTEXITCODE -ne 0) {
        throw $ErrorMessage
    }
}

function Test-CommandExists {
    param (
        [string]$Command
    )

    try {
        Get-Command $Command -ErrorAction Stop | Out-Null
        return $true
    } catch {
        return $false
    }
}

if (-not (Test-CommandExists "mamba")) {
    # Download Mambaforge
    $mambaForgeExe = "Mambaforge-Windows-x86_64.exe"
    if (-not (Test-Path $mambaForgeExe)) {
        Write-Host "Downloading Mambaforge..."
        try {
            Invoke-WebRequest -Uri "https://github.com/conda-forge/miniforge/releases/latest/download/$mambaForgeExe" -OutFile $mambaForgeExe
        } catch {
            throw "Failed to download Mambaforge: $_"
        }
    } else {
        Write-Host "Mambaforge installer already downloaded, skipping download."
    }

    # Install Mambaforge
    Write-Host "Installing Mambaforge..."
    try {
        Start-Process -FilePath ".\$mambaForgeExe" -ArgumentList "/S /D=%UserProfile%\miniforge" -Wait
    } catch {
        throw "Failed to install Mambaforge: $_"
    }

    # Add Mambaforge to Path
    $Env:Path = "$Env:UserProfile\miniforge\Scripts;$Env:Path"
} else {
    Write-Host "Mambaforge is already installed, skipping download and installation."
}

# Check if git is installed
if (-not (Test-CommandExists "git")) {
    $gitExe = "Git-2.40.1-64-bit.exe"
    if (-not (Test-Path $gitExe)) {
        Write-Host "Downloading Git..."
        Invoke-WebRequest -Uri "https://github.com/git-for-windows/git/releases/download/v2.40.1.windows.1/$gitExe" -OutFile $gitExe
    } else {
        Write-Host "Git installer already downloaded, skipping download."
    }

    Write-Host "Installing Git..."
    Start-Process -FilePath ".\$gitExe" -ArgumentList "/VERYSILENT" -Wait
} else {
    Write-Host "Git is already installed, skipping installation."
}

# Clone Bark repository
Write-Host "Cloning Bark repository..."
if (-not (Test-Path ".\bark")) {
    git clone https://github.com/JonathanFly/bark.git
    Check-Error "Failed to clone Bark repository"
} else {
    Write-Host "Bark repository already exists, skipping cloning."
}

# Change to Bark directory
Set-Location ".\bark"

# Create and activate environment
Write-Host "Creating and activating environment..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe env create -f environment-cuda.yml
Check-Error "Failed to create environment"

Write-Host "Activating environment..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe activate bark-infinity-oneclick
Check-Error "Failed to activate environment"

# Install additional packages
Write-Host "Installing additional packages..."
pip install encodec
pip install rich-argparse

# Uninstall and reinstall soundfile
Write-Host "Fixing soundfile installation..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe uninstall pysoundfile
pip install soundfile

Write-Host "Installation complete. Please use the Miniforge Prompt to start Bark."

Here is the current PowerShell script: one_click_install.zip

Error when run "python bark_webui.py"

Traceback (most recent call last):
File "/home/xxx/sound/bark/bark_webui.py", line 350, in
with gr.Blocks(theme=default_theme,css=bark_console_style) as demo:
File "/home/xxx/.local/lib/python3.11/site-packages/gradio/blocks.py", line 1285, in exit
self.config = self.get_config_file()
^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxx/.local/lib/python3.11/site-packages/gradio/blocks.py", line 1261, in get_config_file
"input": list(block.input_api_info()), # type: ignore
^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxx/anaconda3/envs/bark/lib/python3.11/site-packages/gradio_client/serializing.py", line 40, in input_api_info
return (api_info["serialized_input"][0], api_info["serialized_input"][1])
~~~~~~~~^^^^^^^^^^^^^^^^^^^^
KeyError: 'serialized_input'

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Loading Bark models...
Loading text model from C:\Users\Ariana.cache\suno\bark_v0\text_2.pt to cpu
Loading coarse model from C:\Users\Ariana.cache\suno\bark_v0\coarse_2.pt to cpu
Traceback (most recent call last):
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_perform.py", line 128, in
main(namespace_args)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_perform.py", line 95, in main
generation.preload_models(args.text_use_gpu, args.text_use_small, args.coarse_use_gpu, args.coarse_use_small, args.fine_use_gpu, args.fine_use_small, args.codec_use_gpu, args.force_reload)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 888, in preload_models
_ = load_model(
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 781, in load_model
model = _load_model_f(ckpt_path, device)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 813, in _load_model
checkpoint = torch.load(ckpt_path, map_location=device)
File "C:\Users\Ariana\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\serialization.py", line 797, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "C:\Users\Ariana\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\serialization.py", line 283, in init
super().init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

miniforge annoyances in regular windows version

I mostly test in WSL2 and just noticed miniforge in base windows has some annoying bits. The console has errors, even though everything is still working. And you can't control-c it, it just sits there.

I can't use certain formatting in prompts!

When I attempt to use the symbol "♪" in the prompt to indicate a singing voice it just gives me an error?, do I need to format it in a specific way (besides this -> ♪ I'm the king of the jungle ♪ ) or is there a setting that I need to check first (typing [singing] seems to sometimes work making the voice sing what comes after it, is that intentional or what?)

Also the using the speaker like in (Man: Hi ... Woman: Hello) isn't consistent at all, is there a setting I need to adjust for it to work?

made batch file for autorun and probably an issue with --history_prompt

I made a quick batch file to run the script with args. On weekend i will make GUI with tkinter so that anyone can copy paste their prompt and run program with their selected speaker and selected settings. If anyone wants to take over i will provide the files i have made.

And many thanks to OP for making such awesome AI work with larger text.

BTW: i noticed one thing, when it splits text the first one gives --history_prompt in gen: chosen speaker but on next generations it gives --history_prompt in gen: none. could that be fixed?

OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll" or one of its dependencies.

D:\AI\bark>python bark_perform.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes... or can you? (and the next page, and the next page...)" --split_by_words 35
Traceback (most recent call last):
  File "D:\AI\bark\bark_perform.py", line 3, in <module>
    from bark import SAMPLE_RATE, generate_audio, preload_models
  File "D:\AI\bark\bark\__init__.py", line 1, in <module>
    from .api import generate_audio, text_to_semantic, semantic_to_waveform
  File "D:\AI\bark\bark\api.py", line 3, in <module>
    from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
  File "D:\AI\bark\bark\generation.py", line 7, in <module>
    from encodec import EncodecModel
  File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\encodec\__init__.py", line 12, in <module>
    from .model import EncodecModel
  File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\encodec\model.py", line 14, in <module>
    import torch
  File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\__init__.py", line 133, in <module>
    raise err
OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll" or one of its dependencies.

I tried to reinstall torch but it still didnt work

torch.cuda.OutOfMemoryError: CUDA out of memory

I am getting the out of memory issue on my 8GB card i noticed you had posted a comment about using --use_smaller_models. I installed the gui version from the instructions and then updated the files with the one in your repo. I am unsure where to put the --use_smaller_models would appreciate some help. Thank you.

problem with bark models during test run

Hi, when I try to run test voice Bark starts to download few GB of models

python3 bark_perform.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes... or can you? (and the next page, and the next page...)" --split_by_words 35
Loading Bark models...

finally it gives me

No GPU being used. Careful, Inference might be extremely slow!
No GPU being used. Careful, Inference might be extremely slow!
No GPU being used. Careful, Inference might be extremely slow!
Downloading: "https://dl.fbaipublicfiles.com/encodec/v0/encodec_24khz-d7cc33bc.th" to /Users/paulinajaskulska/.cache/torch/hub/checkpoints/encodec_24khz-d7cc33bc.th
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1454, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 513, in wrap_socket
    return self.sslsocket_class._create(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1071, in _create
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1342, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark_perform.py", line 312, in <module>
    main(args)
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark_perform.py", line 245, in main
    preload_models()
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 312, in preload_models
    _ = load_codec_model(use_gpu=use_gpu, force_reload=True)
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 290, in load_codec_model
    model = _load_codec_model(device)
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 254, in _load_codec_model
    model = EncodecModel.encodec_model_24khz()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/encodec/model.py", line 279, in encodec_model_24khz
    state_dict = EncodecModel._get_pretrained(checkpoint_name, repository)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/encodec/model.py", line 262, in _get_pretrained
    return torch.hub.load_state_dict_from_url(url, map_location='cpu', check_hash=True)  # type:ignore
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/hub.py", line 746, in load_state_dict_from_url
    download_url_to_file(url, cached_file, hash_prefix, progress=progress)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/hub.py", line 611, in download_url_to_file
    u = urlopen(req)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)>

what can I do about it?

Installation stuck

Hello!
Thank you for making this application.

I find it very interesting but I have stuck with inconvenient way to install it.
It would be great to me an who doesn't familiar with python to make a full installer or provide a step by step instruction of installation.
Cause its frustration to find out how to install python, pip and use them.
Thank you for your work.

is this mamba thing really required ?

Actually I just created a virtual environment and made a git + install pip requirements.
Works fine so far, also webui starts, however Voice generation only works on CPU it says. So i guess that mamba thing is for cuda.
Automatic1111 seems to work without mamba?

Long text file change voices in the middle

I try the new version. thare is no option of --stable-voice in the commands...
Even though I selected en_speaker_1 the voices changed in the middle of the process. Can this be fixed? Sending the file.

of_finding_him_so_de-SPK-en_speaker_1.mp4

UnicodeEncodeError: 'charmap' codec can't encode characters in position 122-241:

UnicodeEncodeError: 'charmap' codec can't encode characters in position 122-241: character maps to
*** You may need to add PYTHONIOENCODING=utf-8 to your environment ***

This error seems to randomly appear and does not go away. No idea what causes it. Sometimes the prompts work, and other times the same prompt triggers that error, and just continues to get worse after it starts.

how can I run this from another script?

Sorry for the basic question, but I'd love to use this on my laptop (rtx 3050ti) in one of my python programs

WebUI won't open.

I pulled to the newest version of bark, installed the requirements using
pip install -r requirements-pip.txt
and tried running the webui. However, it errors out here:

Traceback (most recent call last):
  File "/home/rlt/bark/bark_webui.py", line 5, in <module>
    import gradio as gr
  File "/home/rlt/.local/lib/python3.10/site-packages/gradio/__init__.py", line 3, in <module>
    import gradio.components as components
  File "/home/rlt/.local/lib/python3.10/site-packages/gradio/components.py", line 55, in <module>
    from gradio import processing_utils, utils
  File "/home/rlt/.local/lib/python3.10/site-packages/gradio/utils.py", line 38, in <module>
    import matplotlib
  File "/home/rlt/.local/lib/python3.10/site-packages/matplotlib/__init__.py", line 107, in <module>
    from collections import MutableMapping
ImportError: cannot import name 'MutableMapping' from 'collections' (/usr/lib/python3.10/collections/__init__.py)

I am using Python 3.10.6. Any advice? Same result in miniconda.

Thanks in advance.

Request for Multiple Pass Quality Assurance for Voice Generation

I would like to request the implementation of a multiple pass quality assurance process for the voice generation program. The aim of this process is to regenerate any output audio segments that do not meet a specified quality standard.

According to ChatGPT, a quality assessment function can be created to evaluate the output audio segment. This function can use a combination of signal processing and machine learning techniques to analyze the audio and determine if it is distorted or of poor quality. Some useful metrics for this assessment include signal-to-noise ratio (SNR), total harmonic distortion (THD), and perceptual evaluation of speech quality (PESQ).

I believe that implementing this multiple pass quality assurance process will significantly improve the overall quality of the generated voice output

".editorconfig" that forces formatting to prevent too many changes

".editorconfig" that forces formatting to prevent the 100+ line "changes"

Originally posted by @rmcc3 in #44 (comment)

summarizing the contributions

voices are randomly summoned without a history prompt. bark infinity lets you persist these voices for future use.
chunk arbitrarily long texts into smaller pieces, then generate each separately.
travolta mode: ignore the EOS token and keep on generating audio.

super cool stuff!

cuDNN Version Incompatibility

Hi there, getting an error just before my audio file is generated:

RuntimeError: cuDNN version incompatibility: PyTorch was compiled against (8, 7, 0) but found runtime version (8, 6, 0). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN.

Any help on this is possible? Thank you!

Full log before it breaks -

--Segment 1/1: est. 0.40s
test
Loading text model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\text_2.pt to cpu
_load_model model loaded: 312.3M params, 1.269 loss                      generation.py:840Loading coarse model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\coarse_2.pt to cpu
_load_model model loaded: 314.4M params, 2.901 loss                      generation.py:840Loading fine model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\fine_2.pt to cpu
_load_model model loaded: 302.1M params, 2.079 loss                      generation.py:840Traceback (most recent call last):
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\routes.py", line 394, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\blocks.py", line 1075, in process_api
    result = await self.call_function(
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\blocks.py", line 884, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\to_thread.py", line 28, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable,
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\_backends\_asyncio.py", line 818, in run_sync_in_worker_thread
    return await future
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\_backends\_asyncio.py", line 754, in run
    result = context.run(func, *args)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\helpers.py", line 587, in tracked_fn
    response = fn(*args)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_webui.py", line 205, in generate_audio_long_gradio
    full_generation_segments, audio_arr_segments, final_filename_will_be = api.generate_audio_long_from_gradio(**kwargs)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 467, in generate_audio_long_from_gradio
    full_generation_segments, audio_arr_segments, final_filename_will_be = generate_audio_long(**kwargs)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 577, in generate_audio_long
    full_generation, audio_arr = generate_audio_barki(text=segment_text, **kwargs)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 434, in generate_audio_barki
    audio_arr = codec_decode(fine_tokens)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\generation.py", line 747, in codec_decode
    model.to(models_devices["codec"])
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 1145, in to
    return self._apply(convert)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  [Previous line repeated 1 more time]
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 202, in _apply
    self._init_flat_weights()
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 139, in _init_flat_weights
    self.flatten_parameters()
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 169, in flatten_parameters
    not torch.backends.cudnn.is_acceptable(fw.data)):
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\backends\cudnn\__init__.py", line 97, in is_acceptable
    if not _init():
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\backends\cudnn\__init__.py", line 60, in _init
    raise RuntimeError(base_error_msg)
RuntimeError: cuDNN version incompatibility: PyTorch was compiled  against (8, 7, 0) but found runtime version (8, 6, 0). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN.```

new API interfaces for the new methods

As an integrator, I would like to have access to the same logic being used by the bark_perform.py script via bark.api.

Please relocate these methods to the api and instead, import and use them through the perform script.

Additionally, the methods you have updated in the api have not had their doc strings updated correctly, so it confusingly returns a tuple now instead of the audio array.

Using prompt_file_separator in the CLI

Hey, I'm a bit confused on how to use the prompt_file_separator command... Would it be possible to provide a basic example on how to do so ?

How to use GPU using bark?

Hello there, so the question is "How to use GPU using bark"?

"The requested array has an inhomogeneous shape after 1 dimensions"

Windows 10, RTX 4090

Git cloned repo
missing encodec - instal it via "pip install -U encodec"
missing funcy - instal it via "pip install -U funcy"
missing scipy - instal it via "pip install -U scipy"
as following

E:\Magazyn\Grafika\AI\Text2Voice\bark>python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3
Loading Bark models...
Models loaded.
Estimated time: 6.00 seconds.
Generating: It is a mistake to think you can solve any major problems just with potatoes.
Using speaker: en_speaker_3
history_prompt in gen: en_speaker_3
en_speaker_3
aa
100%|████████████████████████████████████████████████████████████████████████████████| 100/100 [00:12<00:00, 8.20it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 31/31 [00:35<00:00, 1.15s/it]
Traceback (most recent call last):
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 175, in
main(args)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 157, in main
gen_and_save_audio(prompt, history_prompt, text_temp, waveform_temp, filename, output_dir)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 97, in gen_and_save_audio
save_audio_to_file(filename, audio_array, SAMPLE_RATE, output_dir=output_dir)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 63, in save_audio_to_file
sf.write(filepath, audio_array, sample_rate, format=format, subtype=subtype)
File "C:\Users\jurandfantom\miniconda3\lib\site-packages\soundfile.py", line 338, in write
data = np.asarray(data)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

E:\Magazyn\Grafika\AI\Text2Voice\bark>

is_bf16_supported

Traceback (most recent call last):
File "D:\5118\movielearning\testbark\test.py", line 1, in
from bark import SAMPLE_RATE, generate_audio
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark_init_.py", line 1, in
from .api import generate_audio, text_to_semantic, semantic_to_waveform
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark\api.py", line 5, in
from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark\generation.py", line 24, in
torch.cuda.is_bf16_supported()
AttributeError: module 'torch.cuda' has no attribute 'is_bf16_supported'

Some foreign characters in prompts can fail to save in windows

Just noticed this bug. In the meantime, if you run into it you can go into the 'Even More Options' tab and set

--output_filename my_filename

And it won't try and use the prompt. It will just add numbers for duplicated files.

Installing on MacOS -- No CUDA available?

I'm trying to install on MacOS 13.2.1. When I get to mamba env create -f environment-cuda.yml it gives me the error

Could not solve for environment specs
The following packages are incompatible
├─ cudatoolkit 11.8.0**  does not exist (perhaps a typo or a missing channel);
└─ pytorch-cuda 11.8**  is uninstallable because it requires
   └─ cuda 11.8.* , which does not exist (perhaps a missing channel).

On NVIDA's website it says "NVIDIA® CUDA Toolkit 11.8 no longer supports development or running applications on macOS."

So am I out of luck? Does anyone know a way to get this running on MacOS?

Thanks!

I can't get it to run on the GPU

This is what I get.

Loading Bark models... No GPU being used. Careful, Inference might be extremely slow!
I have a 2080 TI and it works with with SD and GPTs.

Any ideas?