Giter Site home page Giter Site logo

jonathanfly / bark Goto Github PK

View Code? Open in Web Editor NEW

This project forked from suno-ai/bark

968.0 968.0 91.0 19.81 MB

๐Ÿš€ BARK INFINITY GUI CMD ๐ŸŽถ Powered Up Bark Text-prompted Generative Audio Model

License: MIT License

Python 3.66% Jupyter Notebook 96.34% Batchfile 0.01% Dockerfile 0.01%
ai audio bark grado machine-learning text-to-speech torch tts

bark's People

Contributors

afrogthatexists avatar alyxdow avatar gkucsko avatar jn-jairo avatar jonathanfly avatar kmfreyberg avatar marjan2k avatar mcamac avatar melmass avatar mikeyshulman avatar pansapiens avatar pleonard212 avatar steinhaug avatar uetuluk avatar vaibhavs10 avatar zygi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bark's Issues

ModuleNotFoundError: No module named 'encodec'

What do I have to do to make JonathanFly/bark work? I already have barkwebui inst. (oneclickinst.) and then copied the files from your repo to the bark folder and "pip install soundfile" via cmd inst.
I get the following error when i run this python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3 command in cmd.

Microsoft Windows [Version 10.0.22621.1555]
(c) Microsoft Corporation. Alle Rechte vorbehalten.

D:\AI\Bark_WebUI\bark>python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3
Traceback (most recent call last):
File "D:\AI\Bark_WebUI\bark\bark_speak.py", line 3, in
from bark import SAMPLE_RATE, generate_audio, preload_models
File "D:\AI\Bark_WebUI\bark\bark_init_.py", line 1, in
from .api import generate_audio, text_to_semantic, semantic_to_waveform
File "D:\AI\Bark_WebUI\bark\bark\api.py", line 5, in
from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
File "D:\AI\Bark_WebUI\bark\bark\generation.py", line 7, in
from encodec import EncodecModel
ModuleNotFoundError: No module named 'encodec'

D:\AI\Bark_WebUI\bark>

Cloning own voice

Could someone describe to me how to clone my own voice?
I see .wav and .npz files here but Is there any way to add my own voice?

GPU problems with CUDA detection, No module named 'encoded' and "No GPU being used. Careful" fixed

1080ti not detected, while working with stable-diffusion and GPTchat

First encountered No module named 'encoded'
Fixed with running: python -m pip install . (yes, include the .)

Second "No GPU being used. Careful, Inference might be extremely slow!" message

type python, hit enter
then type import torch and hit enter
now type torch.cuda.is_available() and see if it says true or false

if false go to pytorch and follow the steps for manual reinstall

if still not working while rest is working (like in my case)
download anaconda or miniconda, create a clean environment and start over
this did the trick finally for me, while I still had to take again the steps above.

where is Ramshackle Gradio App and dev branch?

I want to generate long consistent and natural audio for my youtube videos to make a voice over : https://www.youtube.com/SECourses

I tried tortoise TTS even cloned a voice but natural and high quality enough

I need consistency and high quality

Reading the repo it says Ramshackle Gradio App

But can't find this dev branch or anything

I can be volunteer to test and make a tutorial video

my discord : MonsterMMORPG#2198

Is there a way to do line breaks like with the GUI?

It seems that line breaks might help with song/rap structure -or it may have no effect, is very hard to tell!

Anyway, is there a way to do that with command line prompt?

(This is a great project, thank you so much!)

PyTorch Stream Reader failed reading zip archive: failed finding central directory

multiple attempts to generate audio resulted in this runtime error.
relevant img attached.
followed the mamba install, using windows 10. running on rtx 3090. only change was the directory of the bark clone from c-drive to g-drive (place for my ai-related stuff).
also occurs when i try to pre-load models. double checked, the models are located at their supposedly correct locations:
C:\Users\Admin\.cache\suno\bark_v0
mamba pytorchstream

Bug in Gradio UI: Audio preview not updating with numbered file names

I have encountered a bug in the Gradio UI where the audio preview does not update if the file name has a number appended to it. For example, if the original file name is "audio.wav" and a new file is generated with the name "audio_1.wav", the UI only loads the original audio preview and does not update to the newly generated file.

To reproduce this issue, please follow these steps:

  1. Generate a prompt
  2. Generate another prompt
  3. Observe that the UI only displays the original audio preview and does not update to the newly generated file.

I have checked the console output and it is clear that the script is not correctly processing numbered files.

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 189: character maps to <undefined>

 File "C:\Users\darkl\bark\bark_webui.py", line 172, in generate_audio_long_gradio
    trim_logs()
  File "C:\Users\darkl\bark\bark_webui.py", line 686, in trim_logs
    lines = f.readlines()
  File "C:\Users\darkl\mambaforge\envs\bark-infinity-oneclick\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 189: character maps to <undefined>

Minor bugs

A few minor bugs, nothing game-breaking!

  • When there is a : in the text prompt, such as "TV AD: Blah blah text here", then the script attempts to save the files with an unsanitized filename. On Windows, : is an illegal character in the filename, which causes no files to be saved out, only a broken 0kb file with the name up to the :. Might be best to strip out everything that isn't regular text.

  • When executing the same prompt twice, then any already existing speaker file will be overwritten. Only the wave file gets a _1 appended to the filename. Would be nice if this also applied to the speaker files.

requirements.txt

Thanks for making this wrapper!
Is it possible to include a requirements.txt file for this project?

RuntimeError: Unrecognized CachingAllocator option: garbage_collection_threshold

While attempting to run the script as written in readme i get

Traceback (most recent call last):
  File "D:\tmp\bark\bark-inf\bark_perform.py", line 3, in <module>
    from bark import SAMPLE_RATE, generate_audio, preload_models
  File "D:\tmp\bark\bark-inf\bark\__init__.py", line 1, in <module>
    from .api import generate_audio, text_to_semantic, semantic_to_waveform
  File "D:\tmp\bark\bark-inf\bark\api.py", line 3, in <module>
    from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
  File "D:\tmp\bark\bark-inf\bark\generation.py", line 24, in <module>
    torch.cuda.is_bf16_supported()
  File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 92, in is_bf16_supported
    return torch.cuda.get_device_properties(torch.cuda.current_device()).major >= 8 and cuda_maj_decide
  File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 481, in current_device
    _lazy_init()
  File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 216, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Unrecognized CachingAllocator option: garbage_collection_threshold

I have gtx 1060 with 6gb vram, windows 10 and python 3.10.6
oh and PYTORCH_CUDA_ALLOC_CONF is garbage_collection_threshold:0.6,max_split_size_mb:128

C-drive installation? How to launch from non-C?

The BAT appears to have the expectation it will install on C: under username
@echo off
call %USERPROFILE%\mambaforge\Scripts\activate.bat bark-infinity-oneclick
python %USERPROFILE%\bark\bark_webui.py
pause

But I installed on E:\Bark
How do I modify this to properly run from E:?

a bit confusing instruction

so need to install both mambaforge and miniforge for Windows? Because when installing the mambaforge I see nothing about miniforge

Low VRAM or CUDA out of memory for BARK INFINITY Solved

python bark_perform.py --use_smaller_models --text_prompt "Hello, world testing this for Bark Infinity since it's giving CUDA out of memory error so to overcome this error just simply use --use_smaller_models as args" --split_by_words 35

in web-ui version of bark and in suno-ai/bark we can do ["SUNO_USE_SMALL_MODELS"] = "True" but can't in this Infinity so to overcome this CUDA out of memory error simply use --use_smaller_models as args in cmd

Feature: Troubleshooting Mambaforge Installation in One-Click PowerShell Script

Hi everyone,

I have been working on creating a one-click PowerShell script to install and set up Bark from this repo. The script is designed to download and install Mambaforge, Git, and other required packages, then clone the Bark repository, create a virtual environment, and activate it.

However, I'm experiencing issues with the Mambaforge installation step. Although the script downloads the Mambaforge installer and seemingly starts the installation, it appears that nothing actually gets installed. The script then continues to the next steps, which eventually fail due to the missing Mambaforge installation.

I am looking for assistance in identifying and resolving the issue with the Mambaforge installation. If you have experience with PowerShell scripting, any suggestions or insights would be greatly appreciated. Your input will help improve the script and make the Bark setup process smoother and more efficient for users.

Thank you for your help!

if (-not ([Security.Principal.WindowsPrincipal] [Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole] "Administrator")) {
    Write-Host "This script requires Administrator privileges. Please run the script as Administrator."
    return
}

function Check-Error {
    param (
        [string]$ErrorMessage
    )

    if ($LASTEXITCODE -ne 0) {
        throw $ErrorMessage
    }
}

function Test-CommandExists {
    param (
        [string]$Command
    )

    try {
        Get-Command $Command -ErrorAction Stop | Out-Null
        return $true
    } catch {
        return $false
    }
}

if (-not (Test-CommandExists "mamba")) {
    # Download Mambaforge
    $mambaForgeExe = "Mambaforge-Windows-x86_64.exe"
    if (-not (Test-Path $mambaForgeExe)) {
        Write-Host "Downloading Mambaforge..."
        try {
            Invoke-WebRequest -Uri "https://github.com/conda-forge/miniforge/releases/latest/download/$mambaForgeExe" -OutFile $mambaForgeExe
        } catch {
            throw "Failed to download Mambaforge: $_"
        }
    } else {
        Write-Host "Mambaforge installer already downloaded, skipping download."
    }

    # Install Mambaforge
    Write-Host "Installing Mambaforge..."
    try {
        Start-Process -FilePath ".\$mambaForgeExe" -ArgumentList "/S /D=%UserProfile%\miniforge" -Wait
    } catch {
        throw "Failed to install Mambaforge: $_"
    }

    # Add Mambaforge to Path
    $Env:Path = "$Env:UserProfile\miniforge\Scripts;$Env:Path"
} else {
    Write-Host "Mambaforge is already installed, skipping download and installation."
}

# Check if git is installed
if (-not (Test-CommandExists "git")) {
    $gitExe = "Git-2.40.1-64-bit.exe"
    if (-not (Test-Path $gitExe)) {
        Write-Host "Downloading Git..."
        Invoke-WebRequest -Uri "https://github.com/git-for-windows/git/releases/download/v2.40.1.windows.1/$gitExe" -OutFile $gitExe
    } else {
        Write-Host "Git installer already downloaded, skipping download."
    }

    Write-Host "Installing Git..."
    Start-Process -FilePath ".\$gitExe" -ArgumentList "/VERYSILENT" -Wait
} else {
    Write-Host "Git is already installed, skipping installation."
}

# Clone Bark repository
Write-Host "Cloning Bark repository..."
if (-not (Test-Path ".\bark")) {
    git clone https://github.com/JonathanFly/bark.git
    Check-Error "Failed to clone Bark repository"
} else {
    Write-Host "Bark repository already exists, skipping cloning."
}

# Change to Bark directory
Set-Location ".\bark"

# Create and activate environment
Write-Host "Creating and activating environment..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe env create -f environment-cuda.yml
Check-Error "Failed to create environment"

Write-Host "Activating environment..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe activate bark-infinity-oneclick
Check-Error "Failed to activate environment"

# Install additional packages
Write-Host "Installing additional packages..."
pip install encodec
pip install rich-argparse

# Uninstall and reinstall soundfile
Write-Host "Fixing soundfile installation..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe uninstall pysoundfile
pip install soundfile

Write-Host "Installation complete. Please use the Miniforge Prompt to start Bark."

Here is the current PowerShell script: one_click_install.zip

Error when run "python bark_webui.py"

Traceback (most recent call last):
File "/home/xxx/sound/bark/bark_webui.py", line 350, in
with gr.Blocks(theme=default_theme,css=bark_console_style) as demo:
File "/home/xxx/.local/lib/python3.11/site-packages/gradio/blocks.py", line 1285, in exit
self.config = self.get_config_file()
^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxx/.local/lib/python3.11/site-packages/gradio/blocks.py", line 1261, in get_config_file
"input": list(block.input_api_info()), # type: ignore
^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxx/anaconda3/envs/bark/lib/python3.11/site-packages/gradio_client/serializing.py", line 40, in input_api_info
return (api_info["serialized_input"][0], api_info["serialized_input"][1])
~~~~~~~~^^^^^^^^^^^^^^^^^^^^
KeyError: 'serialized_input'

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Loading Bark models...
Loading text model from C:\Users\Ariana.cache\suno\bark_v0\text_2.pt to cpu
Loading coarse model from C:\Users\Ariana.cache\suno\bark_v0\coarse_2.pt to cpu
Traceback (most recent call last):
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_perform.py", line 128, in
main(namespace_args)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_perform.py", line 95, in main
generation.preload_models(args.text_use_gpu, args.text_use_small, args.coarse_use_gpu, args.coarse_use_small, args.fine_use_gpu, args.fine_use_small, args.codec_use_gpu, args.force_reload)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 888, in preload_models
_ = load_model(
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 781, in load_model
model = _load_model_f(ckpt_path, device)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 813, in _load_model
checkpoint = torch.load(ckpt_path, map_location=device)
File "C:\Users\Ariana\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\serialization.py", line 797, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "C:\Users\Ariana\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\serialization.py", line 283, in init
super().init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

miniforge annoyances in regular windows version

I mostly test in WSL2 and just noticed miniforge in base windows has some annoying bits. The console has errors, even though everything is still working. And you can't control-c it, it just sits there.

I can't use certain formatting in prompts!

When I attempt to use the symbol "โ™ช" in the prompt to indicate a singing voice it just gives me an error?, do I need to format it in a specific way (besides this -> โ™ช I'm the king of the jungle โ™ช ) or is there a setting that I need to check first (typing [singing] seems to sometimes work making the voice sing what comes after it, is that intentional or what?)

Also the using the speaker like in (Man: Hi ... Woman: Hello) isn't consistent at all, is there a setting I need to adjust for it to work?

made batch file for autorun and probably an issue with --history_prompt

I made a quick batch file to run the script with args. On weekend i will make GUI with tkinter so that anyone can copy paste their prompt and run program with their selected speaker and selected settings. If anyone wants to take over i will provide the files i have made.

And many thanks to OP for making such awesome AI work with larger text.

BTW: i noticed one thing, when it splits text the first one gives --history_prompt in gen: chosen speaker but on next generations it gives --history_prompt in gen: none. could that be fixed?

OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll" or one of its dependencies.

D:\AI\bark>python bark_perform.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes... or can you? (and the next page, and the next page...)" --split_by_words 35
Traceback (most recent call last):
  File "D:\AI\bark\bark_perform.py", line 3, in <module>
    from bark import SAMPLE_RATE, generate_audio, preload_models
  File "D:\AI\bark\bark\__init__.py", line 1, in <module>
    from .api import generate_audio, text_to_semantic, semantic_to_waveform
  File "D:\AI\bark\bark\api.py", line 3, in <module>
    from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
  File "D:\AI\bark\bark\generation.py", line 7, in <module>
    from encodec import EncodecModel
  File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\encodec\__init__.py", line 12, in <module>
    from .model import EncodecModel
  File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\encodec\model.py", line 14, in <module>
    import torch
  File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\__init__.py", line 133, in <module>
    raise err
OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll" or one of its dependencies.

I tried to reinstall torch but it still didnt work

torch.cuda.OutOfMemoryError: CUDA out of memory

I am getting the out of memory issue on my 8GB card i noticed you had posted a comment about using --use_smaller_models. I installed the gui version from the instructions and then updated the files with the one in your repo. I am unsure where to put the --use_smaller_models would appreciate some help. Thank you.

problem with bark models during test run

Hi, when I try to run test voice Bark starts to download few GB of models

python3 bark_perform.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes... or can you? (and the next page, and the next page...)" --split_by_words 35
Loading Bark models...

finally it gives me

No GPU being used. Careful, Inference might be extremely slow!
No GPU being used. Careful, Inference might be extremely slow!
No GPU being used. Careful, Inference might be extremely slow!
Downloading: "https://dl.fbaipublicfiles.com/encodec/v0/encodec_24khz-d7cc33bc.th" to /Users/paulinajaskulska/.cache/torch/hub/checkpoints/encodec_24khz-d7cc33bc.th
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1454, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 513, in wrap_socket
    return self.sslsocket_class._create(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1071, in _create
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1342, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark_perform.py", line 312, in <module>
    main(args)
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark_perform.py", line 245, in main
    preload_models()
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 312, in preload_models
    _ = load_codec_model(use_gpu=use_gpu, force_reload=True)
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 290, in load_codec_model
    model = _load_codec_model(device)
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 254, in _load_codec_model
    model = EncodecModel.encodec_model_24khz()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/encodec/model.py", line 279, in encodec_model_24khz
    state_dict = EncodecModel._get_pretrained(checkpoint_name, repository)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/encodec/model.py", line 262, in _get_pretrained
    return torch.hub.load_state_dict_from_url(url, map_location='cpu', check_hash=True)  # type:ignore
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/hub.py", line 746, in load_state_dict_from_url
    download_url_to_file(url, cached_file, hash_prefix, progress=progress)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/hub.py", line 611, in download_url_to_file
    u = urlopen(req)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)>

what can I do about it?

Installation stuck

Hello!
Thank you for making this application.

I find it very interesting but I have stuck with inconvenient way to install it.
It would be great to me an who doesn't familiar with python to make a full installer or provide a step by step instruction of installation.
Cause its frustration to find out how to install python, pip and use them.
Thank you for your work.

is this mamba thing really required ?

Actually I just created a virtual environment and made a git + install pip requirements.
Works fine so far, also webui starts, however Voice generation only works on CPU it says. So i guess that mamba thing is for cuda.
Automatic1111 seems to work without mamba?

Long text file change voices in the middle

I try the new version. thare is no option of --stable-voice in the commands...
Even though I selected en_speaker_1 the voices changed in the middle of the process. Can this be fixed? Sending the file.

of_finding_him_so_de-SPK-en_speaker_1.mp4

UnicodeEncodeError: 'charmap' codec can't encode characters in position 122-241:

UnicodeEncodeError: 'charmap' codec can't encode characters in position 122-241: character maps to
*** You may need to add PYTHONIOENCODING=utf-8 to your environment ***

This error seems to randomly appear and does not go away. No idea what causes it. Sometimes the prompts work, and other times the same prompt triggers that error, and just continues to get worse after it starts.

WebUI won't open.

I pulled to the newest version of bark, installed the requirements using
pip install -r requirements-pip.txt
and tried running the webui. However, it errors out here:

Traceback (most recent call last):
  File "/home/rlt/bark/bark_webui.py", line 5, in <module>
    import gradio as gr
  File "/home/rlt/.local/lib/python3.10/site-packages/gradio/__init__.py", line 3, in <module>
    import gradio.components as components
  File "/home/rlt/.local/lib/python3.10/site-packages/gradio/components.py", line 55, in <module>
    from gradio import processing_utils, utils
  File "/home/rlt/.local/lib/python3.10/site-packages/gradio/utils.py", line 38, in <module>
    import matplotlib
  File "/home/rlt/.local/lib/python3.10/site-packages/matplotlib/__init__.py", line 107, in <module>
    from collections import MutableMapping
ImportError: cannot import name 'MutableMapping' from 'collections' (/usr/lib/python3.10/collections/__init__.py)

I am using Python 3.10.6. Any advice? Same result in miniconda.

Thanks in advance.

Request for Multiple Pass Quality Assurance for Voice Generation

I would like to request the implementation of a multiple pass quality assurance process for the voice generation program. The aim of this process is to regenerate any output audio segments that do not meet a specified quality standard.

According to ChatGPT, a quality assessment function can be created to evaluate the output audio segment. This function can use a combination of signal processing and machine learning techniques to analyze the audio and determine if it is distorted or of poor quality. Some useful metrics for this assessment include signal-to-noise ratio (SNR), total harmonic distortion (THD), and perceptual evaluation of speech quality (PESQ).

I believe that implementing this multiple pass quality assurance process will significantly improve the overall quality of the generated voice output

summarizing the contributions

  1. voices are randomly summoned without a history prompt. bark infinity lets you persist these voices for future use.
  2. chunk arbitrarily long texts into smaller pieces, then generate each separately.
  3. travolta mode: ignore the EOS token and keep on generating audio.

super cool stuff!

cuDNN Version Incompatibility

Hi there, getting an error just before my audio file is generated:

RuntimeError: cuDNN version incompatibility: PyTorch was compiled against (8, 7, 0) but found runtime version (8, 6, 0). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN.

Any help on this is possible? Thank you!

Full log before it breaks -

--Segment 1/1: est. 0.40s
test
Loading text model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\text_2.pt to cpu
_load_model model loaded: 312.3M params, 1.269 loss                      generation.py:840Loading coarse model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\coarse_2.pt to cpu
_load_model model loaded: 314.4M params, 2.901 loss                      generation.py:840Loading fine model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\fine_2.pt to cpu
_load_model model loaded: 302.1M params, 2.079 loss                      generation.py:840Traceback (most recent call last):
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\routes.py", line 394, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\blocks.py", line 1075, in process_api
    result = await self.call_function(
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\blocks.py", line 884, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\to_thread.py", line 28, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable,
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\_backends\_asyncio.py", line 818, in run_sync_in_worker_thread
    return await future
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\_backends\_asyncio.py", line 754, in run
    result = context.run(func, *args)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\helpers.py", line 587, in tracked_fn
    response = fn(*args)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_webui.py", line 205, in generate_audio_long_gradio
    full_generation_segments, audio_arr_segments, final_filename_will_be = api.generate_audio_long_from_gradio(**kwargs)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 467, in generate_audio_long_from_gradio
    full_generation_segments, audio_arr_segments, final_filename_will_be = generate_audio_long(**kwargs)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 577, in generate_audio_long
    full_generation, audio_arr = generate_audio_barki(text=segment_text, **kwargs)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 434, in generate_audio_barki
    audio_arr = codec_decode(fine_tokens)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\generation.py", line 747, in codec_decode
    model.to(models_devices["codec"])
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 1145, in to
    return self._apply(convert)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  [Previous line repeated 1 more time]
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 202, in _apply
    self._init_flat_weights()
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 139, in _init_flat_weights
    self.flatten_parameters()
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 169, in flatten_parameters
    not torch.backends.cudnn.is_acceptable(fw.data)):
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\backends\cudnn\__init__.py", line 97, in is_acceptable
    if not _init():
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\backends\cudnn\__init__.py", line 60, in _init
    raise RuntimeError(base_error_msg)
RuntimeError: cuDNN version incompatibility: PyTorch was compiled  against (8, 7, 0) but found runtime version (8, 6, 0). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN.```

new API interfaces for the new methods

As an integrator, I would like to have access to the same logic being used by the bark_perform.py script via bark.api.

Please relocate these methods to the api and instead, import and use them through the perform script.

Additionally, the methods you have updated in the api have not had their doc strings updated correctly, so it confusingly returns a tuple now instead of the audio array.

"The requested array has an inhomogeneous shape after 1 dimensions"

Windows 10, RTX 4090

  1. Git cloned repo
  2. missing encodec - instal it via "pip install -U encodec"
  3. missing funcy - instal it via "pip install -U funcy"
  4. missing scipy - instal it via "pip install -U scipy"
  5. as following

E:\Magazyn\Grafika\AI\Text2Voice\bark>python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3
Loading Bark models...
Models loaded.
Estimated time: 6.00 seconds.
Generating: It is a mistake to think you can solve any major problems just with potatoes.
Using speaker: en_speaker_3
history_prompt in gen: en_speaker_3
en_speaker_3
aa
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 100/100 [00:12<00:00, 8.20it/s]
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 31/31 [00:35<00:00, 1.15s/it]
Traceback (most recent call last):
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 175, in
main(args)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 157, in main
gen_and_save_audio(prompt, history_prompt, text_temp, waveform_temp, filename, output_dir)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 97, in gen_and_save_audio
save_audio_to_file(filename, audio_array, SAMPLE_RATE, output_dir=output_dir)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 63, in save_audio_to_file
sf.write(filepath, audio_array, sample_rate, format=format, subtype=subtype)
File "C:\Users\jurandfantom\miniconda3\lib\site-packages\soundfile.py", line 338, in write
data = np.asarray(data)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

E:\Magazyn\Grafika\AI\Text2Voice\bark>

is_bf16_supported

Traceback (most recent call last):
File "D:\5118\movielearning\testbark\test.py", line 1, in
from bark import SAMPLE_RATE, generate_audio
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark_init_.py", line 1, in
from .api import generate_audio, text_to_semantic, semantic_to_waveform
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark\api.py", line 5, in
from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark\generation.py", line 24, in
torch.cuda.is_bf16_supported()
AttributeError: module 'torch.cuda' has no attribute 'is_bf16_supported'

Installing on MacOS -- No CUDA available?

I'm trying to install on MacOS 13.2.1. When I get to mamba env create -f environment-cuda.yml it gives me the error

Could not solve for environment specs
The following packages are incompatible
โ”œโ”€ cudatoolkit 11.8.0**  does not exist (perhaps a typo or a missing channel);
โ””โ”€ pytorch-cuda 11.8**  is uninstallable because it requires
   โ””โ”€ cuda 11.8.* , which does not exist (perhaps a missing channel).

On NVIDA's website it says "NVIDIAยฎ CUDA Toolkit 11.8 no longer supports development or running applications on macOS."

So am I out of luck? Does anyone know a way to get this running on MacOS?

Thanks!

I can't get it to run on the GPU

This is what I get.

Loading Bark models... No GPU being used. Careful, Inference might be extremely slow!
I have a 2080 TI and it works with with SD and GPTs.

Any ideas?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.