sortanon / controllabletalknet Goto Github PK
View Code? Open in Web Editor NEWA web app that lets you play around with TalkNet models
License: GNU Affero General Public License v3.0
A web app that lets you play around with TalkNet models
License: GNU Affero General Public License v3.0
IIRC there are other compatible GANs and a lot of new stuff coming out. is Univnet possible? FreGAN2?
Hello sort anon pls update model lists and make new models if possible regardless of other tech talknet still has its benefits
What does Drive ID for custom model imply? the full directory to the file? This is confusing.
i wonder if its possible to add derpy whooves, and doctor whooves and other ponys like gallus and some of the others like that
I've been running into a ton of issues when running setup.bat on my Windows 10 laptop. I think I fixed the initial issues ("io.h not found", "error cannot open file 'kernel32.lib'") by setting up environment variables, but now there's constant errors that seem to appear when attempting to build a wheel.
I'm probably an idiot and might not have installed something properly, since I don't have any knowledge of python, c++, or errors in general, but I would appreciate any pointers that could help me fix these issues and run your webapp offline. Thank you!
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
did I set docker up wrong?
I noticed the current version of dash requires an older workzeug. This pull request edits the dockerfile to force that dependency to downgrade to a working version.
It appears that the two Google Colab notebooks for training and using Controllable TalkNet do not work properly anymore. I constantly get errors that various modules are missing and even when I add lines to install them manually, some of them still don't work (NeMo in particular). These notebooks used to work for me with no problems before, but not anymore. It appears that many of the dependencies have been updated and function differently. If that is the case, will the notebooks be updated at some point?
please make these two models for talk net
fid = open(filename, 'rb')
IsADirectoryError: [Errno 21] Is a directory: '/home/jordancruz/Tools/ControllableTalkNet/training/basil-training-data/basil-data/Basiliska-the-lamia-locally-trained/_training_data/wavs/'
FYI:
I would have posted this bug report to the PPP first, but do to said area not having an "audio captcha" for Visually Impaired Anons, this is the next best thing.
After the interface loads and I navigate via the tab key to reach the combo box featuring a list of voices, nothing speaks when I arrow down through said choices. The only way I am able to confirm my choice is to tab once then arrow back by line. Otherwise, I need to shift-tab bak or arrow up to the combo box itself to choose a different voice.
In addition, if there is a way you could please point out where the button is for accessing the folder icon (by mentioning the icon's title), that would be most helpful.
NB. I use the VoiceOver screen reader created by Apple, thus Mileage may vary with other assistive technology products found on either Linux, Windows or mobile.
hello sortanon please solve this issue as new anons come oonto the scene are unable to download the model cause it says download failed when the link is completely fine
Hi, I have a question. I’ve been trying to use controllable Talknet on colab to synthesize speech using an audio reference. This seems to work fine for very short samples but when trying somewhat longer reference audio colab does not seem to work.
To get around this problem I tweaked some bits of the code to make it run locally on a jupyter notebook. I can use the TTS functionality just fine, however, I am not able to synthesize using reference audio of any length. When I try I get this error message:
line 437, in tensors=[torch.repeat_interleave(text1, durs1) for text1, durs1 in zip(x, reps)], value=pad, dtype=x.dtype, RuntimeError: repeats has to be Long tensor.
Is this something you ever encounterd?
Starting TalkNet server. Close this window to shut down the server.
Traceback (most recent call last):
File "talknet_offline.py", line 3, in
from controllable_talknet import *
File "C:\Users\alex_\Desktop\talknet controller\ControllableTalkNet\controllable_talknet.py", line 5, in
from jupyter_dash import JupyterDash
ModuleNotFoundError: No module named 'jupyter_dash'
Hello,
I've noticed that in the singing models there is a TalkNetSinger.nemo file that is not present in the non-singing models; however, there is nothing in the training code provided in the offline training notebook wrt generating this file. How do we generate this file?
Hello getting the following errors when trying to run a docker container:
root@DESKTOP-7A6UGRU:/home/snufas/github_projects/docker_talknet# docker run -it --gpus all -p 8050:8050 talknet-offline
Updating TalkNet...
Updating HiFi-GAN...
Updating Python dependencies...
ERROR: pytorch-lightning 1.7.0 has requirement tensorboard>=2.9.1, but you'll have tensorboard 2.4.1 which is incompatible.
ERROR: pytorch-lightning 1.7.0 has requirement torch>=1.9., but you'll have torch 1.8.1+cu111 which is incompatible.
ERROR: pytorch-lightning 1.7.0 has requirement typing-extensions>=4.0.0, but you'll have typing-extensions 3.7.4.3 which is incompatible.
ERROR: tensorboard 2.9.1 has requirement protobuf<3.20,>=3.9.2, but you'll have protobuf 3.20.1 which is incompatible.
ERROR: pytorch-lightning 1.7.0 has requirement torch>=1.9., but you'll have torch 1.8.1+cu111 which is incompatible.
ERROR: pytorch-lightning 1.7.0 has requirement typing-extensions>=4.0.0, but you'll have typing-extensions 3.7.4.3 which is incompatible.
Launching TalkNet...
Traceback (most recent call last):
File "talknet_offline.py", line 3, in
from controllable_talknet import *
File "/talknet/controllable_talknet.py", line 3, in
import dash
File "/usr/local/lib/python3.8/dist-packages/dash/init.py", line 5, in
from .dash import Dash, no_update # noqa: F401,E402
File "/usr/local/lib/python3.8/dist-packages/dash/dash.py", line 20, in
import flask
File "/usr/local/lib/python3.8/dist-packages/flask/init.py", line 4, in
from . import json as json
File "/usr/local/lib/python3.8/dist-packages/flask/json/init.py", line 8, in
from ..globals import current_app
File "/usr/local/lib/python3.8/dist-packages/flask/globals.py", line 56, in
app_ctx: "AppContext" = LocalProxy( # type: ignore[assignment]
TypeError: init() got an unexpected keyword argument 'unbound_message'
can you ubdate the dockerfile?
Thanks
When trying to run # Extract phoneme duration
step of TalkNet_Training_Offline notebook, I'm getting random errors in the backward_extractor
function. See the output below;
[NeMo I 2023-05-29 13:30:11 features:252] PADDING: 1
[NeMo I 2023-05-29 13:30:11 features:262] STFT using conv
[NeMo I 2023-05-29 13:30:12 modelPT:439] Model EncDecCTCModel was successfully restored from /home/mmmmllll1/.cache/torch/NeMo/NeMo_1.0.2/qn5x5_libri_tts_phonemes/656c7439dd3a0d614978529371be498b/qn5x5_libri_tts_phonemes.nemo.
[NeMo I 2023-05-29 13:30:13 collections:173] Dataset loaded with 642 files totalling 0.67 hours
[NeMo I 2023-05-29 13:30:13 collections:174] 0 files were filtered totalling 0.00 hours
18%
114/642 [00:48<02:57, 2.98it/s]
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[18], line 94
91 target_tokens = preprocess_tokens(seq_ids, blank_id)
93 f, p = forward_extractor(target_tokens, log_probs, blank_id)
---> 94 durs = backward_extractor(f, p)
96 dur_key = Path(dl.dataset.collection[sample_idx].audio_file).stem
97 dur_data[dur_key] = {
98 'blanks': torch.tensor(durs[::2], dtype=torch.long).cpu().detach(),
99 'tokens': torch.tensor(durs[1::2], dtype=torch.long).cpu().detach()
100 }
Cell In[18], line 45, in backward_extractor(f, p)
43 t -= 1
44 assert durs.shape[0] == n
---> 45 assert np.sum(durs) == m
46 assert np.all(durs[1::2] > 0)
47 return durs
AssertionError:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[20], line 94
91 target_tokens = preprocess_tokens(seq_ids, blank_id)
93 f, p = forward_extractor(target_tokens, log_probs, blank_id)
---> 94 durs = backward_extractor(f, p)
96 dur_key = Path(dl.dataset.collection[sample_idx].audio_file).stem
97 dur_data[dur_key] = {
98 'blanks': torch.tensor(durs[::2], dtype=torch.long).cpu().detach(),
99 'tokens': torch.tensor(durs[1::2], dtype=torch.long).cpu().detach()
100 }
Cell In[20], line 41, in backward_extractor(f, p)
39 s, t = n - 1, m
40 while s > 0:
---> 41 durs[s - 1] += 1
42 s -= p[s, t]
43 t -= 1
IndexError: index 4720093899646973286 is out of bounds for axis 0 with size 49
I'm unsure how I should debug what is causing these issues? I assume there is something wrong with my training input?
Using the instructions given as-is I'm running into this as the sequence of boot events every time I go to run sudo docker run -it --gpus all -p 8050:8050 talknet-offline
:
Updating TalkNet...
Updating HiFi-GAN...
Updating Python dependencies...
Launching TalkNet...
2023-04-09 23:50:33.232655: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
[NeMo W 2023-04-09 23:50:34 optimizers:47] Apex was not found. Using the lamb optimizer will error out.
Traceback (most recent call last):
File "talknet_offline.py", line 3, in <module>
from controllable_talknet import *
File "/talknet/controllable_talknet.py", line 14, in <module>
from nemo.collections.tts.models import TalkNetSpectModel
File "/usr/local/lib/python3.8/dist-packages/nemo/collections/tts/__init__.py", line 15, in <module>
import nemo.collections.tts.data
File "/usr/local/lib/python3.8/dist-packages/nemo/collections/tts/data/__init__.py", line 15, in <module>
import nemo.collections.tts.data.datalayers
File "/usr/local/lib/python3.8/dist-packages/nemo/collections/tts/data/datalayers.py", line 58, in <module>
from nemo.collections.asr.parts.preprocessing.features import WaveformFeaturizer
File "/usr/local/lib/python3.8/dist-packages/nemo/collections/asr/__init__.py", line 15, in <module>
from nemo.collections.asr import data, losses, models, modules
File "/usr/local/lib/python3.8/dist-packages/nemo/collections/asr/models/__init__.py", line 16, in <module>
from nemo.collections.asr.models.classification_models import EncDecClassificationModel
File "/usr/local/lib/python3.8/dist-packages/nemo/collections/asr/models/classification_models.py", line 28, in <module>
from nemo.collections.asr.data import audio_to_label_dataset
File "/usr/local/lib/python3.8/dist-packages/nemo/collections/asr/data/audio_to_label_dataset.py", line 15, in <module>
from nemo.collections.asr.data import audio_to_label
File "/usr/local/lib/python3.8/dist-packages/nemo/collections/asr/data/audio_to_label.py", line 23, in <module>
from nemo.collections.asr.parts.preprocessing.segment import available_formats as valid_sf_formats
File "/usr/local/lib/python3.8/dist-packages/nemo/collections/asr/parts/preprocessing/__init__.py", line 16, in <module>
from nemo.collections.asr.parts.preprocessing.features import (
File "/usr/local/lib/python3.8/dist-packages/nemo/collections/asr/parts/preprocessing/features.py", line 42, in <module>
from librosa.util import tiny
File "/usr/local/lib/python3.8/dist-packages/lazy_loader/__init__.py", line 76, in __getattr__
submod = importlib.import_module(submod_path)
File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/usr/local/lib/python3.8/dist-packages/librosa/util/utils.py", line 17, in <module>
from numpy.typing import ArrayLike, DTypeLike
ModuleNotFoundError: No module named 'numpy.typing'
The most I can work out so far is that some packages fell into dependency hell and are expecting numpy to be 1.20 and not 1.19.2, can't figure out the issues with loading NeMo. The Windows build works fine with multiboot, but this error seems to persist across anything Arch related seeing as it's happened on two fresh installs along with Manjaro.
Please implement derpy in the next model update talking and singing and maybe doctor whooves as well if possible
theres a problem it thinks that there is no tensorflow_hib module not found also when i delete that line from extracts.py it then works but is in some kind of beta and isnt as goiod as the stable build before please fix this, its a major issue
In Poland, voice cloning AI is very popular, but tacotron2 does not allow adding emotions and singing. TalkNet technology seems to be brilliant, I would like to make a version for Polish language, but I don't have much IT knowledge and I need some light help.
I have been practicing for a week a 30-hour Polish audiobook "The Doll" on this Colab notepad: https://colab.research.google.com/drive/1VqSWRU1H3KIU6au_ojOGFtU0HQPUFa6t
However, despite quite a bit of training, it still twists words a lot. I have discovered that the problem is not necessarily
with the model, but perhaps with the synthesis notebook, which is tailored exclusively for English: https://colab.research.google.com/drive/1aj6Jk8cpRw7SsN3JSYCv57CrR6s0gYPB
Everything you type characters into the generator field is converted to English ARPAbet.
Is it possible to disable this conversion?
Alternatively, is it possible to adapt this ARPAbet for the Polish language? But here there is a problem, because in Polish there are consonants which are not present in English, for example "ć", "ś", "ń", "ź".
Would it be possible to make the script able to directly read MIDI files to get durations and pitch? It'd be very helpful in cases where you don't have clean vocals, but you have a MIDI based on the vocals.
I've been looking into the code, and it looks like it might be possible if you make it able to read the note durations and pitches in a MIDI and convert it to the proper format, but I'm not a skilled enough coder to do it.
Alternatively, I use a concatenative singing synthesizer called UTAU, and the .ust files made with it seem fairly simple in terms of structure, so it might even be possible to import durations and pitch from it instead. UST files even contain lyrics for each note, so a transcript could be extracted.
Hello Sort Anon i know others are moving on to different things but is it possible to release new models i still like talknet cause its very versatile in its use i can even change lyrics of songs please make a doctor whooves model ?
Is it possible to make a CPU version of controllable talknet on windows? It should be as someone has already done this on colab
Thank you!
the readme only lets me pull up the main program to make voices not train them.
Selecting "Reduce metallic noise" gives the error "Reconstruction VQGAN failed to download"
However in the terminal, I can see the Google Drive link
**Access denied with the following error:
Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses.
You may still be able to access the file from the browser:
https://drive.google.com/uc?id=1wlilvBtlBiAUEqqdqE0AEqo-UKx2X_cL
**
I was able to manually download this in my browser. Where shall I put this so that ControllableTalkNet can find it? I can add a docker mount if needed.
DiffSVC_gui don't have code for it.
Do you have code for DiffSVC?
Hello! I'm currently using your TalkNet training script on Google Colab (https://colab.research.google.com/drive/1Nb8TWjUBJIVg7QtIazMl64PAY4-QznzI?usp=sharing#scrollTo=nM7-bMpKO7U2) and there's an error that appears on step 7 where the console lists a bunch of missing and unexpected keys. I have absolutely zero experience with Python, so I would appreciate any pointers or tips on how to fix this.
Full Console Log:
https://pastebin.com/8XtFRTQN
The cell in TalkNet_Training_Offline.ipynb
@ https://github.com/SortAnon/ControllableTalkNet/blame/5ee364f5bb1fe63fcde2b690507bd7cd89bfe268/TalkNet_Training_Offline.ipynb#L818-L823
runs
!python train.py --fine_tuning True --config config_v1b.json \
{start_from_universal} \
--checkpoint_interval 250 --checkpoint_path "{os.path.join(output_dir, 'HiFiGAN')}" \
--input_training_file "{hifi_train}" \
--input_validation_file "{hifi_val}" \
--input_wavs_dir "{hifi_wavs}"
But where do train.py
and config_v1b.json
come from? They don't seem to be included in this repository?
Could you add compatibility with languages other than English, such as by compatibility with CSS10 as used by NANSY? https://github.com/Kyubyong/css10
Hello, I'm facing an issue while trying to run this tool. Upon launching it, I encounter an error message stating the following:
"If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
I attempted to downgrade the protobuf package, but unfortunately, it didn't resolve the issue.
training notebook seems to install the wrong versions of the following libraries
toad 0.1.0 which requires numpy>=1.20, but you have numpy 1.19.5 which is incompatible.
konoha 4.6.5 which requires importlib-metadata<4.0.0,>-3.7.0, but you have importlib-metadata 4.11.3 which is incompatible.
google-colab 1.0.0 which requires requests -2.23.0, but you have requests 2.27.1 which is incompatible.
flair 0.8.0 which which requires torch<-1.7.1, >=1.5.0, but you have torch 1.8.1 which is incompatible.
datascience 0.10.6 which requires folium=-0.2.1, but you have folium 0.8.3 which is incompatible
albumentations 0.1.12 which requires imgaug<0.2.7, >=0.2.5, but you have imgaug 0.2.9 which is incompatible.
Currently running the docker container on a linux environment, however when running step 3, it returns the following error:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[4], line 4
1 # Extract phoneme duration
3 import json
----> 4 from nemo.collections.asr.models import EncDecCTCModel
5 asr_model = EncDecCTCModel.from_pretrained(model_name="asr_talknet_aligner").cpu().eval()
7 def forward_extractor(tokens, log_probs, blank):
File ~/anaconda3/envs/talknet/lib/python3.8/site-packages/nemo/collections/asr/__init__.py:15
1 # Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
2 #
3 # Licensed under the Apache License, Version 2.0 (the "License");
(...)
12 # See the License for the specific language governing permissions and
13 # limitations under the License.
---> 15 from nemo.collections.asr import data, losses, models, modules
16 from nemo.package_info import __version__
18 # Set collection version equal to NeMo version.
File ~/anaconda3/envs/talknet/lib/python3.8/site-packages/nemo/collections/asr/losses/__init__.py:15
1 # Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
2 #
3 # Licensed under the Apache License, Version 2.0 (the "License");
(...)
12 # See the License for the specific language governing permissions and
13 # limitations under the License.
---> 15 from nemo.collections.asr.losses.angularloss import AngularSoftmaxLoss
16 from nemo.collections.asr.losses.audio_losses import SDRLoss
17 from nemo.collections.asr.losses.ctc import CTCLoss
File ~/anaconda3/envs/talknet/lib/python3.8/site-packages/nemo/collections/asr/losses/angularloss.py:18
1 # ! /usr/bin/python
2 # Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
3 #
(...)
13 # See the License for the specific language governing permissions and
14 # limitations under the License.
16 import torch
---> 18 from nemo.core.classes import Loss, Typing, typecheck
19 from nemo.core.neural_types import LabelsType, LogitsType, LossType, NeuralType
21 __all__ = ['AngularSoftmaxLoss']
File ~/anaconda3/envs/talknet/lib/python3.8/site-packages/nemo/core/__init__.py:16
1 # Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
2 #
3 # Licensed under the Apache License, Version 2.0 (the "License");
(...)
12 # See the License for the specific language governing permissions and
13 # limitations under the License.
15 import nemo.core.neural_types
---> 16 from nemo.core.classes import *
File ~/anaconda3/envs/talknet/lib/python3.8/site-packages/nemo/core/classes/__init__.py:18
16 import hydra
17 import omegaconf
---> 18 import pytorch_lightning
20 from nemo.core.classes.common import (
21 FileIO,
22 Model,
(...)
27 typecheck,
28 )
29 from nemo.core.classes.dataset import Dataset, IterableDataset
File ~/anaconda3/envs/talknet/lib/python3.8/site-packages/pytorch_lightning/__init__.py:20
17 _PACKAGE_ROOT = os.path.dirname(__file__)
18 _PROJECT_ROOT = os.path.dirname(_PACKAGE_ROOT)
---> 20 from pytorch_lightning import metrics # noqa: E402
21 from pytorch_lightning.callbacks import Callback # noqa: E402
22 from pytorch_lightning.core import LightningDataModule, LightningModule # noqa: E402
File ~/anaconda3/envs/talknet/lib/python3.8/site-packages/pytorch_lightning/metrics/__init__.py:15
1 # Copyright The PyTorch Lightning team.
2 #
3 # Licensed under the Apache License, Version 2.0 (the "License");
(...)
12 # See the License for the specific language governing permissions and
13 # limitations under the License.
---> 15 from pytorch_lightning.metrics.classification import ( # noqa: F401
16 Accuracy,
17 AUC,
18 AUROC,
19 AveragePrecision,
20 ConfusionMatrix,
21 F1,
22 FBeta,
23 HammingDistance,
24 IoU,
25 Precision,
26 PrecisionRecallCurve,
27 Recall,
28 ROC,
29 StatScores,
30 )
31 from pytorch_lightning.metrics.metric import Metric, MetricCollection # noqa: F401
32 from pytorch_lightning.metrics.regression import ( # noqa: F401
33 ExplainedVariance,
34 MeanAbsoluteError,
(...)
39 SSIM,
40 )
File ~/anaconda3/envs/talknet/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/__init__.py:14
1 # Copyright The PyTorch Lightning team.
2 #
3 # Licensed under the Apache License, Version 2.0 (the "License");
(...)
12 # See the License for the specific language governing permissions and
13 # limitations under the License.
---> 14 from pytorch_lightning.metrics.classification.accuracy import Accuracy # noqa: F401
15 from pytorch_lightning.metrics.classification.auc import AUC # noqa: F401
16 from pytorch_lightning.metrics.classification.auroc import AUROC # noqa: F401
File ~/anaconda3/envs/talknet/lib/python3.8/site-packages/pytorch_lightning/metrics/classification/accuracy.py:18
14 from typing import Any, Callable, Optional
16 from torchmetrics import Accuracy as _Accuracy
---> 18 from pytorch_lightning.metrics.utils import deprecated_metrics
21 class Accuracy(_Accuracy):
23 @deprecated_metrics(target=_Accuracy)
24 def __init__(
25 self,
(...)
32 dist_sync_fn: Callable = None,
33 ):
File ~/anaconda3/envs/talknet/lib/python3.8/site-packages/pytorch_lightning/metrics/utils.py:22
20 from torchmetrics.utilities.data import dim_zero_mean as _dim_zero_mean
21 from torchmetrics.utilities.data import dim_zero_sum as _dim_zero_sum
---> 22 from torchmetrics.utilities.data import get_num_classes as _get_num_classes
23 from torchmetrics.utilities.data import select_topk as _select_topk
24 from torchmetrics.utilities.data import to_categorical as _to_categorical
ImportError: cannot import name 'get_num_classes' from 'torchmetrics.utilities.data' (/home/ghostdog/anaconda3/envs/talknet/lib/python3.8/site-packages/torchmetrics/utilities/data.py)
I've tried re-installing torchmetrics version 0.6.0 using the command conda install -c conda-forge torchmetrics=0.6.0
What can I do to remedy this?
Love what you're doing for the community + the world.
We have been working on adapting this tool to work with this repo, might be helpful in ongoing research:
https://github.com/Appen/UHV-OTS-Speech
I don't know how to contact you so I thought this was my best bet. I've noticed when using custom voicebanks on long words and long notes it glitches out. The pony singing banks don't have this glitch so I was wondering how to fix this in my own banks. I was also wondering if there's anyway to make or edit the phoneme converter because I was getting vowel conversion errors.
Setup everything, installed C++ got error on launch.
[NeMo W 2022-12-15 10:29:07 optimizers:47] Apex was not found. Using the lamb optimizer will error out.
[NeMo W 2022-12-15 10:29:07 nemo_logging:349] C:\Users\chlyw\Desktop\Talknet\miniconda\lib\site-packages\torchaudio\extension\extension.py:13: UserWarning: torchaudio C++ extension is not available.
warnings.warn('torchaudio C++ extension is not available.')
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
[NeMo W 2022-12-02 09:58:37 modelPT:138] If you intend to do training or fine-tuning, please call the ModelPT.setup_training_data() method and provide a valid configuration file to setup the train data loader.
Train config :
dataset:
target: nemo.collections.asr.data.audio_to_text.AudioToCharWithDursF0Dataset
manifest_filepath: H:/ControllableTalkNet/tTrump\trainfiles.json
max_duration: null
min_duration: 0.1
int_values: false
load_audio: false
normalize: false
sample_rate: 22050
trim: false
durs_file: H:/ControllableTalkNet/tTrump\durations.pt
f0_file: H:/ControllableTalkNet/tTrump\f0s.pt
blanking: true
vocab:
notation: phonemes
punct: true
spaces: true
stresses: false
add_blank_at: last
dataloader_params:
drop_last: false
shuffle: true
batch_size: 16
num_workers: 4
[NeMo W 2022-12-02 09:58:37 modelPT:145] If you intend to do validation, please call the ModelPT.setup_validation_data() or ModelPT.setup_multiple_validation_data() method and provide a valid configuration file to setup the validation data loader(s).
Validation config :
dataset:
target: nemo.collections.asr.data.audio_to_text.AudioToCharWithDursF0Dataset
manifest_filepath: H:/ControllableTalkNet/tTrump\valfiles.json
max_duration: null
min_duration: 0.1
int_values: false
load_audio: false
normalize: false
sample_rate: 22050
trim: false
durs_file: H:/ControllableTalkNet/tTrump\durations.pt
f0_file: H:/ControllableTalkNet/tTrump\f0s.pt
blanking: true
vocab:
notation: phonemes
punct: true
spaces: true
stresses: false
add_blank_at: last
dataloader_params:
drop_last: false
shuffle: false
batch_size: 16
num_workers: 1
[NeMo I 2022-12-02 09:58:37 modelPT:439] Model TalkNetDursModel was successfully restored from H:\ControllableTalkNet\talknet_durs.nemo.
[NeMo I 2022-12-02 09:58:37 collections:173] Dataset loaded with 134 files totalling 0.21 hours
[NeMo I 2022-12-02 09:58:37 collections:174] 0 files were filtered totalling 0.00 hours
[NeMo I 2022-12-02 09:58:37 collections:173] Dataset loaded with 134 files totalling 0.21 hours
[NeMo I 2022-12-02 09:58:37 collections:174] 0 files were filtered totalling 0.00 hours
[NeMo W 2022-12-02 09:58:37 modelPT:660] The lightning trainer received accelerator: dp. We recommend to use 'ddp' instead.
[NeMo I 2022-12-02 09:58:37 modelPT:751] Optimizer config = Adam (
Parameter Group 0
amsgrad: False
betas: (0.9, 0.999)
eps: 1e-08
lr: 0.001
weight_decay: 1e-06
)
[NeMo I 2022-12-02 09:58:37 lr_scheduler:621] Scheduler "<nemo.core.optim.lr_scheduler.CosineAnnealing object at 0x0000021A2DF86EB0>"
will be used during training (effective maximum steps = 180) -
Parameters :
(min_lr: 3.0e-06
warmup_ratio: 0.02
max_steps: 180
)
Warm-starting from H:\ControllableTalkNet\talknet_durs.nemo
[NeMo I 2022-12-02 09:58:37 exp_manager:216] Experiments will be logged at H:\ControllableTalkNet\tTrump\TalkNetDurs\2022-12-02_09-57-24
[NeMo I 2022-12-02 09:58:37 exp_manager:563] TensorboardLogger has been set up
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[NeMo W 2022-12-02 09:58:38 modelPT:660] The lightning trainer received accelerator: dp. We recommend to use 'ddp' instead.
[NeMo I 2022-12-02 09:58:38 modelPT:751] Optimizer config = Adam (
Parameter Group 0
amsgrad: False
betas: (0.9, 0.999)
eps: 1e-08
lr: 0.001
weight_decay: 1e-06
)
[NeMo I 2022-12-02 09:58:38 lr_scheduler:621] Scheduler "<nemo.core.optim.lr_scheduler.CosineAnnealing object at 0x0000021A2E22DCD0>"
will be used during training (effective maximum steps = 180) -
Parameters :
(min_lr: 3.0e-06
warmup_ratio: 0.02
max_steps: 180
)
PicklingError Traceback (most recent call last)
Cell In[6], line 68
66 initialize(config_path="conf")
67 cfg = compose(config_name="talknet-durs")
---> 68 train(cfg)
Cell In[6], line 62, in train(cfg)
60 exp_manager(trainer, cfg.get('exp_manager', None))
61 trainer.callbacks.extend([pl.callbacks.LearningRateMonitor(), LogEpochTimeCallback()]) # noqa
---> 62 trainer.fit(model)
File ~\anaconda3\envs\talknet\lib\site-packages\pytorch_lightning\trainer\trainer.py:460, in Trainer.fit(self, model, train_dataloader, val_dataloaders, datamodule)
455 # links data to the trainer
456 self.data_connector.attach_data(
457 model, train_dataloader=train_dataloader, val_dataloaders=val_dataloaders, datamodule=datamodule
458 )
--> 460 self._run(model)
462 assert self.state.stopped
463 self.training = False
File ~\anaconda3\envs\talknet\lib\site-packages\pytorch_lightning\trainer\trainer.py:758, in Trainer._run(self, model)
755 self.pre_dispatch()
757 # dispatch start_training
or start_evaluating
or start_predicting
--> 758 self.dispatch()
760 # plugin will finalized fitting (e.g. ddp_spawn will load trained model)
761 self.post_dispatch()
File ~\anaconda3\envs\talknet\lib\site-packages\pytorch_lightning\trainer\trainer.py:799, in Trainer.dispatch(self)
797 self.accelerator.start_predicting(self)
798 else:
--> 799 self.accelerator.start_training(self)
File ~\anaconda3\envs\talknet\lib\site-packages\pytorch_lightning\accelerators\accelerator.py:96, in Accelerator.start_training(self, trainer)
95 def start_training(self, trainer: 'pl.Trainer') -> None:
---> 96 self.training_type_plugin.start_training(trainer)
File ~\anaconda3\envs\talknet\lib\site-packages\pytorch_lightning\plugins\training_type\training_type_plugin.py:144, in TrainingTypePlugin.start_training(self, trainer)
142 def start_training(self, trainer: 'pl.Trainer') -> None:
143 # double dispatch to initiate the training loop
--> 144 self._results = trainer.run_stage()
File ~\anaconda3\envs\talknet\lib\site-packages\pytorch_lightning\trainer\trainer.py:809, in Trainer.run_stage(self)
807 if self.predicting:
808 return self.run_predict()
--> 809 return self.run_train()
File ~\anaconda3\envs\talknet\lib\site-packages\pytorch_lightning\trainer\trainer.py:844, in Trainer.run_train(self)
841 if not self.is_global_zero and self.progress_bar_callback is not None:
842 self.progress_bar_callback.disable()
--> 844 self.run_sanity_check(self.lightning_module)
846 self.checkpoint_connector.has_trained = False
848 # enable train mode
File ~\anaconda3\envs\talknet\lib\site-packages\pytorch_lightning\trainer\trainer.py:1112, in Trainer.run_sanity_check(self, ref_model)
1109 self.on_sanity_check_start()
1111 # run eval step
-> 1112 self.run_evaluation()
1114 self.on_sanity_check_end()
1116 self.state.stage = stage
File ~\anaconda3\envs\talknet\lib\site-packages\pytorch_lightning\trainer\trainer.py:954, in Trainer.run_evaluation(self, on_epoch)
951 dataloader = self.accelerator.process_dataloader(dataloader)
952 dl_max_batches = self.evaluation_loop.max_batches[dataloader_idx]
--> 954 for batch_idx, batch in enumerate(dataloader):
955 if batch is None:
956 continue
File ~\anaconda3\envs\talknet\lib\site-packages\torch\utils\data\dataloader.py:355, in DataLoader.iter(self)
353 return self._iterator
354 else:
--> 355 return self._get_iterator()
File ~\anaconda3\envs\talknet\lib\site-packages\torch\utils\data\dataloader.py:301, in DataLoader._get_iterator(self)
299 else:
300 self.check_worker_number_rationality()
--> 301 return _MultiProcessingDataLoaderIter(self)
File ~\anaconda3\envs\talknet\lib\site-packages\torch\utils\data\dataloader.py:914, in _MultiProcessingDataLoaderIter.init(self, loader)
907 w.daemon = True
908 # NB: Process.start() actually take some time as it needs to
909 # start a process and pass the arguments over via a pipe.
910 # Therefore, we only add a worker to self._workers list after
911 # it started, so that we do not call .join() if program dies
912 # before it starts, and del tries to join but will get:
913 # AssertionError: can only join a started process.
--> 914 w.start()
915 self._index_queues.append(index_queue)
916 self._workers.append(w)
File ~\anaconda3\envs\talknet\lib\multiprocessing\process.py:121, in BaseProcess.start(self)
118 assert not _current_process._config.get('daemon'),
119 'daemonic processes are not allowed to have children'
120 _cleanup()
--> 121 self._popen = self._Popen(self)
122 self._sentinel = self._popen.sentinel
123 # Avoid a refcycle if the target function holds an indirect
124 # reference to the process object (see bpo-30775)
File ~\anaconda3\envs\talknet\lib\multiprocessing\context.py:224, in Process._Popen(process_obj)
222 @staticmethod
223 def _Popen(process_obj):
--> 224 return _default_context.get_context().Process._Popen(process_obj)
File ~\anaconda3\envs\talknet\lib\multiprocessing\context.py:327, in SpawnProcess._Popen(process_obj)
324 @staticmethod
325 def _Popen(process_obj):
326 from .popen_spawn_win32 import Popen
--> 327 return Popen(process_obj)
File ~\anaconda3\envs\talknet\lib\multiprocessing\popen_spawn_win32.py:93, in Popen.init(self, process_obj)
91 try:
92 reduction.dump(prep_data, to_child)
---> 93 reduction.dump(process_obj, to_child)
94 finally:
95 set_spawning_popen(None)
File ~\anaconda3\envs\talknet\lib\multiprocessing\reduction.py:60, in dump(obj, file, protocol)
58 def dump(obj, file, protocol=None):
59 '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60 ForkingPickler(file, protocol).dump(obj)
PicklingError: Can't pickle <class 'nemo.collections.common.parts.preprocessing.collections.AudioTextEntity'>: attribute lookup AudioTextEntity on nemo.collections.common.parts.preprocessing.collections failed
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.