Comments (3)
因为diffusion在高质量数据下的上限比sovits要高,推理起来听着更好也正常
但是diffusion对UVR的数据集鲁棒性很差
from so-vits-svc.
因为diffusion在高质量数据下的上限比sovits要高,推理起来听着更好也正常 但是diffusion对UVR的数据集鲁棒性很差
原来还有这个因素,下次使用 UVR 数据试试 diffusion 的效果。
from so-vits-svc.
请勾选下方的确认框。
* [x] 我已仔细阅读[README.md](https://github.com/svc-develop-team/so-vits-svc/blob/4.0/README_zh_CN.md)和[wiki中的Quick solution](https://github.com/svc-develop-team/so-vits-svc/wiki/Quick-solution)。 * [x] 我已通过各种搜索引擎排查问题,我要提出的问题并不常见。 * [x] 我未在使用由第三方用户提供的一键包/环境包。
系统平台版本号
win 10
GPU 型号
3060 12g
Python版本
3.10.6
PyTorch版本
2.0.1+cu118
sovits分支
4.0(默认)
数据集来源(用于判断数据集质量)
自行录制,一半是歌声
出现问题的环节或执行的命令
训练 so-vits-svc 4.0 模型
问题描述
我是新手,使用 so-vits-svc 4.0 推理 webUI,为何 diffusion 模型生成的效果要远好於 sovits 的模型?
一般使用30至45分钟的资料,使用预设 config,没有修改批次大小, sovits 模型训练约20万步,推理出来的歌声总是沙哑,或突然出现电流声音.
但是 diffusion 模型只训练了3万步,推理出来的歌声已很不错,虽还不够完美,但比 sovits 模型要好很多。
混合 sovits 模型和 diffusion 模型後,感觉比只用 diffusion 要差一些,但比只用sovits 模型要好。
为什麽会有这情况,是否训练不够多?
日志
N/A截图
so-vits-svc
、logs/44k
文件夹并粘贴到此处补充说明
No response
hi , can you share your diffusion.yaml ?
model:
k_step_max: 0
n_chans: 512
n_hidden: 256
n_layers: 20
n_spk: 1
timesteps: 1000 <---- you set 1000 here too?
type: Diffusion
use_pitch_aug: true
spk:
Ai_Beyond_KaKui: 0
train:
amp_dtype: fp32
batch_size: 48
cache_all_data: true
cache_device: cuda
cache_fp16: true
decay_step: 100000
epochs: 100000
gamma: 0.5
interval_force_save: 2000
interval_log: 10
interval_val: 2000
lr: 0.0002
num_workers: 2
save_opt: false
weight_decay: 0
from so-vits-svc.
Related Issues (20)
- ValueError: math domain error
- [mps] issue with Apple silicon compatibility HOT 1
- [Help]: 特征检索在webui推理中无法使用 HOT 8
- [Help] Where to download the latest weight? error, emb_g.weight is not in the checkpoint HOT 2
- [Help]: subprocess-exited-with-error HOT 1
- [Bug]: KL散度为负数 HOT 3
- [Bug]: ERROR: Failed building wheel for pyworld on Google Colab HOT 3
- issue about speaker embedding
- dlopen: cannot load any more object with static TLS HOT 1
- [Bug]: HOT 2
- [Bug]:ERROR: During reasoning, the final result cannot be output
- [Bug]: ValueError: array is not C-contiguous When using feature_retrieval 使用特征检索时报错 HOT 1
- [Help]: 是否有支持VITS模型ONNX的计划? HOT 1
- [Bug]: new Shallow diffusion bug? HOT 3
- [Bug]: Package conflict with numpy
- [Help]: OOM error on 24 GB GPU upon inference HOT 1
- [Bug]: 4.0下如果数据集过多生成F0会导致爆显存 HOT 10
- [Help]: 如何使用自己的预训练模型 HOT 2
- [Bug]: 中文文件名会出现编码错误 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from so-vits-svc.