Giter Site home page Giter Site logo

Comments (3)

ylzz1997 avatar ylzz1997 commented on June 15, 2024

因为diffusion在高质量数据下的上限比sovits要高,推理起来听着更好也正常
但是diffusion对UVR的数据集鲁棒性很差

from so-vits-svc.

happyman2025 avatar happyman2025 commented on June 15, 2024

因为diffusion在高质量数据下的上限比sovits要高,推理起来听着更好也正常 但是diffusion对UVR的数据集鲁棒性很差

原来还有这个因素,下次使用 UVR 数据试试 diffusion 的效果。

from so-vits-svc.

iiallgaii avatar iiallgaii commented on June 15, 2024

请勾选下方的确认框。

* [x]  我已仔细阅读[README.md](https://github.com/svc-develop-team/so-vits-svc/blob/4.0/README_zh_CN.md)和[wiki中的Quick solution](https://github.com/svc-develop-team/so-vits-svc/wiki/Quick-solution)。

* [x]  我已通过各种搜索引擎排查问题,我要提出的问题并不常见。

* [x]  我未在使用由第三方用户提供的一键包/环境包。

系统平台版本号

win 10

GPU 型号

3060 12g

Python版本

3.10.6

PyTorch版本

2.0.1+cu118

sovits分支

4.0(默认)

数据集来源(用于判断数据集质量)

自行录制,一半是歌声

出现问题的环节或执行的命令

训练 so-vits-svc 4.0 模型

问题描述

我是新手,使用 so-vits-svc 4.0 推理 webUI,为何 diffusion 模型生成的效果要远好於 sovits 的模型?

一般使用30至45分钟的资料,使用预设 config,没有修改批次大小, sovits 模型训练约20万步,推理出来的歌声总是沙哑,或突然出现电流声音.

但是 diffusion 模型只训练了3万步,推理出来的歌声已很不错,虽还不够完美,但比 sovits 模型要好很多。

混合 sovits 模型和 diffusion 模型後,感觉比只用 diffusion 要差一些,但比只用sovits 模型要好。

为什麽会有这情况,是否训练不够多?

日志

N/A

截图so-vits-svclogs/44k文件夹并粘贴到此处

image

补充说明

No response

hi , can you share your diffusion.yaml ?

model:
k_step_max: 0
n_chans: 512
n_hidden: 256
n_layers: 20
n_spk: 1
timesteps: 1000 <---- you set 1000 here too?
type: Diffusion
use_pitch_aug: true
spk:
Ai_Beyond_KaKui: 0
train:
amp_dtype: fp32
batch_size: 48
cache_all_data: true
cache_device: cuda
cache_fp16: true
decay_step: 100000
epochs: 100000
gamma: 0.5
interval_force_save: 2000
interval_log: 10
interval_val: 2000
lr: 0.0002
num_workers: 2
save_opt: false
weight_decay: 0

from so-vits-svc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.