Comments (4)
@nuaazs The 'lr_scheduler' is a warmup cosine scheduler. If you only set the 'num_epoch' to 200 and then resume training, the learning rate will increase at epoch 100. To avoid this, I recommend adjusting the 'lr_scheduler' configuration to maintain a lower lr value. Alternatively, you may simply need to wait and train for more epochs to achieve optimal performance.
from 3d-speaker.
from 3d-speaker.
config.yaml
aug_prob: 0.2
augmentations:
args:
aug_prob: <aug_prob>
noise_file: <noise>
reverb_file: <reverb>
obj: speakerlab.process.processor.SpkVeriAug
batch_size: 256
checkpointer:
args:
checkpoints_dir: <exp_dir>/models
recoverables:
classifier: <classifier>
embedding_model: <embedding_model>
epoch_counter: <epoch_counter>
obj: speakerlab.utils.checkpoint.Checkpointer
classifier:
args:
input_dim: <embedding_size>
out_neurons: <num_classes>
obj: speakerlab.models.campplus.classifier.CosineClassifier
data: data/vox2_dev/train.csv
dataloader:
args:
batch_size: <batch_size>
dataset: <dataset>
drop_last: true
num_workers: <num_workers>
pin_memory: true
obj: torch.utils.data.DataLoader
dataset:
args:
data_file: <data>
preprocessor: <preprocessor>
obj: speakerlab.dataset.dataset.WavSVDataset
embedding_model:
args:
embed_dim: <embedding_size>
feat_dim: <fbank_dim>
num_blocks:
- 3
- 3
- 9
- 3
pooling_func: GSP
obj: speakerlab.models.dfresnet.resnet.DFResNet
embedding_size: 512
epoch_counter:
args:
limit: <num_epoch>
obj: speakerlab.utils.epoch.EpochCounter
exp_dir: exp/dfresnet56
fbank_dim: 80
feature_extractor:
args:
mean_nor: true
n_mels: <fbank_dim>
sample_rate: <sample_rate>
obj: speakerlab.process.processor.FBank
label_encoder:
args:
data_file: <data>
obj: speakerlab.process.processor.SpkLabelEncoder
log_batch_freq: 100
loss:
args:
easy_margin: false
margin: 0.2
scale: 32.0
obj: speakerlab.loss.margin_loss.ArcMarginLoss
lr: 0.1
lr_scheduler:
args:
fix_epoch: <num_epoch>
max_lr: <lr>
min_lr: <min_lr>
optimizer: <optimizer>
step_per_epoch: null
warmup_epoch: 5
obj: speakerlab.process.scheduler.WarmupCosineScheduler
margin_scheduler:
args:
criterion: <loss>
final_margin: 0.2
fix_epoch: 25
increase_start_epoch: 15
initial_margin: 0.0
step_per_epoch: null
obj: speakerlab.process.scheduler.MarginScheduler
min_lr: 0.0001
noise: data/musan/wav.scp
num_classes: 5994
num_epoch: 200
num_workers: 16
optimizer:
args:
lr: <lr>
momentum: 0.9
nesterov: true
params: null
weight_decay: 0.0001
obj: torch.optim.SGD
preprocessor:
augmentations: <augmentations>
feature_extractor: <feature_extractor>
label_encoder: <label_encoder>
wav_reader: <wav_reader>
reverb: data/rirs/wav.scp
sample_rate: 16000
save_epoch_freq: 2
speed_pertub: true
wav_len: 3.0
wav_reader:
args:
duration: <wav_len>
sample_rate: <sample_rate>
speed_pertub: <speed_pertub>
obj: speakerlab.process.processor.WavReader
from 3d-speaker.
Thank you for your response. @wanghuii1
It seems that the poor results are indeed caused by the learning rate being set too high.
Everything works fine after I re-modified the learning rate.
from 3d-speaker.
Related Issues (20)
- 使用speaker diarization结合视频的DER结果效果比单音频的还要差,请问这可以微调嘛? HOT 3
- 使用speaker diarization结合视频的DER结果效果比单音频的还要差,请问这可以微调嘛?
- 关于切分subseg的问题 HOT 1
- 关于人脸相关模型输入通道的问题。 HOT 1
- support real-time speaker diarization? HOT 1
- 数据集 HOT 3
- 有没有ERes2NetV2,m_channels = 32,在200k-Spkrs上面训练的模型发布? HOT 4
- 客户端没有所需的特权
- For ERes2NetV2 performance on short-duration wavs HOT 2
- SELF-DISTILLATION NETWORK WITH ENSEMBLE PROTOTYPES: LEARNING ROBUST SPEAKER REPRESENTATIONS WITHOUT SUPERVISION HOT 2
- 流式说话人识别可以实现吗? HOT 1
- 关于ERes2Net_VOX模型的效果问题 HOT 4
- Assertion error
- Inference index info in indentification from trained model HOT 5
- Numbers of speakers HOT 1
- GPU requirement for sv-eres2netv2 HOT 3
- 请教language-identification语料时数问题 HOT 2
- 训练问题 HOT 3
- 请教多模态说话人日志处理问题 HOT 1
- 训练CAM++时的问题 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from 3d-speaker.