Giter Site home page Giter Site logo

Comments (4)

wanghuii1 avatar wanghuii1 commented on July 30, 2024 1

@nuaazs The 'lr_scheduler' is a warmup cosine scheduler. If you only set the 'num_epoch' to 200 and then resume training, the learning rate will increase at epoch 100. To avoid this, I recommend adjusting the 'lr_scheduler' configuration to maintain a lower lr value. Alternatively, you may simply need to wait and train for more epochs to achieve optimal performance.

from 3d-speaker.

nuaazs avatar nuaazs commented on July 30, 2024

train.log

from 3d-speaker.

nuaazs avatar nuaazs commented on July 30, 2024

config.yaml

aug_prob: 0.2
augmentations:
  args:
    aug_prob: <aug_prob>
    noise_file: <noise>
    reverb_file: <reverb>
  obj: speakerlab.process.processor.SpkVeriAug
batch_size: 256
checkpointer:
  args:
    checkpoints_dir: <exp_dir>/models
    recoverables:
      classifier: <classifier>
      embedding_model: <embedding_model>
      epoch_counter: <epoch_counter>
  obj: speakerlab.utils.checkpoint.Checkpointer
classifier:
  args:
    input_dim: <embedding_size>
    out_neurons: <num_classes>
  obj: speakerlab.models.campplus.classifier.CosineClassifier
data: data/vox2_dev/train.csv
dataloader:
  args:
    batch_size: <batch_size>
    dataset: <dataset>
    drop_last: true
    num_workers: <num_workers>
    pin_memory: true
  obj: torch.utils.data.DataLoader
dataset:
  args:
    data_file: <data>
    preprocessor: <preprocessor>
  obj: speakerlab.dataset.dataset.WavSVDataset
embedding_model:
  args:
    embed_dim: <embedding_size>
    feat_dim: <fbank_dim>
    num_blocks:
    - 3
    - 3
    - 9
    - 3
    pooling_func: GSP
  obj: speakerlab.models.dfresnet.resnet.DFResNet
embedding_size: 512
epoch_counter:
  args:
    limit: <num_epoch>
  obj: speakerlab.utils.epoch.EpochCounter
exp_dir: exp/dfresnet56
fbank_dim: 80
feature_extractor:
  args:
    mean_nor: true
    n_mels: <fbank_dim>
    sample_rate: <sample_rate>
  obj: speakerlab.process.processor.FBank
label_encoder:
  args:
    data_file: <data>
  obj: speakerlab.process.processor.SpkLabelEncoder
log_batch_freq: 100
loss:
  args:
    easy_margin: false
    margin: 0.2
    scale: 32.0
  obj: speakerlab.loss.margin_loss.ArcMarginLoss
lr: 0.1
lr_scheduler:
  args:
    fix_epoch: <num_epoch>
    max_lr: <lr>
    min_lr: <min_lr>
    optimizer: <optimizer>
    step_per_epoch: null
    warmup_epoch: 5
  obj: speakerlab.process.scheduler.WarmupCosineScheduler
margin_scheduler:
  args:
    criterion: <loss>
    final_margin: 0.2
    fix_epoch: 25
    increase_start_epoch: 15
    initial_margin: 0.0
    step_per_epoch: null
  obj: speakerlab.process.scheduler.MarginScheduler
min_lr: 0.0001
noise: data/musan/wav.scp
num_classes: 5994
num_epoch: 200
num_workers: 16
optimizer:
  args:
    lr: <lr>
    momentum: 0.9
    nesterov: true
    params: null
    weight_decay: 0.0001
  obj: torch.optim.SGD
preprocessor:
  augmentations: <augmentations>
  feature_extractor: <feature_extractor>
  label_encoder: <label_encoder>
  wav_reader: <wav_reader>
reverb: data/rirs/wav.scp
sample_rate: 16000
save_epoch_freq: 2
speed_pertub: true
wav_len: 3.0
wav_reader:
  args:
    duration: <wav_len>
    sample_rate: <sample_rate>
    speed_pertub: <speed_pertub>
  obj: speakerlab.process.processor.WavReader

from 3d-speaker.

nuaazs avatar nuaazs commented on July 30, 2024

Thank you for your response. @wanghuii1
It seems that the poor results are indeed caused by the learning rate being set too high.
Everything works fine after I re-modified the learning rate.

from 3d-speaker.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.