Giter Site home page Giter Site logo

Comments (11)

YuanGongND avatar YuanGongND commented on August 20, 2024

Hi,

Can you elaborate on which argument you are referring to, is that

target_length=512

Thanks!

-Yuan

from ssast.

fanOfJava avatar fanOfJava commented on August 20, 2024

yes

from ssast.

YuanGongND avatar YuanGongND commented on August 20, 2024

Can you explain why the value would be 1024?

It seems to me that it changes

ssast/src/run.py

Lines 97 to 101 in a1a3eec

audio_conf = {'num_mel_bins': args.num_mel_bins, 'target_length': args.target_length, 'freqm': args.freqm, 'timem': args.timem, 'mixup': args.mixup, 'dataset': args.dataset,
'mode':'train', 'mean':args.dataset_mean, 'std':args.dataset_std, 'noise':args.noise}
val_audio_conf = {'num_mel_bins': args.num_mel_bins, 'target_length': args.target_length, 'freqm': 0, 'timem': 0, 'mixup': 0, 'dataset': args.dataset,
'mode': 'evaluation', 'mean': args.dataset_mean, 'std': args.dataset_std, 'noise': False}

and

ssast/src/run.py

Lines 132 to 138 in a1a3eec

audio_model = ASTModel(fshape=args.fshape, tshape=args.tshape, fstride=args.fshape, tstride=args.tshape,
input_fdim=args.num_mel_bins, input_tdim=args.target_length, model_size=args.model_size, pretrain_stage=True)
# in the fine-tuning stage
else:
audio_model = ASTModel(label_dim=args.n_class, fshape=args.fshape, tshape=args.tshape, fstride=args.fstride, tstride=args.tstride,
input_fdim=args.num_mel_bins, input_tdim=args.target_length, model_size=args.model_size, pretrain_stage=False,
load_pretrained_mdl_path=args.pretrained_mdl_path)

for both dataloading and model instantiation.

from ssast.

fanOfJava avatar fanOfJava commented on August 20, 2024

because the process of loading the model file ssast-base-patch-400.pth changes the target_length, the code is shown as below
try:
p_fshape, p_tshape = sd['module.v.patch_embed.proj.weight'].shape[2],sd['module.v.patch_embed.proj.weight'].shape[3]
p_input_fdim, p_input_tdim = sd['module.p_input_fdim'].item(), sd['module.p_input_tdim'].item()

from ssast.

fanOfJava avatar fanOfJava commented on August 20, 2024

我猜测这也是为什么finetune完之后,做纯推理时load model file会报错的原因。不知道我理解的是否对

from ssast.

YuanGongND avatar YuanGongND commented on August 20, 2024

can you paste the error code here?

from ssast.

fanOfJava avatar fanOfJava commented on August 20, 2024

can you paste the error code here?

you can print the p_input_tdim before 156 line of ast_model,you will find the error

from ssast.

YuanGongND avatar YuanGongND commented on August 20, 2024

I don't have enough time to run it again. The code is a cleaned up version from the development version. It went through a brief test and I guess I did take care of this. So if you already have a error message, that would be very helpful. It might due to something else.

from ssast.

fanOfJava avatar fanOfJava commented on August 20, 2024

我相信很多人都有同样的问题。因为finetune之后保存的模型,根本没法load进来做纯推理,我也不知道该如何测试训练好的模型的真实性能

from ssast.

YuanGongND avatar YuanGongND commented on August 20, 2024

Oh I see, yes, that is a known problem. It should be fine if you finetune a pretrained model that has different target_length, but if you want to take the finetuned model for deployment, you will get an error.

For checking the performance, once you finetune a pretrained model, the script will print out the accuracy (or mAP) and also save the result on disk.

For deploy the model for inference, you will need to fix the bug.

from ssast.

YuanGongND avatar YuanGongND commented on August 20, 2024

Can you check this: #4

from ssast.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.