lucasnewman / best-rq-pytorch Goto Github PK
View Code? Open in Web Editor NEWImplementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
License: MIT License
Implementation of BEST-RQ - a model for self-supervised learning of speech signals using a random projection quantizer, in Pytorch.
License: MIT License
What is the final result like? Can you take a screenshot to show it?
this is my result:
trainable parameters: 657398946
training with dataset of 2052 samples and validating with randomly splitted 109 samples
do you want to clear previous experiment checkpoints and results? (y/n)
y
0: loss: 6.983
0: valid loss 5.440
0: saving model to results
Process end, code exit -1073741819 (0xC0000005)
Thanks for your great work.
I want to know how long does pre-train take for 0.3B models.
Can you share your experience for the cost of pre-training BEST-RQ? (batch size, GPU you used, # of that GPU, training time, etc...)
Hi Lucas,
I'm looking over the code and I believe you have missed the two convolution subsampling layers in conformer.py,
4.1.1. NON-STREAMING MODELS The model has two convolution layers at the bottom which provide 4 times temporal-dimension reduction for the input sequences. The rest of the layers are a stack of Conformer models. We explore 0.6B model size which is extensively studied in the previous works. The model contains 24 layers of Conformer models.
If you'd like I can create a pull request and implement this for you now.
Thanks - If I've misunderstood the paper, please call me out! 😅
Hi,
I am trying to run the pretraining of the full model (which should have ~650M parameters) in a 24GB GPU card and it only runs if I set the batch size to 1 (totally useless training). What would be the memory necessary to run the full training with the preset batch size?
Also, Once finished training, I tried to run the Kmeans fitting script and it seems to require even more memory. Any idea as well on what is needed?
Thanks!
Hi Lucas,
Thanks for sharing your implementation of the framework. I don't quite get it why the labels are passed into the conformer model instead of the original data. To my understanding, the conformer is used to encode the original data and predict the corresponding labels (indices in the codebook), so the input here shouldn't be the labels, right?
best-rq-pytorch/best_rq_pytorch/best_rq.py
Lines 144 to 149 in b4b0d8d
brq = BestRQ(
sample_rate = 22050,
win_length = 1024,
hop_length = 256,
hi, I changed three parameters, as above, then the program may have crashed. so these three parameters can not be changed?
Can you tell me the vision of 'CUDA' 'torch' 'torchaudio' 'torchvision'.
I find "orch.cuda.is_available()" is False.
Then i change the vision of 'torch' 'torchaudio' and 'torchvision' according to the version of 'CUDA'.
Then "orch.cuda.is_available()" is True.
Then appear "ImportError" about "torchaudio"
Hello,
I am encoutering the following issue when testing the pretrain.py
script from the examples
folder.
beartype.roar.BeartypeDecorHintNonpepException: Method best_rq_pytorch.conformer.ConformerWrapper.__init__() parameter "conformer" type hint <built-in function any> either PEP-noncompliant or currently unsupported by @beartype.
I do not understand how to address this, could you please provide some guidance? Perhaps my data is not well formatted, could you point me to the correct formatting?
Hey! Thanks for your code. I was wondering if you have plans to implement the BestRQTrainerASR class, BestRQTrainerASR class and the BestRQTrainASRWrapper class. If not, could you please guide me on how to implement it?
Is there any pretrain model?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.