Hi, I'm currently using Coqui STT for my Master's thesis project. I

Sure, here you go: <a href="https://colab.research.google.com/drive/1mLXfqVXIQLbgy

Feature request: Tensorflow 2.0 compatibility about stt HOT 11 CLOSED

Baerlie commented on June 8, 2024

Feature request: Tensorflow 2.0 compatibility

from stt.

Comments (11)

HarikalarKutusu commented on June 8, 2024 1

Hey Beatrice, in fact it does work with the new Colab, but not in the old way. It works,, but you need to restart the kernel.

Here is my test script I compiled after this thing came out. Run it step by step and before the last step it shows you a button to restart the kernel. After that you continue from where you left. Do not initialize any variables before that point, they will be lost (or if you have to, you need to redefine them).

https://colab.research.google.com/drive/1CfZbtNLht4h0ShOJR1qUqucNg893rsOP?usp=share_link

AFAIK, there is no immediate plan to update the code to TF v2 as it would be a huge undertaking.

Bülent

from stt.

Baerlie commented on June 8, 2024 1

Ok, thanks for the information! I can confirm, that installing python 3.7 on Colab and creating an virtual env with 3.7 sucessfully installs tensorflow 1.15.4 and STT 1.4.0.

Meanwhile, I finished the grid search on my local machine and it seems that increasing the train batch size leads to a higher WER and CER so I will go for a lower batch size with my audio data samples.

I can train up to batch size 16 with my graphics card, I wanted to include a batch size of 32 as well as a comparison in my thesis, but I can do that later if I have time :)

from stt.

Baerlie commented on June 8, 2024

Hey Bülent,

thanks for your answer and sharing your notebook! It seems they changed a lot. pip on the new Colab does not find any version below 2 any more:

ERROR: Could not find a version that satisfies the requirement tensorflow==1.15.4 (from coqui-stt-training) (from versions: 2.2.0, 2.2.1, 2.2.2, 2.2.3, 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.3.4, 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 2.5.0, 2.5.1, 2.5.2, 2.5.3, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.6.3, 2.6.4, 2.6.5, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.7.1, 2.7.2, 2.7.3, 2.7.4, 2.8.0rc0, 2.8.0rc1, 2.8.0, 2.8.1, 2.8.2, 2.8.3, 2.8.4, 2.9.0rc0, 2.9.0rc1, 2.9.0rc2, 2.9.0, 2.9.1, 2.9.2, 2.9.3, 2.10.0rc0, 2.10.0rc1, 2.10.0rc2, 2.10.0rc3, 2.10.0, 2.10.1, 2.11.0rc0, 2.11.0rc1, 2.11.0rc2, 2.11.0)
ERROR: No matching distribution found for tensorflow==1.15.4

This error occurs when installing the STT from git and also when trying to manually install tensorflow in a later cell. I try to install it from source, maybe this works.

Beatrice

from stt.

HarikalarKutusu commented on June 8, 2024

I found the reason. They switched the image to use Python 3.8 which only supports TF 2.2+

I checked, they have Python 3.6 installed. Can you try to create a virtual env to use Python 3.6?

from stt.

Baerlie commented on June 8, 2024

Hey Bülent,

thanks for the hint with the installed python version! I tried with the virtual environment, it works, but then the following error occurs:
ERROR: Package 'coqui-stt-training' requires a different Python: 3.6.9 not in '<3.9,>=3.7'

I try to install version 3.7 when I have time to see what happens. Thanks for helping, though!

br
Beatrice

from stt.

HarikalarKutusu commented on June 8, 2024

For a quick check you could use an older version of STT. I couldn't find the exact point where Python 3.6 support is dropped, but you can use v1.0.0 for example. Wrt. underlying DeepSpeech model nothing changed, but there have been changes to parameters, so use the related version documentation for parameters.

from stt.

HarikalarKutusu commented on June 8, 2024

It would be very nice if you can share the relevant cells here for future questions.

it seems that increasing the train batch size leads to a higher WER and CER

That might be data dependent. I did a similar test last year, the results were not conclusive/erratic, but anyway, I share them here:

As you can see with training batch size set to 32 (and 16), Best Epoch was reached too early. I did not check the loss graphs at that time, but maybe you should for the thesis to pinpoint possible overfitting etc.

from stt.

Baerlie commented on June 8, 2024

Sure, here you go:
https://colab.research.google.com/drive/1mLXfqVXIQLbgyfa2pXzVay0fWoh9Geod?usp=sharing
I'm not a heavy Colab user, somehow one has to activate the virtual env for every cell, I think.

Thanks for sharing your results! I don't have mine ready yet (but I'm happy to share the thesis once it's finished).

from stt.

HarikalarKutusu commented on June 8, 2024

Thank you for sharing the solution Beatrice.

AFAIK, in Colab (probably in all iPython/Notebook implementations), each cell starts a new shell, thus you need to re-activate. I found out that defining def:'s and calling them in succession from a single cell makes it easier. It kinda defeats the purpose of using a Notebook and results in many linting underlines, but works.

Good luck with your thesis :)

from stt.

fquirin commented on June 8, 2024

AFAIK, there is no immediate plan to update the code to TF v2 as it would be a huge undertaking

I hate it when that happens :-/ and since almost all frameworks released in the last years are basically beta versions that constantly introduce breaking changes and drop backwards compatibility it has become the everyday nightmare of programmers 😞.
The question is how long will you be able to live with TF < 2 :-|.

Maybe the tf.compat module can help?

I'm already facing a situation where I need to use two libraries in the same program (one being Coqui) and one requires TF 2 :-(

from stt.

wasertech commented on June 8, 2024

You can totally use TFv2 for inference already.

Training is another beast in it-self.

The question is how long will you be able to live with TF < 2 :-|.

As long as we can't use TFv2 for training. We need some very specific requirements to train models which TFv2 is lacking for now.

I'm already facing a situation where I need to use two libraries in the same program

You shouldn't mix dependencies like that. Training should be performed inside its own dedicated environment.
Meaning you should have one notebook for training using STT, and create other notebooks for your other needs.

@HarikalarKutusu can tell you that notebooks are not made for you to train your models. They are good tools to learn and play with code but not to seriously produce models at scale.

If you followed the docs, we actually recommend you use our docker image to train your models, as it's the easiest way to train and comes out-of-the-box with everything you would need to fully train your models.

We suggest you use our Docker image as a base for training.
- https://stt.readthedocs.io/en/latest/TRAINING_INTRO.html

I'll move this ticket to a discussion as there is really not much we can do about it. We have made some progress towards it but there is still a long way before we can fully use TF2 as base for training.

from stt.

Feature request: Tensorflow 2.0 compatibility about stt HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent