amazon-science / semimtr-text-recognition Goto Github PK
View Code? Open in Web Editor NEWMultimodal Semi-Supervised Learning for Text Recognition (SemiMTR)
License: Apache License 2.0
Multimodal Semi-Supervised Learning for Text Recognition (SemiMTR)
License: Apache License 2.0
The language model gives NaN or Inf found in input tensor.
train.txt
Can help why it is failing to train on non English character?
Hi, I'm a little confused about the forward function in SeqCLR. In the seqclr_proj.py
line 59, the output of the visual backbone is reshaped to # (N, E, H, W) -> (N, H*W, E)
, but in OCR task they are usually processed as # (N, E, H, W) -> (N, W, E*H)
. And the explanation in paper is Note that the sequence length depends on the width of the input image
. So what is the right shape for the feed of projector? Thanks!
A problem occurred when trying to train the Semimtr:
The training stage will automatically stop after 5 epochs no matter what value of epoch I set in the config file 'configs/semimtr_finetune.yaml'.
However, if I set epoch to less than 5, then everything seems to work just fine.
AttributeError: module 'imgaug.augmenters' has no attribute 'MultiplyBrightness'
train.txt
I tried to check from which path each of these files imports imgaug. I couldn't find the highlighted packages.
Very Nice Work! I noticed in your implementation details, ST used for training, 25 epochs with 304 batchsize. I would like to ask
Is the batchsize here the total batch size on 4 GPU, which means the batch size on each V100 is 76.
How long does it take to train an epoch?
I have also tried to train ABINet. However, each epoch seemed to take a very long training time.
Thanks for open source. I want to test the recognition effect of Chinese text of the model. What changes should I make?
[2022-10-09 21:10:13,443 main.py:283 INFO consistency-regularization] Construct dataset.
[2022-10-09 21:10:13,478 main.py:129 INFO consistency-regularization] 4485536 training items found.
[2022-10-09 21:10:13,478 main.py:131 INFO consistency-regularization] 62944 valid items found.
[2022-10-09 21:10:13,478 main.py:133 INFO consistency-regularization] 147209 test items found.
[2022-10-09 21:10:13,478 main.py:289 INFO consistency-regularization] Construct model.
[2022-10-09 21:10:13,789 model_vision.py:38 INFO consistency-regularization] Read vision model from workdir/semimtr_vision_model_real_l_and_u.pth.
[2022-10-09 21:10:33,487 model_language.py:38 INFO consistency-regularization] Read language model from workdir/abinet_language_model.pth.
[2022-10-09 21:10:33,537 main.py:292 INFO consistency-regularization] Construct learner.
[2022-10-09 21:10:33,597 main.py:301 INFO consistency-regularization] Start testing
Traceback (most recent call last):------------------| 0.00% [0/2 00:00<?]
File "main.py", line 306, in
main()
File "main.py", line 302, in main
test_on_each_ds(learner)
File "/media/disk4/flbl/cdsme/semimtr/semimtr/utils/test.py", line 19, in test_on_each_ds
last_metrics = learner.validate(dl=dl)
File "/home/lbl/miniconda3/envs/semimtr/lib/python3.7/site-packages/fastai/basic_train.py", line 391, in validate
val_metrics = validate(self.model, dl, self.loss_func, cb_handler)
File "/home/lbl/miniconda3/envs/semimtr/lib/python3.7/site-packages/fastai/basic_train.py", line 59, in validate
val_loss = loss_batch(model, xb, yb, loss_func, cb_handler=cb_handler)
File "/home/lbl/miniconda3/envs/semimtr/lib/python3.7/site-packages/fastai/basic_train.py", line 30, in loss_batch
loss = loss_func(out, *yb)
File "/home/lbl/miniconda3/envs/semimtr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/media/disk4/flbl/cdsme/semimtr/semimtr/losses/consistency_regularization_loss.py", line 73, in forward
pt_lengths_teacher, *args[2:], mask=threshold_mask)
File "/home/lbl/miniconda3/envs/semimtr/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "/media/disk4/flbl/cdsme/semimtr/semimtr/losses/losses.py", line 62, in forward
gt_labels, gt_lengths = gt_dict['label'], gt_dict['length']
IndexError: too many indices for tensor of dimension 3
Hi, thanks for your great work!
When I run the code, I have a problem in the following line:
features = features.permute(0, 3, 2, 1).flatten(1, 2)
[2023-08-26 09:23:35,166 main.py:283 INFO pretrain-language-model] Construct dataset.
/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:560: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
warnings.warn(_create_warning_msg(
[2023-08-26 09:23:36,980 main.py:84 INFO pretrain-language-model] 102085 training items found.
[2023-08-26 09:23:36,980 main.py:86 INFO pretrain-language-model] 50000 valid items found.
[2023-08-26 09:23:36,980 main.py:289 INFO pretrain-language-model] Construct model.
[2023-08-26 09:23:41,078 main.py:292 INFO pretrain-language-model] Construct learner.
[2023-08-26 09:23:46,050 main.py:296 INFO pretrain-language-model] Start training.
<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
[2023-08-26 09:23:46,569 callbacks.py:179 INFO pretrain-language-model] Train ended
[2023-08-26 09:23:46,569 callbacks.py:143 INFO pretrain-language-model] average data time = 0.0000s, average running time = 0.0000s
<IPython.core.display.HTML object>
<IPython.core.display.HTML object>
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/fastai/basic_train.py", line 99, in fit
for xb,yb in progress_bar(learn.data.train_dl, parent=pbar):
File "/usr/local/lib/python3.10/dist-packages/fastprogress/fastprogress.py", line 39, in iter
if self.total != 0: self.update(0)
File "/usr/local/lib/python3.10/dist-packages/fastprogress/fastprogress.py", line 59, in update
self.update_bar(0)
File "/usr/local/lib/python3.10/dist-packages/fastprogress/fastprogress.py", line 81, in update_bar
self.on_update(val, f'{pct}[{val}/{tot} {elapsed_t}{self.lt}{remaining_t}{end}]')
File "/usr/local/lib/python3.10/dist-packages/fastprogress/fastprogress.py", line 134, in on_update
elif self.parent is not None: self.parent.show()
File "/usr/local/lib/python3.10/dist-packages/fastprogress/fastprogress.py", line 177, in show
self.out.update(HTML(self.html_code))
AttributeError: 'NoneType' object has no attribute 'update'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/content/drive/.shortcut-targets-by-id/12Fpyqv7ac7VkBnm5mHDZrlJ-pRtYx53q/AIO/Competition_Module/BKAI/semimtr-text-recognition/main.py", line 306, in
main()
File "/content/drive/.shortcut-targets-by-id/12Fpyqv7ac7VkBnm5mHDZrlJ-pRtYx53q/AIO/Competition_Module/BKAI/semimtr-text-recognition/main.py", line 297, in main
learner.fit(epochs=config.training_epochs,
File "/usr/local/lib/python3.10/dist-packages/fastai/basic_train.py", line 200, in fit
fit(epochs, self, metrics=self.metrics, callbacks=self.callbacks+callbacks)
File "/usr/local/lib/python3.10/dist-packages/fastai/basic_train.py", line 112, in fit
finally: cb_handler.on_train_end(exception)
File "/usr/local/lib/python3.10/dist-packages/fastai/callback.py", line 323, in on_train_end
self('train_end', exception=exception)
File "/usr/local/lib/python3.10/dist-packages/fastai/callback.py", line 251, in call
for cb in self.callbacks: self._call_and_update(cb, cb_name, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/fastai/callback.py", line 241, in call_and_update
new = ifnone(getattr(cb, f'on{cb_name}')(**self.state_dict, **kwargs), dict())
File "/content/drive/.shortcut-targets-by-id/12Fpyqv7ac7VkBnm5mHDZrlJ-pRtYx53q/AIO/Competition_Module/BKAI/semimtr-text-recognition/semimtr/callbacks/callbacks.py", line 180, in on_train_end
self._eval_model()
File "/content/drive/.shortcut-targets-by-id/12Fpyqv7ac7VkBnm5mHDZrlJ-pRtYx53q/AIO/Competition_Module/BKAI/semimtr-text-recognition/semimtr/callbacks/callbacks.py", line 146, in _eval_model
last_metrics = self._validate()
File "/content/drive/.shortcut-targets-by-id/12Fpyqv7ac7VkBnm5mHDZrlJ-pRtYx53q/AIO/Competition_Module/BKAI/semimtr-text-recognition/semimtr/callbacks/callbacks.py", line 62, in _validate
val_metrics = validate(self.learn.model, dl, self.loss_func, cb_handler)
File "/usr/local/lib/python3.10/dist-packages/fastai/basic_train.py", line 57, in validate
for xb,yb in progress_bar(dl, parent=pbar, leave=(pbar is not None)):
File "/usr/local/lib/python3.10/dist-packages/fastprogress/fastprogress.py", line 39, in iter
if self.total != 0: self.update(0)
File "/usr/local/lib/python3.10/dist-packages/fastprogress/fastprogress.py", line 59, in update
self.update_bar(0)
File "/usr/local/lib/python3.10/dist-packages/fastprogress/fastprogress.py", line 81, in update_bar
self.on_update(val, f'{pct}[{val}/{tot} {elapsed_t}{self.lt}{remaining_t}{end}]')
File "/usr/local/lib/python3.10/dist-packages/fastprogress/fastprogress.py", line 133, in on_update
if self.display: self.out.update(HTML(self.progress))
AttributeError: 'NoneType' object has no attribute 'update'
Didn't you use EMA(exponential moving average) between teacher & student model?
I found related code (https://github.com/amazon-science/semimtr-text-recognition/blob/main/semimtr/modules/model_fusion_teacher_student_ema.py) but there is any usage.
Thanks You for nice repository!!
Hello, thanks for your great work. I have some questions about the inputs for the SeqCLR model. Should the images in the same batch have the same size(especially the same width)? Since text in images varies in length, and some images with longer text may be distorted if all images are scaled to the same size. If the padding operation is performed on the image, some frames in the sequence feature will lose their semantics. So how do you preprocess the image size in one batch?
Looking forward to your response. Thanks very much!
is it possible to fineture semimtr without trained language model use?
if possible please let me know how?
Thank you
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.