I'm a DataScientist, interested in AI and Machine Learning. I like to build things, you can find everything that I build here on my Github account
The best way to contact me is usually through Facebook or Email.
Transformer OCR
License: Apache License 2.0
Dear a Quốc,
Hiên tại e thấy VietOCR đang đọc được ảnh bé. giờ e muốn đọc ảnh với kích thước A3 thì làm thế nào ạ
Hi anh,
Em đã sử dụng model tranformer ocr để train lại với dữ liệu của mình, hiện tại e muốn triển khai model với hiệu năng tốt hơn nên có ý định convert model sang định dạng onnx.
Vấn đề là phần xử lý để có từ đầu ra của model VietOCR đến khi thu được text còn nhiều bước xử lý và các bước này không kế thừa 1 nn.Module.
Vậy nên em muốn hỏi là model hay thư viện của mình có hỗ trợ hay a có gợi ý gì để em có thể chuyển model sang onnx được không ạ.
Em cảm ơn.
@pbcquoc Em chào anh,
Anh cho em hỏi với tập data 10m của anh thì anh train với bao nhiêu step để có pretrain model đó ạ?
Em cảm ơn anh!
Hello a @pbcquoc,
Khi deloy lên môi trường production cụ thể là website và sử dụng kiến trúc Seq2Seq để predict, em nhận thấy time để predict khá lâu khoảng 3-4s. Có cách nào để improve k a?
E cảm ơn.
Em chào anh, em có chạy file notebook. Thì em có gặp một số lỗi như sau:
Em có thực hiện bằng cách thay đổi thành các key khác:
config["optimizer"]["max_lr"] = 0.1
config["optimizer"]["total_steps"] = 4000
del config["optimizer"]["init_lr"]
del config["optimizer"]["n_warmup_steps"]
Ngoài ra thì em còn thấy missmatching shape model.
Khi tiến hành chạy thì kết quả dự đoán là 0%. Điều này có phải là do chưa train đủ lâu hay không ạ.
Em xin cảm ơn!
Anh có thể chia sẻ file pretrain của seq2seq không ạ, em chỉ tìm được pretrain của transformer thôi ạ. Cảm ơn anh.
Em chào anh, đầu tiên em xin cảm ơn anh về thư viện ocr là một thư viện rất tốt cho em và cộng đồng sử dụng. Em có một vấn đề mong anh giải đáp. Khi tạo dữ liệu để train mô hình attentionocr, thì các đoạn text có nên là các câu có nghĩa hay không ạ? Nếu như em cần OCR cả những chữ số, ký tự đặc biệt thì nên tạo data như nào ạ? Em đang gen từ ngẫu nhiên và lẫn lộn trong đó cả số và ký tự, ví dụ "Đồng bào con gà 12389 * &^^" , thì lúc train cho kết quả không như mong đợi. Em xin cảm ơn.
Em xin phép hỏi a một số vấn đề về mặt lý thuyết ạ.
Dear anh,
Em có làm như hướng dẫn tạo file chạy demo : https://github.com/pbcquoc/vietocr/blob/master/vietocr_gettingstart.ipynb
Khi chạy e bị lỗi này. do máy e k có kết nối mạng
e đọc code thấy đoạn này kết nối đến github. e chuyển cái này thành file offline được k ạ.
Em chào anh ạ, anh cho em hỏi là em cần config như thế nào để có thể train với ảnh gray ạ, em có thử convert('L') trong hàm process_image() ở file translate.py nhưng em đang gặp lỗi ở phần augmentor ảnh ạ
RuntimeError: CUDA error: no kernel image is available for execution on the device
Chào anh, em đang sử dụng vietocr để train lại trên tập dữ liệu của em. Tuy nhiên em nhận thấy rằng model chạy rất tốt trên hầu hết các loại chữ nhưng thường lỗi ở ký tự '/' như thông tin liên quan ngày tháng có '/' ở giữa. Cho em hỏi có cách nào để cải thiện vấn đề này k ạ! Em cảm ơn!
bạn ơi mình ko thấy trong code cũng như public model attention OCR mà chỉ thấy transformer OCR, bạn có thể cho mình link download được không.
Cám ơn tác giả
Hi anh,
Em tải weight vgg-transformer về sau đó load_state_dict thì gặp lỗi này:
size mismatch for transformer.embed_tgt.weight: copying a param with shape torch.Size([232, 256]) from checkpoint, the shape in current model is torch.Size([233, 256]).
size mismatch for transformer.fc.weight: copying a param with shape torch.Size([232, 256]) from checkpoint, the shape in current model is torch.Size([233, 256]).
size mismatch for transformer.fc.bias: copying a param with shape torch.Size([232]) from checkpoint, the shape in current model is torch.Size([233]).
Đây là file config của em:
vocab: 'aAàÀảẢãÃáÁạẠăĂằẰẳẲẵẴắẮặẶâÂầẦẩẨẫẪấẤậẬbBcCdDđĐeEèÈẻẺẽẼéÉẹẸêÊềỀểỂễỄếẾệỆfFgGhHiIìÌỉỈĩĨíÍịỊjJkKlLmMnNoOòÒỏỎõÕóÓọỌôÔồỒổỔỗỖốỐộỘơƠờỜởỞỡỠớỚợỢpPqQrRsStTuUùÙủỦũŨúÚụỤưƯừỪửỬữỮứỨựỰvVwWxXyYỳỲỷỶỹỸýÝỵỴzZ0123456789!"#$%&''()*+,-./:;<=>?@[\]^_`{|}~ '
device: cuda
weights: weights/transformerocr.pth
backbone: vgg19_bn
cnn:
# pooling stride size
ss:
- [2, 2]
- [2, 2]
- [2, 1]
- [2, 1]
- [1, 1]
# pooling kernel size
ks:
- [2, 2]
- [2, 2]
- [2, 1]
- [2, 1]
- [1, 1]
transformer:
d_model: 256
nhead: 8
num_encoder_layers: 6
num_decoder_layers: 6
dim_feedforward: 2048
max_seq_length: 1024
pos_dropout: 0.1
trans_dropout: 0.1
seq_modeling: 'transformer'
beamsearch: False
E đang thử mô hình resnet_transformer thì cũng đang gặp lỗi size mismatch (version đang là 0.1.9):
Đây là code em chạy:
config = Cfg.load_config_from_name('resnet_transformer')
config['device'] = 'cuda:0'
config['predictor']['beamsearch'] = False
detector = Predictor(config)
Đây là thông báo lỗi:
File "/home/duycuong/PycharmProjects/research_py3/text_recognition/classifier/vietocr/vietocr/eval.py", line 24, in <module>
detector = Predictor(config)
File "/home/duycuong/PycharmProjects/research_py3/text_recognition/classifier/vietocr/vietocr/tool/predictor.py", line 19, in __init__
model.load_state_dict(torch.load(weights, map_location=torch.device(device)))
File "/home/duycuong/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for VietOCR:
Missing key(s) in state_dict: "cnn.model.conv0_1.weight", "cnn.model.bn0_1.weight", "cnn.model.bn0_1.bias", "cnn.model.bn0_1.running_mean", "cnn.model.bn0_1.running_var", "cnn.model.conv0_2.weight", "cnn.model.bn0_2.weight", "cnn.model.bn0_2.bias", "cnn.model.bn0_2.running_mean", "cnn.model.bn0_2.running_var", "cnn.model.layer1.0.conv1.weight", "cnn.model.layer1.0.bn1.weight", "cnn.model.layer1.0.bn1.bias", "cnn.model.layer1.0.bn1.running_mean", "cnn.model.layer1.0.bn1.running_var", "cnn.model.layer1.0.conv2.weight", "cnn.model.layer1.0.bn2.weight", "cnn.model.layer1.0.bn2.bias", "cnn.model.layer1.0.bn2.running_mean", "cnn.model.layer1.0.bn2.running_var", "cnn.model.layer1.0.downsample.0.weight", "cnn.model.layer1.0.downsample.1.weight", "cnn.model.layer1.0.downsample.1.bias", "cnn.model.layer1.0.downsample.1.running_mean", "cnn.model.layer1.0.downsample.1.running_var", "cnn.model.conv1.weight", "cnn.model.bn1.weight", "cnn.model.bn1.bias", "cnn.model.bn1.running_mean", "cnn.model.bn1.running_var", "cnn.model.layer2.0.conv1.weight", "cnn.model.layer2.0.bn1.weight", "cnn.model.layer2.0.bn1.bias", "cnn.model.layer2.0.bn1.running_mean", "cnn.model.layer2.0.bn1.running_var", "cnn.model.layer2.0.conv2.weight", "cnn.model.layer2.0.bn2.weight", "cnn.model.layer2.0.bn2.bias", "cnn.model.layer2.0.bn2.running_mean", "cnn.model.layer2.0.bn2.running_var", "cnn.model.layer2.0.downsample.0.weight", "cnn.model.layer2.0.downsample.1.weight", "cnn.model.layer2.0.downsample.1.bias", "cnn.model.layer2.0.downsample.1.running_mean", "cnn.model.layer2.0.downsample.1.running_var", "cnn.model.layer2.1.conv1.weight", "cnn.model.layer2.1.bn1.weight", "cnn.model.layer2.1.bn1.bias", "cnn.model.layer2.1.bn1.running_mean", "cnn.model.layer2.1.bn1.running_var", "cnn.model.layer2.1.conv2.weight", "cnn.model.layer2.1.bn2.weight", "cnn.model.layer2.1.bn2.bias", "cnn.model.layer2.1.bn2.running_mean", "cnn.model.layer2.1.bn2.running_var", "cnn.model.conv2.weight", "cnn.model.bn2.weight", "cnn.model.bn2.bias", "cnn.model.bn2.running_mean", "cnn.model.bn2.running_var", "cnn.model.layer3.0.conv1.weight", "cnn.model.layer3.0.bn1.weight", "cnn.model.layer3.0.bn1.bias", "cnn.model.layer3.0.bn1.running_mean", "cnn.model.layer3.0.bn1.running_var", "cnn.model.layer3.0.conv2.weight", "cnn.model.layer3.0.bn2.weight", "cnn.model.layer3.0.bn2.bias", "cnn.model.layer3.0.bn2.running_mean", "cnn.model.layer3.0.bn2.running_var", "cnn.model.layer3.0.downsample.0.weight", "cnn.model.layer3.0.downsample.1.weight", "cnn.model.layer3.0.downsample.1.bias", "cnn.model.layer3.0.downsample.1.running_mean", "cnn.model.layer3.0.downsample.1.running_var", "cnn.model.layer3.1.conv1.weight", "cnn.model.layer3.1.bn1.weight", "cnn.model.layer3.1.bn1.bias", "cnn.model.layer3.1.bn1.running_mean", "cnn.model.layer3.1.bn1.running_var", "cnn.model.layer3.1.conv2.weight", "cnn.model.layer3.1.bn2.weight", "cnn.model.layer3.1.bn2.bias", "cnn.model.layer3.1.bn2.running_mean", "cnn.model.layer3.1.bn2.running_var", "cnn.model.layer3.2.conv1.weight", "cnn.model.layer3.2.bn1.weight", "cnn.model.layer3.2.bn1.bias", "cnn.model.layer3.2.bn1.running_mean", "cnn.model.layer3.2.bn1.running_var", "cnn.model.layer3.2.conv2.weight", "cnn.model.layer3.2.bn2.weight", "cnn.model.layer3.2.bn2.bias", "cnn.model.layer3.2.bn2.running_mean", "cnn.model.layer3.2.bn2.running_var", "cnn.model.layer3.3.conv1.weight", "cnn.model.layer3.3.bn1.weight", "cnn.model.layer3.3.bn1.bias", "cnn.model.layer3.3.bn1.running_mean", "cnn.model.layer3.3.bn1.running_var", "cnn.model.layer3.3.conv2.weight", "cnn.model.layer3.3.bn2.weight", "cnn.model.layer3.3.bn2.bias", "cnn.model.layer3.3.bn2.running_mean", "cnn.model.layer3.3.bn2.running_var", "cnn.model.layer3.4.conv1.weight", "cnn.model.layer3.4.bn1.weight", "cnn.model.layer3.4.bn1.bias", "cnn.model.layer3.4.bn1.running_mean", "cnn.model.layer3.4.bn1.running_var", "cnn.model.layer3.4.conv2.weight", "cnn.model.layer3.4.bn2.weight", "cnn.model.layer3.4.bn2.bias", "cnn.model.layer3.4.bn2.running_mean", "cnn.model.layer3.4.bn2.running_var", "cnn.model.conv3.weight", "cnn.model.bn3.weight", "cnn.model.bn3.bias", "cnn.model.bn3.running_mean", "cnn.model.bn3.running_var", "cnn.model.layer4.0.conv1.weight", "cnn.model.layer4.0.bn1.weight", "cnn.model.layer4.0.bn1.bias", "cnn.model.layer4.0.bn1.running_mean", "cnn.model.layer4.0.bn1.running_var", "cnn.model.layer4.0.conv2.weight", "cnn.model.layer4.0.bn2.weight", "cnn.model.layer4.0.bn2.bias", "cnn.model.layer4.0.bn2.running_mean", "cnn.model.layer4.0.bn2.running_var", "cnn.model.layer4.1.conv1.weight", "cnn.model.layer4.1.bn1.weight", "cnn.model.layer4.1.bn1.bias", "cnn.model.layer4.1.bn1.running_mean", "cnn.model.layer4.1.bn1.running_var", "cnn.model.layer4.1.conv2.weight", "cnn.model.layer4.1.bn2.weight", "cnn.model.layer4.1.bn2.bias", "cnn.model.layer4.1.bn2.running_mean", "cnn.model.layer4.1.bn2.running_var", "cnn.model.layer4.2.conv1.weight", "cnn.model.layer4.2.bn1.weight", "cnn.model.layer4.2.bn1.bias", "cnn.model.layer4.2.bn1.running_mean", "cnn.model.layer4.2.bn1.running_var", "cnn.model.layer4.2.conv2.weight", "cnn.model.layer4.2.bn2.weight", "cnn.model.layer4.2.bn2.bias", "cnn.model.layer4.2.bn2.running_mean", "cnn.model.layer4.2.bn2.running_var", "cnn.model.conv4_1.weight", "cnn.model.bn4_1.weight", "cnn.model.bn4_1.bias", "cnn.model.bn4_1.running_mean", "cnn.model.bn4_1.running_var", "cnn.model.conv4_2.weight", "cnn.model.bn4_2.weight", "cnn.model.bn4_2.bias", "cnn.model.bn4_2.running_mean", "cnn.model.bn4_2.running_var".
Unexpected key(s) in state_dict: "cnn.cnn.features.0.weight", "cnn.cnn.features.0.bias", "cnn.cnn.features.1.weight", "cnn.cnn.features.1.bias", "cnn.cnn.features.1.running_mean", "cnn.cnn.features.1.running_var", "cnn.cnn.features.1.num_batches_tracked", "cnn.cnn.features.3.weight", "cnn.cnn.features.3.bias", "cnn.cnn.features.4.weight", "cnn.cnn.features.4.bias", "cnn.cnn.features.4.running_mean", "cnn.cnn.features.4.running_var", "cnn.cnn.features.4.num_batches_tracked", "cnn.cnn.features.7.weight", "cnn.cnn.features.7.bias", "cnn.cnn.features.8.weight", "cnn.cnn.features.8.bias", "cnn.cnn.features.8.running_mean", "cnn.cnn.features.8.running_var", "cnn.cnn.features.8.num_batches_tracked", "cnn.cnn.features.10.weight", "cnn.cnn.features.10.bias", "cnn.cnn.features.11.weight", "cnn.cnn.features.11.bias", "cnn.cnn.features.11.running_mean", "cnn.cnn.features.11.running_var", "cnn.cnn.features.11.num_batches_tracked", "cnn.cnn.features.14.weight", "cnn.cnn.features.14.bias", "cnn.cnn.features.15.weight", "cnn.cnn.features.15.bias", "cnn.cnn.features.15.running_mean", "cnn.cnn.features.15.running_var", "cnn.cnn.features.15.num_batches_tracked", "cnn.cnn.features.17.weight", "cnn.cnn.features.17.bias", "cnn.cnn.features.18.weight", "cnn.cnn.features.18.bias", "cnn.cnn.features.18.running_mean", "cnn.cnn.features.18.running_var", "cnn.cnn.features.18.num_batches_tracked", "cnn.cnn.features.20.weight", "cnn.cnn.features.20.bias", "cnn.cnn.features.21.weight", "cnn.cnn.features.21.bias", "cnn.cnn.features.21.running_mean", "cnn.cnn.features.21.running_var", "cnn.cnn.features.21.num_batches_tracked", "cnn.cnn.features.23.weight", "cnn.cnn.features.23.bias", "cnn.cnn.features.24.weight", "cnn.cnn.features.24.bias", "cnn.cnn.features.24.running_mean", "cnn.cnn.features.24.running_var", "cnn.cnn.features.24.num_batches_tracked", "cnn.cnn.features.27.weight", "cnn.cnn.features.27.bias", "cnn.cnn.features.28.weight", "cnn.cnn.features.28.bias", "cnn.cnn.features.28.running_mean", "cnn.cnn.features.28.running_var", "cnn.cnn.features.28.num_batches_tracked", "cnn.cnn.features.30.weight", "cnn.cnn.features.30.bias", "cnn.cnn.features.31.weight", "cnn.cnn.features.31.bias", "cnn.cnn.features.31.running_mean", "cnn.cnn.features.31.running_var", "cnn.cnn.features.31.num_batches_tracked", "cnn.cnn.features.33.weight", "cnn.cnn.features.33.bias", "cnn.cnn.features.34.weight", "cnn.cnn.features.34.bias", "cnn.cnn.features.34.running_mean", "cnn.cnn.features.34.running_var", "cnn.cnn.features.34.num_batches_tracked", "cnn.cnn.features.36.weight", "cnn.cnn.features.36.bias", "cnn.cnn.features.37.weight", "cnn.cnn.features.37.bias", "cnn.cnn.features.37.running_mean", "cnn.cnn.features.37.running_var", "cnn.cnn.features.37.num_batches_tracked", "cnn.cnn.features.40.weight", "cnn.cnn.features.40.bias", "cnn.cnn.features.41.weight", "cnn.cnn.features.41.bias", "cnn.cnn.features.41.running_mean", "cnn.cnn.features.41.running_var", "cnn.cnn.features.41.num_batches_tracked", "cnn.cnn.features.43.weight", "cnn.cnn.features.43.bias", "cnn.cnn.features.44.weight", "cnn.cnn.features.44.bias", "cnn.cnn.features.44.running_mean", "cnn.cnn.features.44.running_var", "cnn.cnn.features.44.num_batches_tracked", "cnn.cnn.features.46.weight", "cnn.cnn.features.46.bias", "cnn.cnn.features.47.weight", "cnn.cnn.features.47.bias", "cnn.cnn.features.47.running_mean", "cnn.cnn.features.47.running_var", "cnn.cnn.features.47.num_batches_tracked", "cnn.cnn.features.49.weight", "cnn.cnn.features.49.bias", "cnn.cnn.features.50.weight", "cnn.cnn.features.50.bias", "cnn.cnn.features.50.running_mean", "cnn.cnn.features.50.running_var", "cnn.cnn.features.50.num_batches_tracked", "cnn.cnn.classifier.0.weight", "cnn.cnn.classifier.0.bias", "cnn.cnn.classifier.3.weight", "cnn.cnn.classifier.3.bias", "cnn.cnn.classifier.6.weight", "cnn.cnn.classifier.6.bias".
size mismatch for transformer.embed_tgt.weight: copying a param with shape torch.Size([232, 512]) from checkpoint, the shape in current model is torch.Size([233, 256]).
size mismatch for transformer.pos_enc.pe: copying a param with shape torch.Size([10000, 1, 512]) from checkpoint, the shape in current model is torch.Size([1024, 1, 256]).
size mismatch for transformer.transformer.encoder.layers.0.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.encoder.layers.0.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.encoder.layers.0.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.encoder.layers.0.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.0.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.encoder.layers.0.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.encoder.layers.0.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.0.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.0.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.0.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.0.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.1.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.encoder.layers.1.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.encoder.layers.1.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.encoder.layers.1.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.1.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.encoder.layers.1.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.encoder.layers.1.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.1.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.1.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.1.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.1.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.2.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.encoder.layers.2.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.encoder.layers.2.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.encoder.layers.2.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.2.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.encoder.layers.2.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.encoder.layers.2.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.2.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.2.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.2.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.2.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.3.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.encoder.layers.3.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.encoder.layers.3.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.encoder.layers.3.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.3.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.encoder.layers.3.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.encoder.layers.3.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.3.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.3.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.3.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.3.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.4.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.encoder.layers.4.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.encoder.layers.4.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.encoder.layers.4.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.4.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.encoder.layers.4.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.encoder.layers.4.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.4.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.4.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.4.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.4.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.5.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.encoder.layers.5.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.encoder.layers.5.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.encoder.layers.5.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.5.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.encoder.layers.5.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.encoder.layers.5.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.5.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.5.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.5.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.layers.5.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.encoder.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.0.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.0.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.0.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.0.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.0.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.0.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.0.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.0.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.0.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.decoder.layers.0.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.decoder.layers.0.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.0.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.0.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.0.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.0.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.0.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.0.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.1.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.1.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.1.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.1.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.1.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.1.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.1.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.1.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.1.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.decoder.layers.1.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.decoder.layers.1.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.1.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.1.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.1.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.1.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.1.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.1.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.2.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.2.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.2.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.2.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.2.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.2.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.2.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.2.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.2.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.decoder.layers.2.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.decoder.layers.2.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.2.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.2.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.2.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.2.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.2.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.2.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.3.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.3.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.3.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.3.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.3.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.3.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.3.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.3.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.3.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.decoder.layers.3.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.decoder.layers.3.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.3.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.3.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.3.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.3.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.3.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.3.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.4.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.4.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.4.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.4.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.4.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.4.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.4.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.4.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.4.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.decoder.layers.4.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.decoder.layers.4.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.4.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.4.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.4.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.4.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.4.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.4.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.5.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.5.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.5.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.5.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.5.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
size mismatch for transformer.transformer.decoder.layers.5.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
size mismatch for transformer.transformer.decoder.layers.5.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
size mismatch for transformer.transformer.decoder.layers.5.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.5.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
size mismatch for transformer.transformer.decoder.layers.5.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
size mismatch for transformer.transformer.decoder.layers.5.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.5.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.5.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.5.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.5.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.5.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.layers.5.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.transformer.decoder.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for transformer.fc.weight: copying a param with shape torch.Size([232, 512]) from checkpoint, the shape in current model is torch.Size([233, 256]).
size mismatch for transformer.fc.bias: copying a param with shape torch.Size([232]) from checkpoint, the shape in current model is torch.Size([233]).
Em nhận được thông báo này khi training ạ. Mong anh chỉ dẫn cách fix lỗi này ạ
Computing MD5: /tmp/tranformerorc.pth
MD5 matches: /tmp/tranformerorc.pth
transformer.embed_tgt.weight missmatching shape
transformer.fc.weight missmatching shape
transformer.fc.bias missmatching shape
Traceback (most recent call last):
File "main.py", line 26, in <module>
trainer = Trainer(config, pretrained=True)
File "/Volumes/DATA/DRAF/Deep Learning/DetectHand2/vietocr/model/trainer.py", line 62, in __init__
self.scheduler = OneCycleLR(self.optimizer, **config['optimizer'])
TypeError: __init__() got an unexpected keyword argument 'n_warmup_steps'
Dưới dây là code của em:
from vietocr.tool.config import Cfg
from vietocr.model.trainer import Trainer
if __name__ == '__main__':
config = Cfg.load_config_from_name('vgg_transformer')
dataset_params = {
'name': 'hw',
'data_root': './data/',
'train_annotation': 'train_annotation.txt',
'valid_annotation': 'test_annotation.txt'
}
params = {
'print_every': 200,
'valid_every': 15 * 200,
'iters': 20000,
'checkpoint': './checkpoint/transformerocr_checkpoint.pth',
'export': './weights/transformerocr.pth',
'metrics': 10000
}
config['trainer'].update(params)
config['dataset'].update(dataset_params)
trainer = Trainer(config, pretrained=True)
trainer.visualize_dataset()
Em chào anh.
Trước hết em cảm ơn anh vì thư viện VietOCR thực sự có ích cho cộng đồng học máy nói chung và em nói riêng. Trong quá trình tìm hiểu thư viện VietOcr, em có một thắc mắc nhỏ là em chưa tìm thấy cách xử lý ảnh line text đầu vào có chiều dài khác nhau để có thể train được theo từng batch. Nếu có thể, anh có thể chia sẻ giúp em được không ?
Em cảm ơn anh. Chúc anh trong thời gian tới, công việc tiếp tục phát triển và có nhiều đóng góp cho cộng đồng AI Việt Nam hơn nữa.
Dear a Quốc,
Cho e hỏi a có model train của vgg_seq2seq và hướng dẫn dùng nó không. Em đang dùng vgg_transformer thấy nó đọc chậm thật
Xin chào anh Quốc,
Cám ơn anh đã tạo ra thư viện OCR cho người việt. Em có 3 vần đề muốn hỏi anh là:
+ Em muốn lấy thông tin từ CMND thông qua Transformer thì em có thể scan 1 lần toàn bộ CMND không ạ hay phải cắt nhỏ ra từng box?
+ Nếu train CMND thì em phải train tầm bao nhiêu anh thì có thể sử dụng ạ?
+ Do em làm .NET nên em có thử download file weights như hình thì thấy kết quả ra không đúng, không biết em có làm sai bước nào không? :D
Chúc anh nhiều sức khỏe.
Hi a Quốc, em đang test thử model với 1 số loại chữ khác nhau thì gặp vấn đề khi thay đổi vùng crop. Ví dụ như sau:
Kết quả vgg19_bn-seq2seq: UASORU7250KXXY
Kết quả vgg19_bn-seq2seq: UASCRU7250KXY
Kết quả vgg19_bn-seq2seq: UASORU7250KXY
Khi e test bằng model transformer thì cũng bị vấn đề tương tự. Vậy có cách nào để giải quyết vấn đề này hoặc có gợi ý nào về việc crop đẹp để model đọc tốt ko anh?
Em chào anh ạ, anh cho em hỏi là train từ đầu thì số lượng ảnh để train khoảng bao nhiêu là đủ tốt ạ.
Em cảm ơn ạ
When i tried to use your notebook and your dataset to train new model, i had an error "This DataLoader will create 3 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create". And when i set num_wokers = 2 in config file, it occured new error: "StopIteration".
Can you help me fix this error? Thank you!
Anh ơi, em muốn lấy được giá trị xác suất sau khi đọc ảnh xong thì mình lấy như nào ạ!
sau khi em clone có build và instal. nhưng khi predict thì lại xảy ra lỗi, anh có thể giúp em được không ạ
Mình có 1 chút thắc mắc về xử lý target trong vietocr
Hi anh,
Em có sử dụng và fine tune lại Vietocr. Trong quá trình sử dụng thì em thấy prob trả về hầu như là 0.93. Sau khi fine tune thì prob trả về chủ yếu là 0.91. Anh cho em hỏi là cách tính prob của Vietocr như thế nào và tại sao prob trả về hầu như là giống nhau đến thế. Tks anh!
hi anh !
Anh cho em hỏi là em muốn tiếp tục training model đang trên dở thì setup thế nào ạ ? Em đã thử truyền model vào checkpoint mà không được ???
Hi a Quốc, em đang test model của anh theo code bên dưới thì gặp lỗi sau:
Traceback (most recent call last):
File "/home/duycuong/PycharmProjects/research_py3/text_recognition/classifier/vietocr/vietocr_gettingstart.py", line 13, in <module>
detector = Predictor(config)
File "/home/duycuong/PycharmProjects/research_py3/text_recognition/classifier/vietocr/vietocr/tool/predictor.py", line 19, in __init__
model.load_state_dict(torch.load(weights, map_location=torch.device(device)))
File "/home/duycuong/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for VietOCR:
size mismatch for transformer.embed_tgt.weight: copying a param with shape torch.Size([232, 256]) from checkpoint, the shape in current model is torch.Size([233, 256]).
size mismatch for transformer.fc.weight: copying a param with shape torch.Size([232, 256]) from checkpoint, the shape in current model is torch.Size([233, 256]).
size mismatch for transformer.fc.bias: copying a param with shape torch.Size([232]) from checkpoint, the shape in current model is torch.Size([233]).
Đây là code em chạy
from PIL import Image
from vietocr.tool.predictor import Predictor
from vietocr.tool.config import Cfg
config = Cfg.load_config_from_name('vgg_transformer')
config['device'] = 'cuda:0'
config['predictor']['beamsearch']=False
detector = Predictor(config)
Mình traning x2 RTX 3060 thì tham số "device" để như thế nào vậy Quốc?
Thanks.
Chào anh, em chạy thử quá trình huấn luyện với code mẫu trên máy mình với GPU RTX 2080, CUDA 10.1, nhưng bị đứng khi bước Trainer.train().
Tuy nhiên khi chạy bằng colab ở notebook getting started thì chạy bình thường.
Em có debug thử thì nó nằm ở vấn đề trong việc lấy ra từng batch, ở dòng 100 trong file trainer.py của thư mục model.
Em chưa tìm ra cách fix được cái lỗi này, vấn đề này anh có từng gặp chưa ạ.
Chào Quốc,
Cảm ơn bạn vì vietocr,
Mình hiện đang chạy thử lại phần training trong vietocr_gettingstart.ipynb với phiên bàn 0.3.2 nhưng với 20000 iterations thì kết quả đang thấp.
Hiện mình đã chạy thử khoảng 180k iterations nhưng kết quả khoảng 0.30 cho full câu và 0.70 cho chars.
Để tiện so sánh nên mình muốn hỏi với bộ dataset trong notebook thì bạn đạt được kết quả bao nhiều về acc full câu và phải training với bao nhiều iterations?
Cảm ơn bạn trước nhé!
Bạn ơi bạn có thể check giúp mình lỗi này được ko ạ. Khi mình chạy version 0.1.9 thì bị lỗi này. Mình check trong code thì báo lỗi dòng này "from vietocr.loader.DataLoader import DataGen"
Traceback (most recent call last):
File "main.py", line 2, in <module>
from vietocr.model.trainer import Trainer
File "/Volumes/DATA/STUDY/AI/Buoi05/DetectHand2/lib/python3.7/site-packages/vietocr/model/trainer.py", line 11, in <module>
from vietocr.loader.DataLoader import DataGen
Dạ chào anh, em có thử train code của anh với dataset nhỏ ( chỉ gồm 4 ảnh) để thử nghiệm , quá trình tạo dataset thành công nhưng khi visualize lại không hiện ra gì và train thì hiện lỗi như hình ạ, không biết có phải chỉnh lại tham số valid hay iter gì cho phù hợp với số lượng ảnh trong dataset không ạ. Em cảm ơn
Chào bạn, mình muốn chỉnh sửa vocab của mình thành chỉ những số từ 0-9 khi prediect thì có thể chỉnh được ở đâu ạ.
Khi mình chỉnh config['vocab'] = '0123456789' khi init_detector thì bị lỗi size mismatch.
Em chào anh
Em đang train mô hình OCR sử dụng thư viện vietocr, em đã từng train vài lần trước đây nhưng hôm nay gặp một lỗi lạ, anh có thể giúp em xem đây là lỗi gì không ạ.
Em cảm ơn anh ạ.
Computing MD5: /tmp/tranformerorc.pth
MD5 matches: /tmp/tranformerorc.pth
transformer.decoder.embedding.weight missmatching shape, required torch.Size([256, 256]) but found torch.Size([233, 256])
transformer.decoder.fc_out.weight missmatching shape, required torch.Size([256, 1024]) but found torch.Size([233, 1024])
transformer.decoder.fc_out.bias missmatching shape, required torch.Size([256]) but found torch.Size([233])
Em chào anh và mọi người, em có thắc mắc cần mọi người giải đáp hộ ạ.
Như em từng đọc 1 bài làm áp dụng transformer cho câu văn bản thì đầu vào cho encoder cần padding để các câu có độ dài bằng nhau.
Nhưng em thấy ở đây, chiều rộng của ảnh đầu vào là khác nhau, sau lớp CNN thì kích thước input là W N C trong đó W khác nhau
Vậy nên anh và mọi ngườigiúp em rằng ở đây có cần padding để input có độ dài bằng nhau không ạ. Em cảm ơn ạ
Hello anh Quốc,
E đang thực hiện retrain model, nhưng khi train thì gặp lỗi này và research thì không hiểu gặp vấn đề gì, mong a giúp đỡ
E sử dụng colab, torch 1.9 cuda 10.2, Vietocr ver 0.3.5
Lỗi như sau ạ :
----> 1 trainer = Trainer(config, pretrained=True)
8 frames
/usr/local/lib/python3.7/dist-packages/vietocr/model/trainer.py in __init__(self, config, pretrained, augmentor)
30
31 self.config = config
---> 32 self.model, self.vocab = build_model(config)
33
34 self.device = config['device']
/usr/local/lib/python3.7/dist-packages/vietocr/tool/translate.py in build_model(config)
128 config['seq_modeling'])
129
--> 130 model = model.to(device)
131
132 return model, vocab
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in to(self, *args, **kwargs)
850 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
851
--> 852 return self._apply(convert)
853
854 def register_backward_hook(
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
528 def _apply(self, fn):
529 for module in self.children():
--> 530 module._apply(fn)
531
532 def compute_should_use_set_data(tensor, tensor_applied):
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
528 def _apply(self, fn):
529 for module in self.children():
--> 530 module._apply(fn)
531
532 def compute_should_use_set_data(tensor, tensor_applied):
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
528 def _apply(self, fn):
529 for module in self.children():
--> 530 module._apply(fn)
531
532 def compute_should_use_set_data(tensor, tensor_applied):
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
528 def _apply(self, fn):
529 for module in self.children():
--> 530 module._apply(fn)
531
532 def compute_should_use_set_data(tensor, tensor_applied):
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
550 # `with torch.no_grad():`
551 with torch.no_grad():
--> 552 param_applied = fn(param)
553 should_use_set_data = compute_should_use_set_data(param, param_applied)
554 if should_use_set_data:
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in convert(t)
848 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
849 non_blocking, memory_format=convert_to_format)
--> 850 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
851
852 return self._apply(convert)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Cảm ơn anh.
I'm using Tesseract to detect texts and extract text regions. Those images have the different sizes.
How could I use the transformerOCR as the batch predicter for those images?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.