Giter Site home page Giter Site logo

vietocr's Introduction

Hey! 👋

I'm a DataScientist, interested in AI and Machine Learning. I like to build things, you can find everything that I build here on my Github account

The best way to contact me is usually through Facebook or Email.

vietocr's People

Contributors

pbcquoc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vietocr's Issues

vv sử dụng ảnh to

Dear a Quốc,

Hiên tại e thấy VietOCR đang đọc được ảnh bé. giờ e muốn đọc ảnh với kích thước A3 thì làm thế nào ạ

How to convert tranformer ocr model to onnx?

Hi anh,
Em đã sử dụng model tranformer ocr để train lại với dữ liệu của mình, hiện tại e muốn triển khai model với hiệu năng tốt hơn nên có ý định convert model sang định dạng onnx.
Vấn đề là phần xử lý để có từ đầu ra của model VietOCR đến khi thu được text còn nhiều bước xử lý và các bước này không kế thừa 1 nn.Module.
Vậy nên em muốn hỏi là model hay thư viện của mình có hỗ trợ hay a có gợi ý gì để em có thể chuyển model sang onnx được không ạ.
Em cảm ơn.

Về pretrain model

@pbcquoc Em chào anh,
Anh cho em hỏi với tập data 10m của anh thì anh train với bao nhiêu step để có pretrain model đó ạ?
Em cảm ơn anh!

Performance khi deloy lên môi trường product

Hello a @pbcquoc,
Khi deloy lên môi trường production cụ thể là website và sử dụng kiến trúc Seq2Seq để predict, em nhận thấy time để predict khá lâu khoảng 3-4s. Có cách nào để improve k a?
E cảm ơn.

Predicted acurracy is 0, some keys of config file was wrong.

Em chào anh, em có chạy file notebook. Thì em có gặp một số lỗi như sau:
image
Em có thực hiện bằng cách thay đổi thành các key khác:

config["optimizer"]["max_lr"] = 0.1
config["optimizer"]["total_steps"] = 4000

del config["optimizer"]["init_lr"]
del config["optimizer"]["n_warmup_steps"]

Ngoài ra thì em còn thấy missmatching shape model.
Khi tiến hành chạy thì kết quả dự đoán là 0%. Điều này có phải là do chưa train đủ lâu hay không ạ.
image
Em xin cảm ơn!

File pretrain cho backbone seq2seq.

Anh có thể chia sẻ file pretrain của seq2seq không ạ, em chỉ tìm được pretrain của transformer thôi ạ. Cảm ơn anh.

Dữ liệu khi training

Em chào anh, đầu tiên em xin cảm ơn anh về thư viện ocr là một thư viện rất tốt cho em và cộng đồng sử dụng. Em có một vấn đề mong anh giải đáp. Khi tạo dữ liệu để train mô hình attentionocr, thì các đoạn text có nên là các câu có nghĩa hay không ạ? Nếu như em cần OCR cả những chữ số, ký tự đặc biệt thì nên tạo data như nào ạ? Em đang gen từ ngẫu nhiên và lẫn lộn trong đó cả số và ký tự, ví dụ "Đồng bào con gà 12389 * &^^" , thì lúc train cho kết quả không như mong đợi. Em xin cảm ơn.

Một số vấn đề về mạng CNN

Em xin phép hỏi a một số vấn đề về mặt lý thuyết ạ.

  • Mạng VGG-19 a sử dụng có đầu ra trước khi flatten là 1xCx2x32 tương ứng BxCxHxW1 (với ảnh đầu vào 128x32). Khác với mã code của paper CRNN-CTC gốc là 1xCx1x33 ~ 33 timesteps. Ý tưởng paper CRNN-CTC là mỗi timesteps sẽ tương ứng với một vùng chữ nhật widthx32 trên ảnh gốc, còn đầu ra của anh thì chiều height featuremap không bằng 1 nên anh đã flatten để được 1xCx64 ~ 64 timesteps. Theo em thấy thì mỗi timestep trong 64 timestep kia chỉ biểu diễn được widthx16 trên ảnh gốc. Vậy tại sao anh lại thay đổi như vậy ?
  • Vấn đề thứ hai là anh nghĩ sử dụng AvgPool2D thay vì Maxpool2D sẽ có lợi gì không ạ.
    Em cảm ơn anh.

Train với ảnh gray

Em chào anh ạ, anh cho em hỏi là em cần config như thế nào để có thể train với ảnh gray ạ, em có thử convert('L') trong hàm process_image() ở file translate.py nhưng em đang gặp lỗi ở phần augmentor ảnh ạ

Lỗi ký tự '/'

Chào anh, em đang sử dụng vietocr để train lại trên tập dữ liệu của em. Tuy nhiên em nhận thấy rằng model chạy rất tốt trên hầu hết các loại chữ nhưng thường lỗi ở ký tự '/' như thông tin liên quan ngày tháng có '/' ở giữa. Cho em hỏi có cách nào để cải thiện vấn đề này k ạ! Em cảm ơn!

model attention OCR

bạn ơi mình ko thấy trong code cũng như public model attention OCR mà chỉ thấy transformer OCR, bạn có thể cho mình link download được không.
Cám ơn tác giả

RuntimeError: Error(s) in loading state_dict for VietOCR

Hi anh,
Em tải weight vgg-transformer về sau đó load_state_dict thì gặp lỗi này:

size mismatch for transformer.embed_tgt.weight: copying a param with shape torch.Size([232, 256]) from checkpoint, the shape in current model is torch.Size([233, 256]).
size mismatch for transformer.fc.weight: copying a param with shape torch.Size([232, 256]) from checkpoint, the shape in current model is torch.Size([233, 256]).
size mismatch for transformer.fc.bias: copying a param with shape torch.Size([232]) from checkpoint, the shape in current model is torch.Size([233]).

Đây là file config của em:

vocab: 'aAàÀảẢãÃáÁạẠăĂằẰẳẲẵẴắẮặẶâÂầẦẩẨẫẪấẤậẬbBcCdDđĐeEèÈẻẺẽẼéÉẹẸêÊềỀểỂễỄếẾệỆfFgGhHiIìÌỉỈĩĨíÍịỊjJkKlLmMnNoOòÒỏỎõÕóÓọỌôÔồỒổỔỗỖốỐộỘơƠờỜởỞỡỠớỚợỢpPqQrRsStTuUùÙủỦũŨúÚụỤưƯừỪửỬữỮứỨựỰvVwWxXyYỳỲỷỶỹỸýÝỵỴzZ0123456789!"#$%&''()*+,-./:;<=>?@[\]^_`{|}~ '
device: cuda
weights: weights/transformerocr.pth
backbone: vgg19_bn
cnn:
    # pooling stride size
    ss:
        - [2, 2]
        - [2, 2]
        - [2, 1]
        - [2, 1]
        - [1, 1]         
    # pooling kernel size 
    ks:
        - [2, 2]
        - [2, 2]
        - [2, 1]
        - [2, 1]
        - [1, 1]
transformer:  
    d_model: 256
    nhead: 8
    num_encoder_layers: 6
    num_decoder_layers: 6
    dim_feedforward: 2048
    max_seq_length: 1024
    pos_dropout: 0.1
    trans_dropout: 0.1
seq_modeling: 'transformer'
beamsearch: False

Error: Size mismatch for resnet_transformer

E đang thử mô hình resnet_transformer thì cũng đang gặp lỗi size mismatch (version đang là 0.1.9):

Đây là code em chạy:

config = Cfg.load_config_from_name('resnet_transformer')
config['device'] = 'cuda:0'
config['predictor']['beamsearch'] = False
detector = Predictor(config)

Đây là thông báo lỗi:


File "/home/duycuong/PycharmProjects/research_py3/text_recognition/classifier/vietocr/vietocr/eval.py", line 24, in <module>
    detector = Predictor(config)
  File "/home/duycuong/PycharmProjects/research_py3/text_recognition/classifier/vietocr/vietocr/tool/predictor.py", line 19, in __init__
    model.load_state_dict(torch.load(weights, map_location=torch.device(device)))
  File "/home/duycuong/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for VietOCR:
	Missing key(s) in state_dict: "cnn.model.conv0_1.weight", "cnn.model.bn0_1.weight", "cnn.model.bn0_1.bias", "cnn.model.bn0_1.running_mean", "cnn.model.bn0_1.running_var", "cnn.model.conv0_2.weight", "cnn.model.bn0_2.weight", "cnn.model.bn0_2.bias", "cnn.model.bn0_2.running_mean", "cnn.model.bn0_2.running_var", "cnn.model.layer1.0.conv1.weight", "cnn.model.layer1.0.bn1.weight", "cnn.model.layer1.0.bn1.bias", "cnn.model.layer1.0.bn1.running_mean", "cnn.model.layer1.0.bn1.running_var", "cnn.model.layer1.0.conv2.weight", "cnn.model.layer1.0.bn2.weight", "cnn.model.layer1.0.bn2.bias", "cnn.model.layer1.0.bn2.running_mean", "cnn.model.layer1.0.bn2.running_var", "cnn.model.layer1.0.downsample.0.weight", "cnn.model.layer1.0.downsample.1.weight", "cnn.model.layer1.0.downsample.1.bias", "cnn.model.layer1.0.downsample.1.running_mean", "cnn.model.layer1.0.downsample.1.running_var", "cnn.model.conv1.weight", "cnn.model.bn1.weight", "cnn.model.bn1.bias", "cnn.model.bn1.running_mean", "cnn.model.bn1.running_var", "cnn.model.layer2.0.conv1.weight", "cnn.model.layer2.0.bn1.weight", "cnn.model.layer2.0.bn1.bias", "cnn.model.layer2.0.bn1.running_mean", "cnn.model.layer2.0.bn1.running_var", "cnn.model.layer2.0.conv2.weight", "cnn.model.layer2.0.bn2.weight", "cnn.model.layer2.0.bn2.bias", "cnn.model.layer2.0.bn2.running_mean", "cnn.model.layer2.0.bn2.running_var", "cnn.model.layer2.0.downsample.0.weight", "cnn.model.layer2.0.downsample.1.weight", "cnn.model.layer2.0.downsample.1.bias", "cnn.model.layer2.0.downsample.1.running_mean", "cnn.model.layer2.0.downsample.1.running_var", "cnn.model.layer2.1.conv1.weight", "cnn.model.layer2.1.bn1.weight", "cnn.model.layer2.1.bn1.bias", "cnn.model.layer2.1.bn1.running_mean", "cnn.model.layer2.1.bn1.running_var", "cnn.model.layer2.1.conv2.weight", "cnn.model.layer2.1.bn2.weight", "cnn.model.layer2.1.bn2.bias", "cnn.model.layer2.1.bn2.running_mean", "cnn.model.layer2.1.bn2.running_var", "cnn.model.conv2.weight", "cnn.model.bn2.weight", "cnn.model.bn2.bias", "cnn.model.bn2.running_mean", "cnn.model.bn2.running_var", "cnn.model.layer3.0.conv1.weight", "cnn.model.layer3.0.bn1.weight", "cnn.model.layer3.0.bn1.bias", "cnn.model.layer3.0.bn1.running_mean", "cnn.model.layer3.0.bn1.running_var", "cnn.model.layer3.0.conv2.weight", "cnn.model.layer3.0.bn2.weight", "cnn.model.layer3.0.bn2.bias", "cnn.model.layer3.0.bn2.running_mean", "cnn.model.layer3.0.bn2.running_var", "cnn.model.layer3.0.downsample.0.weight", "cnn.model.layer3.0.downsample.1.weight", "cnn.model.layer3.0.downsample.1.bias", "cnn.model.layer3.0.downsample.1.running_mean", "cnn.model.layer3.0.downsample.1.running_var", "cnn.model.layer3.1.conv1.weight", "cnn.model.layer3.1.bn1.weight", "cnn.model.layer3.1.bn1.bias", "cnn.model.layer3.1.bn1.running_mean", "cnn.model.layer3.1.bn1.running_var", "cnn.model.layer3.1.conv2.weight", "cnn.model.layer3.1.bn2.weight", "cnn.model.layer3.1.bn2.bias", "cnn.model.layer3.1.bn2.running_mean", "cnn.model.layer3.1.bn2.running_var", "cnn.model.layer3.2.conv1.weight", "cnn.model.layer3.2.bn1.weight", "cnn.model.layer3.2.bn1.bias", "cnn.model.layer3.2.bn1.running_mean", "cnn.model.layer3.2.bn1.running_var", "cnn.model.layer3.2.conv2.weight", "cnn.model.layer3.2.bn2.weight", "cnn.model.layer3.2.bn2.bias", "cnn.model.layer3.2.bn2.running_mean", "cnn.model.layer3.2.bn2.running_var", "cnn.model.layer3.3.conv1.weight", "cnn.model.layer3.3.bn1.weight", "cnn.model.layer3.3.bn1.bias", "cnn.model.layer3.3.bn1.running_mean", "cnn.model.layer3.3.bn1.running_var", "cnn.model.layer3.3.conv2.weight", "cnn.model.layer3.3.bn2.weight", "cnn.model.layer3.3.bn2.bias", "cnn.model.layer3.3.bn2.running_mean", "cnn.model.layer3.3.bn2.running_var", "cnn.model.layer3.4.conv1.weight", "cnn.model.layer3.4.bn1.weight", "cnn.model.layer3.4.bn1.bias", "cnn.model.layer3.4.bn1.running_mean", "cnn.model.layer3.4.bn1.running_var", "cnn.model.layer3.4.conv2.weight", "cnn.model.layer3.4.bn2.weight", "cnn.model.layer3.4.bn2.bias", "cnn.model.layer3.4.bn2.running_mean", "cnn.model.layer3.4.bn2.running_var", "cnn.model.conv3.weight", "cnn.model.bn3.weight", "cnn.model.bn3.bias", "cnn.model.bn3.running_mean", "cnn.model.bn3.running_var", "cnn.model.layer4.0.conv1.weight", "cnn.model.layer4.0.bn1.weight", "cnn.model.layer4.0.bn1.bias", "cnn.model.layer4.0.bn1.running_mean", "cnn.model.layer4.0.bn1.running_var", "cnn.model.layer4.0.conv2.weight", "cnn.model.layer4.0.bn2.weight", "cnn.model.layer4.0.bn2.bias", "cnn.model.layer4.0.bn2.running_mean", "cnn.model.layer4.0.bn2.running_var", "cnn.model.layer4.1.conv1.weight", "cnn.model.layer4.1.bn1.weight", "cnn.model.layer4.1.bn1.bias", "cnn.model.layer4.1.bn1.running_mean", "cnn.model.layer4.1.bn1.running_var", "cnn.model.layer4.1.conv2.weight", "cnn.model.layer4.1.bn2.weight", "cnn.model.layer4.1.bn2.bias", "cnn.model.layer4.1.bn2.running_mean", "cnn.model.layer4.1.bn2.running_var", "cnn.model.layer4.2.conv1.weight", "cnn.model.layer4.2.bn1.weight", "cnn.model.layer4.2.bn1.bias", "cnn.model.layer4.2.bn1.running_mean", "cnn.model.layer4.2.bn1.running_var", "cnn.model.layer4.2.conv2.weight", "cnn.model.layer4.2.bn2.weight", "cnn.model.layer4.2.bn2.bias", "cnn.model.layer4.2.bn2.running_mean", "cnn.model.layer4.2.bn2.running_var", "cnn.model.conv4_1.weight", "cnn.model.bn4_1.weight", "cnn.model.bn4_1.bias", "cnn.model.bn4_1.running_mean", "cnn.model.bn4_1.running_var", "cnn.model.conv4_2.weight", "cnn.model.bn4_2.weight", "cnn.model.bn4_2.bias", "cnn.model.bn4_2.running_mean", "cnn.model.bn4_2.running_var". 
	Unexpected key(s) in state_dict: "cnn.cnn.features.0.weight", "cnn.cnn.features.0.bias", "cnn.cnn.features.1.weight", "cnn.cnn.features.1.bias", "cnn.cnn.features.1.running_mean", "cnn.cnn.features.1.running_var", "cnn.cnn.features.1.num_batches_tracked", "cnn.cnn.features.3.weight", "cnn.cnn.features.3.bias", "cnn.cnn.features.4.weight", "cnn.cnn.features.4.bias", "cnn.cnn.features.4.running_mean", "cnn.cnn.features.4.running_var", "cnn.cnn.features.4.num_batches_tracked", "cnn.cnn.features.7.weight", "cnn.cnn.features.7.bias", "cnn.cnn.features.8.weight", "cnn.cnn.features.8.bias", "cnn.cnn.features.8.running_mean", "cnn.cnn.features.8.running_var", "cnn.cnn.features.8.num_batches_tracked", "cnn.cnn.features.10.weight", "cnn.cnn.features.10.bias", "cnn.cnn.features.11.weight", "cnn.cnn.features.11.bias", "cnn.cnn.features.11.running_mean", "cnn.cnn.features.11.running_var", "cnn.cnn.features.11.num_batches_tracked", "cnn.cnn.features.14.weight", "cnn.cnn.features.14.bias", "cnn.cnn.features.15.weight", "cnn.cnn.features.15.bias", "cnn.cnn.features.15.running_mean", "cnn.cnn.features.15.running_var", "cnn.cnn.features.15.num_batches_tracked", "cnn.cnn.features.17.weight", "cnn.cnn.features.17.bias", "cnn.cnn.features.18.weight", "cnn.cnn.features.18.bias", "cnn.cnn.features.18.running_mean", "cnn.cnn.features.18.running_var", "cnn.cnn.features.18.num_batches_tracked", "cnn.cnn.features.20.weight", "cnn.cnn.features.20.bias", "cnn.cnn.features.21.weight", "cnn.cnn.features.21.bias", "cnn.cnn.features.21.running_mean", "cnn.cnn.features.21.running_var", "cnn.cnn.features.21.num_batches_tracked", "cnn.cnn.features.23.weight", "cnn.cnn.features.23.bias", "cnn.cnn.features.24.weight", "cnn.cnn.features.24.bias", "cnn.cnn.features.24.running_mean", "cnn.cnn.features.24.running_var", "cnn.cnn.features.24.num_batches_tracked", "cnn.cnn.features.27.weight", "cnn.cnn.features.27.bias", "cnn.cnn.features.28.weight", "cnn.cnn.features.28.bias", "cnn.cnn.features.28.running_mean", "cnn.cnn.features.28.running_var", "cnn.cnn.features.28.num_batches_tracked", "cnn.cnn.features.30.weight", "cnn.cnn.features.30.bias", "cnn.cnn.features.31.weight", "cnn.cnn.features.31.bias", "cnn.cnn.features.31.running_mean", "cnn.cnn.features.31.running_var", "cnn.cnn.features.31.num_batches_tracked", "cnn.cnn.features.33.weight", "cnn.cnn.features.33.bias", "cnn.cnn.features.34.weight", "cnn.cnn.features.34.bias", "cnn.cnn.features.34.running_mean", "cnn.cnn.features.34.running_var", "cnn.cnn.features.34.num_batches_tracked", "cnn.cnn.features.36.weight", "cnn.cnn.features.36.bias", "cnn.cnn.features.37.weight", "cnn.cnn.features.37.bias", "cnn.cnn.features.37.running_mean", "cnn.cnn.features.37.running_var", "cnn.cnn.features.37.num_batches_tracked", "cnn.cnn.features.40.weight", "cnn.cnn.features.40.bias", "cnn.cnn.features.41.weight", "cnn.cnn.features.41.bias", "cnn.cnn.features.41.running_mean", "cnn.cnn.features.41.running_var", "cnn.cnn.features.41.num_batches_tracked", "cnn.cnn.features.43.weight", "cnn.cnn.features.43.bias", "cnn.cnn.features.44.weight", "cnn.cnn.features.44.bias", "cnn.cnn.features.44.running_mean", "cnn.cnn.features.44.running_var", "cnn.cnn.features.44.num_batches_tracked", "cnn.cnn.features.46.weight", "cnn.cnn.features.46.bias", "cnn.cnn.features.47.weight", "cnn.cnn.features.47.bias", "cnn.cnn.features.47.running_mean", "cnn.cnn.features.47.running_var", "cnn.cnn.features.47.num_batches_tracked", "cnn.cnn.features.49.weight", "cnn.cnn.features.49.bias", "cnn.cnn.features.50.weight", "cnn.cnn.features.50.bias", "cnn.cnn.features.50.running_mean", "cnn.cnn.features.50.running_var", "cnn.cnn.features.50.num_batches_tracked", "cnn.cnn.classifier.0.weight", "cnn.cnn.classifier.0.bias", "cnn.cnn.classifier.3.weight", "cnn.cnn.classifier.3.bias", "cnn.cnn.classifier.6.weight", "cnn.cnn.classifier.6.bias". 
	size mismatch for transformer.embed_tgt.weight: copying a param with shape torch.Size([232, 512]) from checkpoint, the shape in current model is torch.Size([233, 256]).
	size mismatch for transformer.pos_enc.pe: copying a param with shape torch.Size([10000, 1, 512]) from checkpoint, the shape in current model is torch.Size([1024, 1, 256]).
	size mismatch for transformer.transformer.encoder.layers.0.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.encoder.layers.0.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.encoder.layers.0.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.encoder.layers.0.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.0.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.encoder.layers.0.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.encoder.layers.0.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.0.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.0.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.0.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.0.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.1.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.encoder.layers.1.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.encoder.layers.1.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.encoder.layers.1.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.1.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.encoder.layers.1.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.encoder.layers.1.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.1.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.1.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.1.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.1.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.2.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.encoder.layers.2.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.encoder.layers.2.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.encoder.layers.2.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.2.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.encoder.layers.2.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.encoder.layers.2.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.2.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.2.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.2.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.2.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.3.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.encoder.layers.3.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.encoder.layers.3.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.encoder.layers.3.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.3.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.encoder.layers.3.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.encoder.layers.3.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.3.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.3.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.3.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.3.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.4.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.encoder.layers.4.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.encoder.layers.4.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.encoder.layers.4.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.4.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.encoder.layers.4.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.encoder.layers.4.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.4.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.4.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.4.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.4.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.5.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.encoder.layers.5.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.encoder.layers.5.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.encoder.layers.5.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.5.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.encoder.layers.5.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.encoder.layers.5.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.5.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.5.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.5.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.layers.5.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.encoder.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.0.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.0.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.0.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.0.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.0.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.0.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.0.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.0.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.0.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.decoder.layers.0.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.decoder.layers.0.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.0.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.0.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.0.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.0.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.0.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.0.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.1.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.1.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.1.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.1.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.1.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.1.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.1.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.1.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.1.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.decoder.layers.1.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.decoder.layers.1.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.1.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.1.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.1.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.1.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.1.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.1.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.2.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.2.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.2.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.2.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.2.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.2.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.2.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.2.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.2.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.decoder.layers.2.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.decoder.layers.2.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.2.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.2.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.2.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.2.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.2.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.2.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.3.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.3.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.3.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.3.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.3.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.3.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.3.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.3.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.3.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.decoder.layers.3.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.decoder.layers.3.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.3.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.3.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.3.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.3.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.3.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.3.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.4.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.4.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.4.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.4.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.4.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.4.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.4.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.4.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.4.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.decoder.layers.4.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.decoder.layers.4.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.4.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.4.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.4.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.4.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.4.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.4.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.5.self_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.5.self_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.5.self_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.5.self_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.5.multihead_attn.in_proj_weight: copying a param with shape torch.Size([1536, 512]) from checkpoint, the shape in current model is torch.Size([768, 256]).
	size mismatch for transformer.transformer.decoder.layers.5.multihead_attn.in_proj_bias: copying a param with shape torch.Size([1536]) from checkpoint, the shape in current model is torch.Size([768]).
	size mismatch for transformer.transformer.decoder.layers.5.multihead_attn.out_proj.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([256, 256]).
	size mismatch for transformer.transformer.decoder.layers.5.multihead_attn.out_proj.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.5.linear1.weight: copying a param with shape torch.Size([2048, 512]) from checkpoint, the shape in current model is torch.Size([2048, 256]).
	size mismatch for transformer.transformer.decoder.layers.5.linear2.weight: copying a param with shape torch.Size([512, 2048]) from checkpoint, the shape in current model is torch.Size([256, 2048]).
	size mismatch for transformer.transformer.decoder.layers.5.linear2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.5.norm1.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.5.norm1.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.5.norm2.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.5.norm2.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.5.norm3.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.layers.5.norm3.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.norm.weight: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.transformer.decoder.norm.bias: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
	size mismatch for transformer.fc.weight: copying a param with shape torch.Size([232, 512]) from checkpoint, the shape in current model is torch.Size([233, 256]).
	size mismatch for transformer.fc.bias: copying a param with shape torch.Size([232]) from checkpoint, the shape in current model is torch.Size([233]).

TypeError: __init__() got an unexpected keyword argument 'n_warmup_steps'

Em nhận được thông báo này khi training ạ. Mong anh chỉ dẫn cách fix lỗi này ạ

Computing MD5: /tmp/tranformerorc.pth
MD5 matches: /tmp/tranformerorc.pth
transformer.embed_tgt.weight missmatching shape
transformer.fc.weight missmatching shape
transformer.fc.bias missmatching shape
Traceback (most recent call last):
  File "main.py", line 26, in <module>
    trainer = Trainer(config, pretrained=True)
  File "/Volumes/DATA/DRAF/Deep Learning/DetectHand2/vietocr/model/trainer.py", line 62, in __init__
    self.scheduler = OneCycleLR(self.optimizer, **config['optimizer'])
TypeError: __init__() got an unexpected keyword argument 'n_warmup_steps'

Dưới dây là code của em:

from vietocr.tool.config import Cfg
from vietocr.model.trainer import Trainer


if __name__ == '__main__':
    config = Cfg.load_config_from_name('vgg_transformer')
    dataset_params = {
        'name': 'hw',
        'data_root': './data/',
        'train_annotation': 'train_annotation.txt',
        'valid_annotation': 'test_annotation.txt'
    }

    params = {
        'print_every': 200,
        'valid_every': 15 * 200,
        'iters': 20000,
        'checkpoint': './checkpoint/transformerocr_checkpoint.pth',
        'export': './weights/transformerocr.pth',
        'metrics': 10000
    }

    config['trainer'].update(params)
    config['dataset'].update(dataset_params)

    trainer = Trainer(config, pretrained=True)
    trainer.visualize_dataset()

Lõi khi chạy demo

image
Em làm theo file hướng dẫn thì gặp lỗi ở 2 phiên bản khác nhau. Em xin cách khắc phục ạ!

Lỗi khi load model

Em chào anh ạ,
Em dùng version 0.3.5 và chạy theo hướng dẫn của anh. Đến bước khai báo "detector = Predictor(config)" thì em gặp lỗi:
image
Em có thử thay bằng torch.jit.load() nhưng cũng không được, anh giúp em với ạ.

Cách xử lý ảnh đầu vào để train theo batch như thế nào ?

Em chào anh.

Trước hết em cảm ơn anh vì thư viện VietOCR thực sự có ích cho cộng đồng học máy nói chung và em nói riêng. Trong quá trình tìm hiểu thư viện VietOcr, em có một thắc mắc nhỏ là em chưa tìm thấy cách xử lý ảnh line text đầu vào có chiều dài khác nhau để có thể train được theo từng batch. Nếu có thể, anh có thể chia sẻ giúp em được không ?

Em cảm ơn anh. Chúc anh trong thời gian tới, công việc tiếp tục phát triển và có nhiều đóng góp cho cộng đồng AI Việt Nam hơn nữa.

OCR CMND

Xin chào anh Quốc,
Cám ơn anh đã tạo ra thư viện OCR cho người việt. Em có 3 vần đề muốn hỏi anh là:
+ Em muốn lấy thông tin từ CMND thông qua Transformer thì em có thể scan 1 lần toàn bộ CMND không ạ hay phải cắt nhỏ ra từng box?
+ Nếu train CMND thì em phải train tầm bao nhiêu anh thì có thể sử dụng ạ?
+ Do em làm .NET nên em có thử download file weights như hình thì thấy kết quả ra không đúng, không biết em có làm sai bước nào không? :D
image

  • input
    aa1
  • Result
    image
    => Kết quả không như mong muốn ạ. Anh có thể chỉ em cách để tăng độ chính xác?

Chúc anh nhiều sức khỏe.

Bounding box khác nhau cho kết quả khác nhau

Hi a Quốc, em đang test thử model với 1 số loại chữ khác nhau thì gặp vấn đề khi thay đổi vùng crop. Ví dụ như sau:

vietnam1_3
Kết quả vgg19_bn-seq2seq: UASORU7250KXXY

vietnam1_4
Kết quả vgg19_bn-seq2seq: UASCRU7250KXY

vietnam_1_crop2
Kết quả vgg19_bn-seq2seq: UASORU7250KXY

Khi e test bằng model transformer thì cũng bị vấn đề tương tự. Vậy có cách nào để giải quyết vấn đề này hoặc có gợi ý nào về việc crop đẹp để model đọc tốt ko anh?

Số lượng dữ liệu train

Em chào anh ạ, anh cho em hỏi là train từ đầu thì số lượng ảnh để train khoảng bao nhiêu là đủ tốt ạ.
Em cảm ơn ạ

Train mới bị lỗi

Em cần train tiếp với bộ data anh cung cấp. Nhưng lúc train bị lỗi như vầy
Em để nguyên các folder vi_00,vi_01,.. và tạo mới file train_annotation.txt và test annotation.txt nên các file này sẽ như v: vi_00 abc
vi_01 aksk
....
Em không biết nên sửa sao
image

Missing_key khi predict bằng attention_seq2seq

Em chào anh, em predict thử bằng attention seq2seq nhưng bị lỗi này, em nghĩ do file weight của anh là do train với transformer nên bị lỗi như vậy. Nếu em muốn dùng attention seq2seq thì phải chỉnh như thế nào ạ?
Mong anh chỉ giúp, em cảm ơn ạ.
image
image

Error while training for new dataset?

When i tried to use your notebook and your dataset to train new model, i had an error "This DataLoader will create 3 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create". And when i set num_wokers = 2 in config file, it occured new error: "StopIteration".
Can you help me fix this error? Thank you!

Dùng CPU để chạy thử

Tác giả cho hỏi, máy mình hiện tại ko có GPU, xuất hiện lỗi này

Screenshot 2021-01-13 124656

Giờ mình phải config tại đâu để có thể chạy thử bằng CPU vậy ạ ? Cảm ơn tác giả

vocab trong việt ocr

Mình có 1 chút thắc mắc về xử lý target trong vietocr

  1. Trong vocab.py có 1 biến là self.mask_token không biết biến đấy có ý nghĩa là gì, mình có thể hiểu là giá trị unknown trong vocab không ?
  2. Target được feed vào transformer có fix_length hay không ?
  3. Nếu fix length thì len của hàm len sẽ là max_length+2 thay vì là +4

Điểm prob của vietocr

Hi anh,
Em có sử dụng và fine tune lại Vietocr. Trong quá trình sử dụng thì em thấy prob trả về hầu như là 0.93. Sau khi fine tune thì prob trả về chủ yếu là 0.91. Anh cho em hỏi là cách tính prob của Vietocr như thế nào và tại sao prob trả về hầu như là giống nhau đến thế. Tks anh!

Tiếp tục training model

hi anh !
Anh cho em hỏi là em muốn tiếp tục training model đang trên dở thì setup thế nào ạ ? Em đã thử truyền model vào checkpoint mà không được ???

Error: size mismatch for transformer.embed_tgt.weight: copying a param with shape torch.Size([232, 256]) from checkpoint, the shape in current model is torch.Size([233, 256])

Hi a Quốc, em đang test model của anh theo code bên dưới thì gặp lỗi sau:

Traceback (most recent call last):
  File "/home/duycuong/PycharmProjects/research_py3/text_recognition/classifier/vietocr/vietocr_gettingstart.py", line 13, in <module>
    detector = Predictor(config)
  File "/home/duycuong/PycharmProjects/research_py3/text_recognition/classifier/vietocr/vietocr/tool/predictor.py", line 19, in __init__
    model.load_state_dict(torch.load(weights, map_location=torch.device(device)))
  File "/home/duycuong/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 839, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for VietOCR:
	size mismatch for transformer.embed_tgt.weight: copying a param with shape torch.Size([232, 256]) from checkpoint, the shape in current model is torch.Size([233, 256]).
	size mismatch for transformer.fc.weight: copying a param with shape torch.Size([232, 256]) from checkpoint, the shape in current model is torch.Size([233, 256]).
	size mismatch for transformer.fc.bias: copying a param with shape torch.Size([232]) from checkpoint, the shape in current model is torch.Size([233]).

Đây là code em chạy

from PIL import Image
from vietocr.tool.predictor import Predictor
from vietocr.tool.config import Cfg

config = Cfg.load_config_from_name('vgg_transformer')

config['device'] = 'cuda:0'
config['predictor']['beamsearch']=False
detector = Predictor(config)

Traning with dual GPU

Mình traning x2 RTX 3060 thì tham số "device" để như thế nào vậy Quốc?

Thanks.

Training bị đứng trên RTX 2080

Chào anh, em chạy thử quá trình huấn luyện với code mẫu trên máy mình với GPU RTX 2080, CUDA 10.1, nhưng bị đứng khi bước Trainer.train().

Selection_009

Tuy nhiên khi chạy bằng colab ở notebook getting started thì chạy bình thường.

Selection_010

Em có debug thử thì nó nằm ở vấn đề trong việc lấy ra từng batch, ở dòng 100 trong file trainer.py của thư mục model.
Selection_011

Em chưa tìm ra cách fix được cái lỗi này, vấn đề này anh có từng gặp chưa ạ.

reproducing kết quả vietocr_gettingstart.ipynb

Chào Quốc,

Cảm ơn bạn vì vietocr,
Mình hiện đang chạy thử lại phần training trong vietocr_gettingstart.ipynb với phiên bàn 0.3.2 nhưng với 20000 iterations thì kết quả đang thấp.
Hiện mình đã chạy thử khoảng 180k iterations nhưng kết quả khoảng 0.30 cho full câu và 0.70 cho chars.
Để tiện so sánh nên mình muốn hỏi với bộ dataset trong notebook thì bạn đạt được kết quả bao nhiều về acc full câu và phải training với bao nhiều iterations?

Cảm ơn bạn trước nhé!

ModuleNotFoundError: No module named 'vietocr.loader.DataLoader'

Bạn ơi bạn có thể check giúp mình lỗi này được ko ạ. Khi mình chạy version 0.1.9 thì bị lỗi này. Mình check trong code thì báo lỗi dòng này "from vietocr.loader.DataLoader import DataGen"

Traceback (most recent call last):
  File "main.py", line 2, in <module>
    from vietocr.model.trainer import Trainer
  File "/Volumes/DATA/STUDY/AI/Buoi05/DetectHand2/lib/python3.7/site-packages/vietocr/model/trainer.py", line 11, in <module>
    from vietocr.loader.DataLoader import DataGen

error trong phần decode khi train

chào bạn, khi mình fine tune mô hình seq2seq thì gặp lỗi bên dưới, bạn có thể giải thích giúp mình được không

image

Lỗi khi load model vgg_seq2seq

Dear anh,

Khi dùng code của anh em có load thử model vgg_seq2seq thì gặp lỗi này ạ (em đã cập nhật vietocr version mới nhất)

Capture

Không biết các tham số của mô hình này em có điền sai ở đâu không ạ vì mô hình vgg_transformer em vẫn dùng được bình thường

StopIteration: During handling of the above exception, another exception occurred:

Dạ chào anh, em có thử train code của anh với dataset nhỏ ( chỉ gồm 4 ảnh) để thử nghiệm , quá trình tạo dataset thành công nhưng khi visualize lại không hiện ra gì và train thì hiện lỗi như hình ạ, không biết có phải chỉnh lại tham số valid hay iter gì cho phù hợp với số lượng ảnh trong dataset không ạ. Em cảm ơn

Dùng custom vocab khi test

Chào bạn, mình muốn chỉnh sửa vocab của mình thành chỉ những số từ 0-9 khi prediect thì có thể chỉnh được ở đâu ạ.

Khi mình chỉnh config['vocab'] = '0123456789' khi init_detector thì bị lỗi size mismatch.

Lỗi size mismatch

Em chào anh
Em đang train mô hình OCR sử dụng thư viện vietocr, em đã từng train vài lần trước đây nhưng hôm nay gặp một lỗi lạ, anh có thể giúp em xem đây là lỗi gì không ạ.
Em cảm ơn anh ạ.

Computing MD5: /tmp/tranformerorc.pth
MD5 matches: /tmp/tranformerorc.pth
transformer.decoder.embedding.weight missmatching shape, required torch.Size([256, 256]) but found torch.Size([233, 256])
transformer.decoder.fc_out.weight missmatching shape, required torch.Size([256, 1024]) but found torch.Size([233, 1024])
transformer.decoder.fc_out.bias missmatching shape, required torch.Size([256]) but found torch.Size([233])

Kích thước đầu vào transformer

Em chào anh và mọi người, em có thắc mắc cần mọi người giải đáp hộ ạ.
Như em từng đọc 1 bài làm áp dụng transformer cho câu văn bản thì đầu vào cho encoder cần padding để các câu có độ dài bằng nhau.
Nhưng em thấy ở đây, chiều rộng của ảnh đầu vào là khác nhau, sau lớp CNN thì kích thước input là W N C trong đó W khác nhau
Vậy nên anh và mọi ngườigiúp em rằng ở đây có cần padding để input có độ dài bằng nhau không ạ. Em cảm ơn ạ

RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Hello anh Quốc,
E đang thực hiện retrain model, nhưng khi train thì gặp lỗi này và research thì không hiểu gặp vấn đề gì, mong a giúp đỡ
E sử dụng colab, torch 1.9 cuda 10.2, Vietocr ver 0.3.5
Lỗi như sau ạ :

----> 1 trainer = Trainer(config, pretrained=True)

8 frames
/usr/local/lib/python3.7/dist-packages/vietocr/model/trainer.py in __init__(self, config, pretrained, augmentor)
     30 
     31         self.config = config
---> 32         self.model, self.vocab = build_model(config)
     33 
     34         self.device = config['device']

/usr/local/lib/python3.7/dist-packages/vietocr/tool/translate.py in build_model(config)
    128             config['seq_modeling'])
    129 
--> 130     model = model.to(device)
    131 
    132     return model, vocab

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in to(self, *args, **kwargs)
    850             return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    851 
--> 852         return self._apply(convert)
    853 
    854     def register_backward_hook(

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    528     def _apply(self, fn):
    529         for module in self.children():
--> 530             module._apply(fn)
    531 
    532         def compute_should_use_set_data(tensor, tensor_applied):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    528     def _apply(self, fn):
    529         for module in self.children():
--> 530             module._apply(fn)
    531 
    532         def compute_should_use_set_data(tensor, tensor_applied):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    528     def _apply(self, fn):
    529         for module in self.children():
--> 530             module._apply(fn)
    531 
    532         def compute_should_use_set_data(tensor, tensor_applied):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    528     def _apply(self, fn):
    529         for module in self.children():
--> 530             module._apply(fn)
    531 
    532         def compute_should_use_set_data(tensor, tensor_applied):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    550                 # `with torch.no_grad():`
    551                 with torch.no_grad():
--> 552                     param_applied = fn(param)
    553                 should_use_set_data = compute_should_use_set_data(param, param_applied)
    554                 if should_use_set_data:

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in convert(t)
    848                 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
    849                             non_blocking, memory_format=convert_to_format)
--> 850             return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
    851 
    852         return self._apply(convert)
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Cảm ơn anh.

predict text with batch images

I'm using Tesseract to detect texts and extract text regions. Those images have the different sizes.

How could I use the transformerOCR as the batch predicter for those images?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.