Giter Site home page Giter Site logo

wenmuzhou / pan.pytorch Goto Github PK

View Code? Open in Web Editor NEW
413.0 18.0 112.0 1.89 MB

A unofficial pytorch implementation of PAN(PSENet2): Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

License: Apache License 2.0

Python 19.45% Makefile 0.05% C++ 80.49% Objective-C 0.02%

pan.pytorch's Introduction

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Requirements

  • pytorch 1.1+
  • torchvision 0.3+
  • pyclipper
  • opencv3
  • gcc 4.9+

Download

PAN_resnet18_FPEM_FFM and PAN_resnet18_FPEM_FFM on icdar2015:

the updated model(resnet18:78.8,shufflenetv2: 72.4,lr:le-3) is not the best model

google drive

Data Preparation

train: prepare a text in the following format, use '\t' as a separator

/path/to/img.jpg path/to/label.txt
...

val: use a folder

img/ store img
gt/ store gt file

Train

  1. config the train_data_path,val_data_pathin config.json
  2. use following script to run
python3 train.py

Test

eval.py is used to test model on test dataset

  1. config model_path, img_path, gt_path, save_path in eval.py
  2. use following script to test
python3 eval.py

Predict

predict.py is used to inference on single image

  1. config model_path, img_path, in predict.py
  2. use following script to predict
python3 predict.py

The project is still under development.

Performance

only train on ICDAR2015 dataset

Method image size (short size) learning rate Precision (%) Recall (%) F-measure (%) FPS
paper(resnet18) 736 x x x 80.4 26.1
my (ShuffleNetV2+FPEM_FFM+pse扩张) 736 1e-3 81.72 66.73 73.47 24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张) 736 1e-3 84.93 74.09 79.14 21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张) 736 1e-3 84.23 76.12 79.96 14.22 (P100)
my (ShuffleNetV2+FPEM_FFM+pse扩张) 736 1e-4 75.14 57.34 65.04 24.71 (P100)
my (resnet18+FPEM_FFM+pse扩张) 736 1e-4 83.89 69.23 75.86 21.31 (P100)
my (resnet50+FPEM_FFM+pse扩张) 736 1e-4 85.29 75.1 79.87 14.22 (P100)
my (resnet18+FPN+pse扩张) 736 1e-3 76.50 74.70 75.59 14.47 (P100)
my (resnet50+FPN+pse扩张) 736 1e-3 71.82 75.73 73.72 10.67 (P100)
my (resnet18+FPN+pse扩张) 736 1e-4 74.19 72.34 73.25 14.47 (P100)
my (resnet50+FPN+pse扩张) 736 1e-4 78.96 76.27 77.59 10.67 (P100)

examples

todo

  • MobileNet backbone

  • ShuffleNet backbone

reference

  1. https://arxiv.org/pdf/1908.05900.pdf
  2. https://github.com/WenmuZhou/PSENet.pytorch

If this repository helps you,please star it. Thanks.

pan.pytorch's People

Contributors

guochengzhen avatar wenmuzhou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pan.pytorch's Issues

The order of label cordinates in different dataset

I noticed the author make some cordinates changes with ICDAR2015 dataset in util.py with "order_points_clockwise" fuction, could you kindly explain the reason for this?
I known the ICDAR2015 dataset has the clockwise label, why did you apply this fuction?
I read the Total-Text dateset paper, could not find any discription about the clockwise label. Do we need to do this kind of changes with the Total-Text dataset?

Finetune checkpoint

There is a parameter "finetune_checkpoint", I want to know how does it work here, and does it freeze any initial layers, if yes then how one can control it?

num_samples=0

@WenmuZhou Thank you for your hard work,

I am trying to train icdar2015, when running train.py I get error.
My config.json
My training & testing list, and file tree.

The error message:

(final) home@home-desktop:~/p2/PAN.pytorch-master$ python train.py
Traceback (most recent call last):
  File "train.py", line 33, in <module>
    main(config)
  File "train.py", line 18, in main
    train_loader, eval_loader = get_dataloader(config['data_loader']['type'], config['data_loader']['args'])
  File "/home/home/p2/PAN.pytorch-master/data_loader/__init__.py", line 98, in get_dataloader
    num_workers=module_args['loader']['num_workers'])
  File "/home/home/anaconda3/envs/final/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 213, in __init__
    sampler = RandomSampler(dataset)
  File "/home/home/anaconda3/envs/final/lib/python3.6/site-packages/torch/utils/data/sampler.py", line 94, in __init__
    "value, but got num_samples={}".format(self.num_samples))
ValueError: num_samples should be a positive integer value, but got num_samples=0

想问问,作者跑的时候有没有发现边界比较差

rt。。。发现最后的结果边界都比较差,会往内缩一点,不知道是不是我迭代的次数太少了,但是600个epoch也太长了吧,现在我一般只训50个epoch左右,600个估计一个星期能有一个结果

关于performance中精度和参数的问题

请问一下performance的res+fpn+pse扩张 应该是对应PSENet吧,但是看精度还是差距比较大的,f1分别是77和80多。这里面是参数细节上有什么区别吗或者是实现上面加了什么只适用PAN的trick吗

Weights?

Thanks for sharing the code, is there any training weights available for this work?

Segmentation fault (core dumped)

when i run predict.py

[shakey@xiaoi-778 PAN]$ python predict.py
make: Entering directory /opt/shakey/deep-learning/PAN/post_processing' make: pse.so' is up to date.
make: Leaving directory `/opt/shakey/deep-learning/PAN/post_processing'
self。gpu 1
ininstance True
torch.cuda.is True
self。gpu 1
ininstance True
torch.cuda.is True
device: cuda:0
Segmentation fault (core dumped)

why i can not load the model/when i want to predict pictures

make: Entering directory '/home/git_repo/PAN.pytorch/post_processing'
make: 'pse.so' is up to date.
make: Leaving directory '/home/git_repo/PAN.pytorch/post_processing'
Backend Qt5Agg is interactive backend. Turning interactive mode on.
QXcbConnection: XCB error: 145 (Unknown), sequence: 171, resource id: 0, major code: 139 (Unknown), minor code: 20
device: cuda:0

then it stops ,and not return anythings

The loss does not decrease

It's a great job.Before your update, I tried to train PAN, but the loss was still high until the end of the training.Does the current version support effective training and inference?

关于PAN的损失

在算agg_dis_loss的时候:
text_num = gt_text_i.max().item()+1
请问这句话的意思是什么呀,是想计算文本实例的个数对吧,
gt_text_i不是值为0或1的像素嘛,那用这句话得到的结果不一直是2嘛。

重新编译的PSE.so 使用报Segmentation fault

/usr/lib64/python3.6/site-packages/torch/nn/functional.py:2479: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
Segmentation fault

cpp的pse有些问题

某些图片在cpp的pse过程会卡住,没报错一直卡住,改成pypse就可以正常运行。pypse代码里有个bug,25行应该改成for i in range(label_values): 之前是for i in label_values:,label_values是个int

Exception: ZIP entry not valid

/utils/cal_recall/rrc_evaluation_funcs.py", line 102, in load_folder_file
raise Exception('ZIP entry not valid: %s' % name)
Exception: ZIP entry not valid: res_100104531.txt

验证和eval的时候都报这个错,请问这个是什么意思,我应该怎么解决呢?

some problem about pse.cpp

谢谢大佬的分享,有一些疑问就是,在predict测试的时候,(model用resnet18的,mac os上测试)提示以下错误,不知能否给些意见? 0.0, best wishes!

pse.cpp:49:29: error: variable-sized object may not be initialized
        float kernel_vector[label_num][5] = {0};
                            ^~~~~~~~~
1 error generated.
make: *** [pse.so] Error 1
Traceback (most recent call last):
  File "/Users/abelleon/Documents/project/PAN.pytorch-master/predict.py", line 13, in <module>
    from post_processing import decode
  File "/Users/abelleon/Documents/project/PAN.pytorch-master/post_processing/__init__.py", line 17, in <module>
    raise RuntimeError('Cannot compile pse: {}'.format(BASE_DIR))
RuntimeError: Cannot compile pse: /Users/abelleon/Documents/project/PAN.pytorch-master/post_processing

Pretrained model

Hi,
Given checkpoint is not performing very well. Can you share best checkpoints or help in reproduce that.

导入c++版pse

为啥 from .pse import pse_cpp, get_points, get_num在函数decode内部,局部导入,而不是全局导入呢,我在其他一个psenet项目中看到是全局导入,我对c++不是很了解

post_processing 中的subprocess.call

File "/Users/liubowen/Downloads/PAN.pytorch-master/post_processing/init.py", line 17, in
raise RuntimeError('Cannot compile pse: {}'.format(BASE_DIR))
RuntimeError: Cannot compile pse: /Users/liubowen/Downloads/PAN.pytorch-master/post_processing

有没有解决方案?please!!

import error

from .pse import pse_cpp, get_points, get_num

ModuleNotFoundError: No module named 'post_processing.pse'

how can i fix this problem?
thx!

关于segmentation_head的输出

我想问一下FPEM_FFM这一层的输出为什么是6个通道
self.out_conv = nn.Conv2d(in_channels=conv_out * 4, out_channels=6, kernel_size=1)

[疑问]FPEM模块

您好,在您的FPEM模块的实现中,有self.add_up和self.add_down,
这两块应该相当于是将不同大小的feature map进行up-sample和down-sample并相加后再进行feature map的融合,而这个融合模块实例应该是每次融合都不一样,而不是每次融合都是同一个模块实例,比如在Up-scale Enhancement中c5和c4down-sample相加后的add_up与c4和c3down-sample相加后的add_up实例应该是不同的两个模块实例,虽然两个add_up结构相同,但是其参数并不能共享,论文中好像也没有强调这些参数是共享的,所以我的理解是每个add_up部分应该创建不同的nn.sequential,down-sample同理。不知道我说的对不对。。。还是说我的理解有偏差
唉,语死早,感觉很难说清楚我要表达的意思

测试精度和召回率都0

在跑训练的时候遇到如下问题:
test: recall: 0.000000, precision: 0.000000, f1: 0.000000
请问这个怎么解决?谢谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.