lvpengyuan / masktextspotter.caffe2 Goto Github PK

View Code? Open in Web Editor NEW

261.0 261.0 88.0 3.96 MB

The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"

License: Apache License 2.0

CMake 1.82% Makefile 0.06% Python 52.89% MATLAB 0.13% C++ 44.95% Cuda 0.12% Dockerfile 0.04%

masktextspotter.caffe2's People

Contributors

Stargazers

Watchers

Forkers

fakeryfx liushuchun happog smilewsw dreadlord1984 jackshaw akrusher awesome-archive alwc xgmiao 10183308 wangxiaocao elavin11 roy-algoritm ocelot7777 odgiv cloudxhy sungangweon nirvana93 zgsxwsdxg fendaq duchen521 zipengfeng oysz2016 sadknight0001 xiaoyubing trantorrepository rkshuai zifazhu hannariver znsoftm loovelj chenjun2hao jeffrey98-ai wqai billyzju yonggucheng captain1986 changwh zhengjiawen barongeng xiaowenhe runauto joeytang3377 tyushua1 oftenliu lin-jingnan sndychvn yueyedeai zhangjiekui pubfork utilefuzzball mrljwlm liuwenhaha jsmilemsj lijian10086 xiesibo missyangx chadpieere uptodiff justmyfantasy hell-to-heaven vwkim chenconggit luwuming godla xiaojino xiaolaodi tsui-xianjun yagerpeng dreamerdoremi wxrui simonnie98 daixiaoxiang sunxingxingtf wangqiang1588 joeshow79 ray-mami clarkding fengxingxiang zhangjuhui nikeliza hoppaq xuehuiping immense8342 emmmmmaa celestialized iq-scm

masktextspotter.caffe2's Issues

have no dataset_catalog

in dataset have no dataset_catalog
could you add it
thx

Negative areas found when training

Hi
When trying to train on icdar2013, after a few images i get this

   File "tools/train_net.py", line 370, in <module>
main()
  File "tools/train_net.py", line 197, in main
checkpoints = train_model()
  File "tools/train_net.py", line 219, in train_model
workspace.RunNet(model.net.Proto().name)
  File "/home/archimedes/anaconda2/lib/python2.7/site-packages/caffe2/python/workspace.py", line 237, in RunNet
    StringifyNetName(name), num_iter, allow_fail,
  File "/home/archimedes/anaconda2/lib/python2.7/site-packages/caffe2/python/workspace.py", line 198, in CallWithExceptionIntercept
    return func(*args, **kwargs)
RuntimeError: [enforce fail at pybind_state.h:425] . Exception encountered running PythonOp function: AssertionError: Negative areas founds

At:
  /home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/utils/boxes.py(62): boxes_area
  /home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/modeling/FPN.py(449):  
map_rois_to_fpn_levels
  /home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/roi_data/fast_rcnn.py(390): _distribute_rois_over_fpn_levels
  /home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/roi_data/fast_rcnn.py(398): _add_multilevel_rois
  /home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/roi_data/fast_rcnn.py(150): add_fast_rcnn_blobs_rec/home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/ops/collect_and_distribute_fpn_rpn_propos als_rec.py(60): forward`

Any idea why this is happening?

Thanks in advance.

Some changes have been made in the project Caffe2 and detectron, which makes this dockerfile cant be run. Beside replacing the parent image and detectron workdir, there is another problem i cant fix during make op:
"CMake Error at CMakeLists.txt:8 (find_package):
By not providing "FindCaffe2.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "Caffe2", but
CMake did not find one.

Could not find a package configuration file provided by "Caffe2" with any
of the following names:

Caffe2Config.cmake
caffe2-config.cmake

Add the installation prefix of "Caffe2" to CMAKE_PREFIX_PATH or set
"Caffe2_DIR" to a directory containing one of the above files. If "Caffe2"
provides a separate development package or SDK, be sure it has been
installed.”
Could you fix that? Thank you in advance.

AttributeError: Method AffineChannel is not a registered operator.

WARNING cnn.py: 40: [====DEPRECATE WARNING====]: you are creating an object from CNNModelHelper class which will be deprecated soon. Please use ModelHelper object with brew module. For more information, please refer to caffe2.ai and python/brew.py, python/brew_test.py for more information. Traceback (most recent call last): File "tools/test_net.py", line 157, in <module> main(ind_range=args.range, multi_gpu_testing=args.multi_gpu_testing, vis=vis) File "tools/test_net.py", line 127, in main parent_func(multi_gpu=multi_gpu_testing, vis=vis) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/core/test_engine.py", line 64, in test_net_on_dataset test_net(vis=vis) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/core/test_engine.py", line 126, in test_net model = initialize_model_from_cfg() File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/core/test_engine.py", line 158, in initialize_model_from_cfg model = model_builder.create(cfg.MODEL.TYPE, train=False) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/model_builder.py", line 119, in create return get_func(model_type_func)(model) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/model_builder.py", line 91, in generalized_rcnn freeze_conv_body=cfg.TRAIN.FREEZE_CONV_BODY File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/model_builder.py", line 224, in build_generic_detection_model optim.build_data_parallel_model(model, _single_gpu_build_func) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/optimizer.py", line 51, in build_data_parallel_model single_gpu_build_func(model) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/model_builder.py", line 164, in _single_gpu_build_func blob_conv, dim_conv, spatial_scale_conv = add_conv_body_func(model) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/FPN.py", line 47, in add_fpn_ResNet50_conv5_body model, ResNet.add_ResNet50_conv5_body, fpn_level_info_ResNet50_conv5 File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/FPN.py", line 103, in add_fpn_onto_conv_body conv_body_func(model) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/ResNet.py", line 39, in add_ResNet50_conv5_body return add_ResNet_convX_body(model, (3, 4, 6, 3), use_deformable=use_deformable) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/ResNet.py", line 98, in add_ResNet_convX_body p = model.AffineChannel(p, 'res_conv1_bn', inplace=True) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/detector.py", line 104, in AffineChannel return self.net.AffineChannel([blob_in, scale, bias], blob_in) File "/usr/local/lib/python2.7/dist-packages/caffe2/python/core.py", line 2040, in __getattr__ ",".join(workspace.C.nearby_opnames(op_type)) + ']' AttributeError: Method AffineChannel is not a registered operator. Did you mean: []

I wonder whether this problem results from the Detectron package error installation. By the way, when i install the detectron package follow a dockerfile inside, no problems were shown.

Could you share the pre-trained model?

thank you for releasing the code.

Could you share the pre-trained model?

Look your reply.

have no cityscapes_json_dataset_evaluator

have no cityscapes_json_dataset_evaluator @lvpengyuan

RuntimeError: Cannot compile lanms:

I use anaconda3 and python2.7
my gcc is 4.7.3

How to display recognized fonts?

I have successfully tested the data of icdar2015 test set. Next, how to display the recognized font on the border?

你好，请问怎么展示测试效果的图片

ImportError: cannot import name test_retinanet

Traceback (most recent call last):
File "tools/test_net.py", line 42, in
from core.test_retinanet import test_retinanet
ImportError: cannot import name test_retinanet
你好，我在detectron的core中的test_retinanet.py中没有找到test_retinanet函数

What does MODEL.NUM_CLASSES in mask_textspotter.yaml mean?

Why it's set 2 instead of 36. Text/Non-Text?

out of memory

@lvpengyuan

Thank you very much for your work. When I was training the model, the program prompted me out of memory. Could you tell me the solution?

About res_img_x_x.mat.npy

@lvpengyuan
Is it the score of each character, or the matrix data of the image?

Please help me, thank you.

您好，请问当前模型对中文的识别效果怎么样？

Training on Icdar2015

Hello
Thanks for sharing your work!

I was wondering how i could train on icdar2015?
Also could you kindly share the trained synth-text model.

Thanks in advance.

Questions about datasets and testsets.

I followed your instructions and configured the masktextspotter environment. Now I will use the data set and test set you gave me.

python tools/test_net.py --cfg configs/text/mask_textspotter.yaml

Display the following information, do not know how long to execute?

(caffe2_env) zhoujianwen@zhoujianwen-System:~/masktextspotter.caffe2$ python tools/test_net.py --cfg configs/text/mask_textspotter.yaml

make: Entering directory '/home/zhoujianwen/masktextspotter.caffe2/lanms'
make: 'adaptor.so' is up to date.
make: Leaving directory '/home/zhoujianwen/masktextspotter.caffe2/lanms'
INFO test_net.py: 141: Called with args:
INFO test_net.py: 142: Namespace(cfg_file='configs/text/mask_textspotter.yaml', multi_gpu_testing=False, opts=[], range=None, vis=False, wait=True)
/home/zhoujianwen/masktextspotter.caffe2/lib/core/config.py:1094: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  yaml_cfg = AttrDict(yaml.load(f))
INFO test_net.py: 148: Testing with config:
INFO test_net.py: 149: {'BBOX_XFORM_CLIP': 4.135166556742356,
 'CLUSTER': {'ON_CLUSTER': False},
 'DATA_LOADER': {'NUM_THREADS': 4},
 'DEDUP_BOXES': 0.0625,
 'DOWNLOAD_CACHE': '/tmp/detectron-download-cache',
 'EPS': 1e-14,
 'EXPECTED_RESULTS': [],
 'EXPECTED_RESULTS_ATOL': 0.005,
 'EXPECTED_RESULTS_EMAIL': '',
 'EXPECTED_RESULTS_RTOL': 0.1,
 'FAST_RCNN': {'MLP_HEAD_DIM': 1024,
               'ROI_BOX_HEAD': 'fast_rcnn_heads.add_roi_2mlp_head',
               'ROI_XFORM_METHOD': 'RoIAlign',
               'ROI_XFORM_RESOLUTION': 7,
               'ROI_XFORM_SAMPLING_RATIO': 2},
 'FPN': {'COARSEST_STRIDE': 32,
         'DIM': 256,
         'EXTRA_CONV_LEVELS': False,
         'FPN_ON': True,
         'MULTILEVEL_ROIS': True,
         'MULTILEVEL_RPN': True,
         'ROI_CANONICAL_LEVEL': 4,
         'ROI_CANONICAL_SCALE': 224,
         'ROI_MAX_LEVEL': 5,
         'ROI_MIN_LEVEL': 2,
         'RPN_ANCHOR_START_SIZE': 32,
         'RPN_ASPECT_RATIOS': (0.5, 1, 2),
         'RPN_MAX_LEVEL': 6,
         'RPN_MIN_LEVEL': 2,
         'USE_DEFORMABLE': False,
         'ZERO_INIT_LATERAL': False},
 'IMAGE': {'aug': False,
           'brightness_delta': 32,
           'brightness_prob': 0.5,
           'contrast_lower': 0.5,
           'contrast_prob': 0.5,
           'contrast_upper': 1.5,
           'hue_delta': 18,
           'hue_prob': 0.5,
           'lighting_noise_prob': 0.5,
           'rotate_delta': 15,
           'rotate_prob': 0.5,
           'saturation_lower': 0.5,
           'saturation_prob': 0.5,
           'saturation_upper': 1.5},
 'KRCNN': {'CONV_HEAD_DIM': 256,
           'CONV_HEAD_KERNEL': 3,
           'CONV_INIT': 'GaussianFill',
           'DECONV_DIM': 256,
           'DECONV_KERNEL': 4,
           'DILATION': 1,
           'HEATMAP_SIZE': -1,
           'INFERENCE_MIN_SIZE': 0,
           'KEYPOINT_CONFIDENCE': 'bbox',
           'LOSS_WEIGHT': 1.0,
           'MIN_KEYPOINT_COUNT_FOR_VALID_MINIBATCH': 20,
           'NMS_OKS': False,
           'NORMALIZE_BY_VISIBLE_KEYPOINTS': True,
           'NUM_KEYPOINTS': -1,
           'NUM_STACKED_CONVS': 8,
           'ROI_KEYPOINTS_HEAD': '',
           'ROI_XFORM_METHOD': 'RoIAlign',
           'ROI_XFORM_RESOLUTION': 7,
           'ROI_XFORM_SAMPLING_RATIO': 0,
           'UP_SCALE': -1,
           'USE_DECONV': False,
           'USE_DECONV_OUTPUT': False},
 'MATLAB': 'matlab',
 'MEMONGER': True,
 'MEMONGER_SHARE_ACTIVATIONS': False,
 'MODEL': {'BBOX_REG_WEIGHTS': (10.0, 10.0, 5.0, 5.0),
           'CLS_AGNOSTIC_BBOX_REG': False,
           'CONV_BODY': 'FPN.add_fpn_ResNet50_conv5_body',
           'EXECUTION_TYPE': 'dag',
           'FASTER_RCNN': True,
           'KEYPOINTS_ON': False,
           'MASK_ON': True,
           'NAME': 'shrink++',
           'NUM_CLASSES': 2,
           'RPN_ONLY': False,
           'TYPE': 'generalized_rcnn'},
 'MRCNN': {'CLS_SPECIFIC_MASK': True,
           'CONV_INIT': 'MSRAFill',
           'DILATION': 1,
           'DIM_REDUCED': 256,
           'IS_E2E': True,
           'MASK_BATCH_SIZE_PER_IM': 16,
           'RESOLUTION': 28,
           'RESOLUTION_H': 32,
           'RESOLUTION_W': 128,
           'ROI_MASK_HEAD': 'text_mask_rcnn_heads.mask_rcnn_fcn_head_v1up4convs',
           'ROI_XFORM_METHOD': 'RoIAlign',
           'ROI_XFORM_RESOLUTION': 14,
           'ROI_XFORM_RESOLUTION_H': 16,
           'ROI_XFORM_RESOLUTION_W': 64,
           'ROI_XFORM_SAMPLING_RATIO': 2,
           'THRESH_BINARIZE': 0.5,
           'UPSAMPLE_RATIO': 1,
           'USE_FC_OUTPUT': False,
           'WEIGHT_LOSS_CHAR_BOX': 1.0,
           'WEIGHT_LOSS_MASK': 1.0,
           'WEIGHT_WH': True},
 'NUM_GPUS': 1,
 'OUTPUT_DIR': '.',
 'PIXEL_MEANS': array([[[102.9801, 115.9465, 122.7717]]]),
 'RESNETS': {'NUM_GROUPS': 1,
             'RES5_DILATION': 1,
             'STRIDE_1X1': True,
             'TRANS_FUNC': 'bottleneck_transformation',
             'WIDTH_PER_GROUP': 64},
 'RETINANET': {'ANCHOR_SCALE': 4,
               'ASPECT_RATIOS': (0.25, 0.5, 1.0, 2.0, 4.0),
               'BBOX_REG_BETA': 0.11,
               'BBOX_REG_WEIGHT': 1.0,
               'CLASS_SPECIFIC_BBOX': False,
               'INFERENCE_TH': 0.05,
               'LOSS_ALPHA': 0.25,
               'LOSS_GAMMA': 2.0,
               'NEGATIVE_OVERLAP': 0.4,
               'NUM_CONVS': 4,
               'POSITIVE_OVERLAP': 0.5,
               'PRE_NMS_TOP_N': 1000,
               'PRIOR_PROB': 0.01,
               'RETINANET_ON': False,
               'SCALES_PER_OCTAVE': 3,
               'SHARE_CLS_BBOX_TOWER': False,
               'SOFTMAX': False},
 'RFCN': {'PS_GRID_SIZE': 3},
 'RNG_SEED': 3,
 'ROOT_DIR': '/home/zhoujianwen/masktextspotter.caffe2',
 'RPN': {'ASPECT_RATIOS': (0.5, 1, 2),
         'RPN_ON': True,
         'SIZES': (64, 128, 256, 512),
         'STRIDE': 16},
 'SOLVER': {'BASE_LR': 0.005,
            'GAMMA': 0.1,
            'LOG_LR_CHANGE_THRESHOLD': 1.1,
            'LRS': [],
            'LR_POLICY': 'steps_with_decay',
            'MAX_ITER': 200000,
            'MOMENTUM': 0.9,
            'SCALE_MOMENTUM': True,
            'SCALE_MOMENTUM_THRESHOLD': 1.1,
            'STEPS': [0, 120000],
            'STEP_SIZE': 30000,
            'WARM_UP_FACTOR': 0.3333333333333333,
            'WARM_UP_ITERS': 500,
            'WARM_UP_METHOD': u'linear',
            'WEIGHT_DECAY': 0.0001},
 'TEST': {'BBOX_AUG': {'AREA_TH_HI': 32400,
                       'AREA_TH_LO': 2500,
                       'ASPECT_RATIOS': (),
                       'ASPECT_RATIO_H_FLIP': False,
                       'COORD_HEUR': 'UNION',
                       'ENABLED': False,
                       'H_FLIP': False,
                       'MAX_SIZE': 2000,
                       'SCALES': (800,),
                       'SCALE_H_FLIP': False,
                       'SCALE_SIZE_DEP': False,
                       'SCORE_HEUR': 'UNION'},
          'BBOX_REG': True,
          'BBOX_VOTE': {'ENABLED': True,
                        'SCORING_METHOD': 'ID',
                        'SCORING_METHOD_BETA': 1.0,
                        'VOTE_TH': 0.9},
          'COMPETITION_MODE': True,
          'DATASET': '',
          'DATASETS': ('icdar2015_test',),
          'DETECTIONS_PER_IM': 100,
          'FORCE_JSON_DATASET_EVAL': False,
          'KPS_AUG': {'AREA_TH': 32400,
                      'ASPECT_RATIOS': (),
                      'ASPECT_RATIO_H_FLIP': False,
                      'ENABLED': False,
                      'HEUR': 'HM_AVG',
                      'H_FLIP': False,
                      'MAX_SIZE': 4000,
                      'SCALES': (),
                      'SCALE_H_FLIP': False,
                      'SCALE_SIZE_DEP': False},
          'MASK_AUG': {'AREA_TH': 32400,
                       'ASPECT_RATIOS': (),
                       'ASPECT_RATIO_H_FLIP': False,
                       'ENABLED': False,
                       'HEUR': 'SOFT_AVG',
                       'H_FLIP': False,
                       'MAX_SIZE': 3333,
                       'SCALES': (1600,),
                       'SCALE_H_FLIP': False,
                       'SCALE_SIZE_DEP': False},
          'MAX_SIZE': 3333,
          'NMS': 0.5,
          'NUM_TEST_IMAGES': 5000,
          'OUTPUT_POLYGON': False,
          'PRECOMPUTED_PROPOSALS': False,
          'PROPOSAL_FILE': '',
          'PROPOSAL_FILES': (),
          'PROPOSAL_LIMIT': 2000,
          'RPN_MIN_SIZE': 0,
          'RPN_NMS_THRESH': 0.7,
          'RPN_POST_NMS_TOP_N': 1000,
          'RPN_PRE_NMS_TOP_N': 1000,
          'SCALES': (1000,),
          'SCORE_THRESH': 0.2,
          'SOFT_NMS': {'ENABLED': False, 'METHOD': 'linear', 'SIGMA': 0.5},
          'VIS': False,
          'WEIGHTS': '/home/zhoujianwen/masktextspotter.caffe2/models/model_iter79999.pkl'},
 'TRAIN': {'ASPECT_GROUPING': True,
           'AUTO_RESUME': True,
           'BATCH_SIZE_PER_IM': 512,
           'BBOX_THRESH': 0.5,
           'BG_THRESH_HI': 0.5,
           'BG_THRESH_LO': 0.0,
           'CROWD_FILTER_THRESH': 0.7,
           'DATASETS': ('icdar2015_train',),
           'FG_FRACTION': 0.25,
           'FG_THRESH': 0.5,
           'FREEZE_CONV_BODY': False,
           'GT_MIN_AREA': -1,
           'IMS_PER_BATCH': 2,
           'MAX_SIZE': 1333,
           'MIX_RATIOS': [0.5, 0.25, 0.25],
           'MIX_TRAIN': False,
           'PROPOSAL_FILES': (),
           'RPN_BATCH_SIZE_PER_IM': 256,
           'RPN_FG_FRACTION': 0.5,
           'RPN_MIN_SIZE': 0,
           'RPN_NEGATIVE_OVERLAP': 0.3,
           'RPN_NMS_THRESH': 0.7,
           'RPN_POSITIVE_OVERLAP': 0.7,
           'RPN_POST_NMS_TOP_N': 2000,
           'RPN_PRE_NMS_TOP_N': 2000,
           'RPN_STRADDLE_THRESH': 0,
           'SCALES': (800,),
           'SNAPSHOT_ITERS': 10000,
           'USE_CHARANNS': [True],
           'USE_FLIPPED': False,
           'WEIGHTS': u'/tmp/detectron-download-cache/ImageNetPretrained/MSRA/R-50.pkl'},
 'USE_NCCL': False,
 'VIS': False,
 'VIS_TH': 0.9}
WARNING cnn.py:  25: [====DEPRECATE WARNING====]: you are creating an object from CNNModelHelper class which will be deprecated soon. Please use ModelHelper object with brew module. For more information, please refer to caffe2.ai and python/brew.py, python/brew_test.py for more information.
INFO net.py:  54: Loading from: /home/zhoujianwen/masktextspotter.caffe2/models/model_iter79999.pkl
/home/zhoujianwen/masktextspotter.caffe2/lib/utils/net.py:59: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  saved_cfg = yaml.load(src_blobs['cfg'])
Traceback (most recent call last):
  File "tools/test_net.py", line 159, in <module>
    main(ind_range=args.range, multi_gpu_testing=args.multi_gpu_testing, vis=vis)
  File "tools/test_net.py", line 127, in main
    parent_func(multi_gpu=multi_gpu_testing, vis=vis)
  File "/home/zhoujianwen/masktextspotter.caffe2/lib/core/test_engine.py", line 64, in test_net_on_dataset
    test_net(vis=vis)
  File "/home/zhoujianwen/masktextspotter.caffe2/lib/core/test_engine.py", line 126, in test_net
    model = initialize_model_from_cfg()
  File "/home/zhoujianwen/masktextspotter.caffe2/lib/core/test_engine.py", line 160, in initialize_model_from_cfg
    model, cfg.TEST.WEIGHTS, broadcast=False
  File "/home/zhoujianwen/masktextspotter.caffe2/lib/utils/net.py", line 45, in initialize_from_weights_file
    initialize_gpu_0_from_weights_file(model, weights_file)
  File "/home/zhoujianwen/masktextspotter.caffe2/lib/utils/net.py", line 59, in initialize_gpu_0_from_weights_file
    saved_cfg = yaml.load(src_blobs['cfg'])
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/__init__.py", line 114, in load
    return loader.get_single_data()
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 45, in get_single_data
    return self.construct_document(node)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 49, in construct_document
    data = self.construct_object(node)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 96, in construct_object
    data = constructor(self, tag_suffix, node)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 628, in construct_python_object_new
    return self.construct_python_object_apply(suffix, node, newobj=True)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 611, in construct_python_object_apply
    value = self.construct_mapping(node, deep=True)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 214, in construct_mapping
    return BaseConstructor.construct_mapping(self, node, deep=deep)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 139, in construct_mapping
    value = self.construct_object(value_node, deep=deep)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 101, in construct_object
    for dummy in generator:
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 404, in construct_yaml_map
    value = self.construct_mapping(node)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 214, in construct_mapping
    return BaseConstructor.construct_mapping(self, node, deep=deep)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 139, in construct_mapping
    value = self.construct_object(value_node, deep=deep)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 96, in construct_object
    data = constructor(self, tag_suffix, node)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 617, in construct_python_object_apply
    instance = self.make_python_instance(suffix, node, args, kwds, newobj)
  File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 558, in make_python_instance
    node.start_mark)
yaml.constructor.ConstructorError: while constructing a Python instance
expected a class, but found <type 'builtin_function_or_method'>
  in "<string>", line 3, column 20:
      BBOX_XFORM_CLIP: !!python/object/apply:numpy.core ...

How to diaplay results?

tested results in /masktextspotter.caffe2/models/icdar2015_test/model_iter79999.pkl_results/
like res_img_100.txt res_img_100_0.mat.npy
How to transform to pictures?

Continue training model

Hello @lvpengyuan ,Thank you very much for your excellent work，Can I continue training with new data based on the model you have created?

No module named 'core'

I run the test_net.py, and it came an error
no module named 'core'

but I can't find a python module named core by google

thanks for anyone's help!

ImportError: No module named lanms

I try python tools/test_net.py --cfg configs/text/mask_textspotter.yaml
but i can not import lanms
i have all ready make in lib and make in masktextspotter.caffe2/lanms

i use anaconda3 python2.7
gcc and g++ version is 5.4

error while runining the train script

Hello,lvpengyuan:
Thanks for sharing the code.I can run the test script after installing the env,But I got some problem
while running the train script.The error imformation is as follows:

Error1:
[E pybind_state.h:424] Exception encountered running PythonOp function: IndexError: index 0 is out of bounds for axis 0 with size 0

At:
/MaskTextSpotter/MTS_libbase/lib/roi_data/mask_rcnn.py(210): add_charmask_rcnn_blobs
/MaskTextSpotter/MTS_libbase/lib/roi_data/fast_rcnn.py(325): _sample_rois_rec
/MaskTextSpotter/MTS_libbase/lib/roi_data/fast_rcnn.py(141): add_fast_rcnn_blobs_rec
/MaskTextSpotter/MTS_libbase/lib/ops/collect_and_distribute_fpn_rpn_proposals_rec.py(60): forward

Error2:
[E pybind_state.h:424] Exception encountered running PythonOp function: UnboundLocalError: local variable 'char_boxes' referenced before assignment

At:
/MaskTextSpotter/MTS_libbase/lib/roi_data/mask_rcnn.py(277): add_charmask_rcnn_blobs
/MaskTextSpotter/MTS_libbase/lib/roi_data/fast_rcnn.py(325): _sample_rois_rec
/MaskTextSpotter/MTS_libbase/lib/roi_data/fast_rcnn.py(141): add_fast_rcnn_blobs_rec
/MaskTextSpotter/MTS_libbase/lib/ops/collect_and_distribute_fpn_rpn_proposals_rec.py(60): forward

Both the two errors are caused by the function add_charmask_rcnn_blobs in lib/roi_data/mask_rcnn.py.And I read it and found they are both in the branch:
if is_e2e:
...
if fg_inds.shape[0] > 0:
...
# the else branch doesn't assign the variable 'char_boxes'
else: # If there are no fg masks (it does happen)
print('is_e2e true: fg_inds.shape[0] > 0: ' + str(fg_inds.shape[0] > 0))
# The network cannot handle empty blobs, so we must provide a mask
# We simply take the first bg roi, given it an all -1's mask (ignore
# label), and label it with class zero (bg).
bg_inds = np.where(blobs['labels_int32'] == 0)[0]
# rois_fg is actually one background roi, but that's ok because ...
rois_fg = sampled_boxes[bg_inds[0]].reshape((1, -1))
# We give it an -1's blob (ignore label)
masks = -blob_utils.ones((1, 2, M_HEIGHTM_WIDTH), int32=True)
mask_weights = -blob_utils.ones((1, 2, M_HEIGHTM_WIDTH), int32=True)
# changed lzn
# char_boxes_inside_weight = np.zeros(1, M_HEIGHT*M_WIDTH, 4, dtype=np.float32)
char_boxes_inside_weight = np.zeros((1, M_HEIGHT * M_WIDTH, 4), dtype=np.float32)
# We label it with class = 0 (background)
mask_class_labels = blob_utils.zeros((1, ))
# Mark that the first roi has a mask
# these code can help me avoid the error1
# if len(roi_has_mask) == 0:
# pass
# else:
roi_has_mask[0] = 1

Can you help me?
Thanks a lot.

Excuse me, where do I need to modify for text recognition in the vertical direction?

Excuse me, where do I need to modify for text recognition in the vertical direction?
My text is vertically oriented and identifies errors, but it can recognize horizontal ones. Where do I need to change? Test.py? How should I modify it?

import lanms

when i import lanms, it print this problem
Traceback (most recent call last): File "tools/test_net.py", line 40, in <module> from core.test_engine import test_net, test_net_on_dataset File "/media/chenjun/ed/31_ocr_own/masktextspotter.caffe2/lib/core/test_engine.py", line 37, in <module> from core.test import im_detect_all File "/media/chenjun/ed/31_ocr_own/masktextspotter.caffe2/lib/core/test.py", line 49, in <module> import lanms File "/home/chenjun/anaconda2/lib/python2.7/site-packages/lanms/__init__.py", line 2, in <module> from .adaptor import merge_quadrangle_n9 as nms_impl ImportError: /home/chenjun/anaconda2/lib/python2.7/site-packages/lanms/adaptor.so: undefined symbol: PyInstanceMethod_Type

has no task_evaluation

from datasets import task_evaluation
ImportError: cannot import name task_evaluation

in the dataset has no task_evaluation.

Could you help me?

thank you very much

The model cannot be downloaded

Did you make any changes based on Detectron?

I noticed that you have a lib folder looks alike the folder in the Detectron. But even the functions with the same name or same file, they are a bit different from Detectron.
When I tried the code, the core module in your code and core module in Detectron may have conflicts.
If you made any changes to the Detectron, could you provide your version of Detectron?

I guess there is something wrong with _get_dataset_inds(self) in mix_loader.py, line 146

When I set MIX_RATIOS = [0.4, 0.4, 0.2] in .yaml, it stop at assert(len(self._dataset_inds) == self._num_gpus*cfg.TRAIN.IMS_PER_BATCH). And self._dataset_inds = [].
Then I set MIX_RATIOS = [2.0, 2.0, 1.0] in .yaml and comment the assert, since self._dataset_inds = [0, 0, 1, 1, 2]. It seems to work as expect.
So do you have any idea? Or would you please explain the assert?

have no json_dataset

ImportError: cannot import name json_dataset
@lvpengyuan

question aboout test result

您好，我想请问下，我现在得到的测试结果是这样的，“819,283,871,299,819,283,870,283,870,298,819,298,yourselli,0.9986137,0.6931232114632925,/data/home/zjw/pythonFile/masktextspotter.caffe2/data/icdar2015_test/model_iter79999.pkl_results/res_img_2_0.mat”
请问前面12个数字是指什么呢，该怎么表示bounding box呢？ mask path前面的两个数又是指的什么呢？
谢谢。

How to do inference?

Hello, how to do inference on new images using pretrained model?

How get character level annotation ?

For training Icdar2013 the ground truth files look this
for a single word

158.0,128.0,411.0,128.0,411.0,181.0,158.0,181.0,Footpath,158.0,131.0,187.0,131.0,187.0,172.0,158.0,172.0,F,189.0,139.0,219.0,139.0,219.0,171.0,189.0,171.0,o,226.0,139.0,255.0,139.0,255.0,171.0,226.0,171.0,o,261.0,129.0,282.0,129.0,282.0,171.0,261.0,171.0,t,290.0,140.0,319.0,140.0,319.0,181.0,290.0,181.0,p,324.0,139.0,351.0,139.0,351.0,170.0,324.0,170.0,a,357.0,128.0, 377.0,128.0,377.0,170.0,357.0,170.0,t,385.0,129.0,411.0,129.0,411.0,170.0,385.0,170.0,h

as you see this also has charcter level annotation.

but for the icdar 2015 training set, the annotation looks like this-

377,117,463,117,465,130,378,130,Genaxis Theatre
493,115,519,115,519,131,493,131,[06]
374,155,409,155,409,170,374,170,###
492,151,551,151,551,170,492,170,62-03
376,198,422,198,422,212,376,212,Carpark
494,190,539,189,539,205,494,206,###
374,1,494,0,492,85,372,86,###

As you see this does not have letter level annotation.
can anyone guide me as to how to get charcter level annotation from this?

Thanks in advance.

Are versions of CUDA and CUDNN required?

Could you please tell me, thank you.

train from scratch

Just wonder how to train from scratch
Error when i removed the weight file in configs: RuntimeError: [enforce fail at operator.cc:58] blob != nullptr. op AffineChannel: Encountered a non-existing input blob: gpt_0/res_conv1_bn_s
Could you help with that?

请问如何测试自己的图像

How cloud I solve the problem 'no module named lanms'?

cannot import name test_retinanet

Hi @lvpengyuan,

I am getting the following error while running python tools/test_net.py --cfg configs/text/mask_textspotter.yaml
Traceback (most recent call last):
File "tools/test_net.py", line 42, in
from core.test_retinanet import test_retinanet
ImportError: cannot import name test_retinanet

workspace.RunNet(model.net.Proto().name)

How to test on ICDAR2015 training set

Hi,
If I want to output detection results on ICDAR2015 training set, what should I do ?
I change "icdar2015_test" to "icdar2015_train" in config yaml file, but i got this error:

So why this happened... And what should I do ?
Thanks ：）

How can I get 'scut-eng-char_train'

AttributeError: Method DeformConv is not a registered operator. Did you mean: []

I run the followed statement:
python tools/test_net.py --cfg configs/text/mask_textspotter.yaml
it came an error like this.

File "/home/brooklyn/anaconda3/envs/conda_py2/lib/python2.7/site-packages/caffe2/python/core.py", line 2205, in getattr
",".join(workspace.C.nearby_opnames(op_type)) + ']'
AttributeError: Method DeformConv is not a registered operator. Did you mean: []

How come?
Thanks for help!

ValueError: Type mismatch (<type 'tuple'> vs. <type 'str'>) with values (() vs. icdar2015_test) for config key: TEST.DATASETS

I try python tools/test_net.py --cfg configs/text/mask_textspotter.yaml
but i can not import datasets

the problem is as follows:
make: Entering directory '/home/xie/MaskTextSpotter/lanms'
make: 'adaptor.so' is up to date.
make: Leaving directory '/home/xie/MaskTextSpotter/lanms'
[E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
INFO test_net.py: 142: Called with args:
INFO test_net.py: 143: Namespace(cfg_file='configs/text/mask_textspotter.yaml', multi_gpu_testing=False, opts=[], range=None, vis=False, wait=True)
Traceback (most recent call last):
File "tools/test_net.py", line 145, in
merge_cfg_from_file(args.cfg_file)
File "/home/MaskTextSpotter/lib/core/config.py", line 1095, in merge_cfg_from_file
_merge_a_into_b(yaml_cfg, __C)
File "/home/MaskTextSpotter/lib/core/config.py", line 1153, in _merge_a_into_b
_merge_a_into_b(v, b[k], stack=stack_push)
File "/home/MaskTextSpotter/lib/core/config.py", line 1147, in _merge_a_into_b
v = _check_and_coerce_cfg_value_type(v, b[k], k, full_key)
File "/home/MaskTextSpotter/lib/core/config.py", line 1242, in _check_and_coerce_cfg_value_type
'key: {}'.format(type_b, type_a, value_b, value_a, full_key)
ValueError: Type mismatch (<type 'tuple'> vs. <type 'str'>) with values (() vs. icdar2015_test) for config key: TEST.DATASETS

format of results

i don't understand the format of the results,
618,140,662,159,618,140,660,140,660,157,618,157,ahead,0.99701583,0.9119489312171936,./train/shrink++_finetune/icdar2015_test/model_iter79999.pkl_results/res_img_1_0.mat this is one result of the text file.

There are 12 numbers and then a word and two more numbers, What do the 12 numbers mean?
I checked the icdar data and it has 8 numbers.

also why are there 2 confidence scores?

Any suggestions would be really helpful.

Thanks in advance.

use anaconda

if i use anaconda ,should i make some change when i set up Python modules?
thx

请问那个icadr2015的数据集需要怎么处理才能够让模型跑起来啊，我这边一直报找不到数据集的名字

ValueError: need more than 2 values to unpack

when test the infer.py with ICDAR2015 dataset, it print out the problem:

Traceback (most recent call last):
  File "tools/test_net.py", line 169, in <module>
    main(ind_range=args.range, multi_gpu_testing=args.multi_gpu_testing, vis=vis)
  File "tools/test_net.py", line 139, in main
    parent_func(multi_gpu=multi_gpu_testing, vis=vis)
  File "./lib/core/test_engine.py", line 64, in test_net_on_dataset
    test_net(vis=vis)
  File "./lib/core/test_engine.py", line 150, in test_net
    model, im, image_name, box_proposals, timers, vis=vis
  File "./lib/core/test.py", line 150, in im_detect_all
    text, rec_score, rec_char_scores = getstr_grid(char_masks[index,:,:,:].copy(), box_w, box_h)
  File "./lib/core/test.py", line 1107, in getstr_grid
    string, score, rec_scores = seg2text(pos, mask_index, seg)
  File "./lib/core/test.py", line 1216, in seg2text
    im2, contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
ValueError: need more than 2 values to unpack

The model cannot be downloaded

AssertionError: Detectron ops lib not found at

is it i don't install the detectron?

Could you share the fine-tuned model?

Firstly, thank you for releasing the code.
Could you share the fine-tuned model?
Hope you reply.

Error when training on ICDAR2015

Traceback (most recent call last):
File "tools/train_net.py", line 370, in
main()
File "tools/train_net.py", line 197, in main
checkpoints = train_model()
File "tools/train_net.py", line 212, in train_model
setup_model_for_training(model, output_dir)
File "tools/train_net.py", line 318, in setup_model_for_training
workspace.CreateNet(model.net)
File "/opt/conda3/lib/python3.6/site-packages/caffe2/python/workspace.py", line 171, in CreateNet
StringifyProto(net), overwrite,
File "/opt/conda3/lib/python3.6/site-packages/caffe2/python/workspace.py", line 197, in CallWithExceptionIntercept
return func(args, kwargs)
RuntimeError: [enforce fail at operator.cc:46] blob != nullptr. op AffineChannel: Encountered a non-existing input blob: gpu_0/res_conv1_bn_s
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7fa03a922fe1 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libc10.so)
frame #1: c10::ThrowEnforceNotMet(char const, int, char const, std::string const&, void const) + 0x49 (0x7fa03a922c29 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libc10.so)
frame #2: caffe2::OperatorBase::OperatorBase(caffe2::OperatorDef const&, caffe2::Workspace*) + 0x4ec (0x7f9fe910d71c in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #3: + 0x277bd8c (0x7f9f7eb9cd8c in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #4: + 0x277c42e (0x7f9f7eb9d42e in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #5: std::_Function_handler<std::unique_ptr<caffe2::OperatorBase, std::default_deletecaffe2::OperatorBase > (caffe2::OperatorDef const&, caffe2::Workspace*), std::unique_ptr<caffe2::OperatorBase, std::default_deletecaffe2::OperatorBase > ()(caffe2::OperatorDef const&, caffe2::Workspace)>::_M_invoke(std::_Any_data const&, caffe2::OperatorDef const&, caffe2::Workspace*) + 0xf (0x7fa03ab8cb7f in /opt/conda3/lib/python3.6/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-36m-x86_64-linux-gnu.so)
frame #6: + 0x14058c7 (0x7f9fe910b8c7 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #7: + 0x1407eee (0x7f9fe910deee in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #8: caffe2::CreateOperator(caffe2::OperatorDef const&, caffe2::Workspace*, int) + 0x340 (0x7f9fe910e4a0 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #9: caffe2::dag_utils::prepareOperatorNodes(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0x14fc (0x7f9fe90fb63c in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #10: caffe2::AsyncNetBase::AsyncNetBase(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0x287 (0x7f9fe90e6657 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #11: caffe2::AsyncSchedulingNet::AsyncSchedulingNet(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0x9 (0x7f9fe90ebef9 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #12: + 0x13e76be (0x7f9fe90ed6be in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #13: + 0x13e756f (0x7f9fe90ed56f in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #14: caffe2::CreateNet(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0x6b0 (0x7f9fe90df8b0 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #15: caffe2::Workspace::CreateNet(std::shared_ptr<caffe2::NetDef const> const&, bool) + 0x102 (0x7f9fe913a662 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #16: caffe2::Workspace::CreateNet(caffe2::NetDef const&, bool) + 0x8e (0x7f9fe913abee in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #17: + 0x54767 (0x7fa03ab84767 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-36m-x86_64-linux-gnu.so)
frame #18: + 0x8bb5e (0x7fa03abbbb5e in /opt/conda3/lib/python3.6/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-36m-x86_64-linux-gnu.so)

frame #45: __libc_start_main + 0xf0 (0x7fa046810830 in /lib/x86_64-linux-gnu/libc.so.6)

Is it because of I don't use the model_iter79999.pkl to initiate the model?

miss datasets.dummy_datasets

when i run python2 tools/infer.py --im 1.jpg --rpn-pkl models/model_iter79999.pkl --rpn-cfg configs/text/make_textspotter.yaml --output-dir restuls
The error is import datasets.dummy_datasets as dummy_datasets
ImportError: No module named dummy_datasets
Can you help me,thanks.

lvpengyuan / masktextspotter.caffe2 Goto Github PK

masktextspotter.caffe2's People

Contributors

Stargazers

Watchers

Forkers

masktextspotter.caffe2's Issues

Recommend Projects

Recommend Topics

Recommend Org