lvpengyuan / masktextspotter.caffe2 Goto Github PK
View Code? Open in Web Editor NEWThe code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"
License: Apache License 2.0
The code of "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes"
License: Apache License 2.0
in dataset have no dataset_catalog
could you add it
thx
Hi
When trying to train on icdar2013, after a few images i get this
File "tools/train_net.py", line 370, in <module>
main()
File "tools/train_net.py", line 197, in main
checkpoints = train_model()
File "tools/train_net.py", line 219, in train_model
workspace.RunNet(model.net.Proto().name)
File "/home/archimedes/anaconda2/lib/python2.7/site-packages/caffe2/python/workspace.py", line 237, in RunNet
StringifyNetName(name), num_iter, allow_fail,
File "/home/archimedes/anaconda2/lib/python2.7/site-packages/caffe2/python/workspace.py", line 198, in CallWithExceptionIntercept
return func(*args, **kwargs)
RuntimeError: [enforce fail at pybind_state.h:425] . Exception encountered running PythonOp function: AssertionError: Negative areas founds
At:
/home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/utils/boxes.py(62): boxes_area
/home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/modeling/FPN.py(449):
map_rois_to_fpn_levels
/home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/roi_data/fast_rcnn.py(390): _distribute_rois_over_fpn_levels
/home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/roi_data/fast_rcnn.py(398): _add_multilevel_rois
/home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/roi_data/fast_rcnn.py(150): add_fast_rcnn_blobs_rec/home/archimedes/ai/Ryan/masktextspotter.caffe2/lib/ops/collect_and_distribute_fpn_rpn_propos als_rec.py(60): forward`
Any idea why this is happening?
Thanks in advance.
Some changes have been made in the project Caffe2 and detectron, which makes this dockerfile cant be run. Beside replacing the parent image and detectron workdir, there is another problem i cant fix during make op:
"CMake Error at CMakeLists.txt:8 (find_package):
By not providing "FindCaffe2.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "Caffe2", but
CMake did not find one.
Could not find a package configuration file provided by "Caffe2" with any
of the following names:
Caffe2Config.cmake
caffe2-config.cmake
Add the installation prefix of "Caffe2" to CMAKE_PREFIX_PATH or set
"Caffe2_DIR" to a directory containing one of the above files. If "Caffe2"
provides a separate development package or SDK, be sure it has been
installed.”
Could you fix that? Thank you in advance.
WARNING cnn.py: 40: [====DEPRECATE WARNING====]: you are creating an object from CNNModelHelper class which will be deprecated soon. Please use ModelHelper object with brew module. For more information, please refer to caffe2.ai and python/brew.py, python/brew_test.py for more information. Traceback (most recent call last): File "tools/test_net.py", line 157, in <module> main(ind_range=args.range, multi_gpu_testing=args.multi_gpu_testing, vis=vis) File "tools/test_net.py", line 127, in main parent_func(multi_gpu=multi_gpu_testing, vis=vis) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/core/test_engine.py", line 64, in test_net_on_dataset test_net(vis=vis) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/core/test_engine.py", line 126, in test_net model = initialize_model_from_cfg() File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/core/test_engine.py", line 158, in initialize_model_from_cfg model = model_builder.create(cfg.MODEL.TYPE, train=False) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/model_builder.py", line 119, in create return get_func(model_type_func)(model) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/model_builder.py", line 91, in generalized_rcnn freeze_conv_body=cfg.TRAIN.FREEZE_CONV_BODY File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/model_builder.py", line 224, in build_generic_detection_model optim.build_data_parallel_model(model, _single_gpu_build_func) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/optimizer.py", line 51, in build_data_parallel_model single_gpu_build_func(model) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/model_builder.py", line 164, in _single_gpu_build_func blob_conv, dim_conv, spatial_scale_conv = add_conv_body_func(model) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/FPN.py", line 47, in add_fpn_ResNet50_conv5_body model, ResNet.add_ResNet50_conv5_body, fpn_level_info_ResNet50_conv5 File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/FPN.py", line 103, in add_fpn_onto_conv_body conv_body_func(model) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/ResNet.py", line 39, in add_ResNet50_conv5_body return add_ResNet_convX_body(model, (3, 4, 6, 3), use_deformable=use_deformable) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/ResNet.py", line 98, in add_ResNet_convX_body p = model.AffineChannel(p, 'res_conv1_bn', inplace=True) File "/root/fsy_SceneTextRec/masktextspotter.caffe2-master/lib/modeling/detector.py", line 104, in AffineChannel return self.net.AffineChannel([blob_in, scale, bias], blob_in) File "/usr/local/lib/python2.7/dist-packages/caffe2/python/core.py", line 2040, in __getattr__ ",".join(workspace.C.nearby_opnames(op_type)) + ']' AttributeError: Method AffineChannel is not a registered operator. Did you mean: []
I wonder whether this problem results from the Detectron package error installation. By the way, when i install the detectron package follow a dockerfile inside, no problems were shown.
thank you for releasing the code.
Could you share the pre-trained model?
Look your reply.
have no cityscapes_json_dataset_evaluator @lvpengyuan
I use anaconda3 and python2.7
my gcc is 4.7.3
Traceback (most recent call last):
File "tools/test_net.py", line 42, in
from core.test_retinanet import test_retinanet
ImportError: cannot import name test_retinanet
你好,我在detectron的core中的test_retinanet.py中没有找到test_retinanet函数
Why it's set 2 instead of 36. Text/Non-Text?
Thank you very much for your work. When I was training the model, the program prompted me out of memory. Could you tell me the solution?
@lvpengyuan
Is it the score of each character, or the matrix data of the image?
Please help me, thank you.
您好,请问当前模型对中文的识别效果怎么样?
Hello
Thanks for sharing your work!
I was wondering how i could train on icdar2015?
Also could you kindly share the trained synth-text model.
Thanks in advance.
I followed your instructions and configured the masktextspotter environment. Now I will use the data set and test set you gave me.
python tools/test_net.py --cfg configs/text/mask_textspotter.yaml
Display the following information, do not know how long to execute?
(caffe2_env) zhoujianwen@zhoujianwen-System:~/masktextspotter.caffe2$ python tools/test_net.py --cfg configs/text/mask_textspotter.yaml
make: Entering directory '/home/zhoujianwen/masktextspotter.caffe2/lanms'
make: 'adaptor.so' is up to date.
make: Leaving directory '/home/zhoujianwen/masktextspotter.caffe2/lanms'
INFO test_net.py: 141: Called with args:
INFO test_net.py: 142: Namespace(cfg_file='configs/text/mask_textspotter.yaml', multi_gpu_testing=False, opts=[], range=None, vis=False, wait=True)
/home/zhoujianwen/masktextspotter.caffe2/lib/core/config.py:1094: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
yaml_cfg = AttrDict(yaml.load(f))
INFO test_net.py: 148: Testing with config:
INFO test_net.py: 149: {'BBOX_XFORM_CLIP': 4.135166556742356,
'CLUSTER': {'ON_CLUSTER': False},
'DATA_LOADER': {'NUM_THREADS': 4},
'DEDUP_BOXES': 0.0625,
'DOWNLOAD_CACHE': '/tmp/detectron-download-cache',
'EPS': 1e-14,
'EXPECTED_RESULTS': [],
'EXPECTED_RESULTS_ATOL': 0.005,
'EXPECTED_RESULTS_EMAIL': '',
'EXPECTED_RESULTS_RTOL': 0.1,
'FAST_RCNN': {'MLP_HEAD_DIM': 1024,
'ROI_BOX_HEAD': 'fast_rcnn_heads.add_roi_2mlp_head',
'ROI_XFORM_METHOD': 'RoIAlign',
'ROI_XFORM_RESOLUTION': 7,
'ROI_XFORM_SAMPLING_RATIO': 2},
'FPN': {'COARSEST_STRIDE': 32,
'DIM': 256,
'EXTRA_CONV_LEVELS': False,
'FPN_ON': True,
'MULTILEVEL_ROIS': True,
'MULTILEVEL_RPN': True,
'ROI_CANONICAL_LEVEL': 4,
'ROI_CANONICAL_SCALE': 224,
'ROI_MAX_LEVEL': 5,
'ROI_MIN_LEVEL': 2,
'RPN_ANCHOR_START_SIZE': 32,
'RPN_ASPECT_RATIOS': (0.5, 1, 2),
'RPN_MAX_LEVEL': 6,
'RPN_MIN_LEVEL': 2,
'USE_DEFORMABLE': False,
'ZERO_INIT_LATERAL': False},
'IMAGE': {'aug': False,
'brightness_delta': 32,
'brightness_prob': 0.5,
'contrast_lower': 0.5,
'contrast_prob': 0.5,
'contrast_upper': 1.5,
'hue_delta': 18,
'hue_prob': 0.5,
'lighting_noise_prob': 0.5,
'rotate_delta': 15,
'rotate_prob': 0.5,
'saturation_lower': 0.5,
'saturation_prob': 0.5,
'saturation_upper': 1.5},
'KRCNN': {'CONV_HEAD_DIM': 256,
'CONV_HEAD_KERNEL': 3,
'CONV_INIT': 'GaussianFill',
'DECONV_DIM': 256,
'DECONV_KERNEL': 4,
'DILATION': 1,
'HEATMAP_SIZE': -1,
'INFERENCE_MIN_SIZE': 0,
'KEYPOINT_CONFIDENCE': 'bbox',
'LOSS_WEIGHT': 1.0,
'MIN_KEYPOINT_COUNT_FOR_VALID_MINIBATCH': 20,
'NMS_OKS': False,
'NORMALIZE_BY_VISIBLE_KEYPOINTS': True,
'NUM_KEYPOINTS': -1,
'NUM_STACKED_CONVS': 8,
'ROI_KEYPOINTS_HEAD': '',
'ROI_XFORM_METHOD': 'RoIAlign',
'ROI_XFORM_RESOLUTION': 7,
'ROI_XFORM_SAMPLING_RATIO': 0,
'UP_SCALE': -1,
'USE_DECONV': False,
'USE_DECONV_OUTPUT': False},
'MATLAB': 'matlab',
'MEMONGER': True,
'MEMONGER_SHARE_ACTIVATIONS': False,
'MODEL': {'BBOX_REG_WEIGHTS': (10.0, 10.0, 5.0, 5.0),
'CLS_AGNOSTIC_BBOX_REG': False,
'CONV_BODY': 'FPN.add_fpn_ResNet50_conv5_body',
'EXECUTION_TYPE': 'dag',
'FASTER_RCNN': True,
'KEYPOINTS_ON': False,
'MASK_ON': True,
'NAME': 'shrink++',
'NUM_CLASSES': 2,
'RPN_ONLY': False,
'TYPE': 'generalized_rcnn'},
'MRCNN': {'CLS_SPECIFIC_MASK': True,
'CONV_INIT': 'MSRAFill',
'DILATION': 1,
'DIM_REDUCED': 256,
'IS_E2E': True,
'MASK_BATCH_SIZE_PER_IM': 16,
'RESOLUTION': 28,
'RESOLUTION_H': 32,
'RESOLUTION_W': 128,
'ROI_MASK_HEAD': 'text_mask_rcnn_heads.mask_rcnn_fcn_head_v1up4convs',
'ROI_XFORM_METHOD': 'RoIAlign',
'ROI_XFORM_RESOLUTION': 14,
'ROI_XFORM_RESOLUTION_H': 16,
'ROI_XFORM_RESOLUTION_W': 64,
'ROI_XFORM_SAMPLING_RATIO': 2,
'THRESH_BINARIZE': 0.5,
'UPSAMPLE_RATIO': 1,
'USE_FC_OUTPUT': False,
'WEIGHT_LOSS_CHAR_BOX': 1.0,
'WEIGHT_LOSS_MASK': 1.0,
'WEIGHT_WH': True},
'NUM_GPUS': 1,
'OUTPUT_DIR': '.',
'PIXEL_MEANS': array([[[102.9801, 115.9465, 122.7717]]]),
'RESNETS': {'NUM_GROUPS': 1,
'RES5_DILATION': 1,
'STRIDE_1X1': True,
'TRANS_FUNC': 'bottleneck_transformation',
'WIDTH_PER_GROUP': 64},
'RETINANET': {'ANCHOR_SCALE': 4,
'ASPECT_RATIOS': (0.25, 0.5, 1.0, 2.0, 4.0),
'BBOX_REG_BETA': 0.11,
'BBOX_REG_WEIGHT': 1.0,
'CLASS_SPECIFIC_BBOX': False,
'INFERENCE_TH': 0.05,
'LOSS_ALPHA': 0.25,
'LOSS_GAMMA': 2.0,
'NEGATIVE_OVERLAP': 0.4,
'NUM_CONVS': 4,
'POSITIVE_OVERLAP': 0.5,
'PRE_NMS_TOP_N': 1000,
'PRIOR_PROB': 0.01,
'RETINANET_ON': False,
'SCALES_PER_OCTAVE': 3,
'SHARE_CLS_BBOX_TOWER': False,
'SOFTMAX': False},
'RFCN': {'PS_GRID_SIZE': 3},
'RNG_SEED': 3,
'ROOT_DIR': '/home/zhoujianwen/masktextspotter.caffe2',
'RPN': {'ASPECT_RATIOS': (0.5, 1, 2),
'RPN_ON': True,
'SIZES': (64, 128, 256, 512),
'STRIDE': 16},
'SOLVER': {'BASE_LR': 0.005,
'GAMMA': 0.1,
'LOG_LR_CHANGE_THRESHOLD': 1.1,
'LRS': [],
'LR_POLICY': 'steps_with_decay',
'MAX_ITER': 200000,
'MOMENTUM': 0.9,
'SCALE_MOMENTUM': True,
'SCALE_MOMENTUM_THRESHOLD': 1.1,
'STEPS': [0, 120000],
'STEP_SIZE': 30000,
'WARM_UP_FACTOR': 0.3333333333333333,
'WARM_UP_ITERS': 500,
'WARM_UP_METHOD': u'linear',
'WEIGHT_DECAY': 0.0001},
'TEST': {'BBOX_AUG': {'AREA_TH_HI': 32400,
'AREA_TH_LO': 2500,
'ASPECT_RATIOS': (),
'ASPECT_RATIO_H_FLIP': False,
'COORD_HEUR': 'UNION',
'ENABLED': False,
'H_FLIP': False,
'MAX_SIZE': 2000,
'SCALES': (800,),
'SCALE_H_FLIP': False,
'SCALE_SIZE_DEP': False,
'SCORE_HEUR': 'UNION'},
'BBOX_REG': True,
'BBOX_VOTE': {'ENABLED': True,
'SCORING_METHOD': 'ID',
'SCORING_METHOD_BETA': 1.0,
'VOTE_TH': 0.9},
'COMPETITION_MODE': True,
'DATASET': '',
'DATASETS': ('icdar2015_test',),
'DETECTIONS_PER_IM': 100,
'FORCE_JSON_DATASET_EVAL': False,
'KPS_AUG': {'AREA_TH': 32400,
'ASPECT_RATIOS': (),
'ASPECT_RATIO_H_FLIP': False,
'ENABLED': False,
'HEUR': 'HM_AVG',
'H_FLIP': False,
'MAX_SIZE': 4000,
'SCALES': (),
'SCALE_H_FLIP': False,
'SCALE_SIZE_DEP': False},
'MASK_AUG': {'AREA_TH': 32400,
'ASPECT_RATIOS': (),
'ASPECT_RATIO_H_FLIP': False,
'ENABLED': False,
'HEUR': 'SOFT_AVG',
'H_FLIP': False,
'MAX_SIZE': 3333,
'SCALES': (1600,),
'SCALE_H_FLIP': False,
'SCALE_SIZE_DEP': False},
'MAX_SIZE': 3333,
'NMS': 0.5,
'NUM_TEST_IMAGES': 5000,
'OUTPUT_POLYGON': False,
'PRECOMPUTED_PROPOSALS': False,
'PROPOSAL_FILE': '',
'PROPOSAL_FILES': (),
'PROPOSAL_LIMIT': 2000,
'RPN_MIN_SIZE': 0,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 1000,
'RPN_PRE_NMS_TOP_N': 1000,
'SCALES': (1000,),
'SCORE_THRESH': 0.2,
'SOFT_NMS': {'ENABLED': False, 'METHOD': 'linear', 'SIGMA': 0.5},
'VIS': False,
'WEIGHTS': '/home/zhoujianwen/masktextspotter.caffe2/models/model_iter79999.pkl'},
'TRAIN': {'ASPECT_GROUPING': True,
'AUTO_RESUME': True,
'BATCH_SIZE_PER_IM': 512,
'BBOX_THRESH': 0.5,
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.0,
'CROWD_FILTER_THRESH': 0.7,
'DATASETS': ('icdar2015_train',),
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'FREEZE_CONV_BODY': False,
'GT_MIN_AREA': -1,
'IMS_PER_BATCH': 2,
'MAX_SIZE': 1333,
'MIX_RATIOS': [0.5, 0.25, 0.25],
'MIX_TRAIN': False,
'PROPOSAL_FILES': (),
'RPN_BATCH_SIZE_PER_IM': 256,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 0,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POST_NMS_TOP_N': 2000,
'RPN_PRE_NMS_TOP_N': 2000,
'RPN_STRADDLE_THRESH': 0,
'SCALES': (800,),
'SNAPSHOT_ITERS': 10000,
'USE_CHARANNS': [True],
'USE_FLIPPED': False,
'WEIGHTS': u'/tmp/detectron-download-cache/ImageNetPretrained/MSRA/R-50.pkl'},
'USE_NCCL': False,
'VIS': False,
'VIS_TH': 0.9}
WARNING cnn.py: 25: [====DEPRECATE WARNING====]: you are creating an object from CNNModelHelper class which will be deprecated soon. Please use ModelHelper object with brew module. For more information, please refer to caffe2.ai and python/brew.py, python/brew_test.py for more information.
INFO net.py: 54: Loading from: /home/zhoujianwen/masktextspotter.caffe2/models/model_iter79999.pkl
/home/zhoujianwen/masktextspotter.caffe2/lib/utils/net.py:59: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
saved_cfg = yaml.load(src_blobs['cfg'])
Traceback (most recent call last):
File "tools/test_net.py", line 159, in <module>
main(ind_range=args.range, multi_gpu_testing=args.multi_gpu_testing, vis=vis)
File "tools/test_net.py", line 127, in main
parent_func(multi_gpu=multi_gpu_testing, vis=vis)
File "/home/zhoujianwen/masktextspotter.caffe2/lib/core/test_engine.py", line 64, in test_net_on_dataset
test_net(vis=vis)
File "/home/zhoujianwen/masktextspotter.caffe2/lib/core/test_engine.py", line 126, in test_net
model = initialize_model_from_cfg()
File "/home/zhoujianwen/masktextspotter.caffe2/lib/core/test_engine.py", line 160, in initialize_model_from_cfg
model, cfg.TEST.WEIGHTS, broadcast=False
File "/home/zhoujianwen/masktextspotter.caffe2/lib/utils/net.py", line 45, in initialize_from_weights_file
initialize_gpu_0_from_weights_file(model, weights_file)
File "/home/zhoujianwen/masktextspotter.caffe2/lib/utils/net.py", line 59, in initialize_gpu_0_from_weights_file
saved_cfg = yaml.load(src_blobs['cfg'])
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/__init__.py", line 114, in load
return loader.get_single_data()
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 45, in get_single_data
return self.construct_document(node)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 49, in construct_document
data = self.construct_object(node)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 96, in construct_object
data = constructor(self, tag_suffix, node)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 628, in construct_python_object_new
return self.construct_python_object_apply(suffix, node, newobj=True)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 611, in construct_python_object_apply
value = self.construct_mapping(node, deep=True)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 214, in construct_mapping
return BaseConstructor.construct_mapping(self, node, deep=deep)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 139, in construct_mapping
value = self.construct_object(value_node, deep=deep)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 101, in construct_object
for dummy in generator:
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 404, in construct_yaml_map
value = self.construct_mapping(node)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 214, in construct_mapping
return BaseConstructor.construct_mapping(self, node, deep=deep)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 139, in construct_mapping
value = self.construct_object(value_node, deep=deep)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 96, in construct_object
data = constructor(self, tag_suffix, node)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 617, in construct_python_object_apply
instance = self.make_python_instance(suffix, node, args, kwds, newobj)
File "/home/zhoujianwen/.local/lib/python2.7/site-packages/yaml/constructor.py", line 558, in make_python_instance
node.start_mark)
yaml.constructor.ConstructorError: while constructing a Python instance
expected a class, but found <type 'builtin_function_or_method'>
in "<string>", line 3, column 20:
BBOX_XFORM_CLIP: !!python/object/apply:numpy.core ...
tested results in /masktextspotter.caffe2/models/icdar2015_test/model_iter79999.pkl_results/
like res_img_100.txt res_img_100_0.mat.npy
How to transform to pictures?
Hello @lvpengyuan ,Thank you very much for your excellent work,Can I continue training with new data based on the model you have created?
I run the test_net.py, and it came an error
no module named 'core'
but I can't find a python module named core by google
thanks for anyone's help!
I try python tools/test_net.py --cfg configs/text/mask_textspotter.yaml
but i can not import lanms
i have all ready make in lib and make in masktextspotter.caffe2/lanms
i use anaconda3 python2.7
gcc and g++ version is 5.4
Hello,lvpengyuan:
Thanks for sharing the code.I can run the test script after installing the env,But I got some problem
while running the train script.The error imformation is as follows:
Error1:
[E pybind_state.h:424] Exception encountered running PythonOp function: IndexError: index 0 is out of bounds for axis 0 with size 0
At:
/MaskTextSpotter/MTS_libbase/lib/roi_data/mask_rcnn.py(210): add_charmask_rcnn_blobs
/MaskTextSpotter/MTS_libbase/lib/roi_data/fast_rcnn.py(325): _sample_rois_rec
/MaskTextSpotter/MTS_libbase/lib/roi_data/fast_rcnn.py(141): add_fast_rcnn_blobs_rec
/MaskTextSpotter/MTS_libbase/lib/ops/collect_and_distribute_fpn_rpn_proposals_rec.py(60): forward
Error2:
[E pybind_state.h:424] Exception encountered running PythonOp function: UnboundLocalError: local variable 'char_boxes' referenced before assignment
At:
/MaskTextSpotter/MTS_libbase/lib/roi_data/mask_rcnn.py(277): add_charmask_rcnn_blobs
/MaskTextSpotter/MTS_libbase/lib/roi_data/fast_rcnn.py(325): _sample_rois_rec
/MaskTextSpotter/MTS_libbase/lib/roi_data/fast_rcnn.py(141): add_fast_rcnn_blobs_rec
/MaskTextSpotter/MTS_libbase/lib/ops/collect_and_distribute_fpn_rpn_proposals_rec.py(60): forward
Both the two errors are caused by the function add_charmask_rcnn_blobs in lib/roi_data/mask_rcnn.py.And I read it and found they are both in the branch:
if is_e2e:
...
if fg_inds.shape[0] > 0:
...
# the else branch doesn't assign the variable 'char_boxes'
else: # If there are no fg masks (it does happen)
print('is_e2e true: fg_inds.shape[0] > 0: ' + str(fg_inds.shape[0] > 0))
# The network cannot handle empty blobs, so we must provide a mask
# We simply take the first bg roi, given it an all -1's mask (ignore
# label), and label it with class zero (bg).
bg_inds = np.where(blobs['labels_int32'] == 0)[0]
# rois_fg is actually one background roi, but that's ok because ...
rois_fg = sampled_boxes[bg_inds[0]].reshape((1, -1))
# We give it an -1's blob (ignore label)
masks = -blob_utils.ones((1, 2, M_HEIGHTM_WIDTH), int32=True)
mask_weights = -blob_utils.ones((1, 2, M_HEIGHTM_WIDTH), int32=True)
# changed lzn
# char_boxes_inside_weight = np.zeros(1, M_HEIGHT*M_WIDTH, 4, dtype=np.float32)
char_boxes_inside_weight = np.zeros((1, M_HEIGHT * M_WIDTH, 4), dtype=np.float32)
# We label it with class = 0 (background)
mask_class_labels = blob_utils.zeros((1, ))
# Mark that the first roi has a mask
# these code can help me avoid the error1
# if len(roi_has_mask) == 0:
# pass
# else:
roi_has_mask[0] = 1
Can you help me?
Thanks a lot.
Excuse me, where do I need to modify for text recognition in the vertical direction?
My text is vertically oriented and identifies errors, but it can recognize horizontal ones. Where do I need to change? Test.py? How should I modify it?
when i import lanms, it print this problem
Traceback (most recent call last): File "tools/test_net.py", line 40, in <module> from core.test_engine import test_net, test_net_on_dataset File "/media/chenjun/ed/31_ocr_own/masktextspotter.caffe2/lib/core/test_engine.py", line 37, in <module> from core.test import im_detect_all File "/media/chenjun/ed/31_ocr_own/masktextspotter.caffe2/lib/core/test.py", line 49, in <module> import lanms File "/home/chenjun/anaconda2/lib/python2.7/site-packages/lanms/__init__.py", line 2, in <module> from .adaptor import merge_quadrangle_n9 as nms_impl ImportError: /home/chenjun/anaconda2/lib/python2.7/site-packages/lanms/adaptor.so: undefined symbol: PyInstanceMethod_Type
from datasets import task_evaluation
ImportError: cannot import name task_evaluation
in the dataset has no task_evaluation.
Could you help me?
thank you very much
I noticed that you have a lib folder looks alike the folder in the Detectron. But even the functions with the same name or same file, they are a bit different from Detectron.
When I tried the code, the core module in your code and core module in Detectron may have conflicts.
If you made any changes to the Detectron, could you provide your version of Detectron?
When I set MIX_RATIOS = [0.4, 0.4, 0.2] in .yaml, it stop at assert(len(self._dataset_inds) == self._num_gpus*cfg.TRAIN.IMS_PER_BATCH). And self._dataset_inds = [].
Then I set MIX_RATIOS = [2.0, 2.0, 1.0] in .yaml and comment the assert, since self._dataset_inds = [0, 0, 1, 1, 2]. It seems to work as expect.
So do you have any idea? Or would you please explain the assert?
ImportError: cannot import name json_dataset
@lvpengyuan
您好,我想请问下,我现在得到的测试结果是这样的,“819,283,871,299,819,283,870,283,870,298,819,298,yourselli,0.9986137,0.6931232114632925,/data/home/zjw/pythonFile/masktextspotter.caffe2/data/icdar2015_test/model_iter79999.pkl_results/res_img_2_0.mat”
请问前面12个数字是指什么呢,该怎么表示bounding box呢? mask path前面的两个数又是指的什么呢?
谢谢。
Hello, how to do inference on new images using pretrained model?
Hi
For training Icdar2013 the ground truth files look this
for a single word
158.0,128.0,411.0,128.0,411.0,181.0,158.0,181.0,Footpath,158.0,131.0,187.0,131.0,187.0,172.0,158.0,172.0,F,189.0,139.0,219.0,139.0,219.0,171.0,189.0,171.0,o,226.0,139.0,255.0,139.0,255.0,171.0,226.0,171.0,o,261.0,129.0,282.0,129.0,282.0,171.0,261.0,171.0,t,290.0,140.0,319.0,140.0,319.0,181.0,290.0,181.0,p,324.0,139.0,351.0,139.0,351.0,170.0,324.0,170.0,a,357.0,128.0, 377.0,128.0,377.0,170.0,357.0,170.0,t,385.0,129.0,411.0,129.0,411.0,170.0,385.0,170.0,h
as you see this also has charcter level annotation.
but for the icdar 2015 training set, the annotation looks like this-
377,117,463,117,465,130,378,130,Genaxis Theatre
493,115,519,115,519,131,493,131,[06]
374,155,409,155,409,170,374,170,###
492,151,551,151,551,170,492,170,62-03
376,198,422,198,422,212,376,212,Carpark
494,190,539,189,539,205,494,206,###
374,1,494,0,492,85,372,86,###
As you see this does not have letter level annotation.
can anyone guide me as to how to get charcter level annotation from this?
Thanks in advance.
Could you please tell me, thank you.
Just wonder how to train from scratch
Error when i removed the weight file in configs: RuntimeError: [enforce fail at operator.cc:58] blob != nullptr. op AffineChannel: Encountered a non-existing input blob: gpt_0/res_conv1_bn_s
Could you help with that?
Hi @lvpengyuan,
I am getting the following error while running python tools/test_net.py --cfg configs/text/mask_textspotter.yaml
Traceback (most recent call last):
File "tools/test_net.py", line 42, in
from core.test_retinanet import test_retinanet
ImportError: cannot import name test_retinanet
I run the followed statement:
python tools/test_net.py --cfg configs/text/mask_textspotter.yaml
it came an error like this.
File "/home/brooklyn/anaconda3/envs/conda_py2/lib/python2.7/site-packages/caffe2/python/core.py", line 2205, in getattr
",".join(workspace.C.nearby_opnames(op_type)) + ']'
AttributeError: Method DeformConv is not a registered operator. Did you mean: []
How come?
Thanks for help!
I try python tools/test_net.py --cfg configs/text/mask_textspotter.yaml
but i can not import datasets
the problem is as follows:
make: Entering directory '/home/xie/MaskTextSpotter/lanms'
make: 'adaptor.so' is up to date.
make: Leaving directory '/home/xie/MaskTextSpotter/lanms'
[E init_intrinsics_check.cc:43] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
[E init_intrinsics_check.cc:43] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
INFO test_net.py: 142: Called with args:
INFO test_net.py: 143: Namespace(cfg_file='configs/text/mask_textspotter.yaml', multi_gpu_testing=False, opts=[], range=None, vis=False, wait=True)
Traceback (most recent call last):
File "tools/test_net.py", line 145, in
merge_cfg_from_file(args.cfg_file)
File "/home/MaskTextSpotter/lib/core/config.py", line 1095, in merge_cfg_from_file
_merge_a_into_b(yaml_cfg, __C)
File "/home/MaskTextSpotter/lib/core/config.py", line 1153, in _merge_a_into_b
_merge_a_into_b(v, b[k], stack=stack_push)
File "/home/MaskTextSpotter/lib/core/config.py", line 1147, in _merge_a_into_b
v = _check_and_coerce_cfg_value_type(v, b[k], k, full_key)
File "/home/MaskTextSpotter/lib/core/config.py", line 1242, in _check_and_coerce_cfg_value_type
'key: {}'.format(type_b, type_a, value_b, value_a, full_key)
ValueError: Type mismatch (<type 'tuple'> vs. <type 'str'>) with values (() vs. icdar2015_test) for config key: TEST.DATASETS
Hi
i don't understand the format of the results,
618,140,662,159,618,140,660,140,660,157,618,157,ahead,0.99701583,0.9119489312171936,./train/shrink++_finetune/icdar2015_test/model_iter79999.pkl_results/res_img_1_0.mat
this is one result of the text file.
There are 12 numbers and then a word and two more numbers, What do the 12 numbers mean?
I checked the icdar data and it has 8 numbers.
also why are there 2 confidence scores?
Any suggestions would be really helpful.
Thanks in advance.
if i use anaconda ,should i make some change when i set up Python modules?
thx
when test the infer.py with ICDAR2015 dataset, it print out the problem:
Traceback (most recent call last):
File "tools/test_net.py", line 169, in <module>
main(ind_range=args.range, multi_gpu_testing=args.multi_gpu_testing, vis=vis)
File "tools/test_net.py", line 139, in main
parent_func(multi_gpu=multi_gpu_testing, vis=vis)
File "./lib/core/test_engine.py", line 64, in test_net_on_dataset
test_net(vis=vis)
File "./lib/core/test_engine.py", line 150, in test_net
model, im, image_name, box_proposals, timers, vis=vis
File "./lib/core/test.py", line 150, in im_detect_all
text, rec_score, rec_char_scores = getstr_grid(char_masks[index,:,:,:].copy(), box_w, box_h)
File "./lib/core/test.py", line 1107, in getstr_grid
string, score, rec_scores = seg2text(pos, mask_index, seg)
File "./lib/core/test.py", line 1216, in seg2text
im2, contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
ValueError: need more than 2 values to unpack
The model cannot be downloaded
is it i don't install the detectron?
Firstly, thank you for releasing the code.
Could you share the fine-tuned model?
Hope you reply.
Traceback (most recent call last):
File "tools/train_net.py", line 370, in
main()
File "tools/train_net.py", line 197, in main
checkpoints = train_model()
File "tools/train_net.py", line 212, in train_model
setup_model_for_training(model, output_dir)
File "tools/train_net.py", line 318, in setup_model_for_training
workspace.CreateNet(model.net)
File "/opt/conda3/lib/python3.6/site-packages/caffe2/python/workspace.py", line 171, in CreateNet
StringifyProto(net), overwrite,
File "/opt/conda3/lib/python3.6/site-packages/caffe2/python/workspace.py", line 197, in CallWithExceptionIntercept
return func(args, kwargs)
RuntimeError: [enforce fail at operator.cc:46] blob != nullptr. op AffineChannel: Encountered a non-existing input blob: gpu_0/res_conv1_bn_s
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7fa03a922fe1 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libc10.so)
frame #1: c10::ThrowEnforceNotMet(char const, int, char const, std::string const&, void const) + 0x49 (0x7fa03a922c29 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libc10.so)
frame #2: caffe2::OperatorBase::OperatorBase(caffe2::OperatorDef const&, caffe2::Workspace*) + 0x4ec (0x7f9fe910d71c in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #3: + 0x277bd8c (0x7f9f7eb9cd8c in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #4: + 0x277c42e (0x7f9f7eb9d42e in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2_gpu.so)
frame #5: std::_Function_handler<std::unique_ptr<caffe2::OperatorBase, std::default_deletecaffe2::OperatorBase > (caffe2::OperatorDef const&, caffe2::Workspace*), std::unique_ptr<caffe2::OperatorBase, std::default_deletecaffe2::OperatorBase > ()(caffe2::OperatorDef const&, caffe2::Workspace)>::_M_invoke(std::_Any_data const&, caffe2::OperatorDef const&, caffe2::Workspace*) + 0xf (0x7fa03ab8cb7f in /opt/conda3/lib/python3.6/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-36m-x86_64-linux-gnu.so)
frame #6: + 0x14058c7 (0x7f9fe910b8c7 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #7: + 0x1407eee (0x7f9fe910deee in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #8: caffe2::CreateOperator(caffe2::OperatorDef const&, caffe2::Workspace*, int) + 0x340 (0x7f9fe910e4a0 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #9: caffe2::dag_utils::prepareOperatorNodes(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0x14fc (0x7f9fe90fb63c in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #10: caffe2::AsyncNetBase::AsyncNetBase(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0x287 (0x7f9fe90e6657 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #11: caffe2::AsyncSchedulingNet::AsyncSchedulingNet(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0x9 (0x7f9fe90ebef9 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #12: + 0x13e76be (0x7f9fe90ed6be in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #13: + 0x13e756f (0x7f9fe90ed56f in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #14: caffe2::CreateNet(std::shared_ptr<caffe2::NetDef const> const&, caffe2::Workspace*) + 0x6b0 (0x7f9fe90df8b0 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #15: caffe2::Workspace::CreateNet(std::shared_ptr<caffe2::NetDef const> const&, bool) + 0x102 (0x7f9fe913a662 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #16: caffe2::Workspace::CreateNet(caffe2::NetDef const&, bool) + 0x8e (0x7f9fe913abee in /opt/conda3/lib/python3.6/site-packages/caffe2/python/../../torch/lib/libcaffe2.so)
frame #17: + 0x54767 (0x7fa03ab84767 in /opt/conda3/lib/python3.6/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-36m-x86_64-linux-gnu.so)
frame #18: + 0x8bb5e (0x7fa03abbbb5e in /opt/conda3/lib/python3.6/site-packages/caffe2/python/caffe2_pybind11_state_gpu.cpython-36m-x86_64-linux-gnu.so)
frame #45: __libc_start_main + 0xf0 (0x7fa046810830 in /lib/x86_64-linux-gnu/libc.so.6)
Is it because of I don't use the model_iter79999.pkl to initiate the model?
when i run python2 tools/infer.py --im 1.jpg --rpn-pkl models/model_iter79999.pkl --rpn-cfg configs/text/make_textspotter.yaml --output-dir restuls
The error is import datasets.dummy_datasets as dummy_datasets
ImportError: No module named dummy_datasets
Can you help me,thanks.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.