canjie-luo / text-image-augmentation Goto Github PK

View Code? Open in Web Editor NEW

477.0 20.0 89.0 156 KB

Geometric Augmentation for Text Image

License: MIT License

CMake 1.78% Python 1.66% C++ 96.56%

opencv recognition scene-text detection image-transformations

text-image-augmentation's Issues

您好作者，这个东西工程要怎么改才可以支持python3(自己装boost确实困难重重,.so的动态模块导入,python3会报错)

floating point exception (core dumped) when process images with different size

If I resize all the images to the same size before transforming, there will be no error.
By the way, all the images with height = 70 and width in (200, 400) , no very small size.

About the agent updating and initialization

I have two questions about the nice paper "Learn to Augment: Joint Data Augmentation and Network Optimization
for Text Recognition":

1.  In LIne 9 of the Algorithm 1, why the Agent network update towards -S'? I don't understand why -S' is a harder moving state.
2. As for the agent initialization, what is the initialization direction of the 2*(N+1) fiducial points?

cmake problem

我使用的测试环境是ubuntu 16.04, 没有按照说明使用anaconda安装boost,结果可以编译，成功生成了Argument.so这个文件

但是到服务器上Centos7.4, 使用同样的办法就不行了，我想如果不是boost1.67安装出了问题，那就是cmake　Ｅｒｒｏｒ

boost 安装过程:

down load boost_1_67_0.tar.gz
extract file and cd it
./bootstrap.sh --with-libraries=all --with-python=/home/kongtianning/anaconda3/envs/python2712/bin/python --with-python-version=2.7 --with-python-root=/home/kongtianning/anaconda3/envs/python2712 --prefix=/home/kongtianning/myboost
./b2
./b2 install

接下来我按照你说的做, 在ubuntu上面用系统自带的python2 没问题　但是在Centos上就不行
mkdir build
cd build
cmake -D CUDA_USE_STATIC_CUDA_RUNTIME=OFF ..　

在Ｃｅｎｔｏｓ下cmake 命令我是这样用的

cmake -DPYTHON_INCLUDE_DIR=/home/kongtianning/anaconda3/envs/python2712/include/python2.7 -DPYTHON_LIBRARY=/home/kongtianning/anaconda3/envs/python2712/lib/ -DPYTHON_EXECUTABLE=/home/kongtianning/anaconda3/envs/python2712/bin/python -D CUDA_USE_STATIC_CUDA_RUNTIME=OFF ..

结果返回是　找不到 boost_python

CMake Error at /usr/share/cmake/Modules/FindBoost.cmake:1138 (message):
Unable to find the requested Boost libraries.

Boost version: 1.67.0

Boost include path: /usr/local/include

Could not find the following Boost libraries:

      boost_python

No Boost libraries were found. You may need to set BOOST_LIBRARYDIR to the
directory containing Boost libraries or BOOST_ROOT to the location of
Boost.
Call Stack (most recent call first):
CMakeLists.txt:18 (find_package)

-- Configuring incomplete, errors occurred!
See also "/home/kongtianning/PycharmProjects/HanWangProJectPython/imageAugment/build/CMakeFiles/CMakeOutput.log".

作者您好，请问能否提供联合训练部分的代码？识别网络和代理网络是同步训练的吗？

想请教一下识别网络是如何进行训练的？

Is this project still in development?

I see you have some promising feature on top of README but there is no update since last year?

running into problems during make

Cloned the repo and tried building it.
Used the following command
cmake .. -D CUDA_USE_STATIC_CUDA_RUNTIME=OFF -DPYTHON_INCLUDE_DIR=$(python -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())") -DPYTHON_LIBRARY=$(python -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR'))")

During make, getting this error log
In file included from /usr/include/python2.7/numpy/ndarraytypes.h:1809:0,
from /usr/include/python2.7/numpy/ndarrayobject.h:18,
from /y/x/Text-Image-Augmentation/include/conversion.h:8,
from /y/x/Text-Image-Augmentation/src/conversion.cpp:1:
/usr/include/python2.7/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by "
^~~~~~~
/y/x/Text-Image-Augmentation/src/conversion.cpp:119:16: error: cannot declare variable 'g_numpyAllocator' to be of abstract type 'NumpyAllocator'
NumpyAllocator g_numpyAllocator;
^~~~~~~~~~~~~~~~
/y/x/Text-Image-Augmentation/src/conversion.cpp:64:7: note: because the following virtual functions are pure within 'NumpyAllocator':
class NumpyAllocator : public MatAllocator
^~~~~~~~~~~~~~
In file included from /usr/include/opencv2/core.hpp:59:0,
from /usr/include/opencv2/imgproc.hpp:46,
from /usr/include/opencv2/imgproc/imgproc.hpp:48,
from /y/x/Text-Image-Augmentation/include/conversion.h:5,
from /y/x/Text-Image-Augmentation/src/conversion.cpp:1:
/usr/include/opencv2/core/mat.hpp:417:23: note: virtual cv::UMatData* cv::MatAllocator::allocate(int, const int*, int, void*, size_t*, int, cv::UMatUsageFlags) const
virtual UMatData* allocate(int dims, const int* sizes, int type,
^~~~~~~~
/usr/include/opencv2/core/mat.hpp:419:18: note: virtual bool cv::MatAllocator::allocate(cv::UMatData*, int, cv::UMatUsageFlags) const
virtual bool allocate(UMatData* data, int accessflags, UMatUsageFlags usageFlags) const = 0;
^~~~~~~~
/usr/include/opencv2/core/mat.hpp:420:18: note: virtual void cv::MatAllocator::deallocate(cv::UMatData*) const
virtual void deallocate(UMatData* data) const = 0;
^~~~~~~~~~
/y/x/Text-Image-Augmentation/src/conversion.cpp: In member function 'cv::Mat NDArrayConverter::toMat(const PyObject*)':
/y/x/Text-Image-Augmentation/src/conversion.cpp:202:11: error: 'class cv::Mat' has no member named 'refcount'
m.refcount = refcountFromPyObject(o);
^~~~~~~~
/y/x/Text-Image-Augmentation/src/conversion.cpp: In member function 'PyObject* NDArrayConverter::toNDArray(const cv::Mat&)':
/y/x/Text-Image-Augmentation/src/conversion.cpp:223:12: error: 'class cv::Mat' has no member named 'refcount'
if(!p->refcount || p->allocator != &g_numpyAllocator)
^~~~~~~~
/y/x/Text-Image-Augmentation/src/conversion.cpp:230:36: error: 'class cv::Mat' has no member named 'refcount'
return pyObjectFromRefcount(p->refcount);
^~~~~~~~
CMakeFiles/Augment.dir/build.make:75: recipe for target 'CMakeFiles/Augment.dir/src/conversion.cpp.o' failed
make[2]: *** [CMakeFiles/Augment.dir/src/conversion.cpp.o] Error 1
CMakeFiles/Makefile2:72: recipe for target 'CMakeFiles/Augment.dir/all' failed
make[1]: *** [CMakeFiles/Augment.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

python version - 2.7.17
opencv - 3.3.0
numpy - 1.13.3

CMake fail

Thanks for your code, but when I compiled the code according to the readme.md, I meet the following error.

Could NOT find PythonLibs (missing: PYTHON_LIBRARIES PYTHON_INCLUDE_DIRS)

i got some trouble in 'make'

[ 12%] Building CXX object CMakeFiles/Augment.dir/src/conversion.cpp.o
In file included from /home/fbas/下载/Scene-Text-Image-Transformer-master/src/conversion.cpp:1:0:
/home/fbas/下载/Scene-Text-Image-Transformer-master/include/conversion.h:8:33: fatal error: numpy/ndarrayobject.h: 没有那个文件或目录
compilation terminated.
CMakeFiles/Augment.dir/build.make:62: recipe for target 'CMakeFiles/Augment.dir/src/conversion.cpp.o' failed
make[2]: *** [CMakeFiles/Augment.dir/src/conversion.cpp.o] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/Augment.dir/all' failed
make[1]: *** [CMakeFiles/Augment.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

Why oldDotL are set by DstPoints ?

Could you help explain the following contradiction?

According to the paper, w_k is defined with respect to the fiducial point(control point) p_k, and hence oldDotL should represent the fiducial point here:

Text-Image-Augmentation/src/imgwarp_mls_similarity.cpp

Lines 59 to 60 in ab8e37a

    
           w[k] = 1 / ((i - oldDotL[k].x) * (i - oldDotL[k].x) + 
        
                       (j - oldDotL[k].y) * (j - oldDotL[k].y));

But instead oldDotL are set with the deformed positions:

Text-Image-Augmentation/src/imgwarp_mls.cpp

Lines 92 to 98 in ab8e37a

    
           void ImgWarp_MLS::setDstPoints(const vector<Point_<int> > &qdst) { 
        
               nPoint = qdst.size(); 
        
               oldDotL.clear(); 
        
               oldDotL.reserve(nPoint); 
        
               for (size_t i = 0; i < qdst.size(); i++) oldDotL.push_back(qdst[i]); 
        
           }

Text-Image-Augmentation/src/Augment.cpp

Lines 41 to 53 in ab8e37a

    
           qdst.push_back(Point(rand()%threshold, rand()%threshold)); 
        
           qdst.push_back(Point(img_input.cols-rand()%threshold, rand()%threshold)); 
        
           qdst.push_back(Point(img_input.cols-rand()%threshold, img_input.rows-rand()%threshold)); 
        
           qdst.push_back(Point(rand()%threshold, img_input.rows-rand()%threshold)); 
        
           for (int i = 1; i < segment; i++){ 
        
               qsrc.push_back(Point(cut*i, 0)); 
        
               qsrc.push_back(Point(cut*i, img_input.rows)); 
        
               qdst.push_back(Point(cut*i+rand()%threshold-0.5*threshold, rand()%threshold-0.5*threshold)); 
        
               qdst.push_back(Point(cut*i+rand()%threshold-0.5*threshold, img_input.rows+rand()%threshold-0.5*threshold)); 
        
           } 
        
           cv::Mat result = trans1.setAllAndGenerate(img_input, qsrc, qdst, img_input.cols, img_input.rows);

canjie-luo / text-image-augmentation Goto Github PK

text-image-augmentation's Issues

您好作者，这个东西工程要怎么改才可以支持python3(自己装boost确实困难重重,.so的动态模块导入,python3会报错)

floating point exception (core dumped) when process images with different size

About the agent updating and initialization

cmake problem

作者您好，请问能否提供联合训练部分的代码？识别网络和代理网络是同步训练的吗？

Is this project still in development?

running into problems during make

CMake fail

i got some trouble in 'make'

Why oldDotL are set by DstPoints ?

About Joint Training

I got 'float point exception: core dumped' transforming pics of short cols.

undefined symbol: _ZN2cv6formatB5cxx11EPKcz

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	w[k] = 1 / ((i - oldDotL[k].x) * (i - oldDotL[k].x) +
	(j - oldDotL[k].y) * (j - oldDotL[k].y));

	void ImgWarp_MLS::setDstPoints(const vector<Point_<int> > &qdst) {
	nPoint = qdst.size();
	oldDotL.clear();
	oldDotL.reserve(nPoint);

	for (size_t i = 0; i < qdst.size(); i++) oldDotL.push_back(qdst[i]);
	}

	qdst.push_back(Point(rand()%threshold, rand()%threshold));
	qdst.push_back(Point(img_input.cols-rand()%threshold, rand()%threshold));
	qdst.push_back(Point(img_input.cols-rand()%threshold, img_input.rows-rand()%threshold));
	qdst.push_back(Point(rand()%threshold, img_input.rows-rand()%threshold));

	for (int i = 1; i < segment; i++){
	qsrc.push_back(Point(cut*i, 0));
	qsrc.push_back(Point(cut*i, img_input.rows));
	qdst.push_back(Point(cuti+rand()%threshold-0.5threshold, rand()%threshold-0.5*threshold));
	qdst.push_back(Point(cuti+rand()%threshold-0.5threshold, img_input.rows+rand()%threshold-0.5*threshold));
	}

	cv::Mat result = trans1.setAllAndGenerate(img_input, qsrc, qdst, img_input.cols, img_input.rows);