Giter Site home page Giter Site logo

hciilab / derpn Goto Github PK

View Code? Open in Web Editor NEW
156.0 23.0 46.0 9.71 MB

A novel region proposal network for more general object detection ( including scene text detection ).

Shell 0.48% Makefile 0.27% MATLAB 0.41% Python 62.63% C++ 32.09% Cuda 2.62% C 0.11% CMake 1.13% Dockerfile 0.03% HTML 0.08% CSS 0.10% Batchfile 0.06% PowerShell 0.01%
object-detection scene-text-detection region-proposals object-proposals rpn scene-text detection text-detection

derpn's People

Contributors

hciilab avatar lele-xie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

derpn's Issues

Code optimization

The paper is very interesting, and thank you very much for sharing the code.
I encountered loss nan when I used multi-batch training. After the check, code in generate_derpn_labels_targets_layer.py should be modified except for the assertion code.
Old
#############
for i in range(cls_num_ofchannel.shape[1]):
cls_weights[0,i, :, :].fill(cls_num_ofchannel[0,i])
for i in range(reg_num_ofchannel.shape[1]):
reg_weights[0,i, :, :].fill(reg_num_ofchannel[0,i])
#############
New
#############
for ith_im in range(num_images):
for i in range(cls_num_ofchannel.shape[1]):
cls_weights[ith_im,i, :, :].fill(cls_num_ofchannel[ith_im,i])
for i in range(reg_num_ofchannel.shape[1]):
reg_weights[ith_im,i, :, :].fill(reg_num_ofchannel[ith_im,i])
#############

Add code

If you wouldn't mind, please close this issue only after the code has been fully added. This issue is for tracking purposes.

about some code in tools/ron/generate_derpn_labels_targets_layer.py/self.eval_obs_region(gt_boxes)

def eval_obs_region(self, gt_boxes):
        height, width = self.height, self.width
        observe_region = np.zeros((height, width), np.int32)
        scaled_gt = gt_boxes[:,:4].copy()/float(self._feat_stride)#gt scaled to featrue map
        gt_w = scaled_gt[:, 2] -scaled_gt[:, 0]
        gt_h = scaled_gt[:, 3] -scaled_gt[:, 1]
        gt_cx = (scaled_gt[:, 2] + scaled_gt[:, 0])*0.5
        gt_cy = (scaled_gt[:, 3] + scaled_gt[:, 1])*0.5

        start_x = np.maximum((gt_cx - gt_w*self.extend_ratio*0.5).astype(np.int32), 0)
        end_x = np.minimum((gt_cx + gt_w*self.extend_ratio*0.5).astype(np.int32), width-1)
        start_y = np.maximum((gt_cy - gt_h*self.extend_ratio*0.5).astype(np.int32), 0)
        end_y = np.minimum((gt_cy + gt_h*self.extend_ratio*0.5).astype(np.int32), height-1)
        for ith in range(gt_boxes.shape[0]):                
            observe_region[start_y[ith]:end_y[ith]+1,start_x[ith]:end_x[ith]+1] = 1
        return observe_region, gt_cx, gt_cy`

`
The function of the above code, i think it is to reflect the gtbox into the feature map, and start_x,end_x ,start_y and end_y is to compute the coordinate in the feature map.
But what does the self.extend-ratio mean and why it is equal to 1.2?
And maybe i dismiss the real function of the above codes.
thanks a lo

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.