Giter Site home page Giter Site logo

penny4860 / svhn-deep-digit-detector Goto Github PK

View Code? Open in Web Editor NEW
186.0 13.0 83.0 169.42 MB

Deep-digit-detector (and recognizer) in natural scene. A digit detection framework was implemented using keras with tensorflow backend.

License: MIT License

Python 100.00%
svhn keras detection tensorflow

svhn-deep-digit-detector's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

svhn-deep-digit-detector's Issues

package setup

setup.py 를 이용해서 package 를 setup 하자.

No module named 'crop'

Trying to start the process of getting this digit detector up and running. Have not gotten very far. Attempting to run 1_sample_loader.py is dependent on extractor.py which imports region_proposal.py.

In region_proposal.py it imports 'crop' and 'show', where are these coming from. There are no modules with these names in "A list of all the packages needed to run this project can be found in digit_detector.yml."

positive samples

  • 현재 인식이 정확하지 않다.
  • region proposal 중에서 overlap 이 75% 이상이면 positive sample 에 추가해보자.
  • ground truth 에서 margin (padding) 을 조금씩 주자.

MSER

请问MSER算法是作用在哪一环节的呢?

train detector

  • 32x32x1
  • Positive Samples
    • svhn matlab file load
  • Negative Samples
    • random cropping in natural scene
        1. remove digit region in SVHN natural scene
        1. random crop

Format annotation file

I tried to load the data thanks to "1_sample_loader.py" which requires a digitStruct.json file. Inside the train.tar.gz file there is only a digitStruct.mat file. Have you done any manipulation to convert it in matlab from .mat to .json ?

data augmentation

  • Translation
  • Rotation
    • -15 ~ +15 degree
  • Noise

https://keras.io/preprocessing/image/ 라이브러리 사용하기
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
http://pastebin.com/0QHtPGzJ

evaluate detector's performance

  • Performance of the region proposer

    • recall value : 0.630,
    • precision value : 0.045
    • f1_score : 0.084
  • Performance of the detector

    • recall value : 0.487
    • precision value : 0.656
    • f1_score : 0.559

pruning

region proposal 중에서 가로의 길이가 더 긴 것을 pruning

Improving Idea

  • 초기화
    • 학습된 network 로 transfer learning
  • Validation Data 를 original sample 에서만 sampling
  • Bounding Box 에 Margin 을 줘서 crop 하자
    • 32-32 size 의 mat file을 그대로 사용하는 방법
    • Bounding Box 에서 ratio별로 margin 을 주는 방법
  • Data Augmentation 을 Negative Sample 도 같이 하자
    • Training 할 때 run-time 으로
  • Hard-Negative Sample 을 더 추가

Pickle

The trained model and the get_preds function work perfectly fine but when I export the trained model as a pickle file and then try to use it, it doesn't work. Maybe it is an issue of fastai. When I updated fastai to the latest version, no one of the previous functions run. Does anyone have a pkl, hdf5 or any other trained model for use?

evaluation script

1개의 image 에서 여러개의 ground truth 가 있을 때 mAP 를 구할 수 있도록 코드를 수정하자.

  • multiple object 에 대한 evaluation 구현
  • Test data 에 대한 mAP 를 구할 수 있도록 setup

Resizing Candidate Proposals

학습된 모델 (32x32x1 => digit or not) 을 natural image 에서 돌리는 코드 구현

  • 입 출력
    • Input : Natural Image
    • Output : Candidate Regions whose shape is (N, 32, 32, 1)
  • 구현할 내용
    • 입력 영상을 Gray Scale 로 변환
    • MSER 로 Candidate Region 을 찾는다
    • Candidate Region 을 32x32x1 로 resize
      • (w >= h) : 32x32 로 rescale
      • (w < h) : w=h 가 되도록 crop 후 32x32 로 rescale
        • natural 영상의 edge 부에서의 처리 ?

imshow() bug

In 5_run.py if there are more than 2-test-images, the 2nd image cannot display.

the error message is as follows

init done 
opengl support available 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.