Giter Site home page Giter Site logo

zjb-1 / semantic-aware-video-text-detection Goto Github PK

View Code? Open in Web Editor NEW
4.0 1.0 2.0 3.68 MB

License: Apache License 2.0

Python 63.80% Shell 0.05% Dockerfile 0.05% Lua 0.44% MATLAB 1.22% C++ 0.42% C 0.30% Makefile 0.01% Jupyter Notebook 33.50% Cython 0.21%

semantic-aware-video-text-detection's Introduction

Semantic-Aware-Video-Text-Detection

image-20210923153053176

Introduction

This is a PyTorch implemntation of the CVPR 2021 paper Semantic-Aware-Video-Text-Detection.

Installation

The code is based on the mmdetection(2.11.0) framework.

Requirements:

  • Python3.6+
  • PyTorch 1.3+ and torchvision that matches the Pytorch installation.
  • CUDA 9.2+
  • GCC 5+
  • MMCV
# install the mmcv
pip install mmcv-full==1.3.9
# clone our model
git clone https://github.com/zjb-1/Semantic-Aware-Video-Text-Detection.git
# install the cocoapi
cd Semantic-Aware-Video-Text-Detection/cocoapi/PythonAPI
python setup.py build_ext install
# install our model
cd ../../
pip install -r requirements.txt
pip install -v -e .

Models

If you need a pre-trained model or a trained model, you can contact me.

Datasets

  • The video datasets format is as follows:

    dataset
    ├─Video1
    │    ├─1.jpg
    │    ├─1.txt
    │    ├─2.jpg
    │    ├─2.txt
    │    └─...
    ├─Video2
    │    ├─1.jpg
    │    ├─1.txt
    │    ├─2.jpg
    │    ├─2.txt
    │    └─...
    ├─ ...    
    

    The txt file format is as follows(Coordinate points arranged clockwise, text, id):

    ​ x1,y1,x2,y2,x3,y3,x4,y4 text id

  • Then, you need to run the train_label_gen.py / test_label_gen.py to generate the label file.(Remember to modify the file path in the file).

Training

Before training, you need to modify the profile(mask_track_rcnn_r50_fpn.py) and shell file(train.sh).

# training
bash train.sh

Evaluation

Before evaluation, you need to modify the test shell file(test.sh).

# test
bash test.sh

You will get visual results.

semantic-aware-video-text-detection's People

Contributors

zjb-1 avatar

Stargazers

 avatar  avatar Lilong Wen avatar fireae avatar

Watchers

 avatar

semantic-aware-video-text-detection's Issues

Need a train model

Dear author,
I’m Liu Hongen, a post-graduate student from Tianjin University, and I am doing research works about video text tracking, which is closely related to your newly published paper entitled “Semantic-Aware-Video-Text-Detection”, in CVPR 2021.
Your paper is very interesting and gives me great inspiration. I’m now trying to reproduce your proposed method in the ICDAR 2023 Video Text Reading Competition for Dense and Small Text(DS Text). However, Due to the limited computing resource, The result of training from scratch is less than satisfactory. Would you please share your trained model with me for research purposes only and send the model to my email [email protected]?
Your kind help will be highly appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.