Giter Site home page Giter Site logo

angelinadataset's Introduction

Angelina Braille Images Dataset

This dataset consists of labeled photos of Braille texts.

It includes 212 pages of two-sided printed Braille books and 28 handwritten studet works. Also 44 non-braille photos of various documents found in Internet are included as negative examples. Each group of files is split into train and validation sets. Appropriate image lists are stored in train_*.txt and val_*.txt files in corresponding directories:

train val total
books 169 43 212
handwritten 22 6 28
not braille 44 44
  

Label files has LabelMe JSON format. For two-side pages only front side is labeled.

Each Braille symbol is labeled as

  • corresponding plain text letter or symbol (mainly Russian letter)
  • '~number' or '~number~' where number is a digital representation of the Braille symbol (for example '~3456' for digital sign)
  • some special marks like '##' for digital sign

Tools for handling this dataset can be found at Angelina Braille Reader repository.

Correspondance between Braille symbols and correspondance letters is defined at letters.py file. Some special symbols can be labeled in several ways. See labeling_synonyms dict. Various tools for handling labels can be found at label_tools.py file. Reading function for this dataset is implemented in read_LabelMe_annotation function. Implementation of PyTorch Dataset is here.

angelinadataset's People

Contributors

ilyaovodov avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.