Giter Site home page Giter Site logo

engrafo_manuscript-digitizer's Introduction

ENGRAFO_MANUSCRIPT-DIGITIZER

Overview:

The Manuscript Digitization Project aims to digitally preserve handwritten manuscripts with significant scientific, historical, or aesthetic value dating back at least 75 years. Manuscripts are primary sources of historic information and represent an ancient and rich cultural heritage, particularly in India, where the largest collection of manuscripts exists. Unfortunately, these manuscripts are diminishing over time due to natural decay and other factors. To counteract this issue, the project focuses on digitizing the manuscripts using a mobile application. The application allows users to capture images of handwritten manuscripts using a camera or upload them from their gallery. The text from the captured images is then detected and extracted, and the content is converted into PDF format for preservation.

Key Features:

Digitize handwritten manuscripts using a user-friendly Web application. Multiple manuscript images can be uploaded. Detect and extract text from the manuscript images. Offers text extraction for six Indian languages (Sanskrit, Telugu, Marathi, Bengali, Arabic, and Tamil). Convert the extracted text into PDF format for preservation. Help preserve the scientific, historical, and aesthetic knowledge contained in these manuscripts.

Motivation:

India possesses an extensive collection of manuscripts written in various languages and scripts, including Sanskrit. These manuscripts hold invaluable knowledge in medicine, science, mathematics, literature, art, architecture, theology, philosophy, music, dance, and more. Preserving these manuscripts is vital not only for preserving the information they contain but also for understanding and appreciating the history and culture of the nation. Traditional methods of preservation have been utilized, but modern techniques of digital preservation are becoming increasingly important. The government, along with institutions like the National Library, National Informatic Center, and National Archives, is actively involved in preserving cultural heritage through strategies and policies. However, the manuscripts are scattered among different organizations and individuals, necessitating collaborative efforts to digitize and preserve them using modern technologies.

Requirements:

-f https://download.pytorch.org/whl/torch_stable.html

Flask==1.1.2

ipython==7.8.0

ipython-genutils==0.2.0

easyocr==1.2.2

numpy==1.16.5

numpydoc==0.9.1

gunicorn==19.9.0

torch==1.8.1+cpu

torchvision==0.9.1+cpu

opencv-python-headless==4.2.0.32

openpyxl==3.0.0

Website pages:

Home page:

Home

Help page:

Home

Digitizer Page:

Home

Archive page:

Home

Home

Feedback Page:

Home

engrafo_manuscript-digitizer's People

Contributors

sachan99 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.