Giter Site home page Giter Site logo

anish-m-code / pdftotext Goto Github PK

View Code? Open in Web Editor NEW
13.0 3.0 3.0 795 KB

A simple pdftotext conversion tool for Windows 8.1/10/11 and FEDORA/UBUNTU/DEBIAN/ARCH based linux distros using poppler-utils and Google's tesseract-ocr.

License: MIT License

Python 100.00%
tesseract-ocr poppler-utils pdf-documents pdf pdftotext pdftools ocr ocr-recognition ocr-python ocr-text-reader hacktoberfest hacktoberfest-accepted hacktoberfest2022

pdftotext's Introduction

PDF TO TEXT CONVERTER

A simple Python script to convert PDF Documents to Text Files .

Primary Supported Platforms

  • Debian / Debian Based Linux Distros
  • Ubuntu / Ubuntu Based Linux Distros
  • Fedora / Fedora Based Linux Distros
  • Arch Linux / Arch Linux Based Distros
  • Void Linux / Void Linux Based Distros
  • Windows 10 and above Windows Operating Systems.

Quick Installation

To Install from PyPI:

Run the following commands in Linux terminal / Windows powershell / command prompt to install:-

pip install pdftotext3

Then simply type the following command inside the folder/Directory containing PDF Files to start converting PDF to text :-

pdftotext

For Windows Platform Additional software is required for Proper Functioning of this program , refer Windows Requirements here. To run the program by directly downloading from github refer Instructions here.

NOTE: THIS TOOL IS MEANT TO CONVERT THOSE PDF DOCUMENTS WHICH ARE NOT EASILY CONVERTBLE TO OTHER FORMATS. CURRENTLY THIS TOOL SUPPORTS ENGLISH ONLY.

pdftotext's People

Contributors

anish-m-code avatar raulgilabert avatar sayampradhan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

pdftotext's Issues

Add support for non english languages.

Currently Pdftotext only supports english , potential contributors may try to add non english languages , simplify installation and uninstallation of additional language packs , add code to support above mentioned features on both linux and windows.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.