A NodeJS project which will allow a client to post an image to the api and return the text from the image.
- nodejs (>= 6.9.0) installed on your server
- npm installed on your server
- homebrew installed and functional (only if you intend to follow the steps below to install tesseract-ocr). (note if you dont use a mac you will need to substitute for the appropriate package manager commands here)
- Download and unzip Source from github
- CD into download directory
- % npm install
- Edit config/default.json (set images path, listening port, etc)
- % brew install leptonica --with-libtiff
- % brew install imagemagick
- % brew install tesseract --with-all-languages
Open the config/default.json file.
Start the application like you would any other node application by using the command npm start from the application home directory. Alternatively you may use any process manager Note: I have been unable to get this working with PM2, there is a conflict with node-config I recommend using strongloop as I had no problems running the application under it, but thats just personal pref.
Testing the webservice...
- Use a simple image containing text you want to ocr (eng_bw.jpg from tesseractjs website is my goto)
- Start the restful-ocr server
- Using curl post the image:
curl -F filedata=@/path/to/image/eng_bw.png "http://localhost:3000/api/upload"
If everything is working properly you should see your text returned. Note that especially when using tesseractjs this may take some time.
If you wish to package the application for easy deployment to a server, this can be done easily using gulp.
- % npm install (This should already be done, but saying it again to be sure)
- % node ./node_modules/gulp/bin/gulp.js
Thats it, there should now be a zip file located in the dist directory. You can unzip this on any server with nodejs installed and follow the Operation steps to run the webservice.