Giter Site home page Giter Site logo

node-red-contrib-pdf-hummus's Introduction

node-red-contrib-pdf-hummus

Node-RED Node to be used to extract text from a pdf file making use of hummusjs.

Install

Run the following command in the root directory of your Node-RED install:

    npm install node-red-contrib-pdf-hummus

Usage

This node splits text out of a PDF document making use of the npm module hummusjs and the text extraction sample.

This early release is a get it working node-red wrappering of the text extraction sample code, which does more than is actually is needed by this node. Hence it has a larger than necessary memory requirement.

Input

The node needs a filename and a PDF input buffer as input. The filename being written to can be overridden by setting msg.filename.

The document to be added should be passed in as a data buffer in msg.payload.

HTTP Input

The node can also be driven by a HTTP input node, where a pdf file is POSTed to the flow. The pdf file buffer and name will then be taken from the request field of the msg. To use this implementation, in the http input properties, ensure that "Method" is set to "POST", and "Accept file uploads?" is ticked.

Output

The output is a json object on msg.payload. If the split option is selected then an event is sent for each page.

Sample flow

File Inject Implementation

[{"id":"75540143.de239","type":"pdf-hummus","z":"434de041.4e4f4","name":"","filename":"myfile.txt","split":true,"mode":{"value":"asBuffer"},"x":270.5,"y":65,"wires":[["cd04c7ce.3b70c8","a84e0874.451318"]]},{"id":"38aade4f.ab0c12","type":"fileinject","z":"434de041.4e4f4","name":"","x":103,"y":62,"wires":[["75540143.de239"]]},{"id":"cd04c7ce.3b70c8","type":"debug","z":"434de041.4e4f4","name":"","active":true,"console":"false","complete":"false","x":449.5,"y":65,"wires":[]},{"id":"a84e0874.451318","type":"watson-discovery-v1-document-loader","z":"434de041.4e4f4","name":"","environment_id":"","collection_id":"","default-endpoint":true,"service-endpoint":"https://gateway.watsonplatform.net/discovery/api","x":411,"y":133,"wires":[["2e9ac940.1cd4c6"]]},{"id":"2e9ac940.1cd4c6","type":"debug","z":"434de041.4e4f4","name":"","active":true,"console":"false","complete":"true","x":610.5,"y":131,"wires":[]}]

HTTP POST Implementation

[ { "id": "1f7ebf39.38b309", "type": "pdf-hummus", "z": "639c38eb.3b18c8", "name": "", "filename": "", "split": false, "mode": { "value": "asBuffer" }, "x": 487, "y": 227, "wires": [ [ "757b38a5.04059" ] ] }, { "id": "e8e6b15b.868638", "type": "http in", "z": "639c38eb.3b18c8", "name": "", "url": "/pdfin", "method": "post", "upload": true, "swaggerDoc": "", "x": 204, "y": 228, "wires": [ [ "1f7ebf39.38b309" ] ] }, { "id": "757b38a5.04059", "type": "http response", "z": "639c38eb.3b18c8", "name": "", "statusCode": "200", "headers": {}, "x": 792, "y": 230, "wires": [] } ]

Deploy the sample flow, and create a HTTP POST as follows:

Contributing

For simple typos and fixes please just raise an issue pointing out our mistakes. If you need to raise a pull request please read our contribution guidelines before doing so.

Copyright and license

Copyright 2017 IBM Corp. under the Apache 2.0 license.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.