Giter Site home page Giter Site logo

zrg1993 / spacy-nlp Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kengz/spacy-nlp

0.0 2.0 0.0 19 KB

Expose Spacy nlp text parsing to Nodejs (and other languages) via socketIO

License: MIT License

Shell 8.21% JavaScript 34.84% Python 56.95%

spacy-nlp's Introduction

spacy-nlp npm version CircleCI Code Climate Test Coverage

Expose Spacy nlp text parsing to Nodejs (and other languages) via socketIO

Installation

# install spacy in python3
python3 -m pip install -U socketIO-client
python3 -m pip install -U spacy
python3 -m spacy.en.download

# install this npm package
npm i --save spacy-nlp

Usage

const spacyNLP = require('spacy-nlp')
// default port 6466
// start the server with the python client that exposes spacyIO (or use an existing socketIO server at IOPORT)
var serverPromise = spacyNLP.server({ port: process.env.IOPORT })
// Loading spacy may take up to 15s

Note that python3 is preferred. If you use python2, at each run set the env var USE_PY2=true.

You'll see log like:

[Sun Oct 09 2016 16:53:33 GMT-0400 (EDT)] INFO Starting poly-socketio server on port: 6466, expecting 1 IO clients
[Sun Oct 09 2016 16:53:33 GMT-0400 (EDT)] INFO Starting socketIO client for python3 at 6466
[Sun Oct 09 2016 16:53:44 GMT-0400 (EDT)] DEBUG cgkb-py mXjDqupv852zUeMPAAAA joined, 0 remains
[Sun Oct 09 2016 16:53:44 GMT-0400 (EDT)] INFO All 1 IO clients have joined

Since it uses poly-socketio, there'll be one IO server, and one global.client(internal to this module) in the same process, no matter how many times poly-socketio is called. This resolves conflicts for cross-project usage.

E.g. AIVA uses poly-socketio to start a server for its internal cross-language communication, and uses spacy-nlp too. spacy-nlp will automatically use the IO server and the global.client from AIVA.

Once it is ready, i.e. you can use the nodejs client nlp to parse texts:

const spacyNLP = require('spacy-nlp')
const nlp = spacyNLP.nlp

// Note you can pass multiple sentences concat in one string.
nlp.parse('Bob Brought the pizza to Alice.')
  .then((output) => {
    console.log(output)
    console.log(JSON.stringify(output[0].parse_tree, null, 2))
  })

And the output is the syntax parse tree with POS tagging. For the parse_tree, NE means Named Entity for NER; arc of an object is incident on it. An arc points from head word to modifier word. See the explanation on Tensorflow/syntaxnet.

[ { text: 'Bob Brought the pizza to Alice.',
    len: 7,
    tokens: [ 'Bob', 'Brought', 'the', 'pizza', 'to', 'Alice', '.' ],
    noun_phrases: [ 'Bob', 'the pizza', 'Alice' ],
    parse_tree: [ [Object] ],
    parse_list: 
     [ [Object],
       [Object],
       [Object],
       [Object],
       [Object],
       [Object],
       [Object] ] } ]
[
  {
    "word": "Brought",
    "lemma": "bring",
    "NE": "",
    "POS_fine": "VBD",
    "POS_coarse": "VERB",
    "arc": "ROOT",
    "modifiers": [
      {
        "word": "Bob",
        "lemma": "Bob",
        "NE": "PERSON",
        "POS_fine": "NNP",
        "POS_coarse": "PROPN",
        "arc": "nsubj",
        "modifiers": []
      },
      {
        "word": "pizza",
        "lemma": "pizza",
        "NE": "",
        "POS_fine": "NN",
        "POS_coarse": "NOUN",
        "arc": "dobj",
        "modifiers": [
          {
            "word": "the",
            "lemma": "the",
            "NE": "",
            "POS_fine": "DT",
            "POS_coarse": "DET",
            "arc": "det",
            "modifiers": []
          }
        ]
      },
      {
        "word": "to",
        "lemma": "to",
        "NE": "",
        "POS_fine": "IN",
        "POS_coarse": "ADP",
        "arc": "prep",
        "modifiers": [
          {
            "word": "Alice",
            "lemma": "Alice",
            "NE": "PERSON",
            "POS_fine": "NNP",
            "POS_coarse": "PROPN",
            "arc": "pobj",
            "modifiers": []
          }
        ]
      },
      {
        "word": ".",
        "lemma": ".",
        "NE": "",
        "POS_fine": ".",
        "POS_coarse": "PUNCT",
        "arc": "punct",
        "modifiers": []
      }
    ]
  }
]

spacy-nlp's People

Contributors

kengz avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.