Giter Site home page Giter Site logo

gnip's Introduction

NodeJS Gnip module

Connect to Gnip streaming API and manage rules. You must have a Gnip account with any data source available, like Twitter Power Track.

Currenly, this module only supports JSON activity stream format, so you must enable data normalization in your admin panel.

Gnip.Stream

This class is an EventEmitter and allows you to connect to the stream and start receiving data.

Constructor options

  • timeout As requested in the Gnip docs (http://support.gnip.com/apis/powertrack/api_reference.html), this option in the constructor allows us to set a read timeout in the client. The recommended value is >=30 seconds, so the constructor will throw an error if a smaller timeout is provided. The default value for this option is 35 seconds.
  • backfillMinutes Number of minutes to backfill after connecting to the stream. Optional. Value should be 0 - 5.
  • partition Partition of the Firehose stream you want to connect to. Only required for Firehose streams.
  • parser Parser library for incoming JSON data. Optional, but defaults to the excellent json-bigint library.
    Matching tag IDs are sent to us as big integers which can't be reliably parsed by the native JSON library in Node.js. More info on this issue can be found at StackOverflow

API methods

  • start() Connect to the stream and start receiving data. At this point you should have registered at least one event listener for any of these events: 'data', 'object' or 'tweet'.

  • end() Terminates the connection.

Events

  • ready Emitted when the connection has been successfully established
  • data Emitted for each data chunk (decompressed)
  • error Emitted when any type of error occurs. An error is raised if the response status code is not 20x. {error: String} objects are also checked here.
  • object Emitted for each JSON object.
  • tweet Emitted for each tweet.
  • delete Emitted for each deleted tweet.
  • end Emitted when the connection is terminated. This event is always emitted when an error occurs and the connection is closed.

Gnip.Rules

This class allows you to manage an unlimited number of tracking rules.

Constructor options

  • user GNIP account username.
  • password GNIP account password.
  • url GNIP Rules endpoint url e.g. https://gnip-api.twitter.com/rules/${streamType}/accounts/${account}/publishers/twitter/${label}.json
  • batchSize The batch size used when adding/deleting rules in bulk. (Defaults to 5000)
  • parser Much like the parser option allowed in the Gnip Stream constructor, you can pass a custom parser handler/library for incoming JSON data. This is optional, and defaults to the json-bigint library. More details.
  • cacheFile Internally Gnip.Rules uses a file for caching the current state of the rules configuration, the default path is in the directory of the package. This optional configuration allows you to change the path as the default one may cause problems in applications where node_modules is in a read-only filesystems (e.g. AWS Lambda).

API methods

  • getAll(callback) Get cached rules.

  • update(rules: Array, callback) Creates or replaces the live tracking rules.
    Rules are sent in batches of options.batchSize, so you can pass an unlimited number of rules.
    The current tracking rules are stored in a local JSON file so you can update the existing rules efficiently without having to remove them all. The callback receives an object as the 2nd argument and contains the number of added and deleted rules.

  • clearCache(callback) Clears cached rules.

The following methods uses Gnip API directly and ignores the local cache. Avoid usage if you are working with too many rules!

  • live.update(rules: Array, callback)
  • live.add(rules: Array, callback)
  • live.remove(rules: Array, callback)
  • live.getAll(callback)
  • live.getByIds(ids: Array, callback)
  • live.removeAll(callback)

Gnip.Search

This class is an EventEmitter and allows you to connect to either the 30 day or full archive search API and start receiving data.

Constructor options

  • user GNIP account username.
  • password GNIP account password.
  • url GNIP Search endpoint url e.g. https://gnip-api.twitter.com/search/30day/accounts/{ACCOUNT_NAME}/{LABEL}.json
  • query Rule to match tweets.
  • fromDate The oldest date from which tweets will be gathered. Date given in the format 'YYYYMMDDHHMM'. Optional.
  • toDate The most recent date to which tweets will be gathered. Date given in the format 'YYYYMMDDHHMM'. Optional.
  • maxResults The maximum number of search results to be returned by a request. A number between 10 and 500. Optional.
  • tag Used to segregate rules and their matching data into different logical groups. Optional.
  • bucket The unit of time for which count data will be provided. Options: "day", "hour", "minute". Optional, for /counts calls.
  • rateLimiter A limiter object, used to control the rate of collection. Optional. If unspecified, a rate limit of 30 requests a minute will be shared between Search streams. If you have a non-standard rate limit, you should pass this parameter.
const RateLimiter = require('limiter').RateLimiter;
// Allow 60 requests per minute
const limiter = new RateLimiter(60, 'minute');
const stream = new Gnip.Search({
	rateLimiter : limiter,
  ...
});

API methods

  • start() Start receiving data. At this point you should have registered at least one event listener for 'object' or 'tweet'.

  • end() Terminates the connection.

Events

  • ready Emitted when tweets have started to be collected.
  • error Emitted when a recoverable (non fatal) error occurs.
  • object Emitted for each JSON object.
  • tweet Emitted for each tweet.
  • end Emitted when the connection is terminated. If the stream has ended due to a fatal error, the error object will be passed.

Gnip.Usage

This class allows you to track activity consumption across Gnip products.

Constructor options

const usage = new Gnip.Usage({
	url : 'https://gnip-api.twitter.com/metrics/usage/accounts/{ACCOUNT_NAME}.json',
	user : 'xxx',
	password : 'xxx'
});

API Methods

usage.get({ bucket:'day', fromDate:'201612010000', toDate:'201612100000' },function( err, body )
{
	...
});

Installation

npm install gnip

Example Usage

const Gnip = require('gnip');

const stream = new Gnip.Stream({
  url : 'https://gnip-stream.twitter.com/stream/powertrack/accounts/xxx/publishers/twitter/prod.json',
  user : 'xxx',
  password : 'xxx',
  backfillMinutes: 5 // optional
});
stream.on('ready', function() {
  console.log('Stream ready!');
});
stream.on('tweet', function(tweet) {
  console.log(tweet);
});
stream.on('error', function(err) {
  console.error(err);
});

const rules = new Gnip.Rules({
  url : 'https://gnip-api.twitter.com/rules/powertrack/accounts/xxx/publishers/twitter/prod.json',
  user : 'xxx',
  password : 'xxx',
  batchSize: 1234 // not required, defaults to 5000
});

const newRules = [
  '#hashtag', 
  'keyword', 
  '@user',
  {value: 'keyword as object'},
  {value: '@demianr85', tag: 'rule tag'}
];

rules.update(newRules, function(err) {
  if (err) throw err;
  stream.start();
});

const search = new Gnip.Search({
  url : 'https://gnip-stream.twitter.com/stream/powertrack/accounts/xxx/publishers/twitter/prod.json',
  user : 'xxx',
  password : 'xxx',
  query : '@user'
});

search.on('tweet', function(tweet) {
  console.log(tweet);
});

search.on('error', function(err) {
  console.error(err);
});

search.on('end', function(err) {
  if( err ) 
    console.error(err);
});

// search counts usage
const counts = new Gnip.Search({
  url : 'https://gnip-stream.twitter.com/stream/powertrack/accounts/xxx/publishers/twitter/prod/counts.json',
  user : 'xxx',
  password : 'xxx',
  query : '@user',
  bucket: 'day'
});

counts.on('object', function(object) {
  console.log(object.results);
  counts.end();
});

More details and tests soon...

gnip's People

Contributors

aniham avatar demian85 avatar dependabot[bot] avatar dirkbonhomme avatar ezegolub avatar greginator avatar jamesfrost avatar jvrbaena avatar klr avatar snyk-community avatar tzellman avatar williamcoates avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gnip's Issues

Gnip 2.0 replay error is not passed

When I get a 406 http error from replay stream, I only get the http error object
While the replay docs Specify that I'm expected to get

Will contain a JSON message indicating the issue -- e.g. "This connection requires compression. To enable compression, send an 'Accept-Encoding: gzip' header in your request and be ready to uncompress the stream as it is read on the client end." or "Invalid date for query parameter 'toDate'. Can't ask for tweets from within the past 30 minutes."

I'll be happy to help with a PR if you can point me in the right direction

tweet matching_rules id's are being rounded

I'm testing PowerTrack 2.0 and noticed in each tweet the matching_rules:id is being rounded. All the rules end in 00. Here is an example of the one of the tweets. Any thoughts?

"gnip" : { "matching_rules" : [{ "tag" : "1956", "id" : 781250306873012200 } ]

Socket hangs up while writing large number of rules

I'm getting this when I try to write over 2K rules:

Error: socket hang up
at createHangUpError (http.js:1476:15)
at CleartextStream.socketCloseListener (http.js:1526:23)
at CleartextStream.emit (events.js:117:20)
at tls.js:693:10
at process._tickCallback (node.js:419:13)

Gnip Errors not being parsed from stream

When trying to make multiple connections to a single GNIP stream, GNIP is sending an error in the stream alongside tweets

{"error":{"message":"This stream is currently at the maximum allowed connection limit","sent":"2018-02-16T04:59:50+00:00","transactionId":"00ad405500190622"}}
{"id":"tag:search.twitter.com,2005:964363458044076032","objectType":"activity","verb":"post","postedTime":"2018-02-16T04:59:00.000Z","generator":{"displayName":"Twitter for Android","link":"http:\/\/twitter.com\/download\/android"},"provider":{"objectType":"service","displayName":"Twitter","link":"http:\/\/www.twitter.com"},"link":"http:\/\/twitter.com\/MfMats\/statuses\/964363458044076032","body":"I vote #LandRoverFanClub for the #WesBankCOTY2018 win, who\u2019s your vote? @LandRoverZA @AutoTraderSA @W

however tweets have a \r\n as the separator, while errors only have \n. I noticed that your JSONParser defaults to \r\n as its seperator, and this is causing the events given in the example above to throw the Error parsing JSON error.

Not working with Replay v2.0

Just tested it with replay v2.0 and doesn't work. In index.js on line 89 is has this path : streamUrl.path + '?' + qs,. If qs is null then it appends the ? to then end of the url which gnip then rejects as invalid.

Try testing with replay with only url, user, and password options.

Rule errors are not returned during bulk add

It seems that when adding rules in bulk, errors are not carried on the body.error property, but rather in an array in body.detail.

I think adding a clause for this possibility to append more information to the errStr is appropriate, something like this:

if (body && body.detail) {
    errStr = body.detail.filter(r => r.message).reduce((msg, r) =>
        msg + '\n' + r.message
    , errStr);
}

This would result in an error string like

Error: Unable to add rules. Request Failed with status code: 422
Rule 'something' has some error

https://github.com/demian85/gnip/blob/master/lib/rules.js#L49

From the GNIP docs:

422 Unprocessable Entity Generally occurs due to an invalid rule, based on the PowerTrack rule restrictions. Requests fail or succeed as a batch. For these errors, each invalid rule and the reason for rejection is included in a JSON message in the response. Catch the associated exception to expose this message.

zlib error whatever stream ends

When stream ends (on purpose or not on purpose) I get

Error: unexpected end of file
File "zlib.js", line 154, in Zlib.zlibOnError [as onerror]
on the error event
This doesn't happen every time, but quire frequent
This doesn't seem to be a "real" error but some kind of stream noise. is there a way to silence/solve this error?

JSON parse error

When I have a stream open and I remove all the rules I start getting tons of parse errors in the console:
Error: Error parsing JSON: Cannot set property 'tweet' of undefined.
followed by the tweet object json and then:

    at Parser.receive (/Users/[directory_redacted]/node_modules/gnip/lib/JSONParser.js:30:28)
    at Gunzip.<anonymous> (/Users/[directory_redacted]/node_modules/gnip/lib/index.js:109:17)
    at Gunzip.emit (events.js:223:5)
    at addChunk (_stream_readable.js:309:12)
    at readableAddChunk (_stream_readable.js:290:11)
    at Gunzip.Readable.push (_stream_readable.js:224:10)
    at Gunzip.Transform.push (_stream_transform.js:150:32)
    at Zlib.processCallback (zlib.js:525:10)```

Todo: prepare 0.3.0 release

With one open PR remaining it's time to work towards a 0.3.0 release and have it pushed to the npm repository.

Todo:

  • Wait for #7 to be ready and merged to master
  • Create 0.3.0 tag
  • Publish tag to npm

Request collaborator access

Hi @demian85

There hasn't been much activity on this project for a while even though some PRs and issue reports are very useful. I think it would be sad if this project got scattered throughout many independent forks.

I'm willing to maintain this project if you allow me collaborator access? Or someone else if there's any interest?

Let me know what you think!

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.