Giter Site home page Giter Site logo

informationtracer's Introduction

Information Tracer API Python Library

This Github repo provides Python scripts to interact with the Information Tracer API. Information Tracer is a system to collect social media posts and generate intelligence.

Pre-requisite

  • Python 3
  • You must have a valid token. If not, contact us to get a token.

Overview our different API endpoins

  1. Use Submit API to submit a query, and get a unique identifier called id_hash256
  2. Use Status API to check status of a running query, based on id_hash256
  3. Use the Download Endpoint to get result of a query, based on id_hash256

Quick Start

  1. Close this repository
  2. pip install requests pandas
  3. Update parameters in example.py, including query, start_date, end_date, token
  4. informationtracer_token=XXX python example.py (or add informationtracer_token in bash_profile)

How to build a search query

Rule 1: AND, OR, NOT must be all-cap. Otherwise they are treated as normal English words Rule 2: Use parenthesis to group multiple words with AND. For example, (Word1 AND Word2) Rule 3: Query limit is 512 characters. Sending a query above the limit might get empty results.

Example: (Ukraine AND NATO) OR (Ukraine AND EU) Meaning: Any posts that contain "Ukraine" and "NATO" or "Ukraine" and "EU".

Example: (Ukraine AND NATO) NOT Putin Meaning: Any posts that contain "Ukraine" and "NATO", without word "Putin".

Example: from:elonmusk Meaning: Collect tweets created by user @elonmusk

Example: from:elonmusk Tesla Meaning: Collect tweets created by user @elonmusk, and with word "tesla"

Details about Submit API

Input: query, token, start_date, end_date

Optional input: twitter_sort_by: 'time' or 'engagement' (default is 'engagement', if not specified)

  • set twitter_sort_by to 'time' to collect tweets in reverse chronological order (latest to older)
  • set twitter_sort_by to 'engagement' to collect tweets in reverse like_count order

Output: id_hash256 (a unique string identifier for this search)

Example:

import requests
SUBMIT_URL = 'https://informationtracer.com/submit'

query = 'nvidia AND stock'
token = 'YOUR_TOKEN'
start_date = '2023-11-03'
end_date = '2023-11-08'

response = requests.post(SUBMIT_URL, 
                             timeout=10,
                             json={'query': query, 
                                   'token': token,
                                   'start_date': start_date,
                                   'end_date': end_date,
                                   'twitter_sort_by': 'engagement'
                                   }                                   
                            )
if 'id_hash256' in response.json():
    id_hash256 = response.json()['id_hash256']

Details about Status API

Input: id_hash256, token Output: json (detail below)

Example:

import requests
STATUS_URL = 'https://informationtracer.com/status'

url = "{}?token={}&id_hash256={}".format(STATUS_URL, token, id_hash256)
results = requests.get(url).json()

Format of output Because each collection can take 30-60 seconds, to send partial results to users as soon as possible, we provide a field called tweet_preview. Thie field is initially empty. When the system has collected 10 tweets, tweet_preview will contain a list of dictionaries. Please check the result API v1 details for a detailed explanation of each key-value pair (d, i, l, ...).

{'status': 'started', 
 'status_percentage': '10', 
 'status_text': 'Collecting cross-platform posts...', 
 'tweet_preview': [{'d': '@Apple Unless you buy a MacBook circa 2010',
                    'i': 0, 
                    'l': 'https://twitter.com/heathdollars/status/1721998289388896312', 'n': 'heathdollars', 
                    'p': 'https://pbs.twimg.com/profile_images/1641987731181142018/tECQ8Xy1_normal.jpg', 
                    't': '2023-11-07T21:09:20', 
                    'u_d': 'join your union\n\nhttps://t.co/4sxV02E2aI', 
                    'u_id': '1000720137106866176', 
                    'u_t': '2018-05-27T12:47:27'
                    }, 
                    {...}, 
                    {...}, 
                    ...
                   ]
}

Result API (new, by platform)

Input (required): source, id_hash256, token

source is data source, which can be 'twitter', 'youtube', 'reddit', 'all'

Format of output Output: a pandas dataframe, which can be converted to csv, json, etc,.

  • The columns should be self-explanatory
  • Note that some columns (those with prefix country_, sentiment_ , account_type_ ) are only available to premium users.
import pandas as pd
url = 'https://informationtracer.com/download?source={}&type=csv&id={}&token={}'.format(source, id_hash256, token)
df = pd.read_csv(url)

Result API (v1, depracated)

Result API v1 details.

Web Interface

  • To help people visualize the information, we provide a web interface available at https://informationtracer.com.
  • To visualize a query you searched recently, you can visit https://informationtracer.com/?result={id_hash256}.
  • Log in is required. Please contact us and we will help you register an account

Contact / Bug Report

For bug report or any inquiry, please contact Zhouhan Chen [email protected]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.