The informationtracer's intro from zhouhanc

Information Tracer API Python Library

This Github repo provides Python scripts to interact with the Information Tracer API. Information Tracer is a system to collect social media posts and generate intelligence.

Pre-requisite

Python 3
You must have a valid token. If not, contact us to get a token.

Overview our different API endpoins

Use Submit API to submit a query, and get a unique identifier called id_hash256
Use Status API to check status of a running query, based on id_hash256
Use the Download Endpoint to get result of a query, based on id_hash256

Quick Start

Close this repository
pip install requests pandas
Update parameters in example.py, including query, start_date, end_date, token
informationtracer_token=XXX python example.py (or add informationtracer_token in bash_profile)

How to build a search query

Rule 1: AND, OR, NOT must be all-cap. Otherwise they are treated as normal English words Rule 2: Use parenthesis to group multiple words with AND. For example, (Word1 AND Word2) Rule 3: Query limit is 512 characters. Sending a query above the limit might get empty results.

Example: (Ukraine AND NATO) OR (Ukraine AND EU) Meaning: Any posts that contain "Ukraine" and "NATO" or "Ukraine" and "EU".

Example: (Ukraine AND NATO) NOT Putin Meaning: Any posts that contain "Ukraine" and "NATO", without word "Putin".

Example: from:elonmusk Meaning: Collect tweets created by user @elonmusk

Example: from:elonmusk Tesla Meaning: Collect tweets created by user @elonmusk, and with word "tesla"

Details about Submit API

Input: query, token, start_date, end_date

Optional input: twitter_sort_by: 'time' or 'engagement' (default is 'engagement', if not specified)

set twitter_sort_by to 'time' to collect tweets in reverse chronological order (latest to older)
set twitter_sort_by to 'engagement' to collect tweets in reverse like_count order

Output: id_hash256 (a unique string identifier for this search)

Example:

import requests
SUBMIT_URL = 'https://informationtracer.com/submit'

query = 'nvidia AND stock'
token = 'YOUR_TOKEN'
start_date = '2023-11-03'
end_date = '2023-11-08'

response = requests.post(SUBMIT_URL, 
                             timeout=10,
                             json={'query': query, 
                                   'token': token,
                                   'start_date': start_date,
                                   'end_date': end_date,
                                   'twitter_sort_by': 'engagement'
                                   }                                   
                            )
if 'id_hash256' in response.json():
    id_hash256 = response.json()['id_hash256']

Details about Status API

Input: id_hash256, token Output: json (detail below)

Example:

import requests
STATUS_URL = 'https://informationtracer.com/status'

url = "{}?token={}&id_hash256={}".format(STATUS_URL, token, id_hash256)
results = requests.get(url).json()

Format of output Because each collection can take 30-60 seconds, to send partial results to users as soon as possible, we provide a field called tweet_preview. Thie field is initially empty. When the system has collected 10 tweets, tweet_preview will contain a list of dictionaries. Please check the result API v1 details for a detailed explanation of each key-value pair (d, i, l, ...).

{'status': 'started', 
 'status_percentage': '10', 
 'status_text': 'Collecting cross-platform posts...', 
 'tweet_preview': [{'d': '@Apple Unless you buy a MacBook circa 2010',
                    'i': 0, 
                    'l': 'https://twitter.com/heathdollars/status/1721998289388896312', 'n': 'heathdollars', 
                    'p': 'https://pbs.twimg.com/profile_images/1641987731181142018/tECQ8Xy1_normal.jpg', 
                    't': '2023-11-07T21:09:20', 
                    'u_d': 'join your union\n\nhttps://t.co/4sxV02E2aI', 
                    'u_id': '1000720137106866176', 
                    'u_t': '2018-05-27T12:47:27'
                    }, 
                    {...}, 
                    {...}, 
                    ...
                   ]
}

Result API (new, by platform)

Input (required): source, id_hash256, token

source is data source, which can be 'twitter', 'youtube', 'reddit', 'all'

Format of output Output: a pandas dataframe, which can be converted to csv, json, etc,.

The columns should be self-explanatory
Note that some columns (those with prefix country_, sentiment_ , account_type_ ) are only available to premium users.

import pandas as pd
url = 'https://informationtracer.com/download?source={}&type=csv&id={}&token={}'.format(source, id_hash256, token)
df = pd.read_csv(url)

Result API (v1, depracated)

Result API v1 details.

Web Interface

To help people visualize the information, we provide a web interface available at https://informationtracer.com.
To visualize a query you searched recently, you can visit https://informationtracer.com/?result={id_hash256}.
Log in is required. Please contact us and we will help you register an account

Contact / Bug Report

For bug report or any inquiry, please contact Zhouhan Chen [email protected]

zhouhanc / informationtracer Goto Github PK

informationtracer's Introduction

Information Tracer API Python Library

Pre-requisite

Overview our different API endpoins

Quick Start

How to build a search query

Details about Submit API

Details about Status API

Result API (new, by platform)

Result API (v1, depracated)

Web Interface

Contact / Bug Report

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent