Giter Site home page Giter Site logo

hoseinlook / spark-road-traffic-graph Goto Github PK

View Code? Open in Web Editor NEW
6.0 1.0 0.0 2.08 MB

divide city to some points as nodes of a traffic graph and process location data from kafka and calculate edge between nodes with spark aggregations

Shell 0.46% Python 65.52% Jupyter Notebook 34.02%
kafka spark-streaming jupyter-notebook pyspark

spark-road-traffic-graph's Introduction

Introduction

we want to estimate time during a trip between 2 points in a city and our data are car's locations that are produced to kafka every 5 second's .

its huge data so to solve this problem we divide city to 10000 points and assign closet point to every record with its location and calculate weight between points with spark aggregation and produce result to kafka

example tehran which is divided to points tehran example

Install

sudo apt update -y
sudo apt install -y git python3 python3-pip python3-venv

git clone https://github.com/hoseinlook/road-traffic-graph.git

cp -n .env.example .env
nano .env

python3.8 -m venv venv
source venv/bin/activate
pip install -U pip
pip install -r requirements.txt

Run

to run this project at first provide infrastructure like kafka and zookeeper with docker

start kafka:
docker-compose up
  • kafka bootstrap host: localhost:9093
  • zookeeper server: localhost:2181
start pipeline:

now start pyspark pipeline

source venv/bin/activate
python -m pipeline
  • spark webUI: localhost:4040
produce example data to kafka:
source venv/bin/activate
python -m generate_data

Note:

you can watch checkpoint's and data of kafka and data of zookeeper in storage directory

Optional

run spark in jupyter (its not streaming) to watch

jupyter-notebook .

spark-road-traffic-graph's People

Contributors

hoseinlook avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.