twitterrealtimeimageprocessing's Introduction

Lambda Architecture for Twitter Realtime Image Processing

Description

Lambda Architecture implementation using Apache Storm, Hadoop and HBase to perform Twitter real-time image processing analysis. The goal of this project is find the most representative images from images obtained from Twitter through a certain keyword.
To find these representative images, we will use the K-Means algorithm.

Link to the whole paper

Dependencies

Pre-requisites

Hbase, Storm and Hadoop have to be installed and set correctly on your pseudo-distribuited cluster.

Usage

To collect the images from Twitter, you have to get your personal Twitter Developers credential and insert it on .txt file. You will find an example inside the project as FakeCredential.txt

Start in this order:

Hadoop
HBase
Storm ( it start automatically on Eclipse in my case )

After that, you can finally execute in this order:

TwitterRealTimeImageProcessing.java
- insert the keyword inside the arguments list
HadoopDriver.java
- Choose the correct value as number of center, threshold and a file where to write the centers.
- With CEDD Descriptor, we obtain a 144-dimensional vector.
- If you want change descriptor, change also FeatureExtractorCEDD.java and all corrispondences.

At the end of kmeans, an HTLM page will show the results obtained

Tips

This whole project was executed on Ubuntu 20.04.2 LTS

Recommend Projects

alessioventuri / twitterrealtimeimageprocessing Goto Github PK

twitterrealtimeimageprocessing's Introduction

Lambda Architecture for Twitter Realtime Image Processing

Description

Dependencies

Pre-requisites

Usage

Tips

twitterrealtimeimageprocessing's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent