Giter Site home page Giter Site logo

project-forex-pipeline's Introduction

Project-forex-pipeline

Description

Project architecture

This is a simple pipeline that takes in forex data from a csv file, and sends it to a Kafka server. The Kafka server then sends it to a S3 bucket, where it is stored. A crawler is used to crawl the data in the S3 bucket, and a table is created in the Glue Data Catalog. Athena is used to query the data in the S3 bucket.

Would have used an API to get the data, but I didn't want to pay for it. As such I used a csv file instead.

Note:

  • Each time you update start and stop the EC2 instance, you will need to change the IP address in the code
  • You will have to go to sudo nano config/server.properties and change the ADVERTISED_LISTENERS to the EC2 instance's IP address

Setup for Kafka

  1. Install Kafka on the EC2 instance (make sure you change the security settings)
  2. Open a new terminal, and start the Zookeeper server
cd kafka_2.12-3.5.1
bin/zookeeper-server-start.sh config/zookeeper.properties
  1. Open a new terminal, and run the EC2 instance
  2. Allocate memory to the Kafka server
<!-- export KAFKA_HEAP_OPTS="-Xmx256M -Xms128M" -->
  1. Start the Kafka server
cd kafka_2.12-3.5.1
bin/kafka-server-start.sh config/server.properties
  1. Create a topic in another terminal
bin/kafka-topics.sh --create --topic test1 --bootstrap-server 13.212.114.151:9092 --replication-factor 1 --partitions 1
  1. Start a producer
bin/kafka-console-producer.sh --topic test1 --bootstrap-server 13.212.114.151:9092
  1. New terminal, Start a consumer
bin/kafka-console-consumer.sh --topic test1 --bootstrap-server 13.212.114.151:9092

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.