Giter Site home page Giter Site logo

dataengineering-workshop1's Introduction

Data Engineering Workshop

One Day workshop on understanding Docker, Web Scrapping, Regular Expressions, PostgreSQL and Git.

Prerequisites

Any Linux machine/VM with following packages installed

  • Python 3.6 or above
  • docker
  • docker-compose
  • pip3
  • git (any recent version)

GitHub account

  • Create an account on GitHub (Only if you do not have an account)
  • Fork this repository and then clone it to your machine
  • You can refer this guide to understand how to fork and clone

Docker

  • To install docker go to your cloned repository and run the following command
  • sudo prerequisites/install_docker.sh

Workshop environment setup

  • Check if Git, Docker, and Docker Compose are installed in on the system. Open the terminal and run the following command
    Command: $ git --version
    git version 2.25.1
    
    Command: $ docker --version
    Docker version 20.10.17, build 100c701
    
    Command: $ docker-compose --version
    docker-compose version 1.25.0, build 0a186604
    
    

What will you learn by the end of this workshop?

  • By the end of this workshop you will learn how to build docker image and it's usage.
  • You will learn how to scrape a website using urllib/requests and Beautifulsoup.
  • You will learn Regular Expressions and how to work with it.
  • You will learn key features of PostgreSQL.
  • You will learn how to dockerize your project.

Schedule

Time Topics
09:00 - 11:00 Introduction to Docker
11:00 - 01:00 Introduction to Webscrapping.
01:00 - 02:00 Break
02:00 - 03:00 Dockerizing a project
03:00 - 04:00 Introduction to PostgreSQL
04:00 - 04:30 Introduction to Github
04:30 - 04:45 Q & A
04:45 - 05:00 Wrapping Up

dataengineering-workshop1's People

Contributors

nasaf-uc avatar pshenoy-uc avatar shamiths-uc avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.