Giter Site home page Giter Site logo

dataengineering-workshop3's Introduction

Data Engineering Workshop 3

One Day workshop on web scraping, extractors and debugging a program.

What will you learn by the end of this workshop?

  • By the end of this workshop you will learn how to scrap a website using python
  • You will learn how to save the scrapped data in a database.
  • You will learn how to run a date range and incremental extractors.
  • You will learn how to debug a python program.
  • You will learn time profiling and memory profiling.

Schedule

Time Topics
09:00 - 10:00 Webscrapping using python
10:00 - 12:00 Storing the srapped data in Postgres DB.
12:00 - 01:00 Creating Django view to integrate the script
01:00 - 02:00 Break
02:00 - 03:00 Creating Date Range and Incremental Extractors
03:00 - 04:30 Python debugging and profiling
04:30 - 04:45 Q & A
04:45 - 05:00 Wrapping Up

Things to Note:

  1. Make sure Workshop 2 is completed and you have a complete working project that was build in Workshop 2.

  2. If the working project is not ready you may copy the project myworld from the DataEngineering-Workshop2 repository which you have cloned for the previous workshop and paste it in the current directory which you are going to work on for this workshop.

  3. You will have to clone DataEngineering-Workshop3 repository for today's workshop. But make sure to create a new folder outside that repository and work on it instead of making all the changes to DataEngineering-Workshop3 directly.

dataengineering-workshop3's People

Contributors

ashwathyk-uc avatar pshenoy-uc avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.