Giter Site home page Giter Site logo

crowdfunding_etl's Introduction

Crowdfunding_ETL

ETL mini Project 2

Extract, Transform, Load (ETL):

An ETL miniproject where I built an ETL skills learned on a large dataset. Then using Python, the data was extracted, transformed, and SQL was used to load and query the data.

Overview

The Crowdfunding dataset contains information on campaign contributors who made pledges to live projects. Perform the ETL (extract, transform and load) process on the dataset.

Technologies Used:

  • PgAdmin 4
  • Visual Studio Code
  • Jupyter Notebook
  • QuickDBD

Extract data from the crowdfunding.xlsx file

Create the Category and SubCategory

  • Cleaned the DataFrame by splitting a column into a Category and SubCategory column.

  • Used list comprehensions to create category_id/subcategory_id columns to identify the categories/subcategories.

Campaign DataFrame

  • The "blurb" column, renamed to "description"
  • The "goal" column, converted to the float data type
  • The "pledged" column, converted to the float data type
  • The "launched_at" column, renamed to "launch_date" and with the UTC times converted to the datetime format
  • The "deadline" column, renamed to "end_date" and with the UTC times converted to the datetime format
  • A "category_id" column that contains the unique identification numbers matching those in the "category_id" column of the category DataFrame
  • A "subcategory_id" column that contains the unique identification numbers matching those in the "subcategory_id" column of the subcategory DataFrame

ERD

png

crowdfunding_etl's People

Contributors

mznaturl1 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.