Giter Site home page Giter Site logo

tvkoushik / dbt-airbnb-analytics Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 1.11 MB

This dbt project is designed to transform Airbnb data, creating a series of models that can be used for analytical purposes. The project is organized into various directories and files, each serving a specific purpose in the ETL process..

Makefile 0.02% PLpgSQL 0.60% Shell 0.05% HTML 99.34%

dbt-airbnb-analytics's Introduction

Airbnb dbt Project

Introduction

This dbt project is designed to transform Airbnb data, creating a series of models that can be used for analytical purposes. The project is organized into various directories and files, each serving a specific purpose in the ETL process.

Project Structure

airbnb/
├── dbt_packages/
│ └── dbt_utils/
│ ├── integration_tests/
│ └── ...
├── logs/
├── models/
│ ├── dim/
│ ├── fct/
│ ├── mart/
│ ├── src/
│ └── ...
├── seeds/
├── snapshots/
├── target/
├── tests/

Directory and File Descriptions

  • dbt_packages/dbt_utils/: Contains utility functions and macros used in the project.
  • logs/: Stores log files generated during dbt runs.
  • models/: Contains the SQL models organized into various subdirectories:
    • dim/: Dimension tables.
    • fct/: Fact tables.
    • mart/: Data marts for specific analytical purposes.
    • src/: Source tables loaded from raw data.
  • seeds/: Contains seed files in CSV format.
  • snapshots/: Stores snapshot definitions for slowly changing dimensions.
  • target/: Stores compiled SQL files and run artifacts.
  • tests/: Contains SQL-based tests to validate the data models.

Getting Started

Prerequisites

  • dbt (data build tool) installed
  • A configured dbt profile with connection details to your data warehouse

Installation

  1. Clone the repository:

    git clone <repository_url>
    cd airbnb
  2. Install dependencies:

    dbt deps

Configuration

Ensure that your profiles.yml is configured with the appropriate connection details to your data warehouse.

Running the Project

To run the models and generate the tables in your data warehouse, use:

dbt run

To run tests and validate the data models:

dbt test

Seed Data To load the seed data into your data warehouse, use:

dbt seed

Snapshots To create or update snapshots, use:

dbt snapshot

Models

Source Models

Source models are located in the models/src directory and include:

  • src_hosts.sql
  • src_listings.sql
  • src_reviews.sql

These models are used to load the raw data from the source into the data warehouse.

Dimension Models

Dimension models are located in the models/dim directory and include:

  • dim_hosts.sql
  • dim_listings.sql
  • dim_reviews.sql

These models transform the raw data into cleaned and structured dimension tables.

Fact Models

Fact models are located in the models/fct directory and include:

  • fct_reviews.sql

These models aggregate the data into fact tables suitable for analytical purposes.

Data Marts

Data marts are located in the models/mart directory and include:

  • mart_fullmoon_reviews.sql

These models are designed for specific analytical use cases.

Tests

Tests are located in the tests directory and include:

  • consistent_created_at.sql
  • dim_listings_minumum_nights.sql
  • no_nulls_in_dim_listings.sql

These tests ensure the integrity and quality of the data models.

Snapshots

Snapshots are located in the snapshots directory and include:

  • scd_raw_listings.sql

These snapshots capture changes in the data over time, allowing for historical analysis.

Logs and Targets

  • logs/: Contains logs of the dbt runs.
  • target/: Contains compiled SQL files and run artifacts.

dbt-airbnb-analytics's People

Contributors

tvkoushik avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.