Giter Site home page Giter Site logo

udacity_data_engineering_projects's Introduction

UDACITY Data Engineering Proyect

Docker compose up

The file docker-compose.yml contains the configured containers for the postgreSQL and its Admin view option. Execute inside the folder, the command:

docker compose up -d

Docker Load data

A file with information from the crawled data is provided. Load the studentdb with the next command:

cat data_modelling.sql | docker exec -i pg_container psql -U postgres

Recreate the project

1) PIP

There is the option to run within pip or poetry. With pip, install the requirements file.

pip install -r requirements.txt

2) Poetry

Activate Virtual Environment

Poetry manages virtual environment. For this development, Python 3.10. For installing different Python versions, you should install PYENV.

pyenv install 3.10
pyenv local 3.10 #Activate local python

Since Poetry has been installed. You should allocate the terminal where the location of the pyproject.toml and execute the command that will read and bring the modules that match poetry.lock accordingly.

After installing python3.10 You can activate the virtual environment for Poetry inside the folder that contains the files mentioned above with the command:

poetry shell

Install the packages to poetry when is activated.

poetry install

Run Project

CREATE TABLES

The first command to run in the project is to create the tables. Thus, you have to create the tables running the script accordingly.

poetry run python create_tables.py

CREATE ETLs

To make the ETL´s run, you have to run the "etl.py"

poetry run python etl.py

SANITY TESTS

To check basic test that evaluates the work, we should run the notebook with the name "test.ipynb".

udacity_data_engineering_projects's People

Contributors

geomario avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.