Giter Site home page Giter Site logo

traffy-clustering's Introduction

Traffy Fondue Clustering Project

Created by

  • 6330440921 Bhuribhat Ratanasanguanvongs
  • 6330301321 Panithi Khamwangyang
  • 6330305921 Pras Pitasawad

Description

This project is the final project for the Data Science and Data Engineering course. The goal of this project is to apply clustering techniques to the Traffic Fondue dataset to uncover interesting insights in Thailand.

We will be using Python and popular data science libraries such as Pandas, Scikit-Learn, Folium, and Matplotlib to preprocess the data, perform the clustering analysis, and visualize the results.

Once we have identified the clusters, we will use visualization techniques to explore the relationships between the clusters. The goal is to uncover interesting patterns and insights about traffic in Thailand that could be used to inform policy decisions.

By the end of this project, we will have gained experience in applying clustering techniques to real-world datasets, using data visualization tools to communicate insights effectively, and working with common data science libraries in Python. We hope that our findings will contribute to a better understanding of low-income area in Thailand and inspire further research in this area.

Initializing Environment

If you are using linux os, please follow the following steps:

>> mkdir ./dags ./logs ./plugins
>> echo -e  "AIRFLOW_UID=$(id -u)\nAIRFLOW_GID=0" > .env

Note: For other operating systems, you may get a warning that AIRFLOW_UID is not set, but you can safely ignore it. You can also manually create an .env file in the same folder as docker-compose.yaml with this content to get rid of the warning:

AIRFLOW_UID=50000
  • ./dags - you can put your DAG files here.
  • ./logs - contains logs from task execution and scheduler.
  • ./plugins - you can put your custom plugins here.

Run Airflow

>> docker-compose up airflow-init   # run database and create first user account
>> docker-compose up -d             # run container in background

Run docker ps to check the condition of the containers and make sure that no containers are in unhealthy condition:

>> docker ps
CONTAINER ID   IMAGE                  COMMAND                  CREATED          STATUS                    PORTS                              NAMES
247ebe6cf87a   apache/airflow:2.6.0   "/usr/bin/dumb-init …"   3 minutes ago    Up 3 minutes (healthy)    8080/tcp                           compose_airflow-worker_1
ed9b09fc84b1   apache/airflow:2.6.0   "/usr/bin/dumb-init …"   3 minutes ago    Up 3 minutes (healthy)    8080/tcp                           compose_airflow-scheduler_1
7cb1fb603a98   apache/airflow:2.6.0   "/usr/bin/dumb-init …"   3 minutes ago    Up 3 minutes (healthy)    0.0.0.0:8080->8080/tcp             compose_airflow-webserver_1
74f3bbe506eb   postgres:13            "docker-entrypoint.s…"   18 minutes ago   Up 17 minutes (healthy)   5432/tcp                           compose_postgres_1

Run airflow DAG with parameters in json format for the tasks:

>> airflow trigger_dag 'dag_name' -r 'run_id' --conf '{"key":"value"}'

Example: --conf '{"filter":["ถนน","ทางเท้า"]}' or Trigger DAG w/ config in UI to filter data

Open Airflow UI

The webserver is available at: http://localhost:8080
The default account has the login airflow and the password airflow.

Close Airflow Docker

>> docker-compose down -v

MLflow UI

  • The webserver is available at: http://localhost:6543
>> mlflow ui --port 6543
  • Create a REST API locally with MLflow serving (optional):
>> mlflow models serve --model-url runs:/run_id/model --port 1244

Interactive Dashboard

The webserver is available at: http://localhost:8501

>> streamlit run streamlit_app.py

Interesting Insight with Low-Income Heatmap

Send POST request via Postman to http://localhost:8080/api/v1/dags/fondue_dag/dagRuns

{
    "conf":{
        "filter":["ความสะอาด"]
    }
}

Resources

traffy-clustering's People

Contributors

bhuribhat avatar praspit avatar bbompk avatar

Watchers

 avatar

Forkers

praspit

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.