Giter Site home page Giter Site logo

real-time-traffic-prediction-for-e4's Introduction

Real Time Traffic Prediction for E4 outside Kista

This service serves as project of group 46 of ID2223 Scalable Machine Learning and Deep Learning course at KTH Royal Institute of Technology.

This service is a automatically-updated machine learning-based traffic prediction service for the motorway E4 entering Kista. The service predicts the clearness of E4 traffic based on the time of day and the weather condition on the road. The metric shown is free freeflow level, which equals to the quotient of the actual speed and the freeflow speed, with 0 showing a total congestion and 1 showing a complete freeflow.

The service UI is built as a Huggingface Space with the inference pipeline built in.

Feature Pipeline

The service captures real-time traffic data from the traffic API from Tomtom and the hourly updated weather data from the nearest weather station from the API from SMHI as features and stores them as Huggingface Dataset. We selected the timestamp of retrieval, air temperature, wind speed, precipitation, visibility, confidence of traffic data and the conputed freeflow level as features. To run the feature pipeline automatically, it is deployed on Modal and the features are captured and stored every 10 minutes.

Features Correlation Analysis

According to the correlation analysis of the current data, temperature and time have the largest correlation with the degree of congestion level. Traffic congestion often occurs in the afternoon of a day, which may be caused by people commuting. In addition, the current data show a negative correlation between traffic fluency and temperature. In other words, the higher the temperature, the less smooth the traffic. This is because people tend to drive less in cold temperatures (roads can be icy), avoiding congestion.

We believe that confidence of data is an important feature. Since the confidence of the data obtained so far is all 1 (perfect confidence), the result shown in the model is that the data confidence has nothing to do with traffic. We believe that more data in the future will change this relationship and therefore retain it.

The feature pipeline can be found here.

Training Pipeline

Model Selection

Several machine learning models have been compared and AdaBoostRegression is chosen as it showed best performance in terms of RMSE and R2 score, which for the R2 score is around 0.88.

We are also experimenting with Prophet, an effective time-series forecasting procedure which gives us a lot of option parameters such as custom changepoints and seasonality and might be useful for this task. As it has not yielded good results so far, we did not proceed with it.

Training

The training pipeline retrieves features stored on huggingface and the model is retrained on the whole updated dataset, because incremental training (partial_fit function) of scikit-learn does not support AdaBoostRegression.

The training pipeline is deployed on Modal and automatically retrains every day. The trained model is then uploaded and stored on huggingface, which can finally be inferenced by the huggingface space directly.

Problems Encountered

Several problems have been encountered during the completion of this project, they are mostly because of trying to collect in a short time a large amount of real-life data that generally varies at the period of a year.

Severely Imbalanced dataset

The dataset is severely imbalanced. For a vast majority of the time, the traffic of E4 is very unsaturated and the free flow level stays 1. This may be due to that the project was done during a rather short and special period time at the end of the year where people are on holiday, so the traffic is also few. We believe that, through a longer time period of self-updating, the model can predict the traffic more accurately.

Unchanged Weather

The weather data also mostly stayed the same at the beginning of the year. Our analysis of correlation therefore showed that the weather contributed little to the variation of the traffic, which does not seem logical. We also believe that this situation can be changed with a larger amount of features.

Authors

Acknowledgements

real-time-traffic-prediction-for-e4's People

Contributors

chenzhou98 avatar tilosmsh avatar

Stargazers

Ravi avatar Venura Pussella avatar Kaidi Xu avatar

Watchers

 avatar

Forkers

chenzhou98

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.