Giter Site home page Giter Site logo

se4ai2324-uniba / ghiprediction Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 5.37 MB

The system will predict the Global Horizontal Irradiance (GHI) by analyzing various physical data. This prediction aims at optimizing solar energy systems.

Home Page: http://15.161.48.19

License: MIT License

Makefile 8.23% Python 60.79% CSS 1.26% Dockerfile 1.47% HTML 3.03% JavaScript 25.23%
ghi regression

ghiprediction's Introduction

language tags datasets metrics
en
regression
prediction
machine_learning
linear_regression
xgbooster
random_forest
k_nearest_neighbors
national_solar_radiation_database
R2
RMSE

Static Badge Static Badge Static Badge Static Badge Static Badge Better Stack Badge

GHI PREDICTION

System Description:

The system will predict the Global Horizontal Irradiance (GHI) – the amount of solar energy from the sun hitting a specific location.
It will do this by analyzing various physical data, including temperature, humidity, and Direct Normal Irradiance (DNI). Information about our models can be found here.

Data card:

Information about our data can be found here.

Contributors:

Belonging to the organization se4ai2324-uniba.

How to

In order to use the system we suggest to:

  1. Depending on your OS create a Python environment with the command:

    python3 -m venv name_of_your_env
    WINDOWS USERS:  call name_of_your_env/Scripts/activate
    MACOS USERS:    source name_of_your_env/bin/activate
    
  2. Install requirements:

    pip install -r requirements.txt
    
  3. Open the mlflow server:

    mlflow ui
    
  4. Start the DVC pipeline:
    to let the system download the data files you have to request the access to this Google Drive repository

    dvc pull
    dvc repro
    

Testing

The project has been tested with pytest and great expectations.
If you need to use these tools in our project you can type in your terminal:

pytest *path_of_the_module_containing_your_testing_functions*
 - or/and - 
python src/data/gx_test.py 

Quality of code has been assessed with Pylint with an average score of 8.3/10 on the non-autogenerated modules.
You can check the code quality with the command:

pylint *path_of_module\folder_you_want_to_check* 

APIs (local)

The project incorporates also a module that implements a set of APIs.
If you want to check them out you can run the uvicorn server just by running the python module api.py with the command:

python src/app/api.py 

The server will be accessible on your https://127.0.0.1:8000
You can also interact with the APIs through Swagger interface by adding "/docs" to your localhost address https://127.0.0.1:8000/docs.
Alternatively, you can explore the automatically-generated documentation via redoc adding "/redoc" instead https://127.0.0.1:8000/redoc.

As for the function within the project, also the APIs have been tested with pytest and Pylint (average score of 8.2/10).

Orchestration

Our project includes a dockerfile and a docker-compose.yaml.
Our configuration involved creating a straightforward Dockerfile, enabling the generation of an image within a Docker container.
Then another Docker container has been generated to handle the front-end of the application. We also developed a Docker Compose file to serve as a services orchestrator, managing our current container and any future containers.

To create the orchestrator you can use the command:

docker compose up

GitHub Actions

Our project defines two different GitHub Actions, specifically:

  • Pylint Action: checks the non-autogenerated files for code correctness. It is triggered with every push, across all directories.
  • PreProcessing Action: uses a GitHub Secret that contains the credentials for accessing the remote repository (in this case, Google Drive) through DVC. This action replicates every step of our preprocessing pipeline, spanning from creating the dataset and applying preprocessing steps to splitting it for the training phase. It can be triggered only manually.
  • Alibi Action: This action -as the previous one- replicates every step of our pipeline, and then it runs the Alibi module. This allows the user to check whether some drifting is present in the data or not. In particular, the user can check the output of the Alibi module directly in the terminal of the GitHub action itself. It can be triggered only manually.
  • Tests Action: uses a GitHub Secret that contains the credentials for accessing the remote repository (in this case, Google Drive) through DVC. This action replicates every test of our pipeline. It can be triggered only manually.

Code Carbon

We have integrated Code Carbon to monitor and assess the environmental impact of our model results, providing valuable insights into sustainability. You can access the detailed results in the associated model card.

Deployment & Monitoring

  • AWS deployment (nginx)
    Our app is up and running at 15.161.48.19. It has been deployed using AWS.
    The deployment involves a Linux machine hosted on a European server with 2GB of RAM (extended to 4GB using local storage) and 28GB of storage.
    We utilize nginx for serving the application. It acts as a proxy server between the user and our application.

  • Uptime
    We ensure the continuous availability and performance monitoring of our deployed application through Up time.

  • Prometheus
    Prometheus has been locally installed and run through the command prometheus --config.file=prometheus.yml
    This command starts the Prometheus server, collecting essential metrics needed for Grafana visualization.

  • Locust
    To simulate web traffic and gather additional data for Grafana, we use Locust.
    Locust server is available at http://localhost:8089/ after the initialization through the command locust

  • Grafana
    For comprehensive visualization of data generated by Locust and Prometheus, we use Grafana, which is locally installed. This allows us to customize and explore the metrics dashboards to gain insights into the performance of our application.

  • Alibi
    With Alibi we can have under control also the data drift, checking, whenever it is necessary, if the data on which the model predicts, provokes a drift.

For more detailed information you can consult the monitoring report in the report folder.

Project Organization

├── LICENSE
├── Makefile           <- Makefile with commands like `make data` or `make train`
├── README.md          <- The top-level README for developers using this project.
├── github             <- folder containing all the github actions
│   ├── data_drift.yaml        
│   ├── tests.yaml 
│   ├── pipeline.yaml
│   └── pylint_module.yaml     
├── data
│   ├── external       <- Data from third-party sources.
│   │   ├── user_requestes.csv
│   │   └── drift_results.txt
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── docs               <- A default Sphinx project; see sphinx-doc.org for details
│ 
├── ghi-predictor      <- Frontend application
│
├── models             <- Trained and serialized models, model predictions, or model summaries
│
├── notebooks          <- Jupyter notebooks. Naming convention is a number (for ordering),
│                         the creator's initials, and a short `-` delimited description, e.g.
│                         `1.0-jqp-initial-data-exploration`.
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            <- Generated analysis as HTML, PDF, LaTeX, etc.
│   ├── figures        <- Generated graphics and figures to be used in reporting
│   ├── pylint         <- Generated report for code quality assurance
│   ├── monitor_report <- Generated report for monitoring
│   ├── report_locust  <- Generated report for locust
│   └── codcarbon      <- Generated reports for codecarbon
│
├── requirements.txt   <- The requirements file for reproducing the analysis environment, e.g.
│                         generated with `pip freeze > requirements.txt`
├── requirements_docker.txt   <- The requirements file for reproducing the docker analysis environment
│
├── setup.py           <- makes project pip installable (pip install -e .) so src can be imported
├── src                <- Source code for use in this project.
│   ├── __init__.py    <- Makes src a Python module
│   │
│   ├── app            <- Scripts to generate APIs
│   │   ├── api.py
│   │   ├── schema.py
│   │   └── test.py
│   ├── data           <- Scripts to download or generate data
│   │   ├── gx_test.py
│   │   ├── make_dataset.py
│   │   ├── preprocessing.py
│   │   └── split_dataset.py
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling
│   │   └── build_features.py
│   │
│   ├── models         <- Scripts to train models and then use trained models to make
│   │   │                 predictions
│   │   ├── alibi      <- Script to check data drifting
│   │   │   └── alibi.py
│   │   ├── compare.py
│   │   ├── knr.py
│   │   ├── linear_regressor.py
│   │   ├── random_forest_regressor.py
│   │   ├── train_model.py       
│   │   └── xgbooster.py
│   │ 
│   ├── test          <- Scripts to test modules
│   │   ├── compare_test.py
│   │   ├── preprocessing_test.py  
│   │   └── training_test.py
│   │
│   └── visualization  <- Scripts to create exploratory and results oriented visualizations
│       └── visualize.py
│
└── tox.ini            <- tox file with settings for running tox; see tox.readthedocs.io

Project based on the cookiecutter data science project template. #cookiecutterdatascience

ghiprediction's People

Contributors

fra3005 avatar gianfederico avatar franciosodonato avatar

Watchers

Filippo Lanubile avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.