Wind farm wake modeling using cDCGAN

Predict the velocity field around a wind turbine given an inflow wind profile.

Based on the work by Zhang J, Zhao X, "Wind farm wake modeling based on deep convolutional conditional generative adversarial network", Energy, https://doi.org/10.1016/j.energy.2021.121747

Limitations:

Only trained with streamwise velocity component (Ux) for now
It generates only horizontal planes at hub's height (90m)

Installation

Install packages.

intended for a GPU NVIDIA RTX 3070, check the pytorch specific installation for your specific GPU. You may need to change this line in the requirements.txt: `--extra-index-url https://download.pytorch.org/whl/cu116`

pip install -r requirements.txt

Make sure you have dvc remote data credentials .json in the root directory of the repository.

Enable use of gdrive service account:

dvc remote modify storage gdrive_use_service_account true

add gdrive credentials to dvc remote:

dvc remote modify storage --local gdrive_service_account_json_file_path <credentials_file_name>.json

pull data from gdrive (through dvc which keeps track of the changes)

dvc pull

There are two main branches:
1. main (t_window_1000)
2. t_window_4000

main uses data from a temporal window of 1000 steps, and t_window_4000 of 4000 steps.

You can git checkout and dvc pull (one after the other) in order to train/test/eval one or the other dataset.

Usage

data

(Optional) Preprocess dataset from raw data (CFD simulations)

Generate preprocessed dataset

For now, only in medusa16 server where the raw data is stored, that is, CFD (caffa3d) simulation outputs.

extract images from CFD simulations outputs:

./src/data/make_dataset.py

Split data between training and testing:

./src/data/split_data.py --ratio 0.9 0.1

train the wakeGAN:

./scripts/train.py

You can modify hyperparameters in config.yaml file
You can monitor the training by watching the output in the shell, checking the logs in logs/train.log and checking the figures/monitor directory. This folder contains three figures that update every epoch:
- images.png: watch real vs. synth generated images for both training and testing set
- metrics.png: track RMSE and FID metrics.
- losses.png: track generator and discriminator loss.
- pixel_diff.png: watch error (just the difference in m/s) between real and synth images for training and testing set.

test the wakeGAN:

./scripts/test.py

Check generated files in figures/test:

images.png: flow field comparison between real and synth images for four random samples.
pixel_diff.png: error for those four random samples.
profiles.png: wind profiles at different streamwise position in relation to the wind turbine for both the ground truth and the generated flow.

data versions

We use different branches to track different versions of the dataset. We could also track the versions with git commit, but as these versions aren't an "improvement" of the previous version but just a different way of preprocessing the raw dataset we mantain the versions in branches.

Currently we are using two types of preprocessed dataset with velocity means taken at different average windows:

temporal window of 1000 time steps - branch: main
temporal window of 4000 time steps - branch: t_window_4000

change between data versions

In order to use a different version first check that your git working directory is clean, and then checkout to your target branch:

git checkout t_window_4000

tell dvc to checkout to this branch:

dvc checkout

Now you should see the oter version of the dataset.

You can go back to the previous state with the same commands:

git checkout main
dvc checkout

data pipeline overview:

CFD simulations of a WF
Horizontal slices at hub's height of mean horizontal velocity ($U_x$)
Crop slices into several images around each WT of the WF.
Save them as image files mapped with a certain $v_{min}$ and $v_{max}$.

( $v_{min}$ , $v_{max}$ ) -> ( $0$ , $255$ )
Load the images, convert them to float32 and rescale them to [ $-1$ , $1$ ]
Extract first column of pixels for each image (inflow velocity).

Training loop:

 for each epoch:
     for each minibatch:
         - Generate fake image given inflow
         - Pass real, fake and inflows to discriminator
         - Evaluate loss, backprop on Disc and Gen

cDCGAN architecture

Generator

Discriminator

Results

error between real and synthetic flow field

real vs. generated wind profiles at different streamwise positions relative to WT diameter

complete wf real and synthetic flow

data

raw

Contains the chaman LES simulation outputs of the WF for different precursors and turns. Each simulation is composed of 18 regions.

Versioning data with DVC (only for non-tracked data)

Let's track our splited data using DVC

First initialize DVC, it behaves similar to git:

dvc init

Note: Check that the data you want to tracked isn't in .gitignore

We're going to track the splitted data (splitted between train, val test) which lives in data/preprocessed/tracked

Let's add the data we are going to track to dvc staging area:

dvc add data/preprocessed/tracked/

With this command, dvc creates a file (*.dvc) in the tracked folder that contains metadata about your tracked data, let's git add it and ignore the folder that contains the tracked data

git add data/preprocessed/tracked.dvc data/preprocessed/.gitignore

We can keep track of our data with the actual data being storage in the cloud. We'll use google drive for this:

dvc remote add -d storage gdrive:<gdrive_folder_id>

Note: gdrive_folder_id corresponds to the id that the URL shows when you are in the folder that you would like to store your tracked data.

This configuration lives in .dvc/config file

Now, let's push the data to our remote storage:

dvc push

If you make changes to the data, you can track them with

dvc add <path_to_tracked_data>

Then git add the changes on *.dvc file, and commit.

git add <path_to_tracked_data>/*.dvc
git commit -m 'updating data'

For example, you can recover the last data modification going back one commit

git checkout HEAD^1 <path_to_tracked_data>

And go back and forth with:

git stash
git checkout HEAD
dvc checkout

branches to track datasets

Each branch represents a different dataset. In order to have the changes in main in any feature (dataset) branch we need to git rebase main on each branch when we make changes in the code (only changes in main allowed). However, we don't want to obtain the changes made to tracked.dvc, figures/test/* and config.yaml, we use .gitattributes for this.

In the root file .gitattributes we specify which file pattern we will exclude from the merging. In this case, we add the following to .gitattributes:

data/preprocessed/tracked.dvc merge=ours
figures/test/* merge=ours
config.yaml merge=ours

After that modify .gitconfig:

git config --global merge.ours.driver true

With this changes we can safely move between branches to keep a certain configuration, dataset and figures. Remember that on each branch the we keep track of tracked.dvc not of the data itself which is store in gdrive.

Rebase tu upgrade the feature branch with main changes

Reproducibility (for plain pytorch)

In order to make the results reproducible, a random seed has to be set at the beginning of the code:

torch.manual_seed(42)

It is also recommended to force PyTorch to check that all operations are deterministic:

torch.use_deterministic_algorithms(True)

If using a GPU it is necessary to set the CUBLAS_WORKSPACE_CONFIG environment variable to :4096:8 or :16:8 as suggested in the cuBLAS documentation.

maxibove13 / wakegan Goto Github PK

wakegan's Introduction

Wind farm wake modeling using cDCGAN

Predict the velocity field around a wind turbine given an inflow wind profile.

Installation

Usage

data

Generate preprocessed dataset

extract images from CFD simulations outputs:

Split data between training and testing:

train the wakeGAN:

test the wakeGAN:

data versions

change between data versions

data pipeline overview:

cDCGAN architecture

Generator

Discriminator

Results

error between real and synthetic flow field

real vs. generated wind profiles at different streamwise positions relative to WT diameter

complete wf real and synthetic flow

data

raw

Versioning data with DVC (only for non-tracked data)

branches to track datasets

Reproducibility (for plain pytorch)

wakegan's People

Contributors

Stargazers

Watchers

Forkers

wakegan's Issues

Recommend Projects

Recommend Topics

Recommend Org