Exploring loss landscapes
anaconda simplifies dependency management. To install, execute:
conda env create -f torch-land.yml
conda activate torch-land
export PYTHONPATH=.
All scripts are run from the projects root directory and further specify their usage when called with the -h
flag.
Training a model whose loss landscapes we want to investigate later:
python src/scripts/train.py resnet fashion-mnist
After having trained a model, to compute loss landscapes (= losses over a 2-dimensional parameter subspace) use the gird.py
script, e.g.
python src/scripts/grid.py grid9 resnet fashion-mnist --grid_width=9
Now, that the loss values have been computed, visualize the landscapes using 2d heatmaps by calling the visualize.py
script with the same parameters.
python src/scripts/visualize.py grid9 resnet fashion-mnist --grid_width=9
The commands to run the experiments are documented in the files experiments_run.sh and experiments_visualize.sh.
The landscapes are computed using a pair of random filter-normalized vectors that perturb the model's parameters. The losses correspond to a training step, meaning only a single mini-batch.
We use three pairs of perturbation vectors and the training-set's first three mini-batches of 256 images.
For visualization, we can either look at heatmaps or contour-plots (using the --contour
flag on visualize.py
):
First (convolutional) layer:
Last (fully connected) layer: First filter in first layer:ReLU
sigmoid tanhbefore overfitting (after 1 epoch)
with overfitting (after 9 epochs)