This assignment develops an application of Permutation Invariant and Permutation Equivariant Function approximation with Deep Learning Models.
https://www.dropbox.com/sh/sztempv7hymr3ck/AADjIBGNJgKl9Ox7LcpKlgR7a?dl=0
Consider the problem of learning a permutation-invariant function by using a neural network according to the -sum-decomposition neural network architecture defined in lectures.
For function , use a feedforward neural network with one hidden layer with 100 neurons and ReLU activation function, and the output layer with lat_dim
output units. lat_dim
is a hyperparameter which you will need to set to values specified below.
For function , use a feedforward neural network with one hidden layer with 100 neurons and ReLU activation function.
For training, use the stochastic gradient descent algorithm with learning rate 1e-4. Use MSE for the loss function. Use a validation dataset by taking 0.1 portion of the training dataset (validation split). Use batch size of value 128.
The training and testing datasets are provided in files train-1.csv
and test-1.csv
, respectively. Each row in these files consists of 11 comma-separated values, with the first 10 values corresponding to features of the feature vector, and the last value corresponds to the response variable.
You need to perform the following tasks:
Implement the neural network specified above. Explain your implementation.
Train the neural network with lat_dim = 5
for 10 epochs. Show the test and validation loss versus the number of epochs. Show also the test MSE value. Repeat this but by taking lat_dim = 100
. Discuss the obtained results.
Train the neural network with lat_dim = 100
for 10 epochs, for different values of learning rates equal to 0.01, 0.1, and 0.5. Discuss the obtained results.
Train the neural network with lat_dim = 100
for 50 epochs, with ReLU activation functions. Show the training and validation loss versus the number of epochs. Repeat this for the same setting but with ReLU activation functions replaced with sigmoid activation functions. Discuss the results.
In this part you need to evaluate the neural network architecture defined above, for different values of lat_dim
in [1, 2, ..., 10, 20, 30, ..., 100]. For each setting of lat_dim
, run 5 independent training runs, each for 5 epochs. For each training run, select the best model found with respect to the MSE validation metric. Compute the test MSE value of the best model for each training run. For each lat_dim
, compute the mean of the test MSE values of the best models identified in the corresponding training runs. Show the mean test MSE versus lat_dim
in a plot. Comment the results.
Consider the problem of learning a permutation-equivariant function by using a feedforward neural network with equivariant layers - see lectures for definition. The function we are going to consider corresponds to a choice function that for each set of input items outputs the choice of one of these items (using one-hot encoding).
We consider a feedforward neural network with L
equivariant layers. Each of the first L-1
layers consists of an equivariant affine transformation followed by ReLU activation units with output dimension . The output layer is an equivariant affine transformation with output dimension .
For training, use Adam optimizer with learning rate of value 1e-4 and epsilon set to value 1e-3. For the loss function, use MSE. Use validation split of value 0.1. Use batch size of value 300. Use the number of epochs of value 100.
The training dataset is given in xtrain-2.csv
and ytrain-2.csv
. Each row of xtrain-2.csv contains values of a flatten matrix (row concatenation). Each row of ytrain-2.csv contains elements of a dimensional output vector. The test dataset is given in xtest-2.csv
and ytest-2.csv
and is of the same format. In these datasets, and .
You need to perform the following tasks:
Implement the feedforward neural network defined above. Explain your implementation.
Train the neural network with . Show the training and validation loss versus the number of epochs. Compute the test MSE value. Repeat this for set to (2,10), (2,100), (2,200), (3,5), (3,10), (3,100), and (3,200). Discuss the obtained results.
- Solution deadline: 23 February, 11:59 pm.
- Upload your Python code (including discussion/explanation for each question) and additional files (if any) into GitHub.
- There is no need for uploading training and testing data, as these datasets cannot be changed.
- IMPORTANT: for completing your submission, go to Moodle (Assessment 1 section) and provide a file with a link to your GitHub repository (this must be done by the deadline).