Identification of Handwritten digit images using different classification algorithms such as Multi-class Logistic Regression, Single Hidden Layer Neural Network and Convolutional Neural Network.
We have trained our classification model on MNIST data using Multi-classLogistic Regression, Single Hidden Layer Neural Network, Convolutional Neural Network and predicted the labels of the digit images in both MNIST and USPS digit dataset.
- We trained the model and tuned the hyperparameter i.e. learning rate, by using our own implementation of Logistic regression, we achieved an accuracy of 91.56% on MNIST test images and 45.15% on USPS test images at learning rate of 0.14 and lambda (regulariser) value of 0.
- Using tensorflow, we have achieved an accuracy of 92.41% on MNIST test images and 48.32% on USPS test images
Single hidden layer neural network:
- We trained the model and tuned the hyperparameter i.e. learning rate and number of units in hidden layer, we achieved an accuracy of 97.76% on MNIST test images and 64.6% on USPS test images.
- On training the model using CNN, we achieved an accuracy of 99.18% on MNIST test images and 75% on USPS test images.
- Accuracies of the models on the USPS data were way lower than the accuracies they gave for the MNIST dataset. After seeing such performance of all the three training models on the USPS data set which were trained on the MNIST data set we can conclude that our model is not the best in the general way but performs well on the dataset in which it was trained. Our model needs to have the training knowledge of the USPS data in order to perform well on the USPS data set.
- MNIST dataset files were downloaded from the website mentioned in the main.pdf have been read using the gzip library function of the python with the help of direction mentioned in this website.
- Train and Test Images were then flattened into 2D numpy array, N x 784 size.
- Train image data and Test image data have been then standardised with zero mean and unit standard deviation.
- Previous two steps were performed similarly on the USPS test data.
- Weight numpy array was generated valus of 1s and dimension of K x D, where K is the number of classes i.e., 10 and D is the number of data features plus 1 bias feature i.e., (784 + 1).
- Column of 1s was added as the first column to both Train set and the Test sets of MNIST and USPS data making the total feature dimension of 785
- One hot vectors of MNIST data labels and USPS data labels were created.
- Separate functions for calculating the Cross Entropy error, Gradient descent, Softmax, and Predicting the image labels were created in the LRlibs.py file.
- Cross Entropy error function was implemented as per the formula.
- Gradient descent function was done using the pseudo code mentioned in the document provided along with the Project 3
- Softmax function was implemented using the exp(x − max(x))/Σexp(x) formula.
- Predict function was implemented by returning the column which had max of the W.dot(X) in each row.
- Model was trained on the Training data images with epoch count of 200 and varying learning rate of 0.01 to 0.14. The accuracies and the cross entropy error have been tabulated in the tables 1.1 and 1.2 for both MNIST and USPS data sets.
Single Hidden Layer Neural Network
- This has been implemented using TensorFlow
- The model consists of one hidden layer and one output layer
- In hidden layer, we have initialized 784x1024 weights and 1024 bias for 1024 units by random values. Then, multiplied the input with the weights and added bias to it. On this value, applied ReLU activation function.
- The output of hidden layer is fed to the output layer. The output layer consists of 1024x10 weights and 10 bias for 10 units in this layer and are randomly initialized.
- The output of the model is obtained by multiplying these weights with output from hidden layer and adding bias of output layer. Output will be a one-hot representation of the predicted label
- We have trained the above model using AdamOptimizer with number of epochs = 20000 and input batch size = 50 in each epoch and chose that model which has the minimum cross-entropy error
- We have tuned the hyperparameters such as number of units in hidden layer (784, 864, 944, 1024) and learning rate (0.01 to 0.05) and chose the model which has maximum accuracy on validation set
- Then, ran the model on MNIST and USPS test data to get test accuracy.
- This is also implemented using TensorFlow
- The model consists of 4 layers: Convolution layer 1 (convolution with 32 features applied on 5x5 patch of image + 2D max pooling), convolution layer 2 (convolution with 64 features applied on 5x5 patch of image + 2D max pooling), fully connected layer with 1024 neurons and ReLU activation function, logit layer with 10 neurons corresponding to 10 labels.
- In fully connected layer, few neuron outputs are dropped out to prevent overfitting. The no_drop_prob placeholder makes sure that dropout occurs only during training and not during testing.
- Trained the model using AdamOptimizer with learning rate set to 1e-4 and number of epochs= 20000 on input batches of size 50 and chose that model which has the minimum cross-entropy error.
- Then, ran the model on MNIST and USPS test data to get test accuracy. USPS test set was divided.
- No need of tuning of hyperparameter in CNN.
- While extracting USPS image features, we have to make it resemble to the MNIST image features as close as possible so that our trained model can classify the images correctly.
- In order to do that, we followed below steps:
- Resized the image to square shape i.e. width = height = max(width, height) it was done in such a way that the aspect ratio of the digits was not skewed.
- Converted the image into grayscale.
- Inverted the image pixels value, that is 255 - Image, black became white and vice-versa so has to follow same pixels values that of the MNIST images.
- Resized the each image to 28x28
- Normalized the image pixels with ([value-min]/[max-min]) formula
- Flattened each image into 1x784 numpy array
Hyperparameters = learning_rate
At epoch count = 200 | MNIST | USPS | ||||
Sr. | Learning_Rate | Regulariser | Training (%) | Val (%) | Test (%) | Test (%) |
---|---|---|---|---|---|---|
1 | 0.01 | 1 | 84.87 | 88.48 | 85.94 | 40.33 |
2 | 0.02 | 1 | 84.925 | 88.54 | 85.98 | 40.39 |
3 | 0.01 | 0 | 87.16 | 90.06 | 87.85 | 41.26 |
4 | 0.02 | 0 | 88.87 | 91.36 | 89.3 | 41.95 |
5 | 0.03 | 0 | 89.47 | 91.96 | 89.94 | 42.56 |
6 | 0.04 | 0 | 89.94 | 92.32 | 90.39 | 42.96 |
7 | 0.05 | 0 | 90.2 | 92.58 | 90.59 | 43.35 |
8 | 0.06 | 0 | 90.49 | 92.68 | 90.82 | 43.6 |
9 | 0.07 | 0 | 90.7 | 92.7 | 90.88 | 43.94 |
10 | 0.08 | 0 | 90.88 | 92.78 | 90.98 | 44.18 |
11 | 0.09 | 0 | 91.02 | 92.86 | 91.12 | 44.4 |
12 | 0.1 | 0 | 91.15 | 92.9 | 91.22 | 44.51 |
13 | 0.11 | 0 | 91.25 | 92.98 | 91.35 | 44.65 |
14 | 0.12 | 0 | 91.37 | 93.1 | 91.4 | 44.84 |
15 | 0.13 | 0 | 91.48 | 93.22 | 91.5 | 45.05 |
16 | 0.14 | 0 | 91.54 | 93.3 | 91.56 | 45.15 |
17 | 0.15 | 0 | 91.62 | 93.38 | 91.55 | 45.32 |
Single Hidden Layer Neural Network
Hyperparameters = number of units in hidden layer, learning_rate
With epoch_count = 20000 and batch_size = 50 | MNIST | USPS | ||||
Sr. | Number of units in hidden layer | Learning Rate | Training (%) | Val (%) | Test (%) | Test (%) |
---|---|---|---|---|---|---|
1 | 784 | 0.01 | 99.27 | 97.42 | 97.35 | 63.18 |
2 | 784 | 0.02 | 97.93 | 96.58 | 96.01 | 61.63 |
3 | 784 | 0.03 | 95.08 | 94.32 | 93.87 | 56.88 |
4 | 784 | 0.04 | 93.37 | 93.08 | 92.4 | 54.64 |
5 | 784 | 0.05 | 88.87 | 88.42 | 88.37 | 47.09 |
6 | 864 | 0.01 | 99.33 | 97.6 | 97.46 | 65.25 |
7 | 864 | 0.02 | 97.36 | 95.32 | 95.77 | 59.63 |
8 | 864 | 0.03 | 95.72 | 94.82 | 94.09 | 56.85 |
9 | 864 | 0.04 | 92.08 | 91.16 | 90.94 | 52.24 |
10 | 864 | 0.05 | 91.07 | 90.2 | 90.46 | 49.24 |
11 | 944 | 0.01 | 99.13 | 97.74 | 97.47 | 64.2 |
12 | 944 | 0.02 | 97.9 | 96.74 | 96.12 | 61.74 |
13 | 944 | 0.03 | 95 | 94.16 | 93.8 | 56.89 |
14 | 944 | 0.04 | 92.39 | 92.12 | 91.07 | 50.29 |
15 | 944 | 0.05 | 91.67 | 91.3 | 91.09 | 50.27 |
16 | 1024 | 0.01 | 99.25 | 97.52 | 97.81 | 64.6 |
17 | 1024 | 0.02 | 97.6 | 96.2 | 95.57 | 59.59 |
18 | 1024 | 0.03 | 95.46 | 94.2 | 94.16 | 54.95 |
19 | 1024 | 0.04 | 90.61 | 90.38 | 89.97 | 52.33 |
20 | 1024 | 0.05 | 89.56 | 89.08 | 88.83 | 49.46 |
Output for Learning rate 0.07 Current learning rate is 0.070000
iteration 0/200: loss 2.303
iteration 10/200: loss 0.759
iteration 20/200: loss 0.583
iteration 30/200: loss 0.509
iteration 40/200: loss 0.468
iteration 50/200: loss 0.440
iteration 60/200: loss 0.421
iteration 70/200: loss 0.406
iteration 80/200: loss 0.394
iteration 90/200: loss 0.384
iteration 100/200: loss 0.376
iteration 110/200: loss 0.369
iteration 120/200: loss 0.362
iteration 130/200: loss 0.357
iteration 140/200: loss 0.352
iteration 150/200: loss 0.348
iteration 160/200: loss 0.344
iteration 170/200: loss 0.341
iteration 180/200: loss 0.338
iteration 190/200: loss 0.335
training set Accuracy is 0.907055
validation set Accuracy is 0.927000
Test set Accuracy is 0.908800
USPS set Accuracy is 0.439422
Output for learning rate 0.5, number of epochs: 10000
The accuracy on MNIST test set: 92.41
The accuracy on USPS test set: 48.32
Single Hidden Layer Neural Network:
Output for learning rate 0.01, number of epochs: 20000, number of units in hidden layer: 784
MNIST validation accuracy: 97.42
MNIST test accuracy: 97.35
The accuracy on USPS test set: 63.18
Output for learning rate 1e-4, number of epochs: 20000
MNIST test accuracy: 99.18
The accuracy on USPS test set: 75.13
Report and documentation can be found on this Documentation link
- Report contains summary report detailing our implementation and results.
- code contains the source code of our machine learning algorithm
- Materials contains the project related informative materials
- Bonus contains source code of our machine learning algorithm using back-propogation.
- proj3_images contains image data for training, validation and testing
- logistic_main.py: Run this file for execution of logistic regression model without using tensorflow
- logistic_tensorflow_main.py: Run this file for execution of logistic regression model using tensorflow. It creates the model, trains it, tests it on MNIST validation, test set and USPS test set
- single_layer_NN_main.py: Run this file for execution of single hidden layer NN model using tensorflow. It creates the model, trains it, tests it on MNIST validation, test set and USPS test set
- cnn_main2.py: Run this file for execution of CNN model using tensorflow. It creates the model, trains it, tests it on MNIST validation, test set and USPS test set
- libs.py:
- read_gz(images,labels): To read MNIST gz data
- view_image(image, label=""): to view single image from the MNSIT data
- yDash(trains_images, W): for performing the W.dot(X)
- softmax(x): for calculating the softmax of each row in W.dot(X)
- sgd(W, train_images, T, L2_lambda, epochNo, learning_rate): gradient descend for optimising the weights
- cross_entropy(W, X, T, L2_lambda): cacluating the loss in the model
- predict(W, X): predicting the labels from the output of the model
- single_layer_NN_lib.py:
- **create_single_hidden_layer_nn(number_hidden_units): create input layer, one hidden layer with specified number of neurons and output layer
- cnn_lib.py:
- weight_init(shape): Initialize weight variables
- **bias_init(shape): Initialize bias variables
- **convolution(x, W): convolves input with given weights and stride 1 and with zero padding
- **maxpool(x): performs max pooling on window size of 2x2 and stride of 2 with zero padding
- USPS_data_extraction.py:
- make_square(im): To make the image with equal height and width
- extract_usps_data(): Get USPS test images and labels. Usps_test_images is a Nx784 numpy array and Usps_test_labels is Nx10 numpy array (one-hot representation)
Bonus:This zip folder consists of implementation of Single Hidden Layer NN model using backpropagation
- main.py: Run this file for execution of single hidden layer NN model using back propogation. It creates the model, trains it, tests it on MNIST validation, test set and USPS test set
- SNlibs.py:
- read_gz(images,labels): To read MNIST gz data
- view_image(image, label=""): to view single image from the MNSIT data
- softmax(x): for calculating the softmax of each row in W.dot(X)
- calculate_loss(model, X,y, reg_lambda): Helper function to evaluate the total loss on the dataset
- cross_entropy(W, X, T, L2_lambda): cacluating the loss in the model
- predict(W, X): predicting the labels from the output of the model
- build_model(nn_hdim, num_passes, X, y, reg_lambda, learning_rate, T): This function learns parameters for the neural network and returns the model
- USPS_data_extraction.py:
- make_square(im): To make the image with equal height and width
- extract_usps_data(): Get USPS test images and labels. Usps_test_images is a Nx784 numpy array and Usps_test_labels is Nx10 numpy array (one-hot representation)
- Prof. Sargur N. Srihari
- Jun Chu
- Tianhang Zheng
- Mengdi Huai
- Stackoverflow.com
- Python, Numpy and TensorFlow documentations
- https://cs231n.github.io/convolutional-networks/
- http://yann.lecun.com/exdb/mnist/
- https://martin-thoma.com/classify-mnist-with-pybrain/
This project is open-sourced under MIT License