Traffic Sign Recognition

Build a Traffic Sign Recognition Project

The goals / steps of this project are the following:

Load the German Traffic Sign Dataset
Explore, summarize and visualize the data set
Design, train and test a model deep learning architecture
Use the model to make predictions on new images
Analyze the softmax probabilities of the new images
Summarize the results with a written report

This written report covers the rubric points individually and describes how I addressed each point in my implementation. You may want to follow along using the iPython notebook that was used to generate the data in this writeup. If you wish to dig in deeper than this writeup, you can follow the instructions in the Setup section at the bottom of this writeup to access and run the notebook yourself.

Data Set Exploration

Dataset Summary

The code for this step is contained in the 2nd code cell of the iPython notebook.

I used the NumPy library to calculate summary statistics of the traffic signs data set:

The size of the training set is 34799 examples.
The size of the validation set is 4410 examples.
The size of the test set is 12630 examples.
The shape of a traffic sign image is (32, 32, 3) or 32x32 pixels in RGB color.
The number of unique classes/labels in the data set is 43.

Exploratory Visualization

The code for this step is contained in the 3rd and 4th code cells of the iPython notebook.

First is a visualization of a random sample of each of the 43 sign classes. You will notice from the sample pictures that some of the signs are very dark and difficult to read. We'll fix those up later.

Second up is a bar chart showing a histogram of the training samples per class. You'll notice that some of the sign classes have a lot of sample pictures while others have very few.

Design and Test a Model Architecture

Preprocessing

The code for this step is contained in the 5th code cell of the iPython notebook.

I experimented with a few preprocessing techniques but settled on three in the end.

First, I converted the images to grayscale. The color information was not usesful in my experiments and there was a per-epoch runtime speedup of about 1.5x using grayscale over color images. More importantly, there was a mild improvement in accuracy of the validation training set when using the grayscale images.

Second, I equalized the histograms using the CLAHE algorithm built into OpenCV. This fixed the problem of some of the signs being too dark.

Finally, I normalized the image data to -1 <= pixel <= 1 rather than 0 <= pixel <=255. Reducing the range of values should, in theory, make it easier for the network to train faster, especially with a smaller learning rate.

If I were to go further, I would augment this data set with perturbed images, especially for those classes which were underrepresented in the histogram above. Perturbation could include randomly rotating, shifting, scaling, warping/projecting, blurring, adding noise to, or adjusting the gamma of the image.

Here is a random sample of all 43 classes after the three preprocessing techniques above were applied.

Model Architecture

The code for my model is located in the 6th cell of the iPython notebook.

I used an almost vanilla LeNet-5 model, with the only difference being adding dropout stages to the fully connected outputs and expanding the classification stage to 43 outputs.

Here is a picture of the original LeNet-5:

And here is a tabular description of the model I used:

Layer	Description
Input	32x32x1 grayscale image
Convolution 5x5	1x1 stride, valid padding, outputs 28x28x6
RELU
Max pooling	2x2 stride, valid padding, outputs 14x14x6
Convolution 5x5	1x1 stride, valid padding, outputs 10x10x16
RELU
Max pooling	2x2 stride, valid padding, outputs 5x5x16
Flatten	outputs 400
Fully connected	outputs 120
RELU	+ dropout of 50%
Fully connected	outputs 84
RELU	+ dropout of 50%
Fully connected	outputs 43

Model Training

The code for training the model is located in the 8th and 9th cells of the iPython notebook, with the hyper-paramters used in the 7th cell.

To train the model, I used a learning rate of 0.0005 which is higher than the default rate of 0.0001 that was used in the LeNet lab for MNIST recognition. This allowed it to train much faster while not overshooting too badly. In the future, I would consider starting with a large number and adding decay to lower it over time as the network starts to learn. The batch size was unchanged from MNIST at 128 though slightly smaller values like 100 were also good. Finally, I increased the number of epochs to 30 since there are more parameters in this model and we need more time to learn.

Solution Approach

The code for calculating the accuracy of the model is located in the 11th cell of the iPython notebook. The 10th cell calculates the loss and accuracy data for the training and validation data.

My final model results were:

training set accuracy of 99.6%
validation set accuracy of 96.3%
test set accuracy of 95.1%

This project was achievable with a relatively simple modification of the LeNet-5 network architecture, as mentioned above. I iterated through various combinations of pre-processing to find a combination that worked well and seemed logical. As you can see in the charts below, the loss function and the training and validation accuracy increased nicely without a lot of overfitting. If I were to continue further experiments, I would consider the architecture choices mentioned in the paper by Sermanet & LeCun and other similar research.

Test a Model on New Images

Acquiring New Images

Here are five German traffic signs that I found on the web:

The first three images should be easy to classify but the 4th and 5th are skewed so may cause difficulty for the classifier, especially since I didn't augment the data with such perturbations. I load and show the images in the 12th cell if the iPython notebook.

Performance on New Images

The code for making predictions on my final model is located in the 13th cell of the iPython notebook.

Here are the results of the prediction:

Image	Prediction
Speed limit (20mk/h)	Speed limit (20km/h)
General caution	General caution
Double curve	Double curve
Keep right	Keep right
Roundabout mandatory	Roundabout mandatory

The model was able to correctly guess all 5 of the 5 traffic signs, for an accuracy of 100%. However, given that the model from the solution was only 95% accurate on the test data set, there is a chance other signs would not fare so well, or even that all five of these signs would be detected properly upon retraining of the network with other random weights.

Model Certainty - Softmax Probabilities

The code for calculating the top 5 probabilities on my final model is located in the 14th cell of the iPython notebook.

The first four images are predicted correctly at 67% or better. For the fifth image, the model is correctly predicting "Roundabout mandatory" but with only 41% confidence. This is very likely because of the extreme angle of the image. I suspect if I augmented the original training data before training using perturbations, this image would have been classified with higher confidence.

Setup

Clone the project and start the notebook.

 git clone https://github.com/anandman/CarND-Traffic-Sign-Classifier-Project
 cd CarND-Traffic-Sign-Classifier-Project

Download the dataset and unzip the files into a directory named traffic-signs-data. This is a pickled dataset in which we've already resized the images to 32x32.
Make sure you have an environment setup that includes Jupyter, Python 3.5+, NumPy, SciPy, MatPlotLib, Pandas, OpenCV, and TensorFlow. You can get a complete setup if you follow these directions.

Launch Jupyter to read the iPython notebook:

 jupyter notebook Traffic_Sign_Classifier.ipynb

Open up the iPython notebook in your browser using the instructions Jupyter gave you and run all the cells from the Cell menu.

anandman / traffic-sign-classifier Goto Github PK

traffic-sign-classifier's Introduction