Giter Site home page Giter Site logo

traffic-sign-classifier's Introduction

Traffic Sign Recognition


Build a Traffic Sign Recognition Project

The goals / steps of this project are the following:

  • Load the German Traffic Sign Dataset
  • Explore, summarize and visualize the data set
  • Design, train and test a model deep learning architecture
  • Use the model to make predictions on new images
  • Analyze the softmax probabilities of the new images
  • Summarize the results with a written report

This written report covers the rubric points individually and describes how I addressed each point in my implementation. You may want to follow along using the iPython notebook that was used to generate the data in this writeup. If you wish to dig in deeper than this writeup, you can follow the instructions in the Setup section at the bottom of this writeup to access and run the notebook yourself.


Data Set Exploration

Dataset Summary

The code for this step is contained in the 2nd code cell of the iPython notebook.

I used the NumPy library to calculate summary statistics of the traffic signs data set:

  • The size of the training set is 34799 examples.
  • The size of the validation set is 4410 examples.
  • The size of the test set is 12630 examples.
  • The shape of a traffic sign image is (32, 32, 3) or 32x32 pixels in RGB color.
  • The number of unique classes/labels in the data set is 43.

Exploratory Visualization

The code for this step is contained in the 3rd and 4th code cells of the iPython notebook.

First is a visualization of a random sample of each of the 43 sign classes. You will notice from the sample pictures that some of the signs are very dark and difficult to read. We'll fix those up later.

Random Samples of Classes

Second up is a bar chart showing a histogram of the training samples per class. You'll notice that some of the sign classes have a lot of sample pictures while others have very few.

Training Classes Chart

Design and Test a Model Architecture

Preprocessing

The code for this step is contained in the 5th code cell of the iPython notebook.

I experimented with a few preprocessing techniques but settled on three in the end.

First, I converted the images to grayscale. The color information was not usesful in my experiments and there was a per-epoch runtime speedup of about 1.5x using grayscale over color images. More importantly, there was a mild improvement in accuracy of the validation training set when using the grayscale images.

Second, I equalized the histograms using the CLAHE algorithm built into OpenCV. This fixed the problem of some of the signs being too dark.

Finally, I normalized the image data to -1 <= pixel <= 1 rather than 0 <= pixel <=255. Reducing the range of values should, in theory, make it easier for the network to train faster, especially with a smaller learning rate.

If I were to go further, I would augment this data set with perturbed images, especially for those classes which were underrepresented in the histogram above. Perturbation could include randomly rotating, shifting, scaling, warping/projecting, blurring, adding noise to, or adjusting the gamma of the image.

Here is a random sample of all 43 classes after the three preprocessing techniques above were applied.

Preprocessed Samples

Model Architecture

The code for my model is located in the 6th cell of the iPython notebook.

I used an almost vanilla LeNet-5 model, with the only difference being adding dropout stages to the fully connected outputs and expanding the classification stage to 43 outputs.

Here is a picture of the original LeNet-5:

LeNet-5

And here is a tabular description of the model I used:

Layer Description
Input 32x32x1 grayscale image
Convolution 5x5 1x1 stride, valid padding, outputs 28x28x6
RELU
Max pooling 2x2 stride, valid padding, outputs 14x14x6
Convolution 5x5 1x1 stride, valid padding, outputs 10x10x16
RELU
Max pooling 2x2 stride, valid padding, outputs 5x5x16
Flatten outputs 400
Fully connected outputs 120
RELU + dropout of 50%
Fully connected outputs 84
RELU + dropout of 50%
Fully connected outputs 43

Model Training

The code for training the model is located in the 8th and 9th cells of the iPython notebook, with the hyper-paramters used in the 7th cell.

To train the model, I used a learning rate of 0.0005 which is higher than the default rate of 0.0001 that was used in the LeNet lab for MNIST recognition. This allowed it to train much faster while not overshooting too badly. In the future, I would consider starting with a large number and adding decay to lower it over time as the network starts to learn. The batch size was unchanged from MNIST at 128 though slightly smaller values like 100 were also good. Finally, I increased the number of epochs to 30 since there are more parameters in this model and we need more time to learn.

Solution Approach

The code for calculating the accuracy of the model is located in the 11th cell of the iPython notebook. The 10th cell calculates the loss and accuracy data for the training and validation data.

My final model results were:

  • training set accuracy of 99.6%
  • validation set accuracy of 96.3%
  • test set accuracy of 95.1%

This project was achievable with a relatively simple modification of the LeNet-5 network architecture, as mentioned above. I iterated through various combinations of pre-processing to find a combination that worked well and seemed logical. As you can see in the charts below, the loss function and the training and validation accuracy increased nicely without a lot of overfitting. If I were to continue further experiments, I would consider the architecture choices mentioned in the paper by Sermanet & LeCun and other similar research.

Loss & Accuracy

Test a Model on New Images

Acquiring New Images

Here are five German traffic signs that I found on the web:

Test Images

The first three images should be easy to classify but the 4th and 5th are skewed so may cause difficulty for the classifier, especially since I didn't augment the data with such perturbations. I load and show the images in the 12th cell if the iPython notebook.

Performance on New Images

The code for making predictions on my final model is located in the 13th cell of the iPython notebook.

Here are the results of the prediction:

Image Prediction
Speed limit (20mk/h) Speed limit (20km/h)
General caution General caution
Double curve Double curve
Keep right Keep right
Roundabout mandatory Roundabout mandatory

Test Images Predictions

The model was able to correctly guess all 5 of the 5 traffic signs, for an accuracy of 100%. However, given that the model from the solution was only 95% accurate on the test data set, there is a chance other signs would not fare so well, or even that all five of these signs would be detected properly upon retraining of the network with other random weights.

Model Certainty - Softmax Probabilities

The code for calculating the top 5 probabilities on my final model is located in the 14th cell of the iPython notebook.

The first four images are predicted correctly at 67% or better. For the fifth image, the model is correctly predicting "Roundabout mandatory" but with only 41% confidence. This is very likely because of the extreme angle of the image. I suspect if I augmented the original training data before training using perturbations, this image would have been classified with higher confidence.

Roundabout mandatory


Setup

  1. Clone the project and start the notebook.

     git clone https://github.com/anandman/CarND-Traffic-Sign-Classifier-Project
     cd CarND-Traffic-Sign-Classifier-Project
    
  2. Download the dataset and unzip the files into a directory named traffic-signs-data. This is a pickled dataset in which we've already resized the images to 32x32.

  3. Make sure you have an environment setup that includes Jupyter, Python 3.5+, NumPy, SciPy, MatPlotLib, Pandas, OpenCV, and TensorFlow. You can get a complete setup if you follow these directions.

  4. Launch Jupyter to read the iPython notebook:

     jupyter notebook Traffic_Sign_Classifier.ipynb
    
  5. Open up the iPython notebook in your browser using the instructions Jupyter gave you and run all the cells from the Cell menu.

traffic-sign-classifier's People

Contributors

anandman avatar andrewpaster avatar awbrown90 avatar brok-bucholtz avatar davidawad avatar domluna avatar dsilver829 avatar josemacenteno avatar mvirgo avatar ryan-keenan avatar swwelch avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.