Giter Site home page Giter Site logo

optimizing-image-recognition-with-pytorch-and-alexnet's Introduction

Optimizing-Image-Recognition-with-PyTorch-and-AlexNet

This project explores the construction and optimization of neural networks for image classification tasks, focusing on the AlexNet architecture. It includes building a basic neural network, optimizing it, implementing and improving AlexNet, and enhancing model performance with data augmentation.

Overview

The project covers several key areas:

  1. Building a Basic Neural Network: Begins with a foundational approach to binary classification, highlighting data preprocessing, model architecture, and performance metrics.
  2. Optimizing the Neural Network: Investigates the impact of hyperparameter tuning, including dropout rates, activation functions, and optimizers. Techniques such as Early Stopping and K-Fold Cross Validation are employed to enhance model performance.
  3. Implementing & Improving AlexNet: Adapts AlexNet to classify images into categories such as dogs, vehicles, and food, achieving significant accuracy.
  4. Optimizing CNN & Data Augmentation: Utilizes the Street View House Numbers (SVHN) dataset, applying data augmentation to improve the model's generalization and performance.

Folder Structure

Dataset Link:

https://kaggle.com/datasets/c3a1ad7d0ab948e6bf9f2242ae06247ed5ff8adc894215aa1a292992ea9d99bc
├── datasets/                      # Folder containing dataset used in the project
│   ├── dogs/          
│   ├── foods/     
│   └── vehicles/

Python Code :

AlexNet_Image_Classification_Project/           
│
├── basic_NN_construction.ipynb/                     # Jupyter notebooks with all coding experiments
├── advanced_NN_optimization_experiments.ipynb/ 
├── alexnet_implementation.ipynb/ 
└── alexnet_optimization_and_data_augmentation.ipynb/ 

Project Structure

  • basic_NN_construction.ipynb: Details the construction and initial experimentation of the basic neural network.
  • advanced_NN_optimization_experiments.ipynb: Explores further experiments and optimizations.
  • alexnet_implementation.ipynb: Demonstrates AlexNet implementations and optimizations.
  • alexnet_optimization_and_data_augmentation.ipynb: Advanced optimization and data augmentation (VGG13).

Getting Started

To replicate our findings or build upon them, ensure you have the following:

  • Python 3.x
  • TensorFlow
  • Keras
  • NumPy
  • Matplotlib for plotting
pip install tensorflow keras numpy matplotlib

Running the Project

  1. Clone this repository.
  2. Ensure the necessary dependencies are installed.
  3. Follow the notebooks sequentially to grasp the flow from data preprocessing to model optimization.

Data Preparation

The project utilizes diverse datasets, including images of dogs, vehicles, food, and street view house numbers. Key steps in data preparation include:

  • Normalization: Pixel values are normalized to have a mean of 0 and standard deviation of 1, facilitating model training and convergence.
  • Augmentation: To enhance model robustness, data augmentation techniques like random rotations, resizing, and normalization are applied, creating a more diverse training set.

Model Architecture

Basic Neural Network

The initial model is a straightforward neural network designed for binary classification, featuring layers with ReLU and sigmoid activation functions.

AlexNet Adaptation

The architecture mimics the original AlexNet with adjustments for the specific datasets used. It includes convolutional layers, max-pooling, and fully connected layers, employing ReLU activations and dropout for regularization.

Optimizations

  • Early Stopping: Monitors validation loss to halt training preemptively, preventing overfitting.
  • K-Fold Cross Validation: Ensures model reliability and generalizability across different data subsets.
  • Learning Rate Scheduler: Adjusts the learning rate dynamically, optimizing the training phase.

VGG13

VGG-13 is a convolutional neural network model known for its simplicity, using 13 layers with weights (including convolutional and fully connected layers) that was developed for large-scale image recognition tasks.

Results

Adapted AlexNet reached a peak test accuracy of 90.18% and a training loss of 0.3100, demonstrating outstanding generalization to unseen data.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.