Giter Site home page Giter Site logo

slipnitskaya / caltech-birds-advanced-classification Goto Github PK

View Code? Open in Web Editor NEW
17.0 3.0 5.0 1.38 MB

Collection of Python tutorials on computer vision accompanying the “Bird by Bird Tech” series of articles.

Home Page: https://slipnitskaya.medium.com

Jupyter Notebook 98.28% Python 1.72%
deep-learning computer-vision classification cnn-classification pytorch ai finite-state-machine object-detection python tutorials

caltech-birds-advanced-classification's Introduction

Bird by Bird AI Tutorials

Python tutorials on computer vision for classification of bird species

This repository contains materials accompanying a series of articles “Bird by Bird Tech” published on Medium.

Motivation

Here, we are going to tackle such an established problem in computer vision as fine-grained classification of bird species. The first part of the tutorials demonstrates how to use CNN models to classify bird images based on the Caltech-UCSD Birds-200-2011 (CUB-200-2011) dataset using PyTorch. By the end of these tutorials, you will be able to:

  • Understand basics of image classification problem of bird species.
  • Determine the data-driven image pre-processing strategy.
  • Create your own deep learning pipeline for image classification.
  • Build, train and evaluate ResNet-50 model to predict bird species.
  • Enhance CNN's performance by using different techniques.

Structure

Here you can get familiarized with the content more properly:

  • Part 1: “Advancing CNN model for fine-grained classification of birds” (notebook, article).
  • Part 2: “Finite automata simulation for leveraging AI-assisted systems“ (notebook, article, tutorial).
  • Part 3: “Optimizing AI-based systems on object detection using Monte-Carlo“ [TBA].
  • Part 4: “Interpretable deep learning for computer vision“ [TBA].
  • Part 5: “Multimodal data fusion approach for bird classification“ [TBA].

Summary

Part 1 demonstrates how to perform the data-driven image pre-processing, to build a baseline ResNet-based classifier, and to further improve it's performance for bird classification using different approaches. Results indicate that the final variant of the ResNet-50 model advanced with transfer and multi-task learning, as well as with the attention module greatly contributes to the more accurate bird predictions. Part 2 focuses on simulation modelling using finite state machines for AI-assisted computer vision systems towards improved efficiency on bird detection. More information on experimental design and results can be found in notebooks and articles.

Libraries

Before running the code, make sure to install project dependencies indicated in the requirements file.

License

Except as otherwise noted, the content of this repository is licensed under the Creative Commons Attribution Non Commercial 4.0 International, and code samples are licensed under the Apache 2.0 License. All materials can be freely used, distributed and adapted for non-commercial purposes only, given appropriate attribution to the licensor and/or the reference to this repository.

SPDX-License-Identifier: CC-BY-NC-4.0 AND Apache-2.0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.