Giter Site home page Giter Site logo

iiit-bangalore-march-april-2019's Introduction

Visual Recognition Course (March April 2019)

Basic Info

  • Where: IIIT-Bangalore
  • When: March/April 2019
  • Who: Anush Sankaran, IBM Research AI (co-instructed with Prof. Dinesh Babu Jayagopi)

Course Overview

Date Topic Content Slides Notes
22nd March, 2019 (Friday) Introduction to Image Classification, Neural Networks, and Optimization - What is visual recognition? - Logistic regression - Stochastic Gradient Descent - Multilayer perceptron - Backpropagation - DL + ML Pipeleine slides
30th March, 2019 (Saturday) Unsupervised Feature Learning, Autoencoders, Convolutional Neural Networks - Popular applications of DL - Stacked autoencoders - Convolution & Pooling layers - Convolutional autoencoder slides Notebook
6th April, 2019 (Saturday) Hyper-parameter optimization, Training Process Convolutional neural network - One time model setup - Hyper-parameter optimization slides Notebook
10th April, 2019 (Wednesday) Different CNN Architectures Data Augmentation - Transfer Learning - Comparison of Different CNN Architectures - Watson Studio Hands-on slides Watson Studio: How To
20th April, 2019 (Saturday) Generative Modelling Unsupervised learning - Distribution fitting - PixelRNN/CNN - Variational Autoencoder (VAE) - Generative Adversarial Network (GAN) - Open source GAN toolkit slides Open source GAN Toolkit
27th April, 2019 (Saturday) CNN Visualization and Face Recognition Neuron Visualization - Guided BackProp - Grad-CAM - Face Classification - Face Generation - DeepFake - Model Trust slides

Acknowledgement

References and they have better slides! With huge respects to their slides, hard work, and efforts, I acknowledge them and only makes sense to reuse some part of their slides!

iiit-bangalore-march-april-2019's People

Contributors

goodboyanush avatar satu0king avatar

Stargazers

Rishi Vakharia avatar KailashVC avatar  avatar Abhinav Thakur avatar Parth Trehan avatar  avatar Chitrita Goswami avatar Prateek Ralhan avatar Shreya Singh avatar Shivam Kumar Singh avatar Yashovardhan Siramdas avatar Atibhi Agrawal avatar Vasu Bansal avatar Mahidhar Bandaru avatar Sourabh Kondapaka avatar

Watchers

 avatar Vasu Bansal avatar Gopalakrishnan Venkatesh avatar Archit Kashyap avatar Yasasvi avatar

iiit-bangalore-march-april-2019's Issues

Assignment #2 updates: New model and some code

import keras
from keras.layers import Dense, Conv2D, BatchNormalization, Activation
from keras.layers import AveragePooling2D, Input, Flatten
from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint, LearningRateScheduler
from keras.callbacks import ReduceLROnPlateau
from keras.preprocessing.image import ImageDataGenerator
from keras.regularizers import l2
from keras import backend as K
from keras.models import Model

def resnet_layer(inputs,
                 num_filters=16,
                 kernel_size=3,
                 strides=1,
                 activation='relu',
                 batch_normalization=True,
                 conv_first=True):
    """2D Convolution-Batch Normalization-Activation stack builder

    # Arguments
        inputs (tensor): input tensor from input image or previous layer
        num_filters (int): Conv2D number of filters
        kernel_size (int): Conv2D square kernel dimensions
        strides (int): Conv2D square stride dimensions
        activation (string): activation name
        batch_normalization (bool): whether to include batch normalization
        conv_first (bool): conv-bn-activation (True) or
            bn-activation-conv (False)

    # Returns
        x (tensor): tensor as input to the next layer
    """
    conv = Conv2D(num_filters,
                  kernel_size=kernel_size,
                  strides=strides,
                  padding='same',
                  kernel_initializer='he_normal',
                  kernel_regularizer=l2(1e-4))

    x = inputs
    if conv_first:
        x = conv(x)
        if batch_normalization:
            x = BatchNormalization()(x)
        if activation is not None:
            x = Activation(activation)(x)
    else:
        if batch_normalization:
            x = BatchNormalization()(x)
        if activation is not None:
            x = Activation(activation)(x)
        x = conv(x)
    return x


def resnet_v1(input_shape, depth, num_classes=10):
    """ResNet Version 1 Model builder [a]

    Stacks of 2 x (3 x 3) Conv2D-BN-ReLU
    Last ReLU is after the shortcut connection.
    At the beginning of each stage, the feature map size is halved (downsampled)
    by a convolutional layer with strides=2, while the number of filters is
    doubled. Within each stage, the layers have the same number filters and the
    same number of filters.
    Features maps sizes:
    stage 0: 32x32, 16
    stage 1: 16x16, 32
    stage 2:  8x8,  64
    The Number of parameters is approx the same as Table 6 of [a]:
    ResNet20 0.27M
    ResNet32 0.46M
    ResNet44 0.66M
    ResNet56 0.85M
    ResNet110 1.7M

    # Arguments
        input_shape (tensor): shape of input image tensor
        depth (int): number of core convolutional layers
        num_classes (int): number of classes (CIFAR10 has 10)

    # Returns
        model (Model): Keras model instance
    """
    if (depth - 2) % 6 != 0:
        raise ValueError('depth should be 6n+2 (eg 20, 32, 44 in [a])')
    # Start model definition.
    num_filters = 16
    num_res_blocks = int((depth - 2) / 6)

    inputs = Input(shape=input_shape)
    x = resnet_layer(inputs=inputs)
    # Instantiate the stack of residual units
    for stack in range(3):
        for res_block in range(num_res_blocks):
            strides = 1
            if stack > 0 and res_block == 0:  # first layer but not first stack
                strides = 2  # downsample
            y = resnet_layer(inputs=x,
                             num_filters=num_filters,
                             strides=strides)
            y = resnet_layer(inputs=y,
                             num_filters=num_filters,
                             activation=None)
            if stack > 0 and res_block == 0:  # first layer but not first stack
                # linear projection residual shortcut connection to match
                # changed dims
                x = resnet_layer(inputs=x,
                                 num_filters=num_filters,
                                 kernel_size=1,
                                 strides=strides,
                                 activation=None,
                                 batch_normalization=False)
            x = keras.layers.add([x, y])
            x = Activation('relu')(x)
        num_filters *= 2

    # Add classifier on top.
    # v1 does not use BN after last shortcut connection-ReLU
    x = AveragePooling2D(pool_size=8)(x)
    y = Flatten()(x)
    outputs = Dense(num_classes,
                    activation='softmax',
                    kernel_initializer='he_normal')(y)

    # Instantiate model.
    model = Model(inputs=inputs, outputs=outputs)
    return model


model = resnet_v1(input_shape=(32,32,3), depth=20)
model.summary()

Note: Inceptionv3 can be used with CIFAR-10, by resizing the CIFAR-10 image size to (139,139,3). But instead of doing that,
Please use the above model for your Assignment number 2. This model does not have pre-trained weights from Imagenet. Hence, you need to train this model from scratch. Training from scratch will help you to

Please try exploring the following hyperparameter/ training strategies:

  • Layer activation function
  • Data preprocessing, normalisation
  • Dropout (where to apply, what is the strength)
  • BatchNorm (where to apply)
  • Loss function
  • Learning rate and learning strategy

Also, bonus marks for trying,

  • Data augmentation
  • Transfer learning

Report the final best set of hyper parameters set that gave you the best accuracy.

You will be evaluated on what values you explored for each of the parameter and in what order? Please do explain how you baby-sit the learning process.

Please note that there is no single correct answer to this question.

Assignment 1 Part B

Hi Anush,

For converting a random noisy image to MNIST image is it necessary to start with (4, 4, 8) image or can we choose some other starting dimension.

Also a simple three layer decoder using (4, 4, 8) input is not able to reproduce anything even after 100 epochs of training. Can we use the concepts learnt in the last lecture to modify our decoder to something more powerful!

Request for feedback

As this is my first formal teaching experience, your feedback on the plus and minus of my teaching would really help me a lot. Especially, it would help in teaching your batch itself in the following semester.
Please be critcial on the review and no kind of feedback will be taken in the negative sense.

Assignment 3 Question 1

For the question 1, I do not understand the binary classification part.
What I understand is that we do face recognition using CNN and then at test time pick two images and classify them to the two classes by either checking the labels or through some distance metrics and using threshold on the features extracted. Is my understanding right?

Assignment2

To use GPU powered notebooks, I couldn't find option for that in the Environment tab.
Attaching the screenshot.
Untitled

Assignment 2 - Pre trained weights

Hi Anush,

Just a small clarification needed.

Are we allowed to use the pre-trained weights for the Inception V3 network (weights for imagenet dataset are available in keras) and do transfer learning -or- should we just use the architecture and train the model from scratch.

Query regarding Assignment 3

I got an email question regarding assignment 3, copy pasting the question answer here so that it could aid the broader group

Question:
I had a query about generating faces using GAN. Do we have to train the GAN from scratch using the IIITB face dataset or use the pretrained model on the celebA dataset and use transfer learning on top of it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.