Lectures - Deep Learning for Computer Vision / Aug-Nov 2022

You can download the lectures here. We will try to upload lectures prior to their corresponding classes.

(dl4cv-1) Image Classification
tl;dr: Image Classification, a fundamental task of computer vision and simple algorithms to solve it.
[slides]
(dl4cv-2) Optimization
tl;dr: Process of finding the best parameters.
[slides]
(dl4cv-3) Neural Networks - Perceptron
tl;dr: Basic Artificial Neuron: MP neuron and Perceptron.
[perceptron-code] [slides]
Suggested Readings:
- Perceptron convergence proof
(dl4cv-4) Neural Networks - MLP
tl;dr: Multi-Layered Network of Perceptrons.
[perceptron-code] [slides]
(dl4cv-5) Backpropagation
tl;dr: Elegant technique that implements the gradient descent algorithm for the neural network training.
[Tensor-basics-code] [Autograd-example-code] [slides]
Suggested Readings:
- 5min-Tensor-basics
(dl4cv-6) Building blocks of CNNs
tl;dr: Modules that constitute a Convolutional Neural Network (CNN)
[slides]
Suggested Readings:
- Simple CNN training
(dl4cv-7) CNN Architectures
tl;dr: Evolution of the design principles and the resulting CNN architectures.
[slides]
Suggested Readings:
(dl4cv-8) Training DNNs
tl;dr: Important aspects of training deep neural networks.
[slides]
Suggested Readings:
(dl4cv-9) Training DNNs-II
tl;dr: Some more important aspects of training deep neural networks.
[sgd_update_rules.gif] [slides]
Suggested Readings:
(dl4cv-10) RNNs
tl;dr: Beyond the feed-forward neural nets, processing sequential data!
[Sample-sequential-task] [slides]
Suggested Readings:
(dl4cv-11) Attention
tl;dr: Attention in sequence-to-sequence tasks using RNNs.
[slides]
Suggested Readings:
(dl4cv-12) Word Embeddings
tl;dr: Representing words in the NLP reltaed tasks
[slides]
(dl4cv-13) Visualizing and Understanding CNNs
tl;dr: What do the CNNs learn? Why the predict what they predict?
[slides]
Suggested Readings:
(dl4cv-14) Object Detection
tl;dr: Task of simultaneously locating and classifying multiple pbjects present in an image using CNNs.
[slides]
Suggested Readings:
(dl4cv-15) Semantic Segmentation
tl;dr: Objective is to label each pixel present in an image using CNNs.
[slides]
Suggested Readings:
(dl4cv-16) Video Classification
tl;dr: Recognize the actions in videos using CNNs.
[slides]

Suggested Readings:
(dl4cv-17) Generative Models
tl;dr: ML models that understand and model the data.
[slides]
Suggested Readings:
- Pixel RNN, Google DeepMind, ICML 2016
- Pixel CNN, Google DeepMind, NeurIPS 2016
(dl4cv-17a) Autoencoders
tl;dr: Neural Networks that encode unlabeled data into lower dimensional subspaces driven by the reconstruction objective.
[slides]
(dl4cv-17b) Variational Autoencoder
tl;dr: Stochastic modules of an Autoencoder makes it a generative model.
[slides]
Suggested Readings:
- Auto-Encoding Variational Bayes, ICLR 2014
- Blogpost on VAE
(dl4cv-17c) Generative Adversarial Networks
tl;dr: Generative models with implicit density modeling for generating samples that resemble real data.
[example-code] [slides]
Suggested Readings:
- GANs, Goodfellow et al, NIPS 2014
- GAN Zoo
(dl4cv-18) Adversarial Images
tl;dr: Inputs that are crafted (via adding a special noise) to fool the trained DL systems.
[slides]
(dl4cv-19) Learning Efficient DL Models (Compressing the Models)
tl;dr: Making the power hungry and huge models light-weight.
[slides]
(dl4cv-20) Attetion++
tl;dr: Attention is all that we need?!
[slides]
(dl4cv-0) Introduction
tl;dr: Introduction to DL4CV and logistics of this course.
[slides]
Suggested Readings:
- Brief history of CV by Rostyslav Demush