[edit]
Week 4: Convolutional Neural Networks
[jupyter][google colab][reveal]
Abstract:
This lecture will introduce convolutional neural networks.
from conv_imports import *
%matplotlib notebook
Plan for the Day
 Introduction
 Inductive bias
 Inductive bias in NNs
 Convolutional operator
 Translation invariance in images
 Convolutional Neural Networks
 Advanced convolution variants
 Parameterfree convolution
 Depthwise separability & shuffle
 Deconvolution & super resolution
 Graph CNNs
Inductive bias
What is inductive bias?
 An assumption that allows the model to make predictions about neverseen data.
 All models carry one form of an inductive bias or other.
 i.e.: linear model, nonlinear model, type of nonlinearity...
Why do we need it?
 Data is expensive to gather or generate.
 Data is sparse  usually, we observe only a sample of the true data population.
 Generalization is necessary for both academic and industrial use of (deep) models.
Inductive bias
Inductive bias in NNs
A simple universal approximator that makes no assumptions about its input structure.
Inductive bias in NNs
 Common to all NNs
 Layered feature maps
 Nodes within a layer are assumed never to interact
 Inputs are processed on an directed acyclic graph
 Gradient descent
 Preference for models trainable by gradient
descent given the chosen initializer Recall week 2: Optimization and Stochastic Gradient Descent
 Layered feature maps
 Layerspecific inductive biases  relational inductive biases
 Locality: CNNs  preferred model is invariant to spatial translations
 Sequentiality: RNNs  preferred model is invariant to time translations
 Arbitrary: Graph NNs  preferred model is invariant to node & edge permutations
Inductive bias in NNs
Plain English: An inductive bias allows a learning algorithm to prioritize one solution (or interpretation) over another, independent of the observed data.
Bayesian perspective: inductive biases are captured by the modelâ€™s prior distribution. In this context, the specific shoice of an NN architecture can be seen as an extreme choice of a prior.
\[ \large p(\theta\X,Y)= \huge \frac{p(YX,\theta) p(\theta)}{p(YX)}\]
Technical perspective: all NN layers can be seen as special cases of fully connected layers. Inductive biases are expressed by the weightsharing restrictions that define any given layer type.
Plan for the day
 Introduction
 Inductive bias
 Inductive bias in NNs
 Convolutional operator
 Translation invariance in images
 Convolutional Neural Networks
 Advanced convolution variants
 Parameterfree convolution
 Depthwise separability & shuffle
 Deconvolution & super resolution
 Graph CNNs
Translation invariance in CNNs


Convolutional Neural Networks
\[ \large Out_{i,j}= b + \sum_{x=1}^3 \sum_{y=1}^3 In_{(i+x1),(j+y1)}W_{x,y}\]
Image  Kernel  Output 




Convolutional Neural Networks
\[ \large D_{out} = \huge \frac{(D_{in}  K + 2P)}{2} \large + 1 \]
Kernel size (3x3)  Stride (2)  Padding (1) 




Weight sharing in CNNs


Numerical example of a CNN
Numerical example of a CNN  code
= show_car(CIFAR10_trainset) image
Numerical example of a CNN  code
= show_random_kernel() conv
Numerical example of a CNN  code
= show_conv_anim(image, conv)
fig, num_steps, im, final_fm = animation.FuncAnimation(fig, update_image, num_steps,
anim =(fig, im, final_fm), interval=200, blit=True) fargs
CNN as an edge detector
= show_cute_fox() image
CNN as an edge detector
= np.array([[0,1,0],
kernel 1,4,1],
[0,1,0]])
[= show_edge_detector(kernel.T) conv
CNN as an edge detector
= show_edge_detection(image, conv)
fig, num_steps, im, final_fm = animation.FuncAnimation(fig, update_image_no_annotation, range(0, num_steps, 500), fargs=(im, final_fm),
anim =100, blit=True) interval
Learned CNN features  setup
= get_mobilenet_convbnrelu()
convlayer, bnlayer, relulayer = show_224px_fox() blue_fox
Downloading: "https://download.pytorch.org/models/mobilenet_v2b0353104.pth" to /root/.cache/torch/hub/checkpoints/mobilenet_v2b0353104.pth
{"model_id":"e55d4ca4ad03451db55cbc25eb7d423a","version_major":2,"version_minor":0}
<IPython.core.display.Javascript object>
Learned CNN features  weights at layer 3
= show_conv_weights(convlayer)
fig, num_steps, ims = animation.FuncAnimation(fig, update_row, num_steps,
anim =(ims, convlayer.weight), interval=1000, blit=True) fargs
Learned CNN features  convolution preactivation at layer 3
= show_layer_output(blue_fox, convlayer)
fig, num_steps, ims, convoutput
= animation.FuncAnimation(fig, update_row_output, num_steps,
anim =(ims, convoutput), interval=1000, blit=True) fargs