%reload_ext autoreload
%autoreload 2
%matplotlib inline
How convolutions work
from exp.nb_07 import *
from PIL import Image
import numpy as np
Let's grab out MNIST dataset.
x_train, y_train, x_valid, y_valid = get_data()
x_train.shape
Each row of the dataset is an image,
x_train[0].shape
But in order to properly view them we'll reshape them for now into rank 3 tensors:
[channels, height, width]
x_train = x_train.view(-1, 28,28)
five = x_train[0]
plt.imshow(five)
Let's try to create a top edge detection kernel from scratch and convolve it over the image.
k = tensor([
[1.,1.,1.],
[-1.,-1.,-1.,],
[0.,0.,0.]
]); k.shape
Pytorch F.conv2d
requires [batch size, channels, height, width]
so we'll reshape using view
Then we can pass it into our nn.Conv2d
with our kernel and since the stride=1
the result will be a rank 3 tensor
`[filters, channels, height, width]'
top = F.conv2d(five.view(1,1,28,28), k[None, None])
top.shape
plt.imshow(top.squeeze())
And if we transpose our kernel it looks like it will detect edges on the leftside of an object.
k.t()
left = F.conv2d(five.view(1,1,28,28), k.t()[None, None])
plt.imshow(left.squeeze())
So how do this basic
F.adaptive_avg_pool2d
Applies a 2D adaptive average pooling over an input signal composed of several input planes.
avg_pool_1d = F.adaptive_avg_pool2d(feature_map, 1); avg_pool_1d.shape
feature_map.squeeze().view(-1).mean()
avg_pool_1d.squeeze()
F.adaptive_max_pool2d
Applies a 2D adaptive max pooling over an input signal composed of several input planes.
Max pooling will return a tensor of the max of some specified shape.
max_pool = F.adaptive_max_pool2d(feature_map, 1); max_pool.shape
feature_map.squeeze().view(-1).max()
max_pool.squeeze()
Let's look at the activations of a pretrained model.
from fastai.vision import *
model = models.resnet34(pretrained=True)
To pass an image we need to: normalize, turn into a mini-batch, and put onto GPU
Resources