CSE 559A: Computer Vision

Use Left/Right PgUp/PgDown to navigate slides

Fall 2018: T-R: 11:30-1pm @ Lopata 101

Instructor: Ayan Chakrabarti (ayan@wustl.edu).
Course Staff: Zhihao Xia, Charlie Wu, Han Liu

Sep 4, 2018

Out of 79 people enrolled and waitlisted:

• 64 (-15) have submitted their public key.
• 59 (-5) have cloned the PSET 0 repo.
• 47 (-12) have finished and submitted problem set 0.

• Submit public key and make sure you can clone the repo ASAP.
• Note that in the command: git clone cse559@euclid.seas.wustl.edu:wustl.key/psetN
• Replace psetN with pset0, pset1, etc.
• Finish and submit pset 0 by end of Thursday.

• A couple of people haven't completed sign up form. Do that TODAY.

• COMPLETE THE INFORMATION SECTION
• Do this for every problem set
• Tell us how much time the problem set took
• Let us know who you had discussions with, what external resources you used.
• If you used none, say that. But don't leave it blank, at the default template.
• Read the collaboration and late policies
• Replace the placeholders for your name / WUSTL key in the tex file header.
• Do a "git pull; git log" after every push to read confirmation commit message.

PSET 1 is out early

• Available to clone. Go ahead take a look.
• We haven't covered all the material yet.
• By today's class, we should cover everything you need for the first 3-4 problems.
• Not a bad idea to use the extra time to get started early.
• Lecture slides are being posted after class to the course website
• First two lectures already up

# Convention

RECAP

• An image $$X$$ is an array* of intensities.
• $$X[n]$$ or $$X[n_x,n_y]$$ refers to intensities for a particular pixel at location $$n$$ or $$[n_x,n_y]$$.
• Single index denotes $$n = [n_x,n_y]^T$$ is a vector of two integers.
• Each $$X[n]$$ is a scalar for a grayscale image, or a 3-vector for an RGB color image.
(Unless otherwise specified, vector implies column vector)

*Clarification: numpy convention is H x W x C: (vertical, horizontal, channels) or H x W.

Do not think of single-channel images themselves as matrices !
It makes no sense to "matrix multiply" a 80x60 pixel image with a 60x20 pixel image.

# Convention: Linear Operations

• But sometimes, we want to interpret operations as linear on all intensities / intensity vectors in an image.
• Stack all pixel locations, in some pre-determined order, as rows. Represent $$X$$ as:
• $$(HW) \times 3$$ matrix: color images
• $$(HW) \times 1$$ vector: grayscale images.

$Y[n] = C~X[n] \Rightarrow Y = ~\color{red}{\mathbf ?}~~~~~$

# Convention: Linear Operations

• But sometimes, we want to interpret operations as linear on all intensities / intensity vectors in an image.
• Stack all pixel locations, in some pre-determined order, as rows. Represent $$X$$ as:
• $$(HW) \times 3$$ matrix: color images
• $$(HW) \times 1$$ vector: grayscale images.

$Y[n] = C~X[n] \Rightarrow Y = X~C^T$

# Begin with X as (H,W,3) array
Xflt = np.reshape(X,(-1,3))    # Flatten X to a (H*W, 3) matrix
Yflt = np.matmul(Xflt,C.T)     # Post-multiply by C
Y = np.reshape(Yflt,X.shape)   # Turn Y back to an image array

# Convolution

$\text{Notation: } Y = X * k$

$Y[n] = \sum_{n'} k[n']~~X[n-n']$

$Y[n_x,n_y] = \sum_{n'_x}~~\sum_{n'_y}~~k\big[n'_x,~n'_y\big]~~X\big[(n_x-n'_x),~(n_y-n'_y)\big]$

• Double summation over the support / size of the kernel $$k$$

• We assume $$k[n] \in \mathbb{R}$$ is scalar vaued.
• If $$X[n]$$ is scalar, so is $$Y[n]$$.
• If $$X$$ is a color image, each channel convolved with $$k$$ independently.

To go from m to n channels in a "conv layer": $$k[n]\in\mathbb{R}^{n\times m}$$ is matrix valued, and $$k[n']~X[n-n']$$ is a matrix-vector product.

# Convolution: Properties

Let $$X *_{\tiny \text{full}} k$$, $$X *_{\tiny \text{val}} k$$, and $$X *_{\tiny \text{same}} k$$ denote full, valid, and same convolution (with zero padding for full and same)

• Linear / Distributive: For scalars $$\alpha, \beta$$;
• If $$Y = X * k$$, then: $$~~X * (\alpha k) = (\alpha X) * k = \alpha Y$$
• If $$Y_1 = X * k_1$$ and $$Y_2 = X * k_2$$, ($$k_1$$, $$k_2$$ same size): $$~~X * (\alpha k_1 + \beta k_2) = \alpha Y_1 + \beta Y_2$$
• If $$Y_1 = X_1 * k$$ and $$Y_2 = X_2 * k$$, ($$X_1$$, $$X_2$$ same size): $$~~(\alpha X_1 + \beta X_2) * k = \alpha Y_1 + \beta Y_2$$

$X * (\alpha k_1 + \beta k_2)~[n] = \sum_{n'} (\alpha k_1[n'] + \beta k_2[n']) X[n-n']$ $= \alpha \sum_{n'} k_1[n'] X[n-n'] + \beta \sum_{n'} k_2[n'] X[n-n']$

# Convolution: Properties

Let $$X *_{\tiny \text{full}} k$$, $$X *_{\tiny \text{val}} k$$, and $$X *_{\tiny \text{same}} k$$ denote full, valid, and same convolution (with zero padding for full and same)

• Linear / Distributive: For scalars $$\alpha, \beta$$;
• If $$Y = X * k$$, then: $$~~X * (\alpha k) = (\alpha X) * k = \alpha Y$$
• If $$Y_1 = X * k_1$$ and $$Y_2 = X * k_2$$, ($$k_1$$, $$k_2$$ same size): $$~~X * (\alpha k_1 + \beta k_2) = \alpha Y_1 + \beta Y_2$$
• If $$Y_1 = X_1 * k$$ and $$Y_2 = X_2 * k$$, ($$X_1$$, $$X_2$$ same size): $$~~(\alpha X_1 + \beta X_2) * k = \alpha Y_1 + \beta Y_2$$
• Associative
• $$\big(X *_{\tiny \text{full}} k_1\big) *_{\tiny \text{full}} k_2 = X *_{\tiny \text{full}} \big(k_1 *_{\tiny \text{full}} k_2\big)$$
• $$\big(X *_{\tiny \text{val}} k_1\big) *_{\tiny \text{val}} k_2 = X *_{\tiny \text{val}} \big(k_1 *_{\tiny \text{full}} k_2\big)$$
• $$\big(X *_{\tiny \text{same}} k_1\big) *_{\tiny \text{same}} k_2 \color{red}{\neq} X *_{\tiny \text{same}} \big(k_1 *_{\tiny \text{full}} k_2\big)$$

$\big(X *_{\tiny \text{full}} k_1\big) *_{\tiny \text{full}} k_2~~[n] = \sum_{n'} k_2[n'] (X*k_1)[n-n'] = \sum_{n'} k_2[n'] \sum_{n''} k_1[n''] X[n-n'-n'']$

$= \sum_{n'+n''} \left(\sum_{n'}k_2[n']k_1[(n'+n'')-n']\right) X[n-(n'+n'')]$

# Convolution: Properties

Let $$X *_{\tiny \text{full}} k$$, $$X *_{\tiny \text{val}} k$$, and $$X *_{\tiny \text{same}} k$$ denote full, valid, and same convolution (with zero padding for full and same)

• Linear / Distributive: For scalars $$\alpha, \beta$$;
• If $$Y = X * k$$, then: $$~~X * (\alpha k) = (\alpha X) * k = \alpha Y$$
• If $$Y_1 = X * k_1$$ and $$Y_2 = X * k_2$$, ($$k_1$$, $$k_2$$ same size): $$~~X * (\alpha k_1 + \beta k_2) = \alpha Y_1 + \beta Y_2$$
• If $$Y_1 = X_1 * k$$ and $$Y_2 = X_2 * k$$, ($$X_1$$, $$X_2$$ same size): $$~~(\alpha X_1 + \beta X_2) * k = \alpha Y_1 + \beta Y_2$$
• Associative
• $$\big(X *_{\tiny \text{full}} k_1\big) *_{\tiny \text{full}} k_2 = X *_{\tiny \text{full}} \big(k_1 *_{\tiny \text{full}} k_2\big)$$
• $$\big(X *_{\tiny \text{val}} k_1\big) *_{\tiny \text{val}} k_2 = X *_{\tiny \text{val}} \big(k_1 *_{\tiny \text{full}} k_2\big)$$
• $$\big(X *_{\tiny \text{same}} k_1\big) *_{\tiny \text{same}} k_2 \color{red}{\neq} X *_{\tiny \text{same}} \big(k_1 *_{\tiny \text{full}} k_2\big)$$
• Commutative: $$~~~k_1 *_{\tiny \text{full}} k_2 = k_2 *_{\tiny \text{full}} k_1$$
• $$\big(X *_{\tiny \text{full}} k_1\big) *_{\tiny \text{full}} k_2 = \big(X *_{\tiny \text{full}} k_2\big) *_{\tiny \text{full}} k_1$$
• $$\big(X *_{\tiny \text{val}} k_1\big) *_{\tiny \text{val}} k_2 = \big(X *_{\tiny \text{val}} k_2\big) *_{\tiny \text{val}} k_1$$
• $$\big(X *_{\tiny \text{same}} k_1\big) *_{\tiny \text{same}} k_2 \color{red}{\neq} \big(X *_{\tiny \text{same}} k_2\big) *_{\tiny \text{same}} k_1$$