CSE 559A: Computer Vision

Use Left/Right PgUp/PgDown to navigate slides

Fall 2018: T-R: 11:30-1pm @ Lopata 101

Instructor: Ayan Chakrabarti (ayan@wustl.edu).

Course Staff: Zhihao Xia, Charlie Wu, Han Liu

Sep 4, 2018

Out of 79 people enrolled and waitlisted:

- 64 (-15) have submitted their public key.

- 59 (-5) have cloned the PSET 0 repo.

- 47 (-12) have finished and submitted problem set 0.

- Submit public key and make sure you can clone the repo ASAP.
- Note that in the command:
`git clone cse559@euclid.seas.wustl.edu:wustl.key/psetN`- Replace wustl.key with your WUSTL key username
- Replace psetN with pset0, pset1, etc.

- Note that in the command:

- Finish and submit pset 0 by end of Thursday.

- A couple of people haven't completed sign up form. Do that TODAY.

**Comments about Submitted problem sets**

- COMPLETE THE INFORMATION SECTION
- Do this for every problem set
- Tell us how much time the problem set took
- Let us know who you had discussions with, what external resources you used.
- If you used none, say that. But don't leave it blank, at the default template.
**Read the collaboration and late policies**

- Replace the placeholders for your name / WUSTL key in the tex file header.

- Do a "
`git pull; git log`" after every push to read confirmation commit message.

**PSET 1 is out early**

- Available to clone. Go ahead take a look.

- We haven't covered all the material yet.
- By today's class, we should cover everything you need for the first 3-4 problems.

- Not a bad idea to use the extra time to get started early.

- Lecture slides are being posted after class to the course website
- First two lectures already up

**RECAP**

- An image \(X\) is an
*array** of intensities.

- \(X[n]\) or \(X[n_x,n_y]\) refers to intensities for a particular pixel at location \(n\) or \([n_x,n_y]\).
- Single index denotes \(n = [n_x,n_y]^T\) is a vector of two integers.

- Each \(X[n]\) is a scalar for a grayscale image, or a 3-vector for an RGB color image.

(Unless otherwise specified, vector implies column vector)

*Clarification: numpy convention is H x W x C: (vertical, horizontal, channels) or H x W.

**Do not think of single-channel images themselves as matrices !**

It makes no sense to "matrix multiply" a 80x60 pixel image with a 60x20 pixel image.

- But sometimes, we want to interpret operations as linear on all intensities / intensity vectors in an image.

- Stack all pixel locations, in some pre-determined order, as rows. Represent \(X\) as:
- \((HW) \times 3\) matrix: color images
- \((HW) \times 1\) vector: grayscale images.

\[Y[n] = C~X[n] \Rightarrow Y = ~\color{red}{\mathbf ?}~~~~~\]

- But sometimes, we want to interpret operations as linear on all intensities / intensity vectors in an image.
- Stack all pixel locations, in some pre-determined order, as rows. Represent \(X\) as:
- \((HW) \times 3\) matrix: color images
- \((HW) \times 1\) vector: grayscale images.

\[Y[n] = C~X[n] \Rightarrow Y = X~C^T\]

```
# Begin with X as (H,W,3) array
Xflt = np.reshape(X,(-1,3)) # Flatten X to a (H*W, 3) matrix
Yflt = np.matmul(Xflt,C.T) # Post-multiply by C
Y = np.reshape(Yflt,X.shape) # Turn Y back to an image array
```

\[\text{Notation: } Y = X * k \]

\[Y[n] = \sum_{n'} k[n']~~X[n-n']\]

\[Y[n_x,n_y] = \sum_{n'_x}~~\sum_{n'_y}~~k\big[n'_x,~n'_y\big]~~X\big[(n_x-n'_x),~(n_y-n'_y)\big]\]

Double summation over the support / size of the kernel \(k\)

- We assume \(k[n] \in \mathbb{R}\) is scalar vaued.
- If \(X[n]\) is scalar, so is \(Y[n]\).
- If \(X\) is a color image, each channel convolved with \(k\) independently.

To go from m to n channels in a "conv layer": \(k[n]\in\mathbb{R}^{n\times m}\) is matrix valued, and \(k[n']~X[n-n']\) is a matrix-vector product.

Let \(X *_{\tiny \text{full}} k\), \(X *_{\tiny \text{val}} k\), and \(X *_{\tiny \text{same}} k\) denote full, valid, and same convolution (with zero padding for full and same)

**Linear / Distributive**: For scalars \(\alpha, \beta\);- If \(Y = X * k\), then: \(~~X * (\alpha k) = (\alpha X) * k = \alpha Y\)
- If \(Y_1 = X * k_1\) and \(Y_2 = X * k_2\), (\(k_1\), \(k_2\) same size): \(~~X * (\alpha k_1 + \beta k_2) = \alpha Y_1 + \beta Y_2\)
- If \(Y_1 = X_1 * k\) and \(Y_2 = X_2 * k\), (\(X_1\), \(X_2\) same size): \(~~(\alpha X_1 + \beta X_2) * k = \alpha Y_1 + \beta Y_2\)

\[X * (\alpha k_1 + \beta k_2)~[n] = \sum_{n'} (\alpha k_1[n'] + \beta k_2[n']) X[n-n']\] \[= \alpha \sum_{n'} k_1[n'] X[n-n'] + \beta \sum_{n'} k_2[n'] X[n-n']\]

Let \(X *_{\tiny \text{full}} k\), \(X *_{\tiny \text{val}} k\), and \(X *_{\tiny \text{same}} k\) denote full, valid, and same convolution (with zero padding for full and same)

**Linear / Distributive**: For scalars \(\alpha, \beta\);- If \(Y = X * k\), then: \(~~X * (\alpha k) = (\alpha X) * k = \alpha Y\)
- If \(Y_1 = X * k_1\) and \(Y_2 = X * k_2\), (\(k_1\), \(k_2\) same size): \(~~X * (\alpha k_1 + \beta k_2) = \alpha Y_1 + \beta Y_2\)
- If \(Y_1 = X_1 * k\) and \(Y_2 = X_2 * k\), (\(X_1\), \(X_2\) same size): \(~~(\alpha X_1 + \beta X_2) * k = \alpha Y_1 + \beta Y_2\)

**Associative**- \(\big(X *_{\tiny \text{full}} k_1\big) *_{\tiny \text{full}} k_2 = X *_{\tiny \text{full}} \big(k_1 *_{\tiny \text{full}} k_2\big)\)
- \(\big(X *_{\tiny \text{val}} k_1\big) *_{\tiny \text{val}} k_2 = X *_{\tiny \text{val}} \big(k_1 *_{\tiny \text{full}} k_2\big)\)
- \(\big(X *_{\tiny \text{same}} k_1\big) *_{\tiny \text{same}} k_2 \color{red}{\neq} X *_{\tiny \text{same}} \big(k_1 *_{\tiny \text{full}} k_2\big)\)

\[\big(X *_{\tiny \text{full}} k_1\big) *_{\tiny \text{full}} k_2~~[n] = \sum_{n'} k_2[n'] (X*k_1)[n-n'] = \sum_{n'} k_2[n'] \sum_{n''} k_1[n''] X[n-n'-n'']\]

\[= \sum_{n'+n''} \left(\sum_{n'}k_2[n']k_1[(n'+n'')-n']\right) X[n-(n'+n'')]\]

Let \(X *_{\tiny \text{full}} k\), \(X *_{\tiny \text{val}} k\), and \(X *_{\tiny \text{same}} k\) denote full, valid, and same convolution (with zero padding for full and same)

**Linear / Distributive**: For scalars \(\alpha, \beta\);- If \(Y = X * k\), then: \(~~X * (\alpha k) = (\alpha X) * k = \alpha Y\)
- If \(Y_1 = X * k_1\) and \(Y_2 = X * k_2\), (\(k_1\), \(k_2\) same size): \(~~X * (\alpha k_1 + \beta k_2) = \alpha Y_1 + \beta Y_2\)
- If \(Y_1 = X_1 * k\) and \(Y_2 = X_2 * k\), (\(X_1\), \(X_2\) same size): \(~~(\alpha X_1 + \beta X_2) * k = \alpha Y_1 + \beta Y_2\)

**Associative**- \(\big(X *_{\tiny \text{full}} k_1\big) *_{\tiny \text{full}} k_2 = X *_{\tiny \text{full}} \big(k_1 *_{\tiny \text{full}} k_2\big)\)
- \(\big(X *_{\tiny \text{val}} k_1\big) *_{\tiny \text{val}} k_2 = X *_{\tiny \text{val}} \big(k_1 *_{\tiny \text{full}} k_2\big)\)
- \(\big(X *_{\tiny \text{same}} k_1\big) *_{\tiny \text{same}} k_2 \color{red}{\neq} X *_{\tiny \text{same}} \big(k_1 *_{\tiny \text{full}} k_2\big)\)

**Commutative**: \(~~~k_1 *_{\tiny \text{full}} k_2 = k_2 *_{\tiny \text{full}} k_1\)- \(\big(X *_{\tiny \text{full}} k_1\big) *_{\tiny \text{full}} k_2 = \big(X *_{\tiny \text{full}} k_2\big) *_{\tiny \text{full}} k_1\)
- \(\big(X *_{\tiny \text{val}} k_1\big) *_{\tiny \text{val}} k_2 = \big(X *_{\tiny \text{val}} k_2\big) *_{\tiny \text{val}} k_1\)
- \(\big(X *_{\tiny \text{same}} k_1\big) *_{\tiny \text{same}} k_2 \color{red}{\neq} \big(X *_{\tiny \text{same}} k_2\big) *_{\tiny \text{same}} k_1\)