CSE 559A: Computer Vision

Use Left/Right PgUp/PgDown to navigate slides

Fall 2018: T-R: 11:30-1pm @ Lopata 101

Instructor: Ayan Chakrabarti (ayan@wustl.edu).

Course Staff: Zhihao Xia, Charlie Wu, Han Liu

Oct 18, 2018

- Project Proposals Deadline Extended to 11:59 PM Sunday Night.

- Questions about Using Deep Networks
- If the contribution of the paper you are considering is mainly the network architecture, how it was trained, you will want to make modifications and train again. This can take a huge amount of time (especially if you don't have access to a GPU).
- Some papers might be fine where most of the contribution is what you do with the trained model / network output. There, if you take a pre-trained model, you might still be able to investigate the algorithm.

- Submitting a Project you are already working on. This is fine if:
- The project actually has to do with computer vision.
- You are doing it on your own.
- You are not getting academic credit for it (another course, independent study)
- If submitting this to another course: explain what you are doing for this course and what for the other course, and send me an e-mail CC-ing the instructor of the other course.
- If doing this work with / advised by someone else: ask them to send me an e-mail confirming what part of the project is your own work.

- Deviations from Proposal
- The proposal lets us tell you that the topic is appropriate before you spend a lot of time working on a project.
- It's fine if you have to change some of the directions of the project as you are working on it.
- But let me know (private Piazza post), and get it approved.

- Problem Set 1 Grades Out
- Do a git pull on the original repo to retrieve feedback
- Sometimes, no points deducted although solution has some flaws
- In any case, even if got full points for a question, look at the key!

- Problem Set 3 due next Thursday.
- Zhihao's Recitation this Friday

**IMPORTANT Location Change**: Friday Recitations and Office Hours will be in **Lopata 103** from now on.

This is only two view stereo.

More complicated versions include finding correspondences along multiple cameras:

**Multi-view Stereo**

Let's try to solve it assuming \(u[x,y],v[x,y]\) are very small. (Very little movement).

Also assume "brightness constancy": \[I_t[x,y] = I_{t+1}[x+u[x,y],y+v[x,y]]\]

Let's try to solve it assuming \(u[x,y],v[x,y]\) are very small. (Very little movement).

Also assume "brightness constancy": \[I[x,y,t] = I[x+u[x,y],y+v[x,y],t+1]\]

Do a Taylor approximation to "linearize" the RHS.

\[I[x,y,t] = I[x+u,y+v,t+1] \approx I[x,y,t+1] + \frac{\partial}{\partial x} I[x,y,t] u + \frac{\partial}{\partial y} I[x,y,t] v\]

\[\frac{\partial}{\partial_t}I[x,y,t] + \frac{\partial}{\partial x} I[x,y,t] u + \frac{\partial}{\partial y} I[x,y,t] v \approx 0\]

\[I_t + \langle [I_x,I_y],[u,v] \rangle = 0\]

**Lucas-Kanade Method**

**Lucas-Kanade Method**

\[I_t + \langle [I_x,I_y],[u,v] \rangle = 0\]

- \(I_t\) is an image-shaped array of time gradients (subtracting \(I[x,y,t+1] - I[x,y,t]\))
- \(I_x\) and \(I_y\) are x- and y- gradient images (e.g., use Sobel filters).
- Often, you want to apply these on \(\frac{I[x,y,t+1]+I[x,y,t]}{2}\)

This is an equation on \(u[x,y],v[x,y]\) at each pixel location.

But one equation for two variables. Doesn't tell us about flow vector in

direction "orthogonal" to image gradient.

**Lucas-Kanade Method**

Solution: Assume \(u,v\) is constant in a region, and get multiple equations.

So for \(u[x,y]=u,v[x,y]=v\), consider a bunch of \(x',y'\) in a window around \(x,y\).

\[ I_x[x',y']~u + I_y[x',y']~v = -I_t[x',y']\]

Multiple equations, two variables: solve in the least squares sense.

\[ u(x,y),v(x,y) = \arg \min_{u,v} \sum_{(x',y')\in\mathcal{N}(x,y)} (I_x[x',y']~u + I_y[x',y']~v + I_t[x',y'])^2\]

\[\left[\begin{array}{cc}\sum I_x^2&\sum I_x I_y\\\sum I_xI_y& \sum I_y^2\end{array}\right] \left[\begin{array}{c}u\\v\end{array}\right] = -\left[\begin{array}{c}\sum I_xI_t\\\sum I_yI_t\end{array}\right]\]

**Lucas-Kanade Method**

\[\left[\begin{array}{cc}\sum I_x^2&\sum I_x I_y\\\sum I_xI_y& \sum I_y^2\end{array}\right] \left[\begin{array}{c}u\\v\end{array}\right] = -\left[\begin{array}{c}\sum I_xI_t\\\sum I_yI_t\end{array}\right]\]

Summations are in a window around \(x,y\).

How would you do this without looping over pixels ?

- Compute \(I_x^2,I_xI_y,I_y^2,I_tI_x,\ldots\) point-wise.
- Use convolutions to do local summation.
- Form each element of left matrix and right vector as a separate image.
- Invert by pointwise operations on these images.

**Lucas-Kanade Method**

Summations are in a window around \(x,y\).

When will you get good answers and when will you get bad answers ?

In other words, when is the matrix invertible ?

- Matrix will be all zero in a smooth region.

- Matrix will be rank 1 if all gradients in one direction.

- Good when you have general texture.

For stability, add a small value to the diagonal elements of the matrix.

**Lucas-Kanade Method**

\[\left[\begin{array}{cc}\sum I_x^2+\epsilon&\sum I_x I_y\\\sum I_xI_y& \sum I_y^2+\epsilon\end{array}\right] \left[\begin{array}{c}u\\v\end{array}\right] = -\left[\begin{array}{c}\sum I_xI_t\\\sum I_yI_t\end{array}\right]\]

Summations are in a window around \(x,y\).

When will you get good answers and when will you get bad answers ?

In other words, when is the matrix invertible ?

- Matrix will be all zero in a smooth region.
- Matrix will be rank 1 if all gradients in one direction.
- Good when you have general texture.

For stability, add a small value to the diagonal elements of the matrix.

**Lucas-Kanade Method**

Pyramid / Hierarchical Variant to handle large displacements

- First downsample images, solve it at a coarser scale.
- Then, upsample the flow-field, and compute displacement from that flow field.

So basically, warp image \(I_{t+1}\) based on flow-field from coarser level, and

now find differential motion beyond that.

**Horn-Schunck Method**

- Lucas-Kanade assumes flow field is constant in a window.
- But what if you have a slanted plane that is moving---flow will change continuously.

- Better assumption: assume second derivative is 0.

Regularized estimation for some \(\alpha > 0\):

\[ \{u(x,y),v(x,y)\} = \arg \min~\sum_{x,y}~~(I_x[x,y]~u(x,y) + I_y[x,y]~v(x,y) + I_t[x,y])^2 + \alpha (\Delta u(x,y))^2 + \alpha (\Delta v(x,y))^2\]

- Here, \(\Delta u(x,y) = u(x,y) - \bar{u}(x,y)\), where \(\bar{u} = G*\bar{u}\) is some spatially smoothed version of \(u\).
- \(\Delta\) called the Laplace operator (remember, Laplacian pyramid)

- Now all pixels depend on all pixels. General optimization.

**Horn-Schunck Method**

\[ \{u(x,y),v(x,y)\} = \arg \min~\sum_{x,y}~~(I_x[x,y]~u(x,y) + I_y[x,y]~v(x,y) + I_t[x,y])^2 + \alpha (\Delta u(x,y))^2 + \alpha (\Delta v(x,y))^2\]

Iterative Solution

- Initialize flow map to some \(u^0,v^0\).
- At each iteration \(k+1\), solve assuming \(\bar{u},\bar{v}\) from previous iteration:

\[ \{u^{k+1}(x,y),v^{k+1}(x,y)\} = \arg \min~\sum_{x,y}~~(I_x[x,y]~u(x,y) + I_y[x,y]~v(x,y) + I_t[x,y])^2 \] \[ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+ \alpha (u(x,y)-\bar{u}^k(x,y))^2 + \alpha (v(x,y)-\bar{v}^k(x,y))^2\]

- Can be done for each pixel independently.

- Global smoothness cost means that flow in large smooth regions will be propagated from outside.

**State-of-the Art Methods**

- Use energy minimization, complex features, contours, \(\ldots\)

Jerome Revaud, Philippe Weinzaepfel, Zaid Harchaoui and Cordelia Schmid

**EpicFlow: Edge-Preserving Interpolation of Correspondences for Optical Flow**

CVPR 2015.

http://sintel.is.tue.mpg.de/results - For a leaderboard of best results.