CSE 559A: Computer Vision

Use Left/Right PgUp/PgDown to navigate slides

Fall 2018: T-R: 11:30-1pm @ Lopata 101

Instructor: Ayan Chakrabarti (ayan@wustl.edu).
Course Staff: Zhihao Xia, Charlie Wu, Han Liu

Oct 18, 2018

# General

• Project Proposals Deadline Extended to 11:59 PM Sunday Night.
• Questions about Using Deep Networks
• If the contribution of the paper you are considering is mainly the network architecture, how it was trained, you will want to make modifications and train again. This can take a huge amount of time (especially if you don't have access to a GPU).
• Some papers might be fine where most of the contribution is what you do with the trained model / network output. There, if you take a pre-trained model, you might still be able to investigate the algorithm.
• Submitting a Project you are already working on. This is fine if:
• The project actually has to do with computer vision.
• You are doing it on your own.
• You are not getting academic credit for it (another course, independent study)
• If submitting this to another course: explain what you are doing for this course and what for the other course, and send me an e-mail CC-ing the instructor of the other course.
• If doing this work with / advised by someone else: ask them to send me an e-mail confirming what part of the project is your own work.

# General

• Deviations from Proposal
• The proposal lets us tell you that the topic is appropriate before you spend a lot of time working on a project.
• It's fine if you have to change some of the directions of the project as you are working on it.
• But let me know (private Piazza post), and get it approved.
• Problem Set 1 Grades Out
• Do a git pull on the original repo to retrieve feedback
• Sometimes, no points deducted although solution has some flaws
• In any case, even if got full points for a question, look at the key!
• Problem Set 3 due next Thursday.
• Zhihao's Recitation this Friday

IMPORTANT Location Change: Friday Recitations and Office Hours will be in Lopata 103 from now on.

# Stereo Roundup

• This is only two view stereo.

• More complicated versions include finding correspondences along multiple cameras: Multi-view Stereo

# Optical Flow

Let's try to solve it assuming $$u[x,y],v[x,y]$$ are very small. (Very little movement).

Also assume "brightness constancy": $I_t[x,y] = I_{t+1}[x+u[x,y],y+v[x,y]]$

# Optical Flow

Let's try to solve it assuming $$u[x,y],v[x,y]$$ are very small. (Very little movement).

Also assume "brightness constancy": $I[x,y,t] = I[x+u[x,y],y+v[x,y],t+1]$

Do a Taylor approximation to "linearize" the RHS.

$I[x,y,t] = I[x+u,y+v,t+1] \approx I[x,y,t+1] + \frac{\partial}{\partial x} I[x,y,t] u + \frac{\partial}{\partial y} I[x,y,t] v$

$\frac{\partial}{\partial_t}I[x,y,t] + \frac{\partial}{\partial x} I[x,y,t] u + \frac{\partial}{\partial y} I[x,y,t] v \approx 0$

$I_t + \langle [I_x,I_y],[u,v] \rangle = 0$

# Optical Flow

$I_t + \langle [I_x,I_y],[u,v] \rangle = 0$

• $$I_t$$ is an image-shaped array of time gradients (subtracting $$I[x,y,t+1] - I[x,y,t]$$)
• $$I_x$$ and $$I_y$$ are x- and y- gradient images (e.g., use Sobel filters).
• Often, you want to apply these on $$\frac{I[x,y,t+1]+I[x,y,t]}{2}$$

This is an equation on $$u[x,y],v[x,y]$$ at each pixel location.

But one equation for two variables. Doesn't tell us about flow vector in

# Optical Flow

Solution: Assume $$u,v$$ is constant in a region, and get multiple equations.

So for $$u[x,y]=u,v[x,y]=v$$, consider a bunch of $$x',y'$$ in a window around $$x,y$$.

$I_x[x',y']~u + I_y[x',y']~v = -I_t[x',y']$

Multiple equations, two variables: solve in the least squares sense.

$u(x,y),v(x,y) = \arg \min_{u,v} \sum_{(x',y')\in\mathcal{N}(x,y)} (I_x[x',y']~u + I_y[x',y']~v + I_t[x',y'])^2$

$\left[\begin{array}{cc}\sum I_x^2&\sum I_x I_y\\\sum I_xI_y& \sum I_y^2\end{array}\right] \left[\begin{array}{c}u\\v\end{array}\right] = -\left[\begin{array}{c}\sum I_xI_t\\\sum I_yI_t\end{array}\right]$

# Optical Flow

$\left[\begin{array}{cc}\sum I_x^2&\sum I_x I_y\\\sum I_xI_y& \sum I_y^2\end{array}\right] \left[\begin{array}{c}u\\v\end{array}\right] = -\left[\begin{array}{c}\sum I_xI_t\\\sum I_yI_t\end{array}\right]$

Summations are in a window around $$x,y$$.

How would you do this without looping over pixels ?

• Compute $$I_x^2,I_xI_y,I_y^2,I_tI_x,\ldots$$ point-wise.
• Use convolutions to do local summation.
• Form each element of left matrix and right vector as a separate image.
• Invert by pointwise operations on these images.

# Optical Flow

$\left[\begin{array}{cc}\sum I_x^2&\sum I_x I_y\\\sum I_xI_y& \sum I_y^2\end{array}\right] \left[\begin{array}{c}u\\v\end{array}\right] = -\left[\begin{array}{c}\sum I_xI_t\\\sum I_yI_t\end{array}\right]$

Summations are in a window around $$x,y$$.

In other words, when is the matrix invertible ?

• Matrix will be all zero in a smooth region.
• Matrix will be rank 1 if all gradients in one direction.
• Good when you have general texture.

For stability, add a small value to the diagonal elements of the matrix.

# Optical Flow

$\left[\begin{array}{cc}\sum I_x^2+\epsilon&\sum I_x I_y\\\sum I_xI_y& \sum I_y^2+\epsilon\end{array}\right] \left[\begin{array}{c}u\\v\end{array}\right] = -\left[\begin{array}{c}\sum I_xI_t\\\sum I_yI_t\end{array}\right]$

Summations are in a window around $$x,y$$.

In other words, when is the matrix invertible ?

• Matrix will be all zero in a smooth region.
• Matrix will be rank 1 if all gradients in one direction.
• Good when you have general texture.

For stability, add a small value to the diagonal elements of the matrix.

# Optical Flow

Pyramid / Hierarchical Variant to handle large displacements

• First downsample images, solve it at a coarser scale.
• Then, upsample the flow-field, and compute displacement from that flow field.

So basically, warp image $$I_{t+1}$$ based on flow-field from coarser level, and
now find differential motion beyond that.

# Optical Flow

Horn-Schunck Method

• Lucas-Kanade assumes flow field is constant in a window.
• But what if you have a slanted plane that is moving---flow will change continuously.
• Better assumption: assume second derivative is 0.

Regularized estimation for some $$\alpha > 0$$:

$\{u(x,y),v(x,y)\} = \arg \min~\sum_{x,y}~~(I_x[x,y]~u(x,y) + I_y[x,y]~v(x,y) + I_t[x,y])^2 + \alpha (\Delta u(x,y))^2 + \alpha (\Delta v(x,y))^2$

• Here, $$\Delta u(x,y) = u(x,y) - \bar{u}(x,y)$$, where $$\bar{u} = G*\bar{u}$$ is some spatially smoothed version of $$u$$.
• $$\Delta$$ called the Laplace operator (remember, Laplacian pyramid)
• Now all pixels depend on all pixels. General optimization.

# Optical Flow

Horn-Schunck Method

$\{u(x,y),v(x,y)\} = \arg \min~\sum_{x,y}~~(I_x[x,y]~u(x,y) + I_y[x,y]~v(x,y) + I_t[x,y])^2 + \alpha (\Delta u(x,y))^2 + \alpha (\Delta v(x,y))^2$

Iterative Solution

• Initialize flow map to some $$u^0,v^0$$.
• At each iteration $$k+1$$, solve assuming $$\bar{u},\bar{v}$$ from previous iteration:

$\{u^{k+1}(x,y),v^{k+1}(x,y)\} = \arg \min~\sum_{x,y}~~(I_x[x,y]~u(x,y) + I_y[x,y]~v(x,y) + I_t[x,y])^2$ $~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~+ \alpha (u(x,y)-\bar{u}^k(x,y))^2 + \alpha (v(x,y)-\bar{v}^k(x,y))^2$

• Can be done for each pixel independently.
• Global smoothness cost means that flow in large smooth regions will be propagated from outside.

# Optical Flow

State-of-the Art Methods

• Use energy minimization, complex features, contours, $$\ldots$$

Jerome Revaud, Philippe Weinzaepfel, Zaid Harchaoui and Cordelia Schmid
EpicFlow: Edge-Preserving Interpolation of Correspondences for Optical Flow
CVPR 2015.

http://sintel.is.tue.mpg.de/results - For a leaderboard of best results.