CSE 559A: Computer Vision

Use Left/Right PgUp/PgDown to navigate slides

Fall 2018: T-R: 11:30-1pm @ Lopata 101

Instructor: Ayan Chakrabarti (ayan@wustl.edu).
Course Staff: Zhihao Xia, Charlie Wu, Han Liu

Oct 11, 2018

# General

• Still missing a few problem set 2 submissions
• Make sure you have git push-ed.
• Do a git pull; git log and make sure the latest log message confirms your submission.
• Problem set 3 due two weeks from today.
• No Class Tuesday (Fall Break)

• No office hours tomorrow or Monday. Recitation next Friday.

# Global Optimization

Last Time

• We define a cost volume $$C$$ of size $$W\times H\times D$$
• $$C[x,y,d]$$ measures dis-similarity between $$(x,y)$$ in left image
and $$(x-d,y)$$ in right image
• Simplest Approach: $$d[x,y] = \arg min_d C[x,y,d]$$
• Too noisy
• Want to express that disparity (and therefore depth) of nearby pixels is similar

• Ad-hoc Method: Cost Volume Filtering

• Only encodes that nearby pixel disparities are exactly equal.

# Global Optimization

$d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])$

• $$n=[x,y]^T$$ for pixel location.
• $$C$$ is cost-volume as before. Gives us "local evidence"
• $$\mathbf{E}$$ is a set of all pairs of pixels that are "neighbors" / adjacent in some way.
• Can include all un-ordered pairs of pixels with $$[(x,y),(x-1,y)]$$ and $$[(x,y),(x,y-1)]$$ (four connected)
• Or diagonal neighbors as well.
• $$S$$ is a function that indicates a preference for $$d[n]$$ and $$d[n']$$ to be the same.

# Global Optimization

$d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])$

• $$S$$ is a function that indicates a preference for $$d[n]$$ and $$d[n']$$ to be the same.

• Choice 1:
• $$0$$ if $$d[n']=d[n]$$, $$1$$ otherwise.
• Choice 2: $$|d[n']-d[n]|$$
• Choice 3:
• $$0$$ if $$d[n']=d[n]$$
• $$T_1$$ if $$|d[n']-d[n]| < \epsilon$$
• $$T_2$$ otherwise.

How do we solve this ?

Note that this is a discrete minimization. Each $$d[n] \in \{0,1,\ldots D-1\}$$.

# Global Optimization

$d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])$

One approach: Iterated Conditional Modes

• Begin with $$d_0 = \arg min_d C[n,d[n]]$$
• At each iteration $$t$$, compute $$d_{t+1}$$ from $$d_t$$, by solving
for each pixel in $$d_{t+1}$$ assuming neighbors have values from $$d_t$$.

$d_{t+1}[n] = \arg \min_{d_n} C[n,d_n] + \lambda \sum_{(n,n') \in \mathbf{E_n}} S(d_n,d_{t}[n'])$

• So for each pixel,
• Take matching cost.
• Add smoothness cost from its neighbors, assuming values from previous iteration.
• Minimize.

Does it converge ?

No Guarantee: We are changing all pixel assignments simultaneously.

# Global Optimization

$d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])$

Per-pixel Iterated Conditional Modes (slow!)

• Begin with $$d_0 = \arg \min_d C[n,d[n]]$$
• At each iteration $$t$$, compute $$d_{t+1}$$ from $$d_t$$, by solving
for one pixel in $$d_{t+1}$$ assuming neighbors have values from $$d_t$$.

$d_{t+1}[n_{t+1}] = \arg \min_{d_n} C[n_{t+1},d_n] + \lambda \sum_{(n_{t+1},n') \in \mathbf{E_{n_{t+1}}}} S(d_n,d_{t}[n'])$

Does it converge ?

• Each iteration decreases the cost. So it will converge (but to a local optimum).

# Global Optimization

$d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])$

• These kind of cost functions / optimization problems are quite common in vision.
• The cost can be interpreted as a log probability distribution:

$p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)$

• Joint distribution over all the $$d[n]$$ values.

# Global Optimization

$p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)$

• Joint distribution over all the $$d[n]$$ values.
• Graphical Model: Probability Distribution Represented as a "Graph" $$(V,E)$$

$p(\{v \in V\}) = \prod_{v\in V} \Psi_v(v) \prod_{(v_1,v_2)\in E} \Phi_{v_1,v_2}(v_1,v_2)$

• Unary term for each node, pair-wise term for each edge.

(Directed Graphs represent Bayesian Networks)

# Global Optimization

$p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)$

Question: Are $$d[n]$$ and $$d[n']$$ independent if:

• If $$(n,n') \in \mathbf{E}$$ -- pixels are neighbors ?

# Global Optimization

$p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)$

Question: Are $$d[n]$$ and $$d[n']$$ independent if:

• If $$(n,n') \in \mathbf{E}$$ -- pixels are neighbors ?

Reminder: Two variables are independent if we can express their joint distribution as a product of distributions on each variable.

# Global Optimization

$p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)$

Question: Are $$d[n]$$ and $$d[n']$$ independent if:

• If $$(n,n') \in \mathbf{E}$$ -- pixels are neighbors. No
• If $$(n,n') \notin \mathbf{E}$$ -- pixels are not neighbors ?

# Global Optimization

$p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)$

Question: Are $$d[n]$$ and $$d[n']$$ independent if:

• If $$(n,n') \in \mathbf{E}$$ -- pixels are neighbors. No
• If $$(n,n') \notin \mathbf{E}$$ -- pixels are not neighbors ? NO.

# Global Optimization

$p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)$

Question: Are $$d[n]$$ and $$d[n']$$ independent if:

• If $$(n,n') \notin \mathbf{E}$$ -- pixels are not neighbors ? NO. Unless $$n,n'$$ are parts of disconnected components of graph.

# Global Optimization

$p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)$

Question: Are $$d[n]$$ and $$d[n']$$ independent if:

• If $$(n,n') \notin \mathbf{E}$$ -- pixels are not neighbors ? NO. Unless $$n,n'$$ are parts of disconnected components of graph.
• If $$(n,n') \notin \mathbf{E}$$, "conditioned" on all the neighbors of $$n$$ being observed. $$p(d[n],d[n'] | \{d[n'']\})$$

# Global Optimization

$p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)$

Question: Are $$d[n]$$ and $$d[n']$$ independent if:

• If $$(n,n') \notin \mathbf{E}$$, "conditioned" on all the neighbors of $$n$$ being observed. $$p(d[n],d[n'] | \{d[n'']\})$$

YES. This is the Markov property. And these kinds of graphical models are called Markov random fields.

Graph structure encodes "conditional independence".

# Global Optimization

Compute assignment with highest probability

$d = \arg \max_d p(d) = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])$

# Global Optimization

$d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])$

• Iterated Conditional Modes really slow.
• No guaranteed solution for arbitrary graphs.
• But could solve it we our graph were a chain (or more generally a tree).

# Global Optimization

$d = \arg \min_d \sum_{x} C[x,d[x]] + \lambda \sum_x S(d[x],d[x+1])$

• Consider where we optimize each epipolar line separately.

# Global Optimization

We could apply this on individual epipolar lines.

Get "streaking" artifacts. Because we're smoothing each line independently.

# Global Optimization

• That's why we want to use a full 2D grid.
• But forward-backward only works on chains (or graphs without cycles).

One flavor of approximate algorithms apply the same idea of forming a $$\bar{C}[x,d]$$

• TRW-S
• Loopy Belief Propagation
• SGM

# Global Optimization

Semi-Global Matching

$\bar{C}[x,d] = C[x,d] + \min_{d'} \bar{C}[x-1,d'] + \lambda S(d,d')$

This is going left to right in the horizontal direction.

Idea: Compute different $$\bar{C}$$ along different directions ...

and average.

# Global Optimization

Semi-Global Matching

$\bar{C}_{lr}[n,d] = C[n,d] + \min_{d'} \bar{C}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}_{rl}[n,d] = C[n,d] + \min_{d'} \bar{C}_{rl}[n+[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}_{du}[n,d] = C[n,d] + \min_{d'} \bar{C}_{du}[n-[0,1]^T,d'] + \lambda S(d,d')$ $\bar{C}_{ud}[n,d] = C[n,d] + \min_{d'} \bar{C}_{ud}[n+[0,1]^T,d'] + \lambda S(d,d')$

$d[n] = \arg \min_d \bar{C}_{lr}[n,d] + \bar{C}_{rl}[n,d]+ \bar{C}_{ud}[n,d]+\bar{C}_{du}[n,d]$

# Global Optimization

Semi-Global Matching

# Global Optimization

$\bar{C}[x,d] = C[x,d] + \min_{d'} \bar{C}[x-1,d'] + \lambda S(d,d')$

• Consider the case when $$S(d,d')$$:
• 0 if $$d=d'$$
• $$P_1$$ if $$|d-d'| = 1$$
• $$P_2$$ otherwise.
• Can we do this efficiently ?
• Need to go through each line sequentially.
• But can go through all lines in parallel.
• But what about $$d$$ ? Do we need to do minimization for every $$d$$ independently ?

# Global Optimization

$\bar{C}[x,d] = C[x,d] + \min_{d'} \bar{C}[x-1,d'] + \lambda S(d,d')$

• Note: It doesn't matter if we add / subtract constants to all d's:
• $$C[x,d]$$ with $$C[x,d] + C_0[x]$$
• $$\bar{C}[x,d]$$ with $$\bar{C}[x,d] + C_0[x]$$

Why not ?

• Because the minimization will always be over $$d$$. You are never comparing $$C[x_1,d_1]$$ with $$C[x_2,d_2]$$.

# Global Optimization

$\bar{C}[x,d] = C[x,d] + \min_{d'} \bar{C}[x-1,d'] + S(d,d')$

$S(d,d') = \left\{\begin{array}{l}0~\text{if}~d=d'\\P_1~\text{if}~|d-d'|=1\\P_2~\text{otherwise}\end{array} \right.$

# Global Optimization

$\bar{C}[x,d] = C[x,d] + \color{red}{\min_{d'} \bar{C}[x-1,d'] + S(d,d')}$

$S(d,d') = \left\{\begin{array}{l}0~\text{if}~d=d'\\P_1~\text{if}~|d-d'|=1\\P_2~\text{otherwise}\end{array} \right.$

• Step 1 (Simplify): Replace $$\bar{C}[x-1,d']$$ with $$\tilde{C}[x-1,d'] = \bar{C}[x-1,d'] - \min_{d''} \bar{C}[x-1,d'']$$

# Global Optimization

$\bar{C}[x,d] = C[x,d] + \color{red}{\min_{d'} \tilde{C}[x-1,d'] + S(d,d')}$

$S(d,d') = \left\{\begin{array}{l}0~\text{if}~d=d'\\P_1~\text{if}~|d-d'|=1\\P_2~\text{otherwise}\end{array} \right.$

• Step 1 (Simplify): Replace $$\bar{C}[x-1,d']$$ with $$\tilde{C}[x-1,d']=\bar{C}[x-1,d'] - \min_{d''} \bar{C}[x-1,d'']$$

What happens then ?

What is the MAXIMUM value for $$\min_{d'} \tilde{C}[x-1,d'] + S(d,d')$$ for any $$d$$ ?

# Global Optimization

$\bar{C}[x,d] = C[x,d] + \color{red}{\min_{d'} \tilde{C}[x-1,d'] + S(d,d')}$

$S(d,d') = \left\{\begin{array}{l}0~\text{if}~d=d'\\P_1~\text{if}~|d-d'|=1\\P_2~\text{otherwise}\end{array} \right.$

• Step 1 (Simplify): Replace $$\bar{C}[x-1,d']$$ with $$\tilde{C}[x-1,d']=\bar{C}[x-1,d'] - \min_{d''} \bar{C}[x-1,d'']$$

The MAXIMUM value for $$\min_{d'} \tilde{C}[x-1,d'] +S(d,d')$$ is $$P_2$$.

• Step 2: This means that for every value of $$d$$, we just need to consider four values.

• $$\min_{d'} \tilde{C}[x-1,d'] + S(d,d')$$ is the min of
• $$P_2$$ (for $$d' = \arg \min \tilde{C}[x-1,d']$$)
• $$\tilde{C}[x-1,d-1]+P_1$$ (for $$d' = d-1$$)
• $$\tilde{C}[x-1,d+1]+P_1$$ (for $$d' = d+1$$)
• $$\tilde{C}[x-1,d]$$ (for $$d' = d$$)

# Global Optimization

$\bar{C}[x,d] = C[x,d] + \color{red}{\min_{d'} \tilde{C}[x-1,d'] + S(d,d')}$

$S(d,d') = \left\{\begin{array}{l}0~\text{if}~d=d'\\P_1~\text{if}~|d-d'|=1\\P_2~\text{otherwise}\end{array} \right.$

• $$\min_{d'} \tilde{C}[x-1,d'] + S(d,d')$$ is the min of
• $$P_2$$ (for $$d' = \arg \min \tilde{C}[x-1,d']$$)
• $$\tilde{C}[x-1,d-1]+P_1$$ (for $$d' = d-1$$)
• $$\tilde{C}[x-1,d+1]+P_1$$ (for $$d' = d+1$$)
• $$\tilde{C}[x-1,d]$$ (for $$d' = d$$)

Can do this in parallel with matrix operations for all $$d$$ and all lines.

Full algorithm in paper:
Hirschmueller, Stereo Processing by Semi-Global Matching and Mutual Information, PAMI 2008.

# Global Optimization

SGM Algorithm Averages along four directions:

$\bar{C}_{lr}[n,d] = C[n,d] + \min_{d'} \bar{C}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}_{rl}[n,d] = C[n,d] + \min_{d'} \bar{C}_{rl}[n+[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}_{du}[n,d] = C[n,d] + \min_{d'} \bar{C}_{du}[n-[0,1]^T,d'] + \lambda S(d,d')$ $\bar{C}_{ud}[n,d] = C[n,d] + \min_{d'} \bar{C}_{ud}[n+[0,1]^T,d'] + \lambda S(d,d')$

$d[n] = \arg \min_d \bar{C}_{lr}[n,d] + \bar{C}_{rl}[n,d]+ \bar{C}_{ud}[n,d]+\bar{C}_{du}[n,d]$

Bur $$\bar{C}_{lr}$$ is still smoothing the original cost.

# Global Optimization

SGM Algorithm Averages along four directions:

$\bar{C}_{lr}[n,d] = (C[n,d]+\bar{C}_{rl}[n,d]+\bar{C}_{ud}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}_{rl}[n,d] = C[n,d] + \min_{d'} \bar{C}_{rl}[n+[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}_{du}[n,d] = C[n,d] + \min_{d'} \bar{C}_{du}[n-[0,1]^T,d'] + \lambda S(d,d')$ $\bar{C}_{ud}[n,d] = C[n,d] + \min_{d'} \bar{C}_{ud}[n+[0,1]^T,d'] + \lambda S(d,d')$

Wouldn't this be better ?

But then ...

# Global Optimization

SGM Algorithm Averages along four directions:

$\bar{C}_{lr}[n,d] = (C[n,d]+\bar{C}_{rl}[n,d]+\bar{C}_{ud}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}_{rl}[n,d] = (C[n,d]+\bar{C}_{lr}[n,d]+\bar{C}_{ud}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{rl}[n-[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}_{du}[n,d] = C[n,d] + \min_{d'} \bar{C}_{du}[n-[0,1]^T,d'] + \lambda S(d,d')$ $\bar{C}_{ud}[n,d] = C[n,d] + \min_{d'} \bar{C}_{ud}[n+[0,1]^T,d'] + \lambda S(d,d')$

Wouldn't this be better ?

Why not this ...

# Global Optimization

SGM Algorithm Averages along four directions:

$\bar{C}_{lr}[n,d] = (C[n,d]+\bar{C}_{rl}[n,d]+\bar{C}_{ud}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}_{rl}[n,d] = (C[n,d]+\bar{C}_{lr}[n,d]+\bar{C}_{ud}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{rl}[n-[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}_{du}[n,d] = (C[n,d]+\bar{C}_{lr}[n,d]+\bar{C}_{rl}[n,d] + \bar{C}_{ud}[n,d]) + \min_{d'} \bar{C}_{du}[n-[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}_{ud}[n,d] = (C[n,d]+\bar{C}_{lr}[n,d]+\bar{C}_{rl}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{ud}[n-[1,0]^T,d'] + \lambda S(d,d')$

Wouldn't this be better ?

Why not this ?

Because this is a circular definition.

# Global Optimization

Loopy Belief Propagation (one version)

$\bar{C}^{t+1}_{lr}[n,d] = (C[n,d]+\bar{C}^t_{rl}[n,d]+\bar{C}^t_{ud}[n,d] + \bar{C}^t_{du}[n,d]) + \min_{d'} \bar{C}^{t+1}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}^{t+1}_{rl}[n,d] = (C[n,d]+\bar{C}^t_{lr}[n,d]+\bar{C}^t_{ud}[n,d] + \bar{C}^t_{du}[n,d]) + \min_{d'} \bar{C}^{t+1}_{rl}[n-[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}^{t+1}_{du}[n,d] = (C[n,d]+\bar{C}^t_{lr}[n,d]+\bar{C}^t_{rl}[n,d] + \bar{C}^t_{ud}[n,d]) + \min_{d'} \bar{C}^{t+1}_{du}[n-[1,0]^T,d'] + \lambda S(d,d')$ $\bar{C}^{t+1}_{ud}[n,d] = (C[n,d]+\bar{C}^t_{lr}[n,d]+\bar{C}^t_{rl}[n,d] + \bar{C}^t_{du}[n,d]) + \min_{d'} \bar{C}^{t+1}_{ud}[n-[1,0]^T,d'] + \lambda S(d,d')$

Do this iteratively

More generally, at time step $$t$$, pass a message from node $$n$$ to $$n'$$, based on all messages $$n$$ has at that time, except for the message from $$n'$$.

• Graph Cuts (with expansions / swaps): Generalize that there is an exact solution if only two values of $$d$$.