CSE 559A: Computer Vision


Use Left/Right PgUp/PgDown to navigate slides

Fall 2018: T-R: 11:30-1pm @ Lopata 101

Instructor: Ayan Chakrabarti (ayan@wustl.edu).
Course Staff: Zhihao Xia, Charlie Wu, Han Liu

http://www.cse.wustl.edu/~ayan/courses/cse559a/

Oct 11, 2018

General

  • Still missing a few problem set 2 submissions
    • Make sure you have git push-ed.
    • Do a git pull; git log and make sure the latest log message confirms your submission.
  • Problem set 3 due two weeks from today.
  • No Class Tuesday (Fall Break)

  • No office hours tomorrow or Monday. Recitation next Friday.

Global Optimization

Last Time

  • We define a cost volume \(C\) of size \(W\times H\times D\)
    • \(C[x,y,d]\) measures dis-similarity between \((x,y)\) in left image
      and \((x-d,y)\) in right image
  • Simplest Approach: \(d[x,y] = \arg min_d C[x,y,d]\)
    • Too noisy
  • Want to express that disparity (and therefore depth) of nearby pixels is similar

  • Ad-hoc Method: Cost Volume Filtering

  • Only encodes that nearby pixel disparities are exactly equal.

Global Optimization

\[d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])\]

  • \(n=[x,y]^T\) for pixel location.
  • \(C\) is cost-volume as before. Gives us "local evidence"
  • \(\mathbf{E}\) is a set of all pairs of pixels that are "neighbors" / adjacent in some way.
    • Can include all un-ordered pairs of pixels with \([(x,y),(x-1,y)]\) and \([(x,y),(x,y-1)]\) (four connected)
    • Or diagonal neighbors as well.
  • \(S\) is a function that indicates a preference for \(d[n]\) and \(d[n']\) to be the same.

Global Optimization

\[d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])\]

  • \(S\) is a function that indicates a preference for \(d[n]\) and \(d[n']\) to be the same.

  • Choice 1:
    • \(0\) if \(d[n']=d[n]\), \(1\) otherwise.
  • Choice 2: \(|d[n']-d[n]|\)
  • Choice 3:
    • \(0\) if \(d[n']=d[n]\)
    • \(T_1\) if \(|d[n']-d[n]| < \epsilon\)
    • \(T_2\) otherwise.

How do we solve this ?

Note that this is a discrete minimization. Each \(d[n] \in \{0,1,\ldots D-1\}\).

Global Optimization

\[d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])\]

One approach: Iterated Conditional Modes

  • Begin with \(d_0 = \arg min_d C[n,d[n]]\)
  • At each iteration \(t\), compute \(d_{t+1}\) from \(d_t\), by solving
    for each pixel in \(d_{t+1}\) assuming neighbors have values from \(d_t\).

\[d_{t+1}[n] = \arg \min_{d_n} C[n,d_n] + \lambda \sum_{(n,n') \in \mathbf{E_n}} S(d_n,d_{t}[n'])\]

  • So for each pixel,
    • Take matching cost.
    • Add smoothness cost from its neighbors, assuming values from previous iteration.
    • Minimize.

Does it converge ?

No Guarantee: We are changing all pixel assignments simultaneously.

Global Optimization

\[d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])\]

Per-pixel Iterated Conditional Modes (slow!)

  • Begin with \(d_0 = \arg \min_d C[n,d[n]]\)
  • At each iteration \(t\), compute \(d_{t+1}\) from \(d_t\), by solving
    for one pixel in \(d_{t+1}\) assuming neighbors have values from \(d_t\).

\[d_{t+1}[n_{t+1}] = \arg \min_{d_n} C[n_{t+1},d_n] + \lambda \sum_{(n_{t+1},n') \in \mathbf{E_{n_{t+1}}}} S(d_n,d_{t}[n'])\]

Does it converge ?

  • Each iteration decreases the cost. So it will converge (but to a local optimum).

Global Optimization

\[d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])\]

  • These kind of cost functions / optimization problems are quite common in vision.
  • The cost can be interpreted as a log probability distribution:

\[p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)\]

  • Joint distribution over all the \(d[n]\) values.

Global Optimization

\[p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)\]

  • Joint distribution over all the \(d[n]\) values.
  • Graphical Model: Probability Distribution Represented as a "Graph" \((V,E)\)

\[p(\{v \in V\}) = \prod_{v\in V} \Psi_v(v) \prod_{(v_1,v_2)\in E} \Phi_{v_1,v_2}(v_1,v_2)\]

  • Unary term for each node, pair-wise term for each edge.

(Directed Graphs represent Bayesian Networks)

Global Optimization

\[p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)\]

Question: Are \(d[n]\) and \(d[n']\) independent if:

  • If \((n,n') \in \mathbf{E}\) -- pixels are neighbors ?

Global Optimization

\[p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)\]

Question: Are \(d[n]\) and \(d[n']\) independent if:

  • If \((n,n') \in \mathbf{E}\) -- pixels are neighbors ?

Reminder: Two variables are independent if we can express their joint distribution as a product of distributions on each variable.

Global Optimization

\[p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)\]

Question: Are \(d[n]\) and \(d[n']\) independent if:

  • If \((n,n') \in \mathbf{E}\) -- pixels are neighbors. No
  • If \((n,n') \notin \mathbf{E}\) -- pixels are not neighbors ?

Global Optimization

\[p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)\]

Question: Are \(d[n]\) and \(d[n']\) independent if:

  • If \((n,n') \in \mathbf{E}\) -- pixels are neighbors. No
  • If \((n,n') \notin \mathbf{E}\) -- pixels are not neighbors ? NO.

Global Optimization

\[p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)\]

Question: Are \(d[n]\) and \(d[n']\) independent if:

  • If \((n,n') \notin \mathbf{E}\) -- pixels are not neighbors ? NO. Unless \(n,n'\) are parts of disconnected components of graph.

Global Optimization

\[p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)\]

Question: Are \(d[n]\) and \(d[n']\) independent if:

  • If \((n,n') \notin \mathbf{E}\) -- pixels are not neighbors ? NO. Unless \(n,n'\) are parts of disconnected components of graph.
  • If \((n,n') \notin \mathbf{E}\), "conditioned" on all the neighbors of \(n\) being observed. \(p(d[n],d[n'] | \{d[n'']\})\)

Global Optimization

\[p(d) \propto \prod_{n} \exp\left(-C[n,d[n]]\right) \prod_{(n,n') \in \mathbf{E}} \exp\left(-\lambda S(d[n],d[n'])\right)\]

Question: Are \(d[n]\) and \(d[n']\) independent if:

  • If \((n,n') \notin \mathbf{E}\), "conditioned" on all the neighbors of \(n\) being observed. \(p(d[n],d[n'] | \{d[n'']\})\)

YES. This is the Markov property. And these kinds of graphical models are called Markov random fields.

Graph structure encodes "conditional independence".

Global Optimization

Compute assignment with highest probability

\[d = \arg \max_d p(d) = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])\]

Global Optimization

\[d = \arg \min_d \sum_{n} C[n,d[n]] + \lambda \sum_{(n,n') \in \mathbf{E}} S(d[n],d[n'])\]

  • Iterated Conditional Modes really slow.
  • No guaranteed solution for arbitrary graphs.
  • But could solve it we our graph were a chain (or more generally a tree).

Global Optimization

\[d = \arg \min_d \sum_{x} C[x,d[x]] + \lambda \sum_x S(d[x],d[x+1])\]

  • Consider where we optimize each epipolar line separately.

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

Global Optimization

We could apply this on individual epipolar lines.

Get "streaking" artifacts. Because we're smoothing each line independently.

Global Optimization

  • That's why we want to use a full 2D grid.
  • But forward-backward only works on chains (or graphs without cycles).

One flavor of approximate algorithms apply the same idea of forming a \(\bar{C}[x,d]\)

  • TRW-S
  • Loopy Belief Propagation
  • SGM

Global Optimization

Semi-Global Matching

\[\bar{C}[x,d] = C[x,d] + \min_{d'} \bar{C}[x-1,d'] + \lambda S(d,d')\]

This is going left to right in the horizontal direction.

Idea: Compute different \(\bar{C}\) along different directions ...

and average.

Global Optimization

Semi-Global Matching

\[\bar{C}_{lr}[n,d] = C[n,d] + \min_{d'} \bar{C}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{rl}[n,d] = C[n,d] + \min_{d'} \bar{C}_{rl}[n+[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{du}[n,d] = C[n,d] + \min_{d'} \bar{C}_{du}[n-[0,1]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{ud}[n,d] = C[n,d] + \min_{d'} \bar{C}_{ud}[n+[0,1]^T,d'] + \lambda S(d,d')\]

\[d[n] = \arg \min_d \bar{C}_{lr}[n,d] + \bar{C}_{rl}[n,d]+ \bar{C}_{ud}[n,d]+\bar{C}_{du}[n,d]\]

Global Optimization

Semi-Global Matching

Global Optimization

\[\bar{C}[x,d] = C[x,d] + \min_{d'} \bar{C}[x-1,d'] + \lambda S(d,d')\]

  • Consider the case when \(S(d,d')\):
    • 0 if \(d=d'\)
    • \(P_1\) if \(|d-d'| = 1\)
    • \(P_2\) otherwise.
  • Can we do this efficiently ?
    • Need to go through each line sequentially.
    • But can go through all lines in parallel.
    • But what about \(d\) ? Do we need to do minimization for every \(d\) independently ?

Global Optimization

\[\bar{C}[x,d] = C[x,d] + \min_{d'} \bar{C}[x-1,d'] + \lambda S(d,d')\]

  • Note: It doesn't matter if we add / subtract constants to all d's:
    • \(C[x,d]\) with \(C[x,d] + C_0[x]\)
    • \(\bar{C}[x,d]\) with \(\bar{C}[x,d] + C_0[x]\)

Why not ?

  • Because the minimization will always be over \(d\). You are never comparing \(C[x_1,d_1]\) with \(C[x_2,d_2]\).

Global Optimization

\[\bar{C}[x,d] = C[x,d] + \min_{d'} \bar{C}[x-1,d'] + S(d,d')\]

\[S(d,d') = \left\{\begin{array}{l}0~\text{if}~d=d'\\P_1~\text{if}~|d-d'|=1\\P_2~\text{otherwise}\end{array} \right.\]

Global Optimization

\[\bar{C}[x,d] = C[x,d] + \color{red}{\min_{d'} \bar{C}[x-1,d'] + S(d,d')}\]

\[S(d,d') = \left\{\begin{array}{l}0~\text{if}~d=d'\\P_1~\text{if}~|d-d'|=1\\P_2~\text{otherwise}\end{array} \right.\]

  • Step 1 (Simplify): Replace \(\bar{C}[x-1,d']\) with \(\tilde{C}[x-1,d'] = \bar{C}[x-1,d'] - \min_{d''} \bar{C}[x-1,d'']\)

Global Optimization

\[\bar{C}[x,d] = C[x,d] + \color{red}{\min_{d'} \tilde{C}[x-1,d'] + S(d,d')}\]

\[S(d,d') = \left\{\begin{array}{l}0~\text{if}~d=d'\\P_1~\text{if}~|d-d'|=1\\P_2~\text{otherwise}\end{array} \right.\]

  • Step 1 (Simplify): Replace \(\bar{C}[x-1,d']\) with \(\tilde{C}[x-1,d']=\bar{C}[x-1,d'] - \min_{d''} \bar{C}[x-1,d'']\)

What happens then ?

What is the MAXIMUM value for \(\min_{d'} \tilde{C}[x-1,d'] + S(d,d')\) for any \(d\) ?

Global Optimization

\[\bar{C}[x,d] = C[x,d] + \color{red}{\min_{d'} \tilde{C}[x-1,d'] + S(d,d')}\]

\[S(d,d') = \left\{\begin{array}{l}0~\text{if}~d=d'\\P_1~\text{if}~|d-d'|=1\\P_2~\text{otherwise}\end{array} \right.\]

  • Step 1 (Simplify): Replace \(\bar{C}[x-1,d']\) with \(\tilde{C}[x-1,d']=\bar{C}[x-1,d'] - \min_{d''} \bar{C}[x-1,d'']\)

The MAXIMUM value for \(\min_{d'} \tilde{C}[x-1,d'] +S(d,d')\) is \(P_2\).

  • Step 2: This means that for every value of \(d\), we just need to consider four values.
     
  • \(\min_{d'} \tilde{C}[x-1,d'] + S(d,d')\) is the min of
    • \(P_2\) (for \(d' = \arg \min \tilde{C}[x-1,d']\))
    • \(\tilde{C}[x-1,d-1]+P_1\) (for \(d' = d-1\))
    • \(\tilde{C}[x-1,d+1]+P_1\) (for \(d' = d+1\))
    • \(\tilde{C}[x-1,d]\) (for \(d' = d\))

Global Optimization

\[\bar{C}[x,d] = C[x,d] + \color{red}{\min_{d'} \tilde{C}[x-1,d'] + S(d,d')}\]

\[S(d,d') = \left\{\begin{array}{l}0~\text{if}~d=d'\\P_1~\text{if}~|d-d'|=1\\P_2~\text{otherwise}\end{array} \right.\]

  • \(\min_{d'} \tilde{C}[x-1,d'] + S(d,d')\) is the min of
    • \(P_2\) (for \(d' = \arg \min \tilde{C}[x-1,d']\))
    • \(\tilde{C}[x-1,d-1]+P_1\) (for \(d' = d-1\))
    • \(\tilde{C}[x-1,d+1]+P_1\) (for \(d' = d+1\))
    • \(\tilde{C}[x-1,d]\) (for \(d' = d\))

Can do this in parallel with matrix operations for all \(d\) and all lines.

Full algorithm in paper:
Hirschmueller, Stereo Processing by Semi-Global Matching and Mutual Information, PAMI 2008.

Global Optimization

SGM Algorithm Averages along four directions:

\[\bar{C}_{lr}[n,d] = C[n,d] + \min_{d'} \bar{C}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{rl}[n,d] = C[n,d] + \min_{d'} \bar{C}_{rl}[n+[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{du}[n,d] = C[n,d] + \min_{d'} \bar{C}_{du}[n-[0,1]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{ud}[n,d] = C[n,d] + \min_{d'} \bar{C}_{ud}[n+[0,1]^T,d'] + \lambda S(d,d')\]

\[d[n] = \arg \min_d \bar{C}_{lr}[n,d] + \bar{C}_{rl}[n,d]+ \bar{C}_{ud}[n,d]+\bar{C}_{du}[n,d]\]

Bur \(\bar{C}_{lr}\) is still smoothing the original cost.

Global Optimization

SGM Algorithm Averages along four directions:

\[\bar{C}_{lr}[n,d] = (C[n,d]+\bar{C}_{rl}[n,d]+\bar{C}_{ud}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{rl}[n,d] = C[n,d] + \min_{d'} \bar{C}_{rl}[n+[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{du}[n,d] = C[n,d] + \min_{d'} \bar{C}_{du}[n-[0,1]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{ud}[n,d] = C[n,d] + \min_{d'} \bar{C}_{ud}[n+[0,1]^T,d'] + \lambda S(d,d')\]

Wouldn't this be better ?

But then ...

Global Optimization

SGM Algorithm Averages along four directions:

\[\bar{C}_{lr}[n,d] = (C[n,d]+\bar{C}_{rl}[n,d]+\bar{C}_{ud}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{rl}[n,d] = (C[n,d]+\bar{C}_{lr}[n,d]+\bar{C}_{ud}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{rl}[n-[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{du}[n,d] = C[n,d] + \min_{d'} \bar{C}_{du}[n-[0,1]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{ud}[n,d] = C[n,d] + \min_{d'} \bar{C}_{ud}[n+[0,1]^T,d'] + \lambda S(d,d')\]

Wouldn't this be better ?

Why not this ...

Global Optimization

SGM Algorithm Averages along four directions:

\[\bar{C}_{lr}[n,d] = (C[n,d]+\bar{C}_{rl}[n,d]+\bar{C}_{ud}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{rl}[n,d] = (C[n,d]+\bar{C}_{lr}[n,d]+\bar{C}_{ud}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{rl}[n-[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{du}[n,d] = (C[n,d]+\bar{C}_{lr}[n,d]+\bar{C}_{rl}[n,d] + \bar{C}_{ud}[n,d]) + \min_{d'} \bar{C}_{du}[n-[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}_{ud}[n,d] = (C[n,d]+\bar{C}_{lr}[n,d]+\bar{C}_{rl}[n,d] + \bar{C}_{du}[n,d]) + \min_{d'} \bar{C}_{ud}[n-[1,0]^T,d'] + \lambda S(d,d')\]

Wouldn't this be better ?

Why not this ?

Because this is a circular definition.

Global Optimization

Loopy Belief Propagation (one version)

\[\bar{C}^{t+1}_{lr}[n,d] = (C[n,d]+\bar{C}^t_{rl}[n,d]+\bar{C}^t_{ud}[n,d] + \bar{C}^t_{du}[n,d]) + \min_{d'} \bar{C}^{t+1}_{lr}[n-[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}^{t+1}_{rl}[n,d] = (C[n,d]+\bar{C}^t_{lr}[n,d]+\bar{C}^t_{ud}[n,d] + \bar{C}^t_{du}[n,d]) + \min_{d'} \bar{C}^{t+1}_{rl}[n-[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}^{t+1}_{du}[n,d] = (C[n,d]+\bar{C}^t_{lr}[n,d]+\bar{C}^t_{rl}[n,d] + \bar{C}^t_{ud}[n,d]) + \min_{d'} \bar{C}^{t+1}_{du}[n-[1,0]^T,d'] + \lambda S(d,d')\] \[\bar{C}^{t+1}_{ud}[n,d] = (C[n,d]+\bar{C}^t_{lr}[n,d]+\bar{C}^t_{rl}[n,d] + \bar{C}^t_{du}[n,d]) + \min_{d'} \bar{C}^{t+1}_{ud}[n-[1,0]^T,d'] + \lambda S(d,d')\]

Do this iteratively

More generally, at time step \(t\), pass a message from node \(n\) to \(n'\), based on all messages \(n\) has at that time, except for the message from \(n'\).

Read more:
- Yedidia, Freeman, Weiss, "Understanding belief propagation and its generalizations," IJCAI 2001 (Distinguished Paper)
- Tappen & Freeman, "Comparison of graph cuts with belief propagation for stereo, using identical MRF parameters", ICCV 2003.

Global Optimization

  • Other methods for discrete minimzation---based on "Graph Cuts".
  • SGM / Loopy BP: Generalize that there is an exact solution for a chain.
  • Graph Cuts (with expansions / swaps): Generalize that there is an exact solution if only two values of \(d\).

D. Scharstein and R. Szeliski. A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. IJCV 2002.

http://vision.middlebury.edu/stereo/