CSE 559A: Computer Vision


Use Left/Right PgUp/PgDown to navigate slides

Fall 2018: T-R: 11:30-1pm @ Lopata 101

Instructor: Ayan Chakrabarti (ayan@wustl.edu).
Course Staff: Zhihao Xia, Charlie Wu, Han Liu

http://www.cse.wustl.edu/~ayan/courses/cse559a/

Oct 23, 2018

General

  • Project Proposals Deadline was Sunday
    • We'll be providing you with feedback over the next week.
  • Problem Set 3 Due Thursday
  • Friday Office Hours will be (always) in Lopata 103

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

But what is the basis of this grouping ?

  • Physical
    • Lie on the same surface / plane
    • Made of the same material
    • Moving together rigidly

Grouping & Segmentation

But what is the basis of this grouping ?

  • Semantic
    • Same object
    • Foreground / background
    • Interesting / non-interesting

Semantic segmentation: often humans will disagree on what goes where.

Grouping & Segmentation

Simplest Version: Superpixel Segementation

  • Partition Image into a large number of segments called superpixels.
  • Many segments, each segment relatively small.
  • Oversegmentation of the image
    • Each object / plane / surface might be broken into multiple segments
    • But (hope) each segment does not cross a boundary.
  • Can be based on appearance alone

  • Simplifies further processing (dealing with \(K\) segments instead of \(N\) pixels)

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

SLIC Superpixels

Achanta et al., 2010. Simple Linear Iterative Clustering.

Formally, given an image \(I[n]\) with \(N\) pixels, you want group the pixels into \(K<<N\) super pixels.

You want to determine a label

\[L[n] \in \{1,2,\ldots K\}\]

for every pixel \(n\), based on some metric.

Note the value of \(L\) doesn't matter. What matters is similar pixels have the same label. This is clustering !

The final output we care about is \(K\) sets

\[S_k = \{n: L[n] = k\}\]

Grouping & Segmentation

SLIC Superpixels

We will want to group pixels that appear similar and are close by into the same super-pixel.

Define an "augmented" image \(I'[n]\) where each \(I'[n]\in \mathbb{R}^5\)

  • First 3 dimensions are R,G,B
  • Two dimensions are \(x\) and \(y\) co-ordinates.

For grayscale images, \(I'[n] \in \mathbb{R}^3\).

Grouping & Segmentation

SLIC Superpixels

Determine labeling \(L[n]\) to minimize the following cost:

\[L = \arg \min_L \min_{\{\mu_k\}}~~~\sum_{k=1}^K~~~\sum_{n:L[n] = k} \|I'[n] - \mu_k\|^2\]

Here, each \(\mu_k \in \mathbb{R}^5\).

  • This is K-means clustering.
  • Easy to see that \(\mu_k\) will be the mean of the \(I'\) vectors of pixels assigned to label \(k\).
  • We're saying that all pixels assigned the label \(k\) should be
    close to each other in the squared distance sense of their augmented vectors.
  • This augmented vector encodes both appearance and location.
  • So we want pixels that look the same and are close-by to have the same label.

Grouping & Segmentation

SLIC Superpixels

\[L = \arg \min_L \min_{\{\mu_k\}}~~~\sum_{k=1}^K~~~\sum_{n:L[n] = k} \|I'[n] - \mu_k\|^2\]

  • Typically, use Lab color space instead of RGB.
  • You can weight the contribution of location vs appearance by normalizing \((x,y)\) in \(I'\) differently. \[I'[n] = [I[n]_R,I[n]_G,I[n]_B,\alpha n_x,\alpha n_y]^T\]

Grouping & Segmentation

\[L = \arg \min_L \min_{\{\mu_k\}}~~~\sum_{k=1}^K~~~\sum_{n:L[n] = k} \|I'[n] - \mu_k\|^2\]

K-Means: Lloyd's algorithm

  • Begin with some initial assignment \(L[n]\) (more later).
  • At each iteration ...

Step 1: For each \(k\), assign

\[\mu_k = \text{Mean} \{I'[n]\}_{L[n] = k}\]

Step 2: For each \(n\), assign

\[L[n] = \arg \min_k \|I'[n]-\mu_k\|^2\]

  • Does this converge ?
  • How do we initialize ?
  • Do we really need to do \(K\times N\) computations of \(\|I'[n]-\mu_k\|^2\) ?

Grouping & Segmentation

SLIC: Initialization

  • Actually, begin with an assignment of \(\{\mu_k\}\) (and do a step 2).
  • Given desired number of super-pixels \(K\), choose \(K\) points on a grid.
    • Spaced horizontally and vertically apart by \(S = \sqrt{\frac{HW}{K}}\)
  • Set each \(u_k = I'[n_k]\) as the augmented vector of one of these points.
  • In step 2, each seed is going to attract pixels in its neighborhood that are most like it.

Grouping & Segmentation

SLIC: Initialization

  • Actually, begin with an assignment of \(\{\mu_k\}\) (and do a step 2).
  • Given desired number of super-pixels \(K\), choose \(K\) points on a grid.
    • Spaced horizontally and vertically apart by \(S = \sqrt{\frac{HW}{K}}\)
  • Set each \(u_k = I'[n_k]\) as the augmented vector of one of these points.
  • In step 2, each seed is going to attract pixels in its neighborhood that are most like it.
  • Sometimes this initialization gives you a 'seed' that lies right on an edge.

Grouping & Segmentation

SLIC: Initialization

  • Actually, begin with an assignment of \(\{\mu_k\}\) (and do a step 2).
  • Given desired number of super-pixels \(K\), choose \(K\) points on a grid.
    • Spaced horizontally and vertically apart by \(S = \sqrt{\frac{HW}{K}}\)
  • Set each \(u_k = I'[n_k]\) as the augmented vector of one of these points.
  • In step 2, each seed is going to attract pixels in its neighborhood that are most like it.
  • Sometimes this initialization gives you a 'seed' that lies right on an edge.
    • Bad because pixel on either side of edge will often look nothing like it.

Grouping & Segmentation

SLIC: Initialization

  • Actually, begin with an assignment of \(\{\mu_k\}\) (and do a step 2).
  • Given desired number of super-pixels \(K\), choose \(K\) points on a grid.
    • Spaced horizontally and vertically apart by \(S = \sqrt{\frac{HW}{K}}\)
  • Set each \(u_k = I'[n_k]\) as the augmented vector of one of these points.
  • In step 2, each seed is going to attract pixels in its neighborhood that are most like it.
  • Sometimes this initialization gives you a 'seed' that lies right on an edge.
    • Bad because pixel on either side of edge will often look nothing like it.
  • Solution: Look in a 3x3 neighborhood, and choose pixel with lowest gradient magnitude.

Grouping & Segmentation

SLIC: Minimization

At any given iteration, for step 2:

Grouping & Segmentation

SLIC: Minimization

At any given iteration, for step 2:

  • Don't consider all possible \(K\) for every \(n\).
  • Instead, say that a pixel \(n\) can only be assigned to a cluster \(k\) if
    \(n\) is within a \(2S \times 2S\) window around the spatial co-ordinates in \(u_k\).
  • Note that \(\mu_k\)'s will no longer be on a regular grid.

Grouping & Segmentation

SLIC: Minimization

At any given iteration, for step 2:

  • Initialize min_dist[n] to Infinity for all n

Grouping & Segmentation

SLIC: Minimization

At any given iteration, for step 2:

  • Initialize min_dist[n] to Infinity for all n
  • Loop through each \(u_k\), and consider pixels in \(2S\times 2S\) window around \(\mu_k\)
    • This will be a regular grid.
  • For each pixel in this window, compute distance of \(I'[n]\) to \(\mu_k\),
    compare to min_dist[n], if lower, update min_dist[n] and update L[n].

Do we need to loop over \(K\) ? Can get some parallelism if you're clever about it.

Grouping & Segmentation

SLIC: Uses

Given a set of super-pixels \(S_k = \{n: L[n] = k\}\):

  • You can "denoise" your image by smoothing independently within each \(S_k\).
    • Replace all intensities by their mean.
    • Fit intensity to be a linear function of \(n\).
  • You can "denoise" other scene properties
    • Filter your stereo cost volume within each super-pixel.
    • Take your disparities within each super-pixel, and fit them to a plane.
    • Do the aggregation for Lucas-Kanade flow estimation within each super-pixel.
  • Build super-pixels with intensity + other information
    • Get an initial estimate of disparity, add it to your augmented vector \(I'[n]\).
    • Get a super-pixel segmentation. Smooth cost-volume, re-estimate disparities.
    • Repeat segmentation ...
  • Group pixels (instead of super-pixels) into objects or by semantic labels

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Formally, let's say our smoothness cost \(S_{n,n'}(l,l') = w_{n,n'} \delta[l!=l']\), for \(w_{n,n'} \geq 0\).

\[L = \arg \min_{L[n] \in \{0,1\}} \sum_n C[n,L[n]] + \sum_{(n,n')\in\mathbb{E}} w_{n,n'} \delta[L[n]!=L[n']]\]

  • Build a graph with vertices \(V = \{n\} \cup \{0,1\}\).
  • Place an edge between every \((n,n')\in\mathbb{E}\) with weight \(w_{n,n'}\).
  • Place an edge between \((n,0)\forall n\) with weight \(C[n,1]\) (assuming costs are positive).
  • Place an edge between \((n,1)\forall n\) with weight \(C[n,0]\) (assuming costs are positive).
  • Partition the vertices into sets \(A,B\) such that \(0 \in A, 1 \in B\), to minimize Cut\((A,B)\).
    • The cut is defined as the sum of the weights of the edges going between vertices in A to vertices in B.
  • Can be solved in polynomial time (e.g., Stoer-Wagner)
  • Assign all pixels in \(A\) label 0, and all pixels in \(B\) label 1.

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

Grouping & Segmentation

  • Polynomial Time for Binary Segmentation
  • NP-hard for multi-label cases. \(L[n] \in \{A,B,C,\ldots ...\}\)
    • Remember, this is the same as our stereo case.
  • But approximate algorithms available
    • Typically different algorithms work well here than for stereo

Grouping & Segmentation

Multi-label Case: \(L[n] = \{A,B,C,\ldots ...\}\)

  • Begin with some initial assignment of \(L[n]\) (perhaps the pixel-wise minimizer of \(C\))
  • Then update \(L\) by making one of two kinds of moves in each iteration
  • \(\alpha\)-Expansion
    • Choose one of the labels (say \(A\))
    • Build a binary segmentation problem where \(1 = A\), \(0=\) everything else
    • Set \(C[n,0]=\infty\) for all pixels \(n\) where the current label is already \(A\)
    • Set \(C[n,0]=\) cost of its current assigned label for every other pixel
    • Set \(C[n,1]=\) cost of \(A\) for every other pixel
    • Do a min-cut. Replace all pixels labeled \(1\) with \(A\).
  • \(\alpha-\beta\) Swap
    • Choose a pair of labels (say \(A\) and \(B\))
    • Now define a new graph, containing only pixels that currently have label \(A\) or \(B\).
    • Solve the binary segmentation problem
  • Iterate through these different kinds of moves for different choices of labels.

Grouping & Segmentation

References

  • Boykov and Kolmogorov, An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision, PAMI 2004.
  • Delong et al., Fast Approximate Energy Minimization with Label Costs, IJCV 2012.
  • Rother et al., GrabCut -Interactive Foreground Extraction using Iterated Graph Cuts, SIGGRAPH 2004.

Grouping & Segmentation

Next Time

  • Min-cut can often lead to isolated points

  • Avoid with a method called "Normalized Cuts"