CSE 559A: Computer Vision

Fall 2018: T-R: 11:30-1pm @ Lopata 101

Instructor: Ayan Chakrabarti (ayan@wustl.edu).

Course Staff: Zhihao Xia, Charlie Wu, Han Liu

Oct 2, 2018

- Grace Hopper Attendees
- Last week's lectures were recorded. Check Piazza for link to videos.

- Monday Office Hours in Jolley 217.

- Recitation this Friday, 10:30-Noon in Jolley 309.

- Typo in PSET 2 code comments: ntod should return HxW array (not HxWx3)

- In response to requests, slide numbers to help you take notes
- Note that the posted slide PDFs have a subset of slides (avoid redundant transitions).
- Slide numbers will stay the same, but won't be contiguous in PDFs.
- Posted HTML slides are exactly those presented in class.

Most general form is a Homography:

\[p' = H p\]

where \(H\) is a general invertible \(3\times 3\) matrix.

- Defined upto scale. So 8 degrees of freedom.

- Defines mapping of co-ordinates of corresponding points in two images taken from different views:
- If all corresponding points lie on a plane in the world.
- If only the camera orientation has changed in two views (center is at the same place)

I know a bunch of pairs of points \((p'_i, p_i)\), and want to find \(H\) such that:

\[p'_i \sim Hp_i,~~~\forall i\]

I know a bunch of pairs of points \((p'_i, p_i)\), and want to find \(H\) such that:

\[p'_i \sim Hp_i,~~~\forall i\]

- How many unknowns ?
- How many equations for four points?

I know a bunch of pairs of points \((p'_i, p_i)\), and want to find \(H\) such that:

\[p'_i \sim Hp_i,~~~\forall i\]

- How many unknowns ? 8 (defined upto scale)
- How many equations for four points? 8 (2 x 4)

But how do we write these equations for equality upto scale ?

\[p'_i \times (Hp_i) = 0\]

Recall: \(u \times v = [(u_2v_3-u_3v_2),(u_3v_1-u_1v_3),(u_1v_2-u_2v_1)]^T\)

\[p'_i \times (Hp_i) = 0 \Rightarrow \color{red}{A_i} h = 0\]

\[Hp_i = \left[\begin{array}{c}h_1p_{ix}+h_2p_{iy}+h_3p_{iz}\\h_4p_{ix}+h_5p_{iy}+h_6p_{iz}\\h_7p_{ix}+h_8p_{iy}+h_9p_{iz}\end{array}\right] = \left[\begin{array}{ccccccccc} p_{ix}&p_{iy}&p_{iz}& 0&0&0 & 0&0&0 \\ 0&0&0 & p_{ix}&p_{iy}&p_{iz}& 0&0&0 \\ 0&0&0 & 0&0&0 & p_{ix}&p_{iy}&p_{iz}\end{array}\right]~h\]

\[u \times v = \left[\begin{array}{c}u_yv_z - u_zv_y\\u_zv_x-u_xv_z\\u_xv_y-u_yv_x\end{array}\right] = \left[\begin{array}{ccc} 0&-u_z&u_y\\ u_z&0&-u_x\\ -u_y&u_x&0 \end{array}\right]~v\]

\[p'_i \times (Hp_i) = \color{red}{\left[\begin{array}{ccc} 0&-p'_{iz}&p'_{iy}\\ p'_{iz}&0&-p'_{ix}\\ -p'_{iy}&p'_{ix}&0 \end{array}\right] \left[\begin{array}{ccccccccc} p_{ix}&p_{iy}&p_{iz}& 0&0&0 & 0&0&0 \\ 0&0&0 & p_{ix}&p_{iy}&p_{iz}& 0&0&0 \\ 0&0&0 & 0&0&0 & p_{ix}&p_{iy}&p_{iz}\end{array}\right]}~h\]

\[p'_i \times (Hp_i) = 0\]

\[A_i h = 0\]

The cross product gives us 3 equations, so \(A_i\) is \(3\times 9\).

But, one of the rows of \(A_i\) is a linear combination of the other (\(A_i\) has rank 2). Can choose to keep only two rows, or all three.

Stacking all the \(A_i\) matrices for all different correspondences, we get:

\[Ah = 0\]

\(A\) is \(2n\times 9\) or \(3n\times 9\) matrix, where \(n\) is number of correspondences. Rank(\(A\)) is at most \(2n\).

Rank exactly equal to \(2n\) if no three points are collinear.

So we have \(Ah = 0\) and want to find \(h\) upto scale. \(A\) has rank \(2n\) and \(h\) has 9 elements.

**Case 1**: \(n = 4\) non-collinear points.

- Trivial solution is \(h=0\). But want to avoid this.
- Cast as finding \(Ah = 0\) such that \(\|h\|=1\).

- Since \(A\) is exactly rank 8, there exists such a solution and it is unique (upto sign).
- Can find using eigen-decomposition / SVD.
- \(A = U D V^T\) where \(D\) is diagonal with last element \(0\). \(h\) is the last column of \(V\).

**Case 2**: \(n > 4\) non-collinear points.

- Over-determined case. Want to find "best" solution.

So we have \(Ah = 0\) and want to find \(h\) upto scale. \(A\) has rank \(2n\) and \(h\) has 9 elements.

**Case 1**: \(n = 4\) non-collinear points.

- Trivial solution is \(h=0\). But want to avoid this.
- Cast as finding \(Ah = 0\) such that \(\|h\|=1\).
- Since \(A\) is exactly rank 8, there exists such a solution and it is unique (upto sign).
- Can find using eigen-decomposition / SVD.
- \(A = U D V^T\) where \(D\) is diagonal with last element \(0\). \(h\) is the last column of \(V\).

**Case 2**: \(n > 4\) non-collinear points.

- Over-determined case. Want to find "best" solution.
- \(h = \arg \min_h \|Ah\|^2,~~~\|h\|=1\)

- Same solution, except that instead of taking 0 singular value, we take minimum singular value.
- \(\|Ah\|^2 = (Ah)^T(Ah) = h^T (A^TA) h\)
- Minimized by unit vector corresponding to lowest eigenvalue of \(A^TA\), or lowest singular value of \(A\).

Estimation from Lines

- How does a homography transform a line:

\[l^Tp = 0 \leftrightarrow l'^Tp'=0\]

\[l^T H^{-1}H p = 0 \Rightarrow (H^{-T}l)^T(Hp) = 0\]

\[l' = H^{-T}l \Rightarrow l=H^Tl'\]

- If we find four pairs of corresponding lines, we can get a similar set of equations for \(l_i = H^Tl'_i\) as for points.
- Get equations from \(l_i \times (H^Tl'_i) = 0\) for elements of \(H\).

Other approaches:

- Instead of measuring \(\|Ah\|^2\), might want to measure explicit geometric distance.
- Minimize distance in mapped cartesian co-ordinates (re-projection error).
- Involves division, no longer linear in \(H\). Iterative methods.

- See "Multiple View Geometry in Computer Vision," Hartley & Zisserman: Section 4.2

(or really, the whole book for a thorough discussion of geometry)

For a bunch of samples denoted by set \(C\):

\[h = \arg \min_h \sum_{i \in C} \|A_ih\|^2\] \[h = \arg \min_h \sum_{i \in C} E_i(h), ~~\text{for some error function E from sample i}\]

Robust Version: \[h = \arg \min_h \sum_{i \in C} \min(\epsilon,E_i(h))\]

- Limits the extent to which an erroneous sample can hurt

- If a specific \(E_i > \epsilon\), what is its gradient with respect to \(h\) ?

For a bunch of samples denoted by set \(C\):

\[h = \arg \min_h \sum_{i \in C} E_i(h), ~~\text{for some error function E from sample i}\]

Robust Version: \[h = \arg \min_h \sum_{i \in C} \min(\epsilon,E_i(h))\]

- Limits the extent to which an erroneous sample can hurt
- If a specific \(E_i > \epsilon\), what is its gradient with respect to \(h\) ? 0

So if I knew which \(i\in C\) would have \(E_i > \epsilon\),

\[h = \arg \min_h \sum_{i: E_i \leq \epsilon} E_i(h)\]

Drop those samples, and solve the normal way (SVD, etc.)

For a bunch of samples denoted by set \(C\):

\[h = \arg \min_h \sum_{i \in C} E_i(h), ~~\text{for some error function E from sample i}\]

Robust Version: \[h = \arg \min_h \sum_{i \in C} \min(\epsilon,E_i(h))\]

- Limits the extent to which an erroneous sample can hurt
- If a specific \(E_i > \epsilon\), what is its gradient with respect to \(h\) ? 0

So if I knew which \(i\in C\) would have \(E_i > \epsilon\) for the correct h,

\[h = \arg \min_h \sum_{i: E_i \leq \epsilon} E_i(h)\]

Drop those samples, and solve the normal way (SVD, etc.)

Iterative Version:

\[h = \arg \min_h \sum_{i \in C} \min(\epsilon,E_i(h))\]

- Fit the best \(h\) to all samples in full set \(C\).

- Given the current estimate of \(h\), compute the inlier set \(C'={i: E_i(h) \leq \epsilon}\)

- Update estimate of \(h\) by minimizing error over only the inlier set \(C'\)

- Goto step 2

Will this converge ?

Consider the original robust cost \(\min(\epsilon,E_i(h))\). Can step 3 ever increase the cost ?

Iterative Version:

\[h = \arg \min_h \sum_{i \in C} \min(\epsilon,E_i(h))\]

- Fit the best \(h\) to all samples in full set \(C\).
- Given the current estimate of \(h\), compute the inlier set \(C'={i: E_i(h) \leq \epsilon}\)
- Update estimate of \(h\) by minimizing error over only the inlier set \(C'\)
- Goto step 2

Will this converge ?

Consider the original robust cost \(\min(\epsilon,E_i(h))\). Can step 3 ever increase the cost ?

- Before step 3, \(h\) had some cost over the inlier set, and cost over each outlier sample was \(> \epsilon\).

- Step 3 find the value of \(h\) with minimum cost over inlier set. So error can only decrease over inliner set.

- Step 3 can increase or decrease error over outlier set. But increased error doesn't hurt us, since it was already \(> \epsilon\) before step 3.

Iterative Version:

\[h = \arg \min_h \sum_{i \in C} \min(\epsilon,E_i(h))\]

- Fit the best \(h\) to all samples in full set \(C\).
- Given the current estimate of \(h\), compute the inlier set \(C'={i: E_i(h) \leq \epsilon}\)
- Update estimate of \(h\) by minimizing error over only the inlier set \(C'\)
- Goto step 2

Will this converge ? Yes. Cost will never increase.

Stop when it stops decreasing. (Might oscillate between two solutions with same cost).

So method converges to some solution. Is it the global minimum ?

No. It's possible that if I made one more point an outlier, that would increase its error to \(> \epsilon\), but reduce other errors by a lot.

Fundamentally a combinatorial problem. Only way to solve exactly is to consider all possible sub-sets of \(C\) as outlier sets.

**Random Sampling and Consensus**

Lots of different variants.

- Randomly select \(k\) points (correspondences) as my inlier set.

Choice of \(k\) can vary: has to be at least 4 for computing homographies.

- Fit \(h\) to these \(k\) points.

- Store \(h\) and a measure of how good a fit \(h\) is to all points. This measure can either be the thresholded robust cost, or the number of outliers.

Repeat this \(N\) times to get \(N\) different estimates of \(h\) and associated costs.

Choose the \(h\) with the lowest cost, and then refine using iterative algorithm.