CSE 559A: Computer Vision

Use Left/Right PgUp/PgDown to navigate slides

Fall 2018: T-R: 11:30-1pm @ Lopata 101

Instructor: Ayan Chakrabarti (ayan@wustl.edu).

Course Staff: Zhihao Xia, Charlie Wu, Han Liu

Sep 27, 2018

- Almost all PSET 1 submissions in !
- Try not to use all your late days on the first problem set.

- Reminder: PSET 2 out.
- Get started early !

- Start thinking about final project.
- Will discuss next week.

**Intensity, Distance, and sub-tended angles**

- Does an object appear brighter if we're closer to it ?

- Consider taking an image of an object in a room where everything else is completely black.

- Take an image, and then move your camera closer, and take another image.

- The average intensity of pixels, that are on the object, won't increase.

- But now more pixels in your image will correspond to the object. And so average / total intensity across all pixels in the image will increase.

- So your camera is receiving more total light, but the light along an individual ray doesn't change.

**Diffuse Lighting vs Diffuse Surface**

- Diffuse lighting means the incoming light from light sources is nearly the same.

- Diffuse Surface or Diffuse Reflection or Lambertian Reflection: Light reflected by the surface is the same in all directions.

- A Diffuse Surface under a Non-diffuse (e.g., point) light source will have sharp shading variations.

- But the shading will be the same from all viewing directions.

- A non-diffuse (specular / shiny) surface under relatively diffuse light can still have sharp variations in appearance based on viewing direction.

\[\text{3D}~~~(x,y,z) \Rightarrow \left(-f\frac{x}{z}, -f \frac{y}{z}\right)~~~\text{2D}\]

- The division is annoying, makes projection non-linear.
- Can no longer use matrices / linear operations to relate co-ordinates.

- But we like matrix operations !

**Solution**: Homogeneous Co-ordinates

Book-keeping trick !

- 2D Cartesian Co-ordinates: \((x,y)\)
- 2D Homogeneous Co-ordinates: \((\alpha~x,\alpha~y,\alpha)\)

- Cartesian to Homogeneous: \((x,y) \rightarrow (\alpha x, \alpha y, \alpha)\)
- When \(\alpha=1\), this is called "augmented": \((x,y,1)\)

- Homogeneous to Cartesian: \((x',y',\alpha) \rightarrow \left(\frac{x'}{\alpha},\frac{y'}{\alpha}\right)\)

- A whole family of homogeneous co-ordinates map to the same cartesian co-ordinate
*Over-parameterization*of a 2D point- Denote this equality by \(\sim:~~(\alpha_1 x, \alpha_1 y, \alpha_1) \sim (~\alpha_2 x, \alpha_2 y, \alpha_2)\)

- Space of 2D Homogeneous co-ordinates denoted as \(\mathbb{P}^2 = \mathbb{R}^3 - (0,0,0)\)

- Note that \((x,y,0)\) is defined. In cartesian co-ordinates, it is the point at infinity

along the line joining \((0,0)\) to \((x,y)\).

- 3D Homogeneous Co-ordinates: \((x,y,z) \Rightarrow (\alpha x, \alpha y, \alpha z, \alpha)\)

- Turned non-linear perspective projection into a linear operation.

- Here's a different projection matrix:

\[P_{2d} = \left[\begin{array}{cccc}1 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 0 & 1\end{array}\right] P_{3d}\]

What does this represent ?

\[(x,y,z) \rightarrow (\color{red}{?}, \color{red}{?})\]

- Turned non-linear perspective projection into a linear operation.
- Here's a different projection matrix:

\[P_{2d} = \left[\begin{array}{cccc}1 & 0 & 0 & 0\\0 & 1 & 0 & 0\\0 & 0 & 0 & 1\end{array}\right] P_{3d}\]

What does this represent ?

\[(x,y,z) \rightarrow (x,y)\]

**Orthographic Projection**

- Also useful to represent translation, rotation, skew in addition to projection

- Learn to chain together all these operations to:
- Relate points in 3D to points in image
- Verify angles, metric lengths from calibration targets, ...
- Relate points in two images from different cameras

- Useful way to think about 2-D Homogeneous Co-ordinates \(\mathbb{P}^2\)

"Rays" in \(\mathbb{R}^3\)

- Useful way to think about 2-D Homogeneous Co-ordinates \(\mathbb{P}^2\)

"Rays" in \(\mathbb{R}^3\)

- Cartesian form is "intersection" with plane \(z=1\).

- \((x,y,0)\) are forms that are parallel to the \(z=1\) plane, intersect at infinity.

- 3-D Homogeneous Co-ordinates are rays in 4D, intersection with a hyper-plane.

- We assume vectors are column vectors.
- \(p' = [x,y]\) implies a 2-D row vector (of size \(1\times 2\))
- \(p = [x,y]^T = \left[\begin{array}{c}x\\y\end{array}\right]\) implies a 2-D column vector (of size \(2\times 1\))

**Lines**

Equation of a line in 2D:

\[ax+by+c = 0\]

Let \(p=[\alpha~x,\alpha~y,\alpha]^T\) be homogeneous co-ordinates of a point \((x,y)\). Then,

What is the equation ? \[l^Tp = 0,~~~l = [a,b,c]^T\]

Interestingly, \(l\) is also defined "upto scale": \(~l' = [\beta a, \beta b, \beta c]^T\) describes the same line as \(l\).

**Lines**

Since \(l^Tp = 0\) for all points that lie on a line:

**Lines**

Given two points \(p_1\) and \(p_2\), what is the homogeneous vector for the line joining them ?

It has to be an \(l\) such that \(l^Tp_1 = 0\) and \(l^Tp_2 = 0\).

Is that sufficient to determine \(l\) ?

Yes. Because, only need \(l\) upto scale.

Solution given by: \(l = p_1 \times p_2\) (**Vector Cross-product**)

Recap: Writing \(u = [u_1,u_2,u_3]^T = u_1 \hat{i} + u_2 \hat{j} + u_3 \hat{k}\), and \(u = [v_1,v_2,v_3]^T = v_1 \hat{i} + v_2 \hat{j} + v_3 \hat{k}\)

\[u \times v = \text{det} \left|\begin{array}{ccc}\hat{i}& \hat{j}& \hat{k}\\u_1 & u_2 & u_3 \\v_1 & v_2 & v_3\end{array} \right| = (u_2v_3-u_3v_2)\hat{i} + (u_3v_1-u_1v_3)\hat{j} + (u_1v_2-u_2v_1)\hat{k}\]

\[= [(u_2v_3-u_3v_2),(u_3v_1-u_1v_3),(u_1v_2-u_2v_1)]^T~~~~~~~~~~\]

**Lines**

Given two lines \(l_1\) and \(l_2\), what is the homogeneous co-ordinate vector \(p\) for the point of their intersection ?

Same idea: \(l_1^T p = p^T l_1 = 0\) and \(p^T l_2 = 0\)

\(p = l_1 \times l_2\)

- Cross product between two points gives us the line between them
- Cross product between two lines gives us the point common to both

- What happens if \(l_1\) and \(l_2\) are parallel ?

Answer: Third co-ordinate of \(l_1\times l_2\) is 0. Point at infinity.

**Transformations**

- Translation:
- \(x' = x-c_x\), \(y'=y-c_y\)

- Express as, \(p' = T~p\) where \(T\) is a \(3\times 3\) matrix.

**Transformations**

- Translation:
- \(x' = x-c_x\), \(y'=y-c_y\)

- Verify this works for any scaled version of \(T\) above
- Verify this works for \(p=[\alpha x,\alpha y,\alpha]\), for any \(\alpha \neq 0\)

**Transformations**

- Rotation Around the Origin
- \(x' = x \cos \theta - y \sin \theta\), \(x \sin \theta + y \cos \theta\)

\[p' = \left[\begin{array}{ccc}\cos \theta & -\sin \theta & 0\\\sin\theta&\cos\theta&0\\0&0&1\end{array}\right] p \]

- Rotation around a different point \(c_x,c_y\) ?

\[p' = \left[\begin{array}{ccc}1 & 0 & c_x\\0&1&c_y\\0&0&1\end{array}\right] \left[\begin{array}{ccc}\cos \theta & -\sin \theta & 0\\\sin\theta&\cos\theta&0\\0&0&1\end{array}\right] \left[\begin{array}{ccc}1 & 0 & -c_x\\0&1&-c_y\\0&0&1\end{array}\right] p \]

**Transformations**

- Euclidean Transformation \[p' = \left[\begin{array}{cc}R & t\\0^T & 1\end{array}\right]p\]
- \(R\) is a \(2\times 2\) rotation matrix, \(R^TR=I\)
- \(t\) is a \(2\times 1\) translation vector
- \(0^T\) here represents a \(1\times 2\) row of two zeros
- Preserves orientation, lengths, areas

If \(R^TR = I\), is \(R\) always of the form:

\[R = \left[\begin{array}{cc}\cos \theta & -\sin \theta\\\sin \theta & \cos\theta\end{array}\right]\]

**Transformations**

- Euclidean Transformation \[p' = \left[\begin{array}{cc}R & t\\0^T & 1\end{array}\right]p\]
- \(R\) is a \(2\times 2\) rotation matrix, \(R^TR=I\)
- \(t\) is a \(2\times 1\) translation vector
- \(0^T\) here represents a \(1\times 2\) row of two zeros
- Preserves orientation, lengths, areas

If \(R^TR = I\), is \(R\) always of the form:

\[R = \left[\begin{array}{cc}-\cos \theta & -\sin \theta\\-\sin \theta & \cos\theta\end{array}\right]\]

**Transformations**

- Euclidean Transformation \[p' = \left[\begin{array}{cc}R & t\\0^T & 1\end{array}\right]p\]
- \(R\) is a \(2\times 2\) rotation matrix, \(R^TR=I\)
- \(t\) is a \(2\times 1\) translation vector
- \(0^T\) here represents a \(1\times 2\) row of two zeros
- Preserves orientation, lengths, areas

- Isometries
- \(R^TR=I\) can also correspond to reflections \[R = \left[\begin{array}{cc}1&0\\0&-1\end{array}\right]\]
- If we allow this in \(R\) above, more general than euclidean
- Preserves lengths, areas, but not orientation.

**Transformations**

What about scaling ?

Allow uniform scaling \(s\) along both co-ordinates: \[p' = \left[\begin{array}{cc}sR & t\\0^T & 1\end{array}\right]p\]

Called a similarity: preserves ratio of lengths, angles.

**Transformations**

Affine Transformation

\[p' = \left[\begin{array}{cc}A & t\\0^T & 1\end{array}\right]p\]

where \(A\) is a general invertible \(2\times 2\).

Preserves ratios of areas, parallel lines stay parallel.

Prove that parallel lines stay parallel.

- Consider the homogeneous vector \(q\) for intersection of two lines that are parallel.

- The third co-ordinate of \(q\) is 0, because the lines don't intersect.

- The affine transform doesn't change the third co-ordinate.
- Hence, the lines still intersect at infinity after the transformation.

Most general form:

\[p' = H p\]

where \(H\) is a general invertible \(3\times 3\) matrix.

- Called a projective transform or
**homography**. - All bets are off! Parallel lines can now intersect. Maps quadrilaterals to quadrilaterals.

- Defined upto scale. So 8 degrees of freedom.
- Hierarchy of Transforms
- Translation (2 dof) < Euclidean (3 dof) < Affine (6 dof) < Homography (8 dof)

- Defines mapping of co-ordinates of corresponding points in two images taken from different views:
- If all corresponding points lie on a plane in the world.
- If only the camera orientation has changed in two views (center is at the same place)