Epipolar Geometry and
Stereo Vision
Computer Vision
CS 543 / ECE 549
University of Illinois
Derek Hoiem
04/12/11
Many slides adapted from Lana Lazebnik, Silvio Saverese, Steve Seitz, many figures from Hartley & Zisserman
HW 3 is back
Stats
• HW1: mean= 93, quartile= 91, median= 97
• HW2: mean= 89, quartile= 86, median= 96
• HW3: mean= 94, quartile= 89, median= 99
Summary
• Most homeworks were basically correct
• Problem 1b: u2, v2 need to account for scale and orientation
• Problem 1c: main causes of error were perspective and
multiple objects
• Problem 3: some extra credit possible
• Problem 4: sometimes wanted more detail
This class: Two-View Geometry
• Epipolar geometry
– Relates cameras from two views
• Stereo depth estimation
– Recover depth from two images
Depth from Stereo
• Goal: recover depth by finding image coordinate x’
that corresponds to x
f
x
x’
Baseline
B
z
C
C’
X
f
X
x
x'
Depth from Stereo
• Goal: recover depth by finding image coordinate x’ that
corresponds to x
• Problems
– Calibration: How do we recover the relation of the cameras (if
not already known)?
– Correspondence: How do we search for the matching point x’?
f
x
x’
Baseline
B
z
C
C’
X
f
X
x
x'
Correspondence Problem
• We have two images taken from cameras at different
positions
• How do we match a point in the first image to a point in the
second? What constraints do we have?
Potential matches for
x have to lie on the corresponding line
l’.
Potential matches for
x’ have to lie on the corresponding line
l.
Key idea: Epipolar constraint
x
x’
X
x’
X
x’
X
•
Epipolar Plane – plane containing baseline (1D family)
•
Epipoles
= intersections of baseline with image planes
= projections of the other camera center
•
Baseline – line connecting the two camera centers
Epipolar geometry: notation
X
x
x’
•
Epipolar Lines - intersections of epipolar plane with image
planes (always come in corresponding pairs)
Epipolar geometry: notation
X
x
x’
•
Epipolar Plane – plane containing baseline (1D family)
•
Epipoles
= intersections of baseline with image planes
= projections of the other camera center
•
Baseline – line connecting the two camera centers
Example: Converging cameras
Example: Motion parallel to image plane
Example: Forward motion
What would the epipolar lines look like if the
camera moves directly forward?
e
e’
Example: Forward motion
Epipole has same coordinates in both
images.
Points move along lines radiating from e:
“Focus of expansion”
X
x
x’
Epipolar constraint: Calibrated case
Suppose that we know the intrinsic and extrinsic parameters of the
cameras. Then we can…
1. Convert to normalized coordinates by pre-multiplying all points with the
inverse of the calibration matrix
2. Set the first camera’s coordinate system as world coordinates and define
R and t that map from X’ to X
XxKx
1
=
= -
ˆ
XxKx
1
′
=′
′
=′ -
ˆ
Here,
x is in homogeneous
coordinates but
X is in
inhomogeneous coordinates
tXRX
+′
=
X
Epipolar constraint: Calibrated case
R
t
The vectors , , and
are coplanar
xRˆ′
t
xˆ
XxKx
1
=
= -
ˆ
XxKx
1
′
=′
′
=′ -
ˆ
tXRX
+′
=
txR
+′
= ˆ
x
x’
Essential Matrix
(Longuet-Higgins, 1981)
Essential matrix
0)]
ˆ
([
ˆ
=
′
×
∙
xRtx
RtE
xEx
T
×
=
=′
with
0
ˆ
ˆ
X
x
x’
The vectors , , and
are coplanar
xR′
t
xˆ
X
x
x’
Properties of the Essential matrix
•
E x’ is the epipolar line associated with
x’ (
l = E x’)
•
ETx is the epipolar line associated with
x (
l’ = ETx)
•
E e’ = 0 and
ETe = 0
•
E is singular (rank two)
•
E has five degrees of freedom
– (3 for R, 2 for t because it’s up to a scale)
0)]
ˆ
([
ˆ
=
′
×
∙
xRtx
RtE
xEx
T
×
=
=′
with
0
ˆ
ˆ
Drop ^ below to simplify notation
Epipolar constraint: Uncalibrated case
• If we don’t know
K and
K’, then we can
write the epipolar constraint in terms of
unknown normalized coordinates:
X
x
x’
0
ˆ
ˆ
=′
xEx
T
xKxxKx′′
=′
=
ˆ
,ˆ
The Fundamental Matrix
X
x
x’
Fundamental Matrix
(Faugeras and Luong, 1992)
0
ˆ
ˆ
=′
xEx
T
xKx
xKx
′′
=′
=
ˆ
ˆ
1
with
0
-
-
′
=
=′
KEKF
xFx
T
T
Properties of the Fundamental matrix
1
with
0
-
-
′
=
=′
KEKF
xFx
T
T
•
F x’ is the epipolar line associated with
x’ (
l = F x’)
•
FTx is the epipolar line associated with
x (
l’ = FTx)
•
F e’ = 0 and
FTe = 0
•
F is singular (rank two): det(F)=0
•
F has seven degrees of freedom
X
x
x’
Estimating the Fundamental Matrix
• 8-point algorithm
– Least squares solution using SVD on equations from 8 pairs of
correspondences
– Enforce det(F)=0 constraint using SVD on F
• 7-point algorithm
– Use least squares to solve for null space (two vectors) using SVD
and 7 pairs of correspondences
– Solve for linear combination of null space vectors that satisfies
det(F)=0
• Minimize reprojection error
– Non-linear least squares
8-point algorithm
1. Solve a system of homogeneous linear
equations
a. Write down the system of equations
0
=′
xx
F
T
8-point algorithm
1. Solve a system of homogeneous linear
equations
a. Write down the system of equations
b. Solve
f from A
f=
0 using SVD
Matlab:
[U, S, V] = svd(A);
f = V(:, end);
F = reshape(f, [3 3])’;
Need to enforce singularity constraint
8-point algorithm
1. Solve a system of homogeneous linear
equations
a. Write down the system of equations
b. Solve
f from A
f=
0 using SVD
2. Resolve det(F) = 0 constraint by SVD
Matlab:
[U, S, V] = svd(A);
f = V(:, end);
F = reshape(f, [3 3])’;
Matlab:
[U, S, V] = svd(F);
S(3,3) = 0;
F = U*S*V’;
8-point algorithm
1. Solve a system of homogeneous linear equations
a. Write down the system of equations
b. Solve
f from A
f=
0 using SVD
2. Resolve det(F) = 0 constraint by SVD
Notes:
• Use RANSAC to deal with outliers (sample 8 points)
• Solve in normalized coordinates
– mean=0
– RMS distance = (1,1,1)
– just like with estimating the homography for stitching
Comparison of homography estimation and the
8-point algorithm
Homography (No Translation) Fundamental Matrix (Translation)
Assume we have matched points x x’ with outliers
Homography (No Translation)
• Correspondence Relation
1. Normalize image
coordinates
2. RANSAC with 4 points
3. De-normalize:
0
Hxx
Hx
x
=
×
⇒
=
'
'
Tx
x =
~
xTx′′
=′
~
THTH
~1-′
=
Fundamental Matrix (Translation)
Comparison of homography estimation and the
8-point algorithm
Assume we have matched points x x’ with outliers
Comparison of homography estimation and the
8-point algorithm
Homography (No Translation) Fundamental Matrix (Translation)
• Correspondence Relation
1. Normalize image
coordinates
2. RANSAC with 8 points
3. Enforce
by SVD
4. De-normalize:
• Correspondence Relation
1. Normalize image
coordinates
2. RANSAC with 4 points
3. De-normalize:
Assume we have matched points x x’ with outliers
0
Hxx
Hx
x
=
×
⇒
=
'
'
Tx
x =
~
xTx′′
=′
~
THTH
~1-′
=
Tx
x =
~
xTx′′
=′
~
TFTF
~1
-′
=
( ) 0
~
det
=
F
0
=
′
Fxx
T
7-point algorithm
Faster (need fewer points) and could be more robust (fewer
points), but also need to check for degenerate cases
“Gold standard” algorithm
• Use 8-point algorithm to get initial value of F
• Use F to solve for P and P’ (discussed later)
• Jointly solve for 3d points
X and
F that
minimize the squared re-projection error
X
x
x'
See Algorithm 11.2 and Algorithm 11.3 in HZ (pages 284-285) for details
Comparison of estimation algorithms
8-point
Normalized 8-point
Nonlinear least squares
Av. Dist. 1
2.33 pixels
0.92 pixel
0.86 pixel
Av. Dist. 2
2.18 pixels
0.85 pixel
0.80 pixel
We can get projection matrices P and P’ up
to a projective ambiguity
Code:
function P = vgg_P_from_F(F)
[U,S,V] = svd(F);
e = U(:,3);
P = [-vgg_contreps(e)*F e];
[ ]
0IP|
=
[ ][
]
e|Fe
P
′
′
=′
×
0
=
′
Fe
T
See HZ p. 255-256
K’*translation
K’*rotation
From epipolar geometry to camera
calibration
• Estimating the fundamental matrix is known as “weak
calibration”
• If we know the calibration matrices of the two cameras, we
can estimate the essential matrix:
E = KTFK’
• The essential matrix gives us the relative rotation and
translation between the cameras, or their extrinsic
parameters
Moving on to stereo…
Fuse a calibrated binocular stereo pair to
produce a depth image
image 1
image 2
Dense depth map
Many of these slides adapted from
Steve Seitz and Lana Lazebnik
Basic stereo matching algorithm
• For each pixel in the first image
– Find corresponding epipolar line in the right image
– Examine all pixels on the epipolar line and pick the best match
– Triangulate the matches to get depth information
• Simplest case: epipolar lines are scanlines
– When does this happen?
Simplest Case: Parallel images
• Image planes of cameras are
parallel to each other and to
the baseline
• Camera centers are at same
height
• Focal lengths are the same
Simplest Case: Parallel images
• Image planes of cameras are
parallel to each other and to
the baseline
• Camera centers are at same
height
• Focal lengths are the same
• Then, epipolar lines fall along
the horizontal scan lines of the
images
Simplest Case: Parallel images
RtE
xEx
T
×
=
=′,0
│
│
│
⌋
⌉
│
│
│
⌊
⌈
-
=
×
=
0
0
00
000
T
T
RtE
Epipolar constraint:
(
)
(
)
vTTv
vT
T
vu
v
u
T
T
vu
′
=
=
│
│
│
⎠
⎞
│
│
│
⎝
⎛
′
-
=
│
│
│
⎠
⎞
│
│
│
⎝
⎛
′
′
│
│
│
⌋
⌉
│
│
│
⌊
⌈
-
0
0
1
0
10
0
00
000
1
R = I t = (
T, 0, 0)
The y-coordinates of corresponding points are the same
t
x
x’
Depth from disparity
f
x’
Baseline
B
z
O
O’
X
f
z
fB
xx
disparity
∙
=′
-
=
Disparity is inversely proportional to depth.
x
z
f
OO
xx
=
′
-
′
-
Stereo image rectification
Stereo image rectification
• Reproject image planes
onto a common plane
parallel to the line
between camera centers
• Pixel motion is horizontal
after this transformation
• Two homographies (3x3
transform), one for each
input image reprojection
➢ C. Loop and Z. Zhang. Computing
Rectifying Homographies for Stereo
Vision. IEEE Conf. Computer Vision
and Pattern Recognition, 1999.
Rectification example
Basic stereo matching algorithm
• If necessary, rectify the two stereo images to transform
epipolar lines into scanlines
• For each pixel x in the first image
– Find corresponding epipolar scanline in the right image
– Examine all pixels on the scanline and pick the best match x’
– Compute disparity x-x’ and set depth(x) = fB/(x-x’)
Matching cost
disparity
Left
Right
scanline
Correspondence search
• Slide a window along the right scanline and
compare contents of that window with the
reference window in the left image
• Matching cost: SSD or normalized correlation
Left
Right
scanline
Correspondence search
SSD
Left
Right
scanline
Correspondence search
Norm. corr
Effect of window size
W = 3
W = 20
• Smaller window
+ More detail
– More noise
• Larger window
+ Smoother disparity maps
– Less detail
Failures of correspondence search
Textureless surfaces
Occlusions, repetition
Non-Lambertian surfaces, specularities
Results with window search
Window-based matching
Ground truth
Data
How can we improve window-based
matching?
• So far, matches are independent for each
point
• What constraints or priors can we add?
Stereo constraints/priors
• Uniqueness
– For any point in one image, there should be at
most one matching point in the other image
Stereo constraints/priors
• Uniqueness
– For any point in one image, there should be at most
one matching point in the other image
• Ordering
– Corresponding points should be in the same order in
both views
Stereo constraints/priors
• Uniqueness
– For any point in one image, there should be at most
one matching point in the other image
• Ordering
– Corresponding points should be in the same order in
both views
Ordering constraint doesn’t hold
Priors and constraints
• Uniqueness
– For any point in one image, there should be at most one
matching point in the other image
• Ordering
– Corresponding points should be in the same order in both
views
• Smoothness
– We expect disparity values to change slowly (for the most
part)
Stereo matching as energy minimization
I
1
I
2
D
• Energy functions of this form can be minimized
using
graph cuts
Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy Minimization
via Graph Cuts, PAMI 2001
W
1
(i)
W
2
(i+D(i))
D(
i)
)(
),;(
smooth
2
1
data
D
E
IID
EE
β
+
=
2
,
neighbors
smooth
)()(
∑
-
=
ji
jDiD
E
(
)2
2
1
data
))(
(
)(
∑
+
-
=
i
iDiWiW
E
Many of these constraints can be encoded in an energy
function and solved using graph cuts
Graph cuts
Ground truth
For the latest and greatest: http://www.middlebury.edu/stereo/
Y. Boykov, O. Veksler, and R. Zabih, Fast Approximate Energy
Minimization via Graph Cuts, PAMI 2001
Before
Summary
• Epipolar geometry
– Epipoles are intersection of baseline with image planes
– Matching point in second image is on a line passing through its
epipole
– Fundamental matrix maps from a point in one image to a line
(its epipolar line) in the other
– Can solve for F given corresponding points (e.g., interest points)
– Can recover canonical camera matrices from F (with projective
ambiguity)
• Stereo depth estimation
– Estimate disparity by finding corresponding points along
scanlines
– Depth is inverse to disparity
Next class: structure from motion