Homography transformations describe the change in projected position of points on a rigidly moving plane observed between 2D images.
A homography H ∈ ℝ³ˣ³
transforms a point (x y)∈ℝ²
to a new point (x' y')∈ℝ²
by linear transformation in homogeneous coordinates:
/ x'/w \ / h₁₁ h₁₂ h₁₃ \ / xᵢ \
| y'/w | = | h₂₁ h₂₂ h₂₃ | | yᵢ |
\ w / \ h₃₁ h₃₂ h₃₃ / \ 1 /
We can pose the problem of finding the homography H
which matches a set of input points (xᵢ yᵢ)∈ℝ²
to corresponding target points (x'ᵢ y'ᵢ)∈ℝ²
:
‖ / x'ᵢ \ - / (h₃₁ h₃₂ h₃₃ )/ xᵢ \ \⁻¹ /h₁₁ h₁₂ h₁₃ \/ xᵢ \ ‖²
min ∑ᵢ ‖ \ y'ᵢ / | | yᵢ | | \h₂₁ h₂₂ h₂₃ /| yᵢ | ‖
H ‖ \ \ 1 / / \ 1 / ‖
The term inside the square is non-linear in H
. The common adjustment is to multiply this term through so that the result is a linear least-squares problem:
‖ / (h₃₁ h₃₂ h₃₃ )/ xᵢ \ \ / x'ᵢ \ - /h₁₁ h₁₂ h₁₃ \/ xᵢ \ ‖²
min ∑ᵢ ‖ | | yᵢ | | \ y'ᵢ / \h₂₁ h₂₂ h₂₃ /| yᵢ | ‖
H ‖ \ \ 1 / / \ 1 / ‖
This can be rearranged as:
‖ / -xᵢ -yᵢ -1 0 0 0 x'ᵢxᵢ - x'ᵢyᵢ -x'ᵢ \ / h₁₁ \ ‖²
‖ \ 0 0 0 -xᵢ -yᵢ -1 y'ᵢxᵢ - y'ᵢyᵢ -y'ᵢ / | h₁₂ | ‖
‖ ^---------------------.---------------------^ | h₁₃ | ‖
‖ Aᵢ | h₂₁ | ‖
min ∑ᵢ ‖ | h₂₂ | ‖
H ‖ | h₂₃ | ‖
‖ | h₃₁ | ‖
‖ | h₃₂ | ‖
‖ \ h₃₃ / ‖
and further simplified as:
min (h₁₁ h₁₂ h₁₃ h₂₁ h₂₂ h₂₃ h₃₁ h₃₂ h₃₃) (∑ᵢ AᵢᵀAᵢ) / h₁₁ \
H ^----Q---^ | h₁₂ |
| h₁₃ |
| h₂₁ |
| h₂₂ |
| h₂₃ |
| h₃₁ |
| h₃₂ |
\ h₃₃ /
To avoid the trivial solution H=0
, we take the first eigen vector of Q
, which corresponds to the minimizer subject to the constraint H:H = 1
. Other sources suggest constraining h₃₃=1
.
Unfortunately, this solution does not minimize the original least-squares problem. The trouble started when we multiplied the term inside the square. This can be understood instead by introducing a per-point bias wᵢ²
:
‖ / x'ᵢ \ - / (h₃₁ h₃₂ h₃₃ )/ xᵢ \ \⁻¹ /h₁₁ h₁₂ h₁₃ \/ xᵢ \ ‖²
min ∑ᵢ wᵢ² ‖ \ y'ᵢ / | | yᵢ | | \h₂₁ h₂₂ h₂₃ /| yᵢ | ‖
‖ \ \ 1 / / \ 1 / ‖
where
wᵢ = (h₃₁ h₃₂ h₃₃ )/ xᵢ \
| yᵢ |
\ 1 /
Perhaps without noticing it, we've placed more weight on reproducing points with a large homogeneous coordinate. The more perspective distortion in the observation, the stronger this bias will be.
Alternatively, we could simply minimize the original least-squares problem directly using Gauss-Newton method:
‖ / x'ᵢ \ - / (h₃₁ h₃₂ h₃₃ )/ xᵢ \ \⁻¹ /h₁₁ h₁₂ h₁₃ \/ xᵢ \ ‖²
min ∑ᵢ ‖ \ y'ᵢ / | | yᵢ | | \h₂₁ h₂₂ h₂₃ /| yᵢ | ‖
H ‖ \ \ 1 / / \ 1 / ‖
^----------------------------Fᵢ(H)----------------------^
Given some current guess H
we can define the per-point Jacobians Jᵢ
as:
Jᵢ = ∂Fᵢ/∂H = (tᵢ zᵢ) ∈ ℝ²ˣ⁹
tᵢ = / (h₃₁ h₃₂ h₃₃ )/ xᵢ \ \⁻¹ / xᵢ yᵢ 1 0 0 0 \
| | yᵢ | | \ 0 0 0 xᵢ yᵢ 1 /
\ \ 1 / /
zᵢ = / (h₃₁ h₃₂ h₃₃ )/ xᵢ \ \⁻² / xᵢ yᵢ 1 \ ⊙ /h₁₁ h₁₂ h₁₃ \/ xᵢ \
| | yᵢ | | \ xᵢ yᵢ 1 / \h₂₁ h₂₂ h₂₃ /| yᵢ |
\ \ 1 / / \ 1 /
Then the Gauss-Newton update is written as:
H ← H - (∑ᵢ JᵢᵀJᵢ)⁺ (∑ᵢ Jᵢᵀ Fᵢ)
where M⁺
is the pseudo-inverse of M
.
It seems sufficient to initialize H = identity
and take just 3 or 4 Gauss-Newton steps, implying that this should just be a small constant factor slower than non-optimal Eigen-decomposition method above. In many problems, when a better initial guess may be available, a single step may be enough.
It's not immediately clear to me whether the original least squares problem is convex over H
. If not, then there should be some concern of finding a local rather than global minimum.