SDM and its Applications to Face Alignment
论文笔记
Derivation of SDM
Given an image d∈Rm×1 of m pixels, d(x)∈Rp×1 indexes p landmarks in the image. h is a non-linear feature extraction function (e.g., SIFT), and h(d(x))∈R128p×1 in case of SIFT.
During training, we will assume that the correct p landmarks (in our case 66) are known and refered to as x∗. We ran the face detector on the training images to provide an initial configuration of the landmarks x0, which corresponds to an average shape.
In this setting, face alignment can be framed as minimizing the following function over Δx
f(x0+Δx)=∥h(d(x0+Δx))−ϕ∗∥22
where
ϕ∗=h(d(x∗)).
For derivation purposes, we will assume that h is twice differentiable. We apply a second order Taylor expansion
f(x0+Δx)≈f(x0)+Jf(x0)TΔx+12ΔxTH(x0)Δx
where
Jf(x0)∈Rp×1,
H(x0)∈Rp×p.
SDM will learn a sequence of generic descent directions {Rk} and bias terms {bk}
xk=xk−1+Rk−1ϕk−1+bk−1
such that the succession of
xk converges to
x∗.
Learning for SDM
Minimize
argminRk,bk∑di∑xik∥Δxki∗−Rkϕik−bk∥2
where
Δxki=xi∗−xik.