SDM and its Applications to Face Alignment
论文笔记
Derivation of SDM
Given an image d∈Rm×1 of m pixels, d(x)∈Rp×1 indexes p landmarks in the image. h is a non-linear feature extraction function (e.g., SIFT), and h(d(x))∈R128p×1 in case of SIFT.
During training, we will assume that the correct p landmarks (in our case 66) are known and refered to as x∗. We ran the face detector on the training images to provide an initial configuration of the landmarks x0, which corresponds to an average shape.
In this setting, face alignment can be framed as minimizing the following function over Δx 
f(x0+Δx)=∥h(d(x0+Δx))−ϕ∗∥22
 
where 
ϕ∗=h(d(x∗)).
For derivation purposes, we will assume that h is twice differentiable. We apply a second order Taylor expansion 
f(x0+Δx)≈f(x0)+Jf(x0)TΔx+12ΔxTH(x0)Δx
 
where 
Jf(x0)∈Rp×1, 
H(x0)∈Rp×p.
SDM will learn a sequence of generic descent directions {Rk} and bias terms {bk} 
xk=xk−1+Rk−1ϕk−1+bk−1
 
such that the succession of 
xk converges to 
x∗.
Learning for SDM
Minimize 
argminRk,bk∑di∑xik∥Δxki∗−Rkϕik−bk∥2
 
where 
Δxki=xi∗−xik.