[关闭]
@nrailgun 2015-11-05T21:15:34.000000Z 字数 2268 阅读 1640

Newton Method and Quasi Newton Method

机器学习


Newton Method

Considering unconstrainted optimization problem

minxRnf(x)

where x is the minimum of objective function.

Assuming f(x) is twice differentiable, and x(k) is value of x in k-th iteration, the taylor expansion of f(x(k)) is

f(x)=f(x(k))+gTk(xx(k))+12(xx(k))TH(x(k))(xx(k))

where gk=g(x(k))=f(x(k)), and H(x(k)) is Hesse matrix of f(x)
H(x)=[2fxixj]n×n

Algorithm

Input: objective function f(x), gradient g(x)=f(x), Hesse matrix H(x);
Output: minimum x.

  1. Pick init point x(0), set k=0.
  2. Calculate gk=g(x(k)).
  3. Stop if gk<ϵ, let x=x(k).
  4. Calculate Hk=H(x(k)) and pk
    Hkpk=gk
  5. Set x(k+1)=x(k)+pk.
  6. Set k=k+1, goto step (2).

Quasi Newton Method

Calculating H1 is rather expensive. Quasi newton method approximates H1k with Gk=G(x(k)).

In newton method, we have

gk+1gk=Hk(x(k+1)x(k))

Let yk=gk+1gk and δk=x(k+1)x(k),
yk=Hkδk

The equation above are called quasi newton condition. Algorithms picking Gk approximating H1k or Bk approximating Hk are called quasi newton method.

BFGS

Approxiamate H with

Bk+1=Bk+ykyTkyTkδkBkδkδTkBkδTkBkδk

where yk=gk+1gk and δk=x(k+1)x(k).

Algorithm

Input: objective function f(x), g(x)=f(x), precision ϵ;
Output: minimum x of f(x).

  1. Pick init point x(0), positive-definite matrix B0, k=0.
  2. Calculate gk=g(x(k)), stop if gk<ϵ, let x=x(k).
  3. Calculate pk with Bkpk=gk.
  4. 1D search:
    f(x(k)+λkpk)=minλ0f(x(k)+λpk)
  5. Let x(k+1)=x(k)+λkpk.
  6. Calculate Bk+1:
    Bk+1=Bk+ykyTkyTkδkBkδkδTkBkδTkBkδk
  7. Set k=k+1, goto step (2).
添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注