[关闭]
@nrailgun 2016-04-07T15:56:21.000000Z 字数 515 阅读 1418

Caffe Solver

机器学习


The Caffe solvers are:

The optimization objective over all |D| data instances is:

L(W)=1|D|i|D|fW(X(i))+λγ(W)

SGD

We have the following formulas to compute the update value Vt+1 and the updated updated weight Wt+1:

Vt+1=μVt+αL(Wt)

Wt+1=Wt+Vt+1

Generally you probably want to use a momentum μ=0.9 and learning rate α=0.01. If you increase μ, it might be a good idea to decrease α.

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注