[关闭]
@nrailgun 2016-03-08T22:13:53.000000Z 字数 435 阅读 1418

Avoid using Sigmoid

机器学习


Sigmoid was popular as activision function, but it should not be used.

  1. Saturated neurons "kill" the gradients.
    For example, σx=10=4.5e5.
  2. Sigmoid outputs are not zero-centered.
    If input to a neuron is always positive, then gradients on w are always positive or negative.
  3. exp is a bit compute expensive.

tanh squashes numbers to range [1,1]. It's zero centered (nice), but kills gradients still.

ReLU is a great idea.

添加新批注
在作者公开此批注前,只有你和作者可见。
回复批注