@nrailgun 2015-10-18T06:08:15.000000Z 字数 934 阅读 1552

CNNVR: Training Neural Network

机器学习

Step 1: Process Data

Zero-center data, then normalize data. In practice, you may also see PCA and whitening.

Step 2: Choose the Architecture

Select apropriate architecture. Set weights to small random numbers and biases to zero. Usually $W \sim N(0, 0.01)$ works, if not, normalize by square root of fan in.

Tip:
1. Make sure you can overfit very small portion of the data.
2. Loss not going down: learning rate too low.
3. Loss exploding: learning rate too high.
4. "Coarse to Fine" cross-validation in stages.
5. Visualization.

Regularization knobs

L2: $\frac{1}{2} \lambda w^2$
L1: $\lambda |w|$

Dropout: Drop a neuron at each iteration with probability $p$ .

Learning Rate

$\mathrm{SGD} + \mathrm{Momentum} \gt \mathrm{SGD}$ . Momentum $0.9$ usually works well.

ν = γ ν + α ▽ θ J (θ, x (i), y (i))

$\nu = \gamma \nu + \alpha \triangledown_\theta J(\theta, x^{(i)}, y^{(i)})$

θ = θ - ν

$\theta = \theta - \nu$
where

ν $\nu$ is the current velocity vector, momentum

α $\alpha$ is learning rate,

γ∈[0,1) $\gamma \in [0, 1)$ . Decreasing the learning rate over time.