bayer_on fast dropout and its applicability to recurrent networks_2013 把marginalized dropout应用到RNN,认为循环递归放大的噪声,只用到前馈连接,不适合用到recurrent连接。
pham_dropout improves rnn for handwriting recogniion_2013
pachitariu_regularization and nonlinearities for neural language models: when are they needed?_2013 有把dropout应用到LSTM,加噪声到recurrent连接会导致不稳定性,只对decode部分加噪声