[[1, 5, 6], [3,4], [3]] -- [[1, 5], [3,4], [3, None]]
[[1, 5, 6], [3,4], [3]] := [[1,3,3],[5,4, EOS],[3,EOS, None]
Douglas and Martin, 1989
... physically mapped the synapses on the dendritic trees (...) in layer 4 of the cat primary visual cortex and found that only 5% of the excitatory synapses arose from the lateral geniculate nucleus (LGN)
Binzegger et al. 2004
Susillo, 2014
Working Memory:
O'Reilly and Frank, 2006
Human brain areas for working memory of face identity and location
Murray et al. 2017
https://colah.github.io/posts/2015-08-Understanding-LSTMs/
Examples: Depending on the output structure, different problems can be solved:
Elman, Finding Structure in Time, 1990
$$ \begin{split} h_t = \text{tanh}(W_{ih} x_t + W_{hh} h_{(t-1)} ) \end{split} $$
Williams and Zipser, 1995
The temporal separation between targets and inputs makes training difficult. This is called the temporal credit assignment problem
Bengio et al. 1994
Their Residual Net or ResNet is a special case of our Highway Net of May 2015, the first very deep feedforward networks with hundreds of layers. Highway nets are essentially feedforward versions of recurrent Long Short-Term Memory (LSTM) networks