In the 1980s, the second wave of neural network research emerged in great part via a movement called connectionism or parallel distributed processing (Rumelhart et al., 1986c; McClelland et al., 1995).
Connectionism arose in the context of cognitive science.
Cognitive science is an interdisciplinary approach to understanding the mind, combining multiple different levels of analysis.
During the early 1980s, most cognitive scientists studied models of symbolic reasoning.
Despite their popularity, symbolic models were difficult to explain in terms of how the brain could actually implement them using neurons.
The connectionists began to study models of cognition that could actually be grounded in neural implementations (Touretzky and Minton, 1985), reviving many ideas dating back to the work of psychologist Donald Hebb in the 1940s (Hebb, 1949).
The central idea in connectionism is that a large number of simple computational units can achieve intelligent behavior when networked together.
This insight applies equally to neurons in biological nervous systems and to hidden units in computational models.
Several key concepts arose during the connectionism movement of the 1980s that remain central to today’s deep learning.
Distributed representation
One of these concepts is that of distributed representation (Hinton et al., 1986).
Backpropagation
Another major accomplishment of the connectionist movement was the successful
use of back-propagation to train deep neural networks with internal representations and the popularization of the back-propagation algorithm (Rumelhart
et al., 1986a; LeCun, 1987).
This algorithm has waxed and waned in popularity but as of this writing is currently the dominant approach to training deep models.
Sequence modeling
During the 1990s, researchers made important advances in modeling sequences with neural networks.
Hochreiter (1991) and Bengio et al. (1994) identified some of the fundamental mathematical difficulties in modeling long sequences.
Hochreiter and Schmidhuber (1997) introduced the long short-term memory or LSTM network to resolve some of these difficulties.
Today, the LSTM is widely used for many sequence modeling tasks, including many natural language processing tasks at Google.