12.1 Learning to be Fast

In many real-world situations we must react rapidly to the earliest signs that could warn us about harmful stimuli. If an obstacle blocks our way, we want to avoid it before a painful contact occurs. If we ride a bicycle, we should make correcting steering movements already at small inclination angles of the bicycle, well before we fall down. Spike-time dependent learning rules with a temporally asymmetric learning window provide a hint of how a simple predictive coding could be implemented on the neuronal level.

**Figure 12.1:** A. A postsynaptic neuron receives inputs from twenty presynaptic cells at intervals of 3ms. All synapses have the same weight. The neuron emits two output spikes at about 31 and 58ms after stimulus onset. B. After 5 repetitions of the same stimulus the neuron fires after 10ms (dashed line); after a total of 100 repetitions the neuron fires already after about 5ms (solid line). C. The reason is that synapses that have been active slightly before the postsynaptic spikes are strengthened while others are depressed. To illustrate this point, the learning window W(t_j^(f) - t_i^(f)) is shown twice, each time centered at the postsynaptic firing time t_i^(f) of the first trial (shown in part A). D. The sequence of presynaptic spikes could be generated by a stimulus that moves from top to bottom.
$\hbox{{\bf A} \hspace{65mm} {\bf B}} \hbox{\hspace{10mm} \includegraphics[heigh... ...s} \hspace{25mm} \includegraphics[height=35mm]{Figs-ch-hebbcode/predict.eps} }$

Let us consider a single neuron that receives inputs from, say, twenty presynaptic cells which are stimulated one after the other; cf. Fig. 12.1. Initially, all synapses w_ij have the same weight w₀. The postsynaptic neuron fires two spikes; cf. Fig. 12.1A. All synapses that have been activated before the postsynaptic spike are strengthened while synapses that have been activated immediately afterwards are depressed; cf. Fig. 12.1C. In subsequent trials the threshold is therefore reached earlier; cf. Fig. 12.1B. After many trials, those presynaptic neurons that fire first have developed strong connections while other connections are depressed. Thus a temporally asymmetric learning rule favors connections that can serve as `earliest predictors' of other spike events (Song et al., 2000; Mehta et al., 2000).

12.1.0.1 Example: Hippocampal place fields

**Figure 12.2:** A The place fields of neurons in regions CA3 of hippocampus are indicated as cones along the track that extends from S to T. The place field of the neuron in CA1 shown with solid lines has its center at c and extends from l to r. B. After the rat has made several movements from left to right, some connections are increased (thick lines) others decreased (dotted lines). As a result, the place field center c has moved to the left.
$\hbox{{\bf A} \hspace{65mm} {\bf B}} \hbox{\hspace{10mm} \includegraphics[width... ...includegraphics[width=45mm]{Figs-ch-hebbcode/hippocampus-b.eps} } \vspace{1mm}$

Place cells are neurons in rodent hippocampus that are sensitive to the spatial location of the animal in an environment. The sensitive area is called the place field of the cell. If, for example, a rat runs on a linear track from a starting point S to a target point T, this movement would first activate cells with a place fields close to S, then those with a place field in the middle of the track, and finally those with a place field close to T; cf. Fig. 12.2. In a simple feedforward model of the hippocampus (Mehta et al., 2000), a first set of place cells is identified with neurons in region CA3 of rat hippocampus. A cell further down the processing stream (i.e., a cell in hippocampal region CA1) receives input from several cells in CA1. If we assume that initially all connections have the same weight, the place field of a CA1 cell is therefore broader than that of a CA3 cell.

During the experiment, the rat moves repeatedly from left to right. During each movement, the same sequence of CA3 cells is activated. This has consequences for the connections from CA3 cells to CA1 cells. Hebbian plasticity with an asymmetric learning window strengthens those connections where the presynaptic neuron fires early in the sequence. Connections from neurons that fire later in the sequence are weakened. As a result the center of the place field of a cell in CA3 is shifted to the left; cf. Fig. 12.2B. The shift of place fields predicted by asymmetric Hebbian learning has been confirmed experimentally (Mehta et al., 2000).

12.1.0.2 Example: Conditioning

The shift of responses towards early predictors plays a central role in conditioning. The basic idea is best explained by the paradigm of Pavlovian conditioning (Pavlov, 1927). Tasting or smelling food (stimulus s2) evokes an immediate response r. During the conditioning experiment, a bell (stimulus s1) rings always at a fixed time interval $\Delta$ T before the food stimulus. After several repetitions of the experiment, it is found that the response now occurs already after the first stimulus (s1). Thus the reaction has moved from stimulus s2 to stimulus s1 which reliably predicts s2.

Spike-time dependent plasticity with an asymmetric learning window allows to replicate this result, if the time difference $\Delta$ T between the two stimuli is less than the width of the learning window. The mechanism is identical to that of the previous example with the only difference that the input spikes are now clustered into two groups corresponding to the stimuli s1 and s2; cf. Fig. 12.3.

**Figure 12.3:** A. Conditioning paradigm. A response neuron r can receive input from two neuronal populations, representing the stimuli s1 and s2. B. Membrane potential of the postsynaptic neuron. Before learning, stimulation of the presynaptic population s1 which occurs at about t = 10 ms leads to subthreshold excitation of the postsynaptic neuron whereas stimulation of group s2 40 milliseconds later evokes postsynaptic firing. C. After learning postsynaptic firing is already triggered by the stimulus s1.
$\hbox{{\bf A} \hspace{45mm} {\bf B} \hspace{40mm} {\bf C}} \hbox{\hspace{5mm} ... ...pace{10mm} \includegraphics[height=30mm]{Figs-ch-hebbcode/cond-of-t-end.eps} }$

In behavioral experiments with monkeys, conditioning is possible with time intervals that span several seconds (Schultz et al., 1997) whereas typical learning windows extend over 50-100 milliseconds (Zhang et al., 1998; Markram et al., 1997; Bi and Poo, 1998,1999; Debanne et al., 1998; Magee and Johnston, 1997). In order to explain conditioning with time windows longer than 100 milliseconds, additional assumptions regarding neuronal architecture and dynamics have to be made; see, e.g., Brown et al. (1999); Suri and Schutz (2001); Fiala et al. (1996). A potential solution could be provided by delayed reverberating loops; cf. Chapter 8.3. As an aside we note that, traditionally, conditioning experiments have been discussed on the level of rate coding. For slowly changing firing rates, spike-time dependent rules learning rules with an asymmetric learning window yield a differential Hebbian term [cf. Eq. (11.64)] that is proportional to the derivative of the postsynaptic rate, which is the starting point of models of conditioning (Schultz et al., 1997; Montague et al., 1995).