While the relevance of spike timing in the millisecond range in cortical areas is a topic of intense debate, there are a few specific systems where temporal coding is generally accepted. One of the most prominent examples is the auditory system of the barn owl (Carr and Konishi, 1988,1990; Konishi, 1986; Carr, 1993,1995), and this is the system we will focus on in this section. Owls hunt at night. From behavioral experiments it is known that owls can locate sound sources even in complete darkness with a remarkable precision. To do so, the signal processing in the auditory pathway must achieve a temporal precision in the microsecond range with elements that are noisy, unreliable and rather slow. In this section, we use the results of previous chapters and show that spiking neurons that are driven in the sub-threshold regime are sensitive to temporal structure in the presynaptic input, in particular to synchronous spike arrival; cf. Sections 5.8 and 7.3. On the other hand, synchronous spike arrival is only possible if presynaptic transmission delays are appropriately tuned. A spike-time dependent Hebbian learning rule can play the role of delay tuning or delay selection (Eurich et al., 1999; Gerstner et al., 1996a,1997; Kempter et al., 1999; Senn et al., 2001a; Hüning et al., 1998).
We start this section with an outline of the problem of sound source localization and a rough sketch of the barn owl auditory pathway. We turn then to the problem of coincidence detection and the idea of delay selection by a spike-time dependent learning rule.
Barn owls use interaural time differences (ITD) for sound source localization (Carr and Konishi, 1990; Moiseff and Konishi, 1981; Jeffress, 1948). Behavioral experiments show that barn owls can locate a sound source in the horizontal plane with a precision of about 1-2 degrees of angle (Knudsen et al., 1979). A simple calculation shows that this corresponds to a temporal difference of a few microseconds (< 5s) between the sound waves at the left and right ear. These small temporal differences must be detected and evaluated by the owl's auditory system; see Fig. 12.12.
The basic principle of how such a time-difference detector could be set up was discussed by Jeffress (1948) more than 50 years ago. It consists of delay lines and an array of coincidence detectors. If the sound source is on the right-hand side of the auditory space, the sound wave arrives first at the right ear and then at the left ear. The signals propagate from both ears along transmission lines towards the set of coincidence detectors. A signal originating from a source located to the right of the owl's head, stimulates a coincidence detector on the left-hand side of the array. If the location of the signal source is shifted, a different coincidence detector responds. The `place' of a coincidence detector is therefore a signature for the location of the external sound source; cf. Fig. 12.12). Such a representation has been called `place' coding (Carr and Konishi, 1990; Konishi, 1986).
Remarkably enough, such a coincidence detector circuit was found four decades later by Carr and Konishi (1990) in the nucleus laminaris of the barn owl. The existence of the circuit confirms the general idea of temporal difference detection by delayed coincidence measurement. It gives, however, no indication of how the precision of a few microseconds is finally achieved.
In order to better understand how precise spike timing arises, we have to study signal processing in the auditory pathway. Three aspects are important: frequency separation, phase locking, and phase-correct averaging.
The first few processing steps along the auditory localization pathway are sketched in Fig. 12.12B. The figure represents, of course, a simplified picture of auditory information processing, but it captures some essential ingredients. At both ears the sound wave is separated into its frequency components. Signals then pass an intermediate processing area called nucleus magnocellularis (NM) and meet at the nucleus laminaris (NL). Neurons there are found to be sensitive to the interaural time difference (ITD). Due to the periodicity of a sinusoidal wave, the ITD of a single frequency channel is really a phase difference and leaves some ambiguities. In the next processing step further up in the auditory pathway, information on phase differences from different frequency channels is combined to retrieve the temporal difference and hence the location of the sound source in the horizontal plane. Reviews of the basic principles of auditory processing in the owl can be found in Konishi (1993,1986).
Let us now discuss the first few processing steps in more detail. After cochlear filtering, different frequencies are processed by different neurons and stay separated up to the nucleus laminaris. In the following we may therefore focus on a single frequency channel and consider a neuron which responds best to a frequency of, say, 5kHz.
If the ear is stimulated with a 5kHz tone, neurons in the 5kHz channel are activated and fire action potentials. At first sight, the spike train looks noisy. A closer look, however, reveals that the pulses are phase locked to the stimulating tone: Spikes occur preferentially around some phase with respect to the periodic stimulus. Phase locking is, of course, not perfect, but subject to two types of noise; cf. Fig. 12.13. First, spikes do not occur at every cycle of the 5kHz tone. Often the neuron misses several cycles before it fires again. Second, spikes occur with a temporal jitter of about = 40s around the preferred phase (Sullivan and Konishi, 1984) [Sullivan and Konishi, 1984].
For the sake of simplicity we describe the spike train by a Poisson process with a periodically modulated rate
Phase locking can be observed in the auditory nerve connecting the cochlea and the nucleus magnocellularis, in the nucleus magnocellularis, and also in the nucleus laminaris. The phase jitter even decreases from one processing step to the next so that the temporal precision of phase locking increases from around 40s in the nucleus magnocellularis to about 25s in the nucleus laminaris. The precision of phase locking is the topic of the following subsection.
We focus on a single neuron i in the nucleus laminaris (NL). The neuron receives input from neurons in the nucleus magnocellularis through about 150 synapses. All input lines belong to the same frequency channel. The probability of spike arrival at one of the synapses is given by Eq. (12.13) where j labels the synapses and T = 0.2ms is the period of the signal.
As a neuron model for i we take an integrate-and-fire unit with membrane time constant and synaptic time constant . From experiments on chickens it is known that the duration of an EPSP in the NL is remarkably short (< 1ms) (Reyes et al., 1994,1996). Neurons of an auditory specialist like the barn owl may be even faster. In our model equations, we have set = = 0.1ms. These values correspond to an EPSP with a duration of about 0.25ms.
The short duration of EPSPs in neurons in the NL and NM is due to an outward rectifying current which sets in when the membrane potential exceeds the resting potential (Manis and Marx, 1991; Oertel, 1983). The purely passive membrane time constant is in the range of 2ms (Reyes et al., 1994), but the outward rectifying current reduces the effective membrane resistance whenever the voltage is above the resting potential. In a conductance-based neuron model (cf. Chapter 2), all membrane currents would be described explicitly. In our integrate-and-fire model, the main effect of the outward rectifying current is taken into account by working with a short effective membrane time constant =0.1ms. A membrane constant of 0.1ms is much shorter than that found in cortical neurons where 10 - 50ms seem to be typical values; see, e.g., [Bernander et al. 1991]. Note, however, that for temporal coding in the barn owl auditory system, = 0.1ms is quite long as compared to the precision of phase locking of 25 s found in auditory neurons and necessary for successful sound source localization.
To get an intuitive understanding of how phase locking arises, let us study an idealized situation and take perfectly coherent spikes as input to our model neuron; cf. Fig. 12.14. Specifically, let us consider a situation where 100 input lines converge on the model neuron. On each line, spike arrival is given by (12.13) with 0 and p = 0.2. If the delays are the same for all transmission lines ( = ), then in each cycle a volley of 20±5 synchronized spikes arrive. The EPSPs evoked by those spikes are added as shown schematically in Fig. 12.14. The output spike occurs when the membrane potential crosses the threshold . Note that the threshold must be reached from below. It follows that the output spike must always occur during the rise time of the EPSPs generated by the last volley of spikes before firing.
Since the input spikes are phase-locked to the stimulus, the output spike will also be phase-locked to the acoustic waveform. The preferred phase of the output spike will, of course be slightly delayed with respect to the input phase = (2/T). The typical delay will be less than the rise time of an EPSP. Thus, = ( +0.5 ) (2/T) will be a reasonable estimate of the preferred output phase.
Can we transfer the above qualitative arguments to a more realistic scenario? We have simulated a neuron with 154 input lines. At each synapse spikes arrive a time-dependent rate as in Eq. (12.13). The temporal jitter has been set to = 40 s. The delays (and hence the preferred phases) have a jitter of 35 s around some mean value . As before, p = 0.2 for all inputs.
A short interval taken from a longer simulation run with these input parameters is shown in Fig. 12.15. Part A shows the membrane potential u(t) as a function of time; Fig. 12.15B and C show the distribution of spike arrival times. Even though spike arrival is rather noisy, the trajectory of the membrane potential exhibits characteristic periodic modulations. Hence, following the same arguments as in Fig. 12.14 we expect the output spike to be phase-locked. Fig. 12.16A confirms our expectations: the distribution of output phases exhibits a pronounced peak. The width of the distribution corresponds to a temporal precision of = 25s, a significant increase in precision compared to the input jitter = 40s.
So far we have assumed that the delays have a small variation of 35s only. Hence the preferred phases = (2/T) are nearly identical for all input lines. If the preferred phases are drawn stochastically from a uniform distribution over [0, 2], then spike arrival at the neuron is effectively incoherent, even though the spikes on each input line exhibit phase-locking. If input spikes arrive incoherently, the temporal precision is lost and the output spikes have a flat phase distribution; see Fig. 12.16B.
We conclude that spiking neurons are capable of transmitting phase information, if input spikes arrive with a high degree of coherence. If input spikes arrive incoherently, the temporal information is lost. As we will see in the following subsection, this observation implies that that the reliable transmission of temporal codes requires a mechanism for delay-line tuning.
Each neuron in the nucleus laminaris (NL) of the barn owl receives input from about 150 presynaptic neurons (Carr and Konishi, 1990; Carr, 1993) [Carr and Konishi, 1990; Carr, 1993]. The high degree of convergence enables the neuron to increase the signal-to-noise ratio by averaging over many (noisy) transmission lines. As we have seen in the preceding section, the temporal precision of phase locking is indeed increased from 40s in the input lines to 25s in the output of our model neuron in the NL.
Such an averaging scheme, however, can work only, if the preferred phases of all input lines are (nearly) the same. Otherwise the temporal precision is decreased or even lost completely as shown in Fig. 12.16B. To improve the signal-to-noise ratio, `phase-correct' averaging is needed. The question arises of how a neuron in the NL can perform correct averaging.
The total delay from the ear to the NL has been estimated to be in the range of 2-3ms (Carr and Konishi, 1990). Even if the transmission delays vary by only 0.1-0.2ms between one transmission line and the next, the phase information of a 5kHz signal is completely lost when the signals arrive at the NL. Therefore the delays must be precisely tuned so as to allow the neurons to perform phase-correct averaging.
Precise wiring of the auditory connections could be set up genetically. This is, however, rather unlikely since the owl's head grows considerably during development. Moreover, while neurons in the nucleus laminaris of the adult owl are sensitive to the interaural phase difference, no such sensitivity was found for young owls (Carr, 1995). This indicates that delay tuning arises only later during development. It is clear that there can be no external supervisor or controller that selects the appropriate delays. What the owl needs is an adaptive mechanism which can be implemented locally and which achieves a tuning of appropriate delays.
Tuning can be achieved either by selection of appropriate delay lines (Gerstner et al., 1996a) or by changes in axonal parameters that influence the transmission delay along the axon (Eurich et al., 1999; Senn et al., 2001a; Hüning et al., 1998). We focus on learning by delay selection; cf. Fig. 12.17. Immediately after birth a large number of connections are formed. We suppose that during an early period of post-natal development a tuning process takes place which selectively reinforces transmission lines with similar preferred phase and eliminates others (Gerstner et al., 1996a). The selection process can be implemented by a spike-time dependent Hebbian learning rule as introduced in Chapter 10.
In order to illustrate delay selection in a model study, we assume that both ears are stimulated by a pure 5kHz tone with interaural time difference ITD=0. The effect of stimulation is that spikes arrive at the synapses with a periodically modulated rate (t) as given by Eq. (12.13). During learning, synaptic weights are modified according to a spike-time dependent Hebbian learning rule
The non-Hebbian term a1pre in Eq. (12.14) is taken as small but positive. The learning window W(s) is the one shown in Fig. 12.18. It has a negative integral = W(s) ds < 0 and a maximum at s* = - 0.05ms. The choice s* = - /2 guarantees stable learning (Gerstner et al., 1996a). As we have seen in Section 11.2.3, the combination of a learning window with negative integral and a positive non-Hebbian term a1pre leads to a stabilization of the postsynaptic firing rate. Thus the postsynpatic neuron remains during learning in the subthreshold regime, where it is most sensitive to temporal coding; cf. Section 5.8. The rate stabilization induces in turn an effective competition between different synapses. Thus, we expect that some synapses grow at the expense of others that must decay.
The results of a simulation run are shown in Fig. 12.19. Before learning the neurons receives input over about 600 synapses from presynaptic neurons. Half of the input lines originate from the left, the other half from the right ear. The total transmission delays are different between one line and the next and vary between 2 and 3ms. At the beginning of learning all synaptic efficacies have the same strength wij = 1 for all j. The homogeneous weight distribution becomes unstable during learning (Fig. 12.19, Middle). The instability can been confirmed analytically using the methods developped in Chapter 11 (Kempter, 1997). After learning the synaptic efficacies have approached either the upper bound wmax = 3 or they have decayed to zero. The transmission lines which remain after learning have either very similar delays, or delays differing by a full period (Fig. 12.19, Bottom). Thus, the remaining delays form a consistent pattern that guarantees that spikes arrive with a high degree of coherence.
The sensitivity of the output firing rate to the interaural time difference (ITD) and the degree of phase locking were tested before, during, and after learning (right column in Fig. 12.19). Before learning, the neuron shows no sensitivity to the ITD. This means that the neuron is not a useful coincidence detector for the sound source localization task. During learning ITD sensitivity develops similar to that found in experiments (Carr, 1995). After learning the output rate is significantly modulated as a function of ITD. The response is maximal for ITD=0, the ITD used during learning. The form of the ITD tuning curves corresponds to experimental measurements.
To test the degree of phase locking in the output we have plotted the vector strength, vs, as a function of ITD. By definition the vector strength is proportional to the first Fourier component of the histogram of phase distributions; cf. Fig. 12.16. It is therefore a suitable measure of phase-locking. The vector strength is normalized so that vs = 1 indicates perfect phase locking (infinite temporal precision or =0). Let us focus on the value of vs in the case of optimal stimulation (ITD=0). Before learning vs 0.1, which indicates that there is no significant phase locking. The value of vs 0.8 found after learning confirms that after the tuning of the synapses, phase locking is very pronounced.
To summarize, spike-time dependent Hebbian synaptic plasticity selects delay lines so that spikes arrive with maximal coherence. After learning the postsynaptic neuron is sensitive to the interaural time difference, as it should be for the neuron that are used for sound source localization. A temporal resolution in the range of a few microseconds can be achieved even though the membrane time constant and synaptic time constant are in the range of 100 microseconds.
© Cambridge University Press
This book is in copyright. No reproduction of any part
of it may take place without the written permission
of Cambridge University Press.