![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image28.png]]
### The Physics of sound
Sound is a vibration that propagates as an acoustic wave through a transmission medium such as air. Audition is the ability of the organs evolved by humans to perceive these changes in air pressure which are then translated in our sensation of sounds.
**Amplitude** and **Wavelength** are the parameters of the waves generated by the vibrating body.
**Measuring Sound Intensity**
Sound intensity is related to the changes of amplitude of the waves. And the Decibels are a perceptual measurement.
We are sensitive to an enormous range of intensities, so a logarithmic scale works well. Intensity in dB = 20 x log(P1/P2) where P2 is 20 Micropascal, which is the **reference air pressure**, i.e., the minimal air pressure difference that humans can detect.
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image29.png]]
From the graph above, it can be noticed that cats need lower intensity to be able to hear sounds keeping frequency constant. In addition, the frequencies that require the lowest intensity to be heard in humans are those associated with human speech, because our ear is perfectly tuned for that.
**Fourier Analysis**
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image30.png]]
"Any complex waveform can be represented as the sum of a series of sine waves of different frequencies and amplitudes". This is very important to understand how the cochlea works because it seems to be performing a type of Fourier analysis, by decomposing the sound waves."
### The Human Ear
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image31.png]]
Key parts:
- **External Ear** (pinna): it collects sound waves and channels them into the ear canal, where the sound is amplified.
- **Tympanic Membrane**
- **Ossicles**: the ossicle chamber is filled with air. Ossicles are malleus, incus and stapes. These three ossicles connect the tympanic membrane to the inner ear allowing for transmission of sound waves.
- **Eustachian tube**: equalizes the pressure
- **Cochlea**: it is filled with liquid. It transforms changes in pressure in electrical signals.
**The Middle Ear**
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image32.png]]
The whole job of these ossicles is to transmit the vibration of the tympanic membrane to the cochlea. The middle ear matches the impedance difference, which is due to the change
from the air environment (tympanic membrane & ossicles) to the liquid environment of cochlea. The muscles in the middle ear limit the range of motion of the ossicles to protect from high intensity stimuli. That's why when you go to a concert then you feel like your auditive capabilities are reduced, because these muscles become stiff and reduce your sensibility. In particular, we can notice that one of the three ossicles, the stapes, presses onto the oval window of the cochlea to transmit changes in air pressure.
**The Inner Ear**
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image33.png]]
The key element of the inner ear is represented by the **Organ of Corti**, it is contained in the middle canal. This organ can be thought of as the retina in the visual system, it contains the nerve cells that will translate changes in pressures to electrical signals.
**Tympanic Membrane & Ossicular System**
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image34.png]]
Sound stimuli pass through pinna and exterior auditory canal to strike **Tympanic Membrane (TM)**, causing it to vibrate. The **Ossicular System** conducts sound from the TM through the middle ear to the cochlea. The faceplate of the stapes pushes forward on the cochlear fluid (oval window) every time the TM and malleus move inward. **Impedance matching** is provided by the ossicular system between sound waves in air and sound vibration in the cochlear fluid (fluid has a greater inertia than air). Most amplification occurs because the area of the TM is 17x greater than the stapes/oval window surface area.
**The Mechanics of the Basilar Membrane**
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image35.png]]
The basilar membrane is inside the cochlea and vibrates to sound waves, it will vibrate differentially along its length depending on the frequency of the stimulus. It is broader and thinner at the end, while it is smaller and thicker at the beginning. In particular, low frequencies can be found at the end, while high frequencies are at the beginning. Hair cells at different positions respond to different frequencies (only because of their location) (Mechanical Fourier Analysis).
The **Organ of Corti** is the location where vibrations are traduced into electrical signals. It is situated on top of the basilar membrane and contains **hair (auditory receptor)
cells,** these generate **nerve impulses** in response to vibration of the basilar membrane. When the basilar membrane is vibrating, the tectorial membrane changes position, which makes hair cells move back and forth.
- **Inner Hair Cells**: single row, provide fine **auditory discrimination**. 90% of auditory nerve fibers innervate these cells.
- **Outer Hair Cells**: three rows, detect the **presence of sound**. (Less important for audition).
The hair cells contain **stereocilia**, which protrude into the overlying **tectorial membrane**.
**Auditory Transduction**
The **up-and-down** motion of the basilar membrane causes the Organ of Corti to vibrate up-and-down, which, in turn causes the stereocilia to bend **back-and-forth**.
**Polarization of the Stereocilia**
- \(B\) When the Organ of Corti moves **upward**, the stereocilia bend **away** from the limbus and they **depolarize**.
- \(C\) When the Organ of Corti moves **downward**, the stereocilia bend **toward** the limbus and they **hyperpolarize**.
**Transduction at Hair Cells**
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image37.png]]
- High concentration of K+ outside the cell.
- Then the cannel opens.
- Causing a change in the membrane potential.
**Receptor Potential**
The hair cells are depolarized by the movement of K+ ions into the cell:
- The **endolymph** contains a high K+ and is **electrically positive**. The **hair cells** also contain a high K+ but are **electrically negative** (NA/K pumps) because their concentration is less than the outside. Hence, driving force for **K+ into cells**.
- When the **stereocilia bend away** from the limbus, they cause K channels to **open**. K+ then flows into the cell and the **hair cell depolarizes**.
- When the **stereocilia bend towards** the limbus, they cause K channels to **close** and the **hair cell hyperpolarizes**.
**Release of Synaptic Transmitter**
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image38.png]]
- When the hair cell depolarizes, a Ca channel opens, allowing **calcium** to enter the cell. Calcium initiates the release of synaptic transmitter, which stimulates the auditory nerve fiber.
- The cell bodies of the auditory nerve fibers are located within the **spiral ganglion**. Their axons join those from the vestibular apparatus to form the **vestibulocochlear nerve**.
The picture to the right shows the minimal sounds intensity for single units in the cochlear nerve to react. Individual fibers show frequency preferences.
### Encoding of Auditory Information
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image39.png]]
How what we have seen so far translates into encoding of information? Neighboring cells in the cortex tend to prefer the same/neighboring frequencies.
- **Place principle of f determination**: the frequency (f) of a sound that activates a particular hair cells depends on the location of the hair cell along the basilar membrane. This spatial organization is maintained all the way to the cerebral cortex. The auditory cortex shows that specific brain neurons are activated by specific sound frequencies **(tonotopic organization)**. The brain knows which frequencies compose the sound by the movements of hair cells in the cochlear.
- **Volley principle of f determination**: low frequencies are discriminated by firing of the auditory nerve fibers at the same frequency as the sound wave. The frequency of firing of the cell corresponds to the frequency of the sound. If the frequency of the sounds is too high, i.e., the neuron cannot fire so many APs in a second, 2 cells can coordinate and fire alternatively.
- **Loudness**. As the amplitude of vibration increases, a larger proportion of the basilar membrane vibrates, causing more and more of the hair cells to move. This leads to **spatial summation** of impulses and transmission through a greater number of nerve fibers.
Signals from both ears are transmitted to both sides of the brain, with preponderance to contralateral pathway. Many collateral fibers to RAS of brain stem (loud sound). **Tonotopic organization** is maintained from cochlea to auditory cortex. Where high frequency sounds excite neurons at one end, whereas low frequency sounds excite neurons at the opposite end. The first auditory cortex is excited by the MGN, whereas the auditory association areas are excited secondarily by impulses from the first auditory cortex.
The circuits are quite complex because we use these circuits for sound localization.
**Tonotopic Organization**
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image40.png]]
Different frequencies are organized in a regular fashion, which represents the tonotopical organization.
**Discrimination of "Sound Patterns" by the First and Second Auditory Cortex**
Destruction of both (but not one) first auditory cortices will reduce greatly one's sensitivity to hearing. Interpretation of the meaning and sequence of sound tones in the auditory signals - second auditory cortex.
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image41.png]]
There is a close association between the areas involved in the processing of auditory information (**Auditory Cortex**) and those involved in processing language (**Wernicke's Area**).
**Physiology and Psychophysics**
There is a complicated relation between the physics of the signals and what we perceive.
- Cochlea performs mechanical spectral analysis of sound signal.
- Pure tone induces traveling wave in basilar membrane, maximum mechanical displacement along membrane is function of frequency (place coding).
- Displacement of basilar membrane changes with compression and rarefaction (frequency coding).
- The pitch stay to frequency as loudness stays to intensity.
**Perception of Pitch**
- Along the basilar membrane, hair cell response is tuned to frequency
- Each neuron in the auditory nerve responds to acoustic energy near its preferred frequency.
- Preferred frequency is place coded along the cochlea. Frequency coding believed to have a role at lower frequencies.
- Higher auditory centers maintain frequency selectivity and are "tonotopically mapped".
- Pitch is related to frequency for pure tones.
- For periodic or quasi-periodic sounds the pitch typically corresponds to inverse of period.
- Some have no perceptible pitch (e.g., clicks, noise).
- Sounds can have same pitch but different spectral content, temporal envelope ... *timbre*. The timbre is what allows us to hear the difference between the same frequency played by different instruments.
**Perception of Loudness**
- Intensity is measured on a logarithmic scale in decibels.
- Range from threshold to pain is about 120 dB-SPL (Sound Pressure Level).
- Loudness is related to intensity but also depends on many other factors (attention, frequency, harmonics, ...). Attention because you can isolate particular frequencies and/or locations (e.g., trumpet in a band).
### Spatial Hearing
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image42.png]]
- Auditory events can be perceived in all directions from observer.
- Auditory events can be localized internally or externally at various distances.
- Audition also supports motion perception
- Change in direction.
- Doppler shift (A source coming towards you is perceived as having an higher pitch than a source getting away from you).
**Cocktail Party Effect**
- In environments with many sound sources it is easier to process auditory streams if they are separated spatially.
- Spatial sound techniques can help in sound discrimination, detection and speech comprehension in busy immersive environments.
**Spatial Auditory Cues**
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image43.png]]
Two basic types of head-centric direction cues
- Binaural cues
- Spectral cues (they are the cues that change in frequency depending on where the sound is coming from).
**Binaural Directional Cues**
- When a source is located eccentrically it is closer to one ear than the other
- The sound arrives later and weaker to one ear, **Interaural Time Difference** (**ITD**)
- The head "shadow" also weakens the sound that arrives from the opposite ear, **Interaural Intensity Difference** (**IID**)
- Binaural cues are robust but ambiguous.
**Interaural Time Differences (ITD)**
- ITD increase with directional deviation from the median plane. It is about 600μs for a source located directly to one side.
- Humans are sensitive to as little as 10μs as ITD. Sensitivity decreases with ITD.
- For a given ITD, phase difference is a linear function of frequency.
- For pure tones, phase based ITD is ambiguous (imagine listening to a sinusoidal wave with ITD between the two ear, it is difficult to "match the peaks" because on one ear they could either arrive early or late but it's difficult to tell which of the two).
- At low to moderate frequencies phase difference can be detected. At high frequencies we can use ITD in signal envelope.
- ITD cues appear to be integrated over a window of 100-200ms (binaural sluggishness).
**Interaural Intensity Differences (IID)**
- With lateral sources head shadow reduces intensity at opposite ear.
- Effect of head shadow is most pronounced for high frequencies.
- IID cues are most effective above about 2000Hz.
- IID of less than 1dB are detectable. At 4000Hz a source located at 90 degrees gives about 30dB IID.
**Ambiguity and Lateralization**
![[ETH/ETH - Systems Neuroscience/Images - ETH Systems Neuroscience/image44.png]]
- These binaural cues are ambiguous. The same ITD/IID can arise from sources anywhere along a "cone of confusion".
- Spectral cues and changes in ITD/IID with observer/object motion can help disambiguate.
- When directional cues are used in headphone systems, sounds are *lateralized* left versus right but seem to emanate from inside the head (not localized).
- Also for near sources (less than 1m) there is significant IID due to differences in distance to each ear even at lower frequencies.
- Intersection of these "near field" IID curves with cones of confusion constrains them to toroids of confusion.
**Spectral Cues**
- Pinnae or outer ears and head shadow each ear and create frequency dependent attenuation of sounds that depend on direction of source.
- Pinnae are relatively small, spectral cues are effective predominately at higher frequency (i.e., above 6000 Hz).
- Direction estimation requires separation of spectrum of sound source from spectral shaping by the pinnae.
- Shape of the pinnae shows large individual differences which is reflected in differences in spectral cues.