Biology of Auditory Perception

Anatomy and Neurology

The brain is divided into four hemispheres, the frontal, parietal, occipital, and temporal lobes. The frontal lobe manages planning, organises perception, and governs spatial apprehension. The parietal lobe is involved in sensation and perception of various signals from the skin, ears and eyes. The occipital lobe specialises in ocular processing and spatial apprehension in conjunction with the frontal lobe. The oldest part of the brain is the cerebellum, which governs primal instincts, and is located below the temporal lobe, close to the brain stem. The cerebellum controls the motor and coordination functions, as well as emotional responses (BALL, Phillip, 2011). The primary auditory cortex resides in the temporal lobe where sound stimuli and speech semantics are first received from the ear. The temporal lobe also houses the hippocampus (long-term memory) and the amygdala (the emotional centre).

 The Anatomy of the Brain - Major sound and music computational centres of the brain. [Right] Side View: front brain is to the left [Left] Cross Section: Same orientation The Primary Auditory cortex is the first area to receive and decode pitch. It is Tonotopically Mapped. The Cerebellum is the oldest part of the brain. It is responsible for the primal response to sound (Startle and Movement). Diagram adapted from original illustrations by Mark Tramo 2001 (LEVITIN, Daniel J, 2007) and (BALL, Phillip, 2011).

The Anatomy of the Brain - Major sound and music computational centres of the brain. [Right] Side View: front brain is to the left [Left] Cross Section: Same orientation The Primary Auditory cortex is the first area to receive and decode pitch. It is Tonotopically Mapped. The Cerebellum is the oldest part of the brain. It is responsible for the primal response to sound (Startle and Movement). Diagram adapted from original illustrations by Mark Tramo 2001 (LEVITIN, Daniel J, 2007) and (BALL, Phillip, 2011).

Sound is apprehended as air vibrations. When an object is manipulated (i.e., struck or driven by an oscillating electromagnetic field), it vibrates at its fundamental frequency. A shift in state occurs, which induces oscillations (successive compressions and depressions) in the immediate encircling air. Particle vibrations propagate from the object to the outer ear (BALL, Phillip, 2011). The funnel-like form of the outer ear deflects the waves through the ear canal toward the eardrum.

 Simplified longitudinal anatomical cross section of the human ear. The ear transforms air vibrations to electro chemical brain signals through a series of mechanical and hydrodynamic systems The diagram is an adaptation of Swiss National Sound Archives’ original (FONOTECA NAZIONALE SVIZZERA).

Simplified longitudinal anatomical cross section of the human ear. The ear transforms air vibrations to electro chemical brain signals through a series of mechanical and hydrodynamic systems The diagram is an adaptation of Swiss National Sound Archives’ original (FONOTECA NAZIONALE SVIZZERA).

The small Eustachian tube connects the middle ear and mouth and equalises the atmospheric pressure on both sides of the eardrum. This equilibrium creates a state in which the membrane fluctuates only in response to minute changes in pressure caused by sound waves. The eardrum is connected to the oval window by a system of small bones in the middle ear (the malleus, incus and stapes, collectively known as the ossicles). The oval window separates the air-filled middle ear from the liquid-filled inner ear. The ossicles act as a mechanical lever system that amplifies eardrum deflections and passes the motion onto the oval window. The deflection of the oval window transforms the mechanical vibrations to the hydromechanical system of the inner ear (TURNER, John and Pretlove, A.J., 1991).

Aside from governing balance, the inner ear conducts an electrochemical process that translates acoustic vibrations into nerve signals and transmits them to the primary auditory cortex. The cochlea, a small spiral liquid-filled chamber lined with the basilar membrane and overlaid with sound-sensitive hair cells, manages this process. In response to vibrations, the cochlea hair cells oscillate within the enclosed fluid. This oscillating motion opens pores in the cell walls to release electrically charged metal atoms. The change in electric state produces neural signals that surge through the cochlear nerve fibres to the brain. These cells are tonotopically mapped (spatial organisation based on frequency); that is, different hair cells respond to different frequencies. This corresponding configuration linearly progresses along the basilar membrane; low frequencies resonate at one end and high frequencies are apprehended toward the other end (BALL, Phillip, 2011).

 Longitudinal section of an unrolled cochlea and an accompanying graph showing the response maxima. The sensitivity of different parts of the Basilar Membrane to a number of pure tonal frequencies. The Basilar Membrane is tonotopically mapped . Diagram adapted from the original by Bruel & Kjaer Ltd (TURNER, John and Pretlove, A.J., 1991).

Longitudinal section of an unrolled cochlea and an accompanying graph showing the response maxima. The sensitivity of different parts of the Basilar Membrane to a number of pure tonal frequencies. The Basilar Membrane is tonotopically mapped . Diagram adapted from the original by Bruel & Kjaer Ltd (TURNER, John and Pretlove, A.J., 1991).

Auditory cognition occurs when an audio signal changes its mechanical nature, progresses through mechanical and hydrodynamic states, and ends as an electrochemical signal at the auditory cortex. The diaphragms, levers, and sensitivity hairs enable the ear to cope with a frequency range of 20Hz to 16-20Hz (TURNER, John and Pretlove, A.J., 1991). Perception of sound depends on the decoding processes of the brain. To interpret the pitch of a sound instantly, each pitch-selective neuron in the primary auditory cortex directly connected to and dedicated exclusively for a segment of the basilar membrane to interpret the pitch of a sound instantly. This unique perceptual stimulus one-to-one neural mapping has no equivalent in any other sense.

Part of preliminary pitch and speech processing occurs in the brain stem as raw data. Electrochemical signals travel from the cochlea to the primary auditory cortex via the brain stem and the primitive, subcortical brain is triggered immediately with stimulus detection. The cerebellum decodes the rhythm, and the thalamus assesses the signal, ready to trigger a subconscious survival reflex. The thalamus then signals the amygdala to generate an emotional response (BALL, Phillip, 2011). This low-level decoding occurs before any complex cognitive processes as a primary startle response of alertness.

Once the primal evaluation is complete, all high-level processes commence and continuously register neural projections from sensory receptors and low-level processing regions. This process is termed bottom-up processing in which properties of the collected signal are separable and can change independently. Different neural circuits manage the information carried by the stimulus. Through a top-down process, high-level centres update the input data steadily, read only the overall cognitive information, and influence low-level modules. This two-way exchange integrates these signal attributes into a perceptual whole (LEVITIN, Daniel J, 2007).

The prefrontal cortex manages high-level processes such as awareness and expectations, which are the result of associations that the hippocampus creates by correlating the received signal to retained memories. The language centre (Broca) of the prefrontal cortex evaluates syntactic aspects of sound (e.g., speech or music) by transcribing pitch into language. The principal attribute in aural cognition is pitch analysis. Pitch concatenations encode an entire spectrum of sound dimensions, and the brain commissions separate modules to execute each dimension. Pitch dissection leads to melody processing, harmonic structuring, and distinct voice and timber identification. Pitch and rhythm are also the general audible characteristics required to decipher speech and ambient sound. Rhythm and event duration provide clues for pitch qualities and trigger motor responses (BALL, Phillip, 2011). This motor response explains the motivation to dance when listening to intricate rhythmic songs, and the emergent behaviour of falling into step on a crowded street (WITEK, Maria A. G. et al., 2014).