Musicology Concepts and Spatial Awareness

As neither the enjoyment nor the capacity of producing musical notes are faculties of the least direct use to man in reference to his ordinary habits of life, they must be ranked amongst the most mysterious with which he is endowed. They are present, though in a very rude and as it appears latent condition, in men of all races, even the most savage.
— Charles Darwin~1871

A Graphic by Iannis Xenakis Pithoprakta (1955-56). Formalising musical note pitch and duration into vectoral algorithms Image Reference: (SOLOMOS, Makis, 2013) (MATOSSIAN, Nouritza, 2005).

Typically, sound research is linked closely to other sciences such as musicology, the science of music.Although a sentimental notion exists that music is a product of numinous inspiration, scientists refer to mathematics and biology to understand the human response to music. Many musicology paradigms align with the hypothesis of aural design discourse. Consequently, these theories can be extrapolated for the intent of this study, which is to ‘organised sound’ in urban space [1].This section briefly discusses the evolution of music and its relation to human cognitive and social behaviours and considers research on the effect of music on biological and physiological responses.

While no dispute exists over the evolutionary purpose of the auditory sense and language, there is much controversy on whether the existence of music (and music making) has an evolutionary explanation. Scientists on both sides of the argument refer to Darwin’s phrase ‘survival of the fittest’ and his two-volume publication The Descent of Man to make their opposing points. Cosmologist, John Barrow, states that music plays no role in the survival of a species. Psychologist, Dan Sperber, goes so far to call music “an evolutionary parasite.” Sperber believes that music evolved in preparation of language.

Cognitive scientist and evolutionary psychologist, Steven Pinker, agrees with Sperber, and states that music is merely an evolutionary by-product of language. Pinker asserts that listening to music is a pleasure-seeking behaviour that circumvents the original evolutionary purpose and directly taps into a reward system (LEVITIN, Daniel J, 2007). Ball, a science writer, takes a neutral position and explains that the brain dedicates several regions to handle music, which implies that music serves an adaptive function (BALL, Phillip, 2011).

Levitin cites Darwin’s sexual selection concept and declares that music has an evolutionary purpose. Like other species, humans create organised sound as an exhibitionist behaviour. This determining reproductive phenomenon is associated with dexterity, coordination, determination, and good hearing genes. Levitin employs Levitin employs Sperber’s rationale by which the cognitive ability to process complex sound patterns that vary in pitch and duration evolved in pre-linguistic humans and prepared humans for speech communication (LEVITIN, Daniel J, 2007).

As social animals, humans use music to encourage feelings of unity and synchrony and to develop social habits of coordination and cooperation (LEVITIN, Daniel J, 2007). According to Levitin, this idea aligns with Blesser and Salter’s statement that people inhabiting the same area and subjected to the same sonic event (e.g., church bells) develop a sense of citizenship to that region. They introduce this concept as soundmarks (BLESSER, Barry and Salter, Linda-Ruth, 2006).

Sound Physiological Response

Grouping Coordination and Association

Following the description of biological and neurological processes, musicology research is discussed  to clarify the indicated synaptic processes and sound associations that formulate programmable music and chart spatial characteristics. Psychological and physiological sound nomenclature and perceptual influences are presented here. The acquired facts are extrapolated to inform the hypothesis of aural urban design. The focus here is the musicology subdomain of perceptual-cognition, which covers the physiological effects of sound and the communicative function of speech. The highly subjective matter of emotional response is not a concern of this study.

The terms defined here (pitch, rhythm and loudness) are more subjective than are their mathematical counterparts. Physiological, psychological and psychophysical studies use these terms to investigate the neurological and biological effects of sound. Psychophysicists reveal that these properties are separable parameters; one may alternate while the others remain constant. The contrast between ambient sound and music is the administration of and the correlation between these properties (LEVITIN, Daniel J, 2007).


A Sample Range of mean frequency of various objects. Musical instruments, human vocals, urban sounds, and the corresponding notes on a standard 88 note piano. Diagram set on a frequency logarithmic scale. (Note: The infrasonic and ultrasonic notations are reversed. Please return to the post for updated graphic soon)

While the expression ‘Pitch’ may be confused with frequency, pitch is a purely psychological subjective construct that denotes the result of a sequence of cerebral functions in response to a frequency. Unlike any other sound attribute, the mind directly identifies pitch through the tonotopic-mapped basilar membrane and primary auditory cortex. The frequency selective configuration infers that pitch interpretation is the most significant perceptual factor (LEVITIN,
Daniel J, 2007). Studies indicate that the neural response to pitch is isomorphic. Magnetic resonance imaging (MRI) scans of subjects listening to music reveal that brain waves and the apprehended music are approximately identical (LEVITIN, Daniel J, 2013).

Pitch is the perception of the frequency of a particular tone [2] in relation to its position on the musical scale [3], on which C-sharp (or Middle C) is the mid-point (LEVITIN, Daniel J, 2007). The perceived pitch increases as the frequency of the transmitted wave (e.g., from a violin string) increases. Pitch is a mental image that results from a series of mechanical and neurochemical events triggered by sound vibrations that reach and oscillate the eardrum.

While a sound wave has a particular frequency, it has a pitch only when perceived (LEVITIN, Daniel J, 2007). A typical hearing adult can detect a spectrum of pitches that corresponds to a frequency range of 20Hz to 20 kHz [4]. Tones below the audible threshold are known as infrasonic sounds, which provoke unsettling responses in humans owing to an evolutionary associated response to natural disasters. Ultrasonic sounds are those frequencies above the audible threshold. Unlike the infrasonic signals, humans cannot detect ultrasonic events (BALL, Phillip, 2011).

Pitches at the lower end of the audible spectrum can be labelled as rumbles; for example, the sound of a passing vehicle. High-frequency pitches can be described as shrill (BALL, Phillip, 2011). While tuning of musical instruments varies in different cultures, note names repeat at specific intervals. If the frequency values of two tones have a 1:2 ratio (or any equivalent increment), they are perceived as distinguishably related pitches [5], also called an octave. Sound is perceived on a logarithmic, not linear, scale. Each octave begins at a frequency that corresponds to half the subsequent higher one, which leads to the repetitive or circulatory musical notation (LEVITIN, Daniel J, 2007). Because the mind cannot perceive a gradual linear increase in frequency, the listener experiences the shift abruptly and recognises an interval based on preconditioned expectations. For musicians, these intervals are established around musical notes. Some individuals identify the periods at exact notes and are said to have absolute pitch as they demonstrate an augmentation in the brain region associated with speech (BALL, Phillip, 2011).

The connection between pitch and the size of the speech-processing region is significant for perception. Citizens of cultures that have tonal languages (i.e., various verbal pitch cues), show similar augmentation in language processing regions (BALL, Phillip, 2011). In speech, pitch is used as a prosodic cue. Each community has its own inherent music and speech tonal grammar. For example, in the English language, a sentence ending with a high pitch syllable changes a statement into a question. In addition to learned tonal grammar, some pitches of natural events have enabled the evolution of the startle reflex. Whether learned or evolutionary, pitch associations invoke physiological and emotional responses; musicians and sound artists use these responses to convey subliminal narratives (LEVITIN, Daniel J, 2007).


Certain purposeful violations of the beat are often exceptionally beautiful
— C. P. E. Bach

Acoustic regularity is a common component not only in a musical arrangement, but also in ambient sound. Rhythm is a subsidiary aspect of tonal duration and a designation for the pattern and duration of a sequence of notes (LEVITIN, Daniel J, 2013). The colloquial music expression ‘beat’ is recognised when one pulse is discernible among other rhythmic pulsations and loudness serves to create emphasis. Even if there is no emphasising technique, the mind invariably attempts to structure periodic signals into a rhythm and tries to anticipate the composition. This rhythm-imposing phenomenon seems intrinsic; even infants detect a rhythm where none exists.

A series of images from an experiment exploring neurobiology of rhythm and beat perception. [Top] Schematic depictions of the auditory stimuli used in experiments [Left to Right] Auditory Waveform | Standard Musical Notation | Means rate of observed beats. [Bottom] Statistical parametric mapping (SPM) analyses. The beat versus non-beat contrasts overlaid on a template brain The experiment was conducted on two groups: Musicians and Non-Musicians subjects. Both experiments show significant bilateral activity in the Putamen (part of the Basal Ganglia) for this contrast. The main function of the putamen is to regulate movements and influence speech learn-ability. Image Reference: (GRAHN, Jessica A and Rowe, James B, 2009)

The ability to identify a repetitious pulse preconditions the mind to comprehend rhythm. Individuals from different cultures group pulses differently based on their respective languages and speech intonations (BALL, Phillip, 2011). Musicians alternate rhythms to surprise or satisfy expectations (LEVITIN, Daniel J, 2007). When the expectation is met, a sense of gratification occurs; if not met, a state of awareness or apprehension occurs. Yet, not all music has rhythm; architect and composer, Iannis Xenakis, does not use rhythm; his music is perceived as a constant tonal stream (BALL, Phillip, 2011).

Organisms such as humans or single-celled amoebas have an innate capacity to discern auditory regularity and synchronously respond to structure. Living organisms have inherent biochemical oscillators with rhythmic responses that resemble the mathematical phenomena of ‘linked oscillations’ [6] (BALL, Phillip, 2011). These biochemical oscillators are responsible for emergent behaviours such as fireflies synchronizing or strangers falling into step on a congested street (JOHNSON, Steven, 2012).

The Mapping of Mental Choreography. The brain regions contribute to dance in ways that go beyond simply carrying out motion. Diagram adapted from (BROWN, Steven and Parsons, Lawrence M, 2008)

Infants are not born with a response to rhythm and toddlers can sustain a steady rhythm sporadically. While response motor skills are not sufficiently developed during early childhood, scans show that they can discern a beat (BALL, Phillip, 2011). In adults, music with a powerful beat and tempo mediates the involvement of motor and beat perception within cerebral areas [7] (LEVITIN, Daniel J, 2013). A link is established through neural bond between the perceiving and producing rhythm regions (GRAHN, Jessica A and Rowe, James B, 2009).

Gaps in the rhythmic structure, gaps in the sort of underlying beat of the music—that sort of provides us with an opportunity to physically inhabit those gaps and fill in those gaps with our own bodies
— Neuroscientist Maria Witek ~2014

Involuntary movement (foot tap or body sway) in response to a rhythm is governed by the subcortical auditory relay station (medial geniculate nucleus) (BROWN, Steven and Parsons, Lawrence M, 2008). This synchronous motion is defined as a biological and behavioural unconscious event because of the absence of communication to cortical structures. MRI scans of subjects listening to music reveal activation in the regions of the brain that normally organise motor movements (LEVITIN, Daniel J et al., 2003). When subjects tap their feet to rhythmic patterns of various difficulties, the scans
show activation of the dorsal premotor cortex (CHEN, Joyce L et al., 2008). It is as though movement is impossible to suppress (LEVITIN, Daniel J, 2013).

Like pitch ratios, the rhythmic ratio of 2:1 appears universal. In music, the variation in rhythm urges individuals to move (dance). Doucleff (2014) reports that a rhythmic balance between predictability and complexity is favourable in music complexity. Composers use this movement-compelling phenomenon to their advantage and challenge their audiences’ expectations by shifting the emphasis of the beat, also known as syncopation. Successful music imposes syncopated patterns over an underlay of regular beats to ensure that a mental shift does not occur. This duality generates an unbalanced cognitive experience and invokes apprehension and tension (in some cases irritation). Rhythmic irregularity is not necessarily foreign to humans. Increasingly accelerated rhythms of falling and rebounding objects are habitually perceived patterns (BALL, Phillip, 2011).


A modified loudness graph:  Auditory Experience and musical ranges charted on their corresponding locations on a Fletcher-Munson equal loudness contours for pure tones charts. Fletcher-Munson is a subjective graph created by asking hearing-human subjects to adjust the loudness as subjected to continuously increasing frequencies of pure tones. Figure adapted from (TURNER, John and Pretlove, A.J., 1991).

Loudness is a subjective psychological construct with a nonlinear relation to generated sound energy. The perceived loudness of a sound depends principally on sound pressure levels (SPL) and frequency content. The Fletcher-Munson curves shown in are the result of a large number of psychoacoustical experiments in which individuals are subjected to binaural pure tone events. The curves represent averages of the data obtained by directing a large number of subjects (regular hearing 18-25 year olds) to regulate the perceived loudness of incrementally increasing frequencies (TURNER, John and Pretlove, A.J., 1991). The reference control pressure level is 2x10-5 N/m2, which corresponds to the zero point on the loudness scale at 0 dB (INTERNATIONAL STANDARDS ORGANISATION, 1975).

The brain decodes loudness as the distance from the emitting object. The displacement distance of an oscillating object determines its perceived loudness. The rate of energy used to strike (or pluck) an object determines the rate of the resulting sound energy. Considering this definition, loudness is the mental construct of the linear physical aspect of sound energy (i.e., amplitude or maximum relative intensity). Conversely, the brain interprets amplitude into loudness logarithmically, not linearly (LEVITIN, Daniel J, 2007). 

In audiometry, the measuring unit of loudness is a phon (TURNER, John and Pretlove, A.J., 1991). In musicology and acoustics, the measuring unit is decibel (dB). These correlated units are measured on a logarithmic scale that follows the human perception of amplitude. The loudness audible range is called the dynamic range (0-140 dB), above which permanent ear damage can occur. Neurons fire at a maximum rate when an individual is subjected to sound of 115 dB. The excessive neural activity demonstrated at these levels explains the related state of sentience that concert audiences report. The slightest deviation in music loudness also communicates and triggers emotional responses (LEVITIN, Daniel J, 2007). The metaphorical techniques that musicians consider group coordination, a procedure that habitually employs loudness, pitch and rhythm variations.

Group Coordination | Spatial Location and Metaphors

A diagram illustrating the aggregation of harmonic frequencies

Through evolution, humans have acquired the survival technique to appropriate simplification, enabling the identification of edible food (vegetation and prey), disease smells (decay), and danger sounds (predator). The human brain endeavours to comprehend its surroundings holistically by developing associations between groups of stimuli. Visually, the brain groups objects by commonality in colour, form, position, continuity, or trajectory. This stimulus parsing occurs unconsciously and immediately. Similarly, the mind groups acoustical signals to map and simplify the surrounding soundscape (BALL, Phillip, 2011). Herman Von Helmholtz, the nineteenth-century acoustics pioneer, called this association process ‘unconscious inference’ (LEVITIN, Daniel J, 2007). In an attempt to balance separation and integration techniques, a continuous process of logical deductions induced from continual stimuli input (BALL, Phillip, 2011).

A diagram illustrating a harmonic wave in a string moving in the positive and negative direction.

Pure tones are seldom encountered in ambient sound, or music. Objects (or instrument) emit a collection of frequencies (pitches) corresponding to its natural oscillation frequency, also known as fundamental frequency. Fundamental frequency is a direct result of the shape, materiality and size of an object. The brain groups each set of frequencies cognitively as one acoustical entity associated with a mental image, known as timber. For example, while a quartet ensemble plays the same note, the brain classifies the sound of the violin, in contrast to the cello, as a separate unified sound (LEVITIN, Daniel J, 2007).

Pitch is a notable fundamental grouping factor that musicians use during solo performances by deviating in pitch or time slightly from the background ensemble. With this shift, the audience perceives the notes of the solo instrument as a group that is separate from the accompanying instruments. Pitch proximity grouping (i.e., related frequencies) is recognised as a melodic structure, and is significant in the soundscape of ambient sounds. Specifically, if a single sound is temporarily obscured, the brain does not interpret the emitting pitch as individual signal bursts separated by silence or other sounds. Rather, the sound is perceived as a continuous tone that increases and decreases in loudness (BALL, Phillip, 2011).

The planum temporale is a region of the brain that governs spatial mapping and handles pitch intervals and melody. This region provides a cerebral link between spatial perception and pitch, rhythm, loudness and their grouping associations (BALL, Phillip, 2011). Spatial location is another grouping principle that results from the ability to detect sound binaurally. Binaural perception uses proximity cues to regulate sound groupings. If one ear receives more than one signal at relatively the same time, the signalling objects are perceived in unity and their positions are mentally mapped as such. Humans are more sensitive to changes in position along the horizontal plane than the vertical one because the brain is less sensitive to distance moderation, but it is fine-tuned to time discrepancies as
the latter is interpreted as relative distances (LEVITIN, Daniel J, 2007).

Loudness is also a spatial cue. If all other parameters are constant, louder sounds are mapped as closer events. If the received sounds are apprehended by one ear and all equal in loudness except for one, the loudest sound is apprehended as the closest. Musicians employ all of these grouping coordinations to design audible narratives or programmable music (BALL, Phillip, 2011).

Auditory Illusion and Programmable Music

In the absence of structure, the brain strives to assert the necessary grouping procedures(BALL, Phillip, 2011). Neurologically, top-down and bottom-up processes continuously inform each other to comprehend collected signals. The constructed inferences can build an inaccurate conclusion owing to partial or indistinct stimuli input. In this case, an emerging illusion holds the mind in a cerebral loop, even if the principle is comprehended (LEVITIN, Daniel J, 2007). Musicians use these illusions to derive their compositions. For example, the structuring and speed of the consecutive notes in Sindings’ ‘Rustle of the Spring’ creates an illusory melody [8]. In Vivaldi’s ‘Four Seasons-Spring’ concerto, the intermitted high-pitch notes mimic ambient sounds to create a specific narrative (VAN CAMPEN, Cretien, 2007). The ability to comprehend these structural techniques and produce emotion is based on learned and inherited experiences as well as the neural structures of the brain (LEVITIN, Daniel J, 2007).

Through the process of ‘unconscious inference,’ a sequence of mental events creates a cerebral image. For example, the piano is physically unable to project the lowest note on the keyboard. The audience hears the note through a mental event known as the ‘filling-in’ phenomenon or ‘virtual pitch’ (MCRAINEY, Megan, 2009). Similarly, the brain perceives fragmented ambient sound as a continuous signal. This cognitive continuation is a mental process that consults other senses to complete the acquired facts, such as tactile or visual stimuli. The volume of the surrounding space is mentally mapped through a spectrum of aural cues, such as reverberations and echoes. The average person cannot specify the dimensional and material properties of a room, but he or she can successfully navigate and interact in it through acoustic cues. Recording engineers use this phenomenon to create ‘hyper-realities’ in music to mimic virtual sensory experiences because artificial reverberation actively changes the perceived proximity and location of the sound source (LEVITIN, Daniel J, 2007).

The psychological sound terminology outlined here is used customarily in more sciences (e.g., musicology and audiometry) that study human sound perception, a highly subjective feild. It is imperative to understand how sound affects users in architectural spaces. The presented facts of neurological pitch mapping and the associated low-level processes controlled by the cerebellum cement the fundamental paradigm used in this research blog. The brain uses a series of associations and grouping coordinations to comprehend and map the physical surroundings by interpreting the accompanying soundscape, thus, aural design should not be a subsidiary design method.

[1] “Music is organised sound.” – Phillip Ball (2011)

[2] The terms note and tone refer to the same abstract entity. Tone is what is heard. Note is the written symbol for tone (LEVITIN, Daniel J, 2007).

[3] Successive tones create a melody, and simultaneously played tones produce a harmony (BALL, Phillip, 2011).

[4] Infrasonic sounds are associated with supernatural experiences, and musicians and cinematographers exploit this phenomenon (BALL, Phillip, 2011).

[5] In a male-female duet, although the female singer’s vocal cords have a higher fundamental frequency than those of the male singer, they sound similar when they sing in unison (BALL, Phillip, 2011).

[6] Two oscillating pendulums attached to the same rod will eventually synchronise.

[7] The areas are: the premotor cortex, inferior frontal gyrus, superior temporal gyrus, and inferior parietal lobe (LEVITIN, Daniel J, 2013).

[8] Rustle of Spring by Christian Sinding & Chopin impromptu no. 4 in c sharp minor op. 66. The notes go so quickly that an illusory melody emerges.