ADDRESSING VOCALREGISTER DISCREPANCIES:
AN ALTERNATIVE, SCIENCE-BASED THEORY
OF REGISTER PHENOMENA
Leon Thurman, Ed.D.
Graham Welch, Ph.D.
Axel Theimer, D.M.A.
Carol Klitzke, M.S., CCC/SLP
Second International Conference
The Physiology and Acoustics of Singing
6 – 9 October 2004
Vocal registers are controversial in the pedagogical, clinical, and scientific domains of vocology. A well known general definition of vocal registers is “...perceptually distinct regions of vocal quality that can be maintained over some ranges of pitch and loudness.” (Titze, 2000, p. 282) For centuries, however, concepts and practices related to vocal register phenomena, including their linguistic labels, have been somewhat varied and commonly contradictory.
Within both the voice science and the voice education communities of the early 21st century, considerable discrepancies remain in the conceptual frameworks, terminologies, and practices that are related to vocal registers. To people who are not familiar with the jargon of the voice-related professions, these discrepancies are puzzling, confusing, and can even call into question the credibility of voice profession members.
In this paper, a reconciliation of the varied and conflicting register concepts, terminologies, and practices will be presented. It will: (1) present a brief historical context of vocal registers,
(2) propose a documented science-based theory that accounts for all vocal register phenomena from perceptual, physiological, and acoustical perspectives, (3) propose criteria for selection of categorical word labels for register phenomena and suggest terms that meet them, and (4) suggest how the theory can be beneficially applied to the learning of efficient, skilled singing and speaking when guided by music educators, choral conductors, singing teachers, speech teachers, theatre directors, and when applied to therapeutic clinical settings.
A BRIEF HISTORY OF VOCAL REGISTERS
Perhaps for millennia of time, singers and singing teachers have aurally identified changes in voice quality when singing the consecutive fundamental frequencies (F0s) of a two-octave (or more) musical scale. When transitions from one voice quality to another occur, most singers report some sort of non-specific, kinesthetically sensed, neuromuscular coordination adjustment in the larynx. Among experienced or trained singers, the transitions are perceived to be blended and smooth, but the transitions are more commonly abrupt among inexperienced singers.
The writings of
the Greek physician Galen (c. 129-200 AD) were the “bible” of medical anatomy
and practice for almost 1500 years after his death, but his observations
contained quite a number of inaccuracies.
Detailed knowledge of human anatomy and physiology began to be assembled
in the mid-16th century. A
prominent center for such study was the
Scientific findings about vocal anatomy, physiology, and the nature of vocal sound only began to be widely distributed in about the middle of the 19th century. With limited science-based knowledge of vocal anatomy, physiology, and acoustics, singers and singing teachers had to base their vocal concepts, terminologies, and practices substantially on logical assumptions, personal perceptions, and metaphoric communications about the nature of voices.
The term register was borrowed from the terminology of keyboard organs (Merkel, 1863), and has been used in vocal terminology since at least the 13th century (Duey, 1951). The earliest known writings about voice date from that century. They were written in Latin by two monks, Jerome of Moravia (c.1250) and John of Garland (c.1193 - c.1270) (Large, 1973, p. 10; Mori, 1970; Timberlake, 1990). They wrote about the then current conceptual categories and linguistic labels for the various “voices” in which singers can sing.
When singers sang in their upper pitch range, for instance, they presumably felt a prominence of vibration sensations in the front, sides, and/or top of their heads. We may presume that they interpreted those sensations as meaning that their voices were “coming from” their heads, so logically, they would call that way of singing head voice (Latin: vox captis = voice from the head). When they perceived their head voice as producing subtle voice quality differences, they are likely to have experienced differences in the vibration sensations in their heads and believed they could “place” their head voices in different areas of the head.
When they sang in their lower pitch range, they are likely to have felt a prominence of vibrations in their chest. We may presume, therefore, that they interpreted those sensations as meaning that their voices were “coming from” their chests and would call that way of singing their chest voice (Latin: vox pectoris = voice from the breast or chest). When they sang in their middle pitch range, they no longer felt prominent vibration sensations in the head or chest, but presumably felt a prominence of such sensations in their throats because they called that way of singing their throat voice (Latin: vox gutturis = voice in the throat). When they changed from one voice to another, they physically felt the transition from one “place” in the body to another, and they heard the sound quality of their voices change at the same time. We may presume that this is how a language of “voice placement” evolved.
In the 17th
century, Caccini wrote of voce piena
(full voice) and voce finta (feigned
voice). In 1627, Monteverdi wrote
vocale della gola (the voice of the throat) and la vocale
In the 19th century, Garcia and others wrote of three registers in ascending order of pitch range: chest, falsetto, and head (Garcia, 1855; also reported in Large, 1973, p.10; Timberlake, 1990). Merkel (1873), wrote of chest and falsetto registers in men and low, middle, and high for women. In 1875, John Curwen suggested thick, thin, and small as names for vocal registers (cited in Mackworth-Young, 1953). Browne and Behnke (1884) wrote about five registers, i.e., lower thick, upper thick, lower thin, upper thin, and small. In the early 20th century, Fröschels (1920) maintained that a “natural” voice had no registers, so that voices were only one register. Wilcox (1935) suggested the terms heavy mechanism and light mechanism for two vocal registers.
When singing in their middle pitch range, some singers noticed sensations and sound qualities that were different from those that they observed when singing in their uppermost and lower pitch ranges. This way of singing seemed to be a mixture of the sounds and sensations of chest and head or falsetto, so it has come to be called a medium, or middle, or mixed register (Italian: voce mista; French: voix mixte). When singing in middle or mixed voice, singers observed transitions to other registers at both the top and bottom of the mixed voice (Mori, 1970; Timberlake, 1990). Its pitch range was said to be between the chest and the head or falsetto voices. According to Miller's description of the Italian vocal pedagogy tradition (1977; 1986, pp. 115-149), the primo passaggio was the passage from chest register to middle register and the secondo passaggio was the passage from middle to head register. Men and women of various voice classifications experienced the passages at slightly different pitch ranges.
Currently, the terms head register and falsetto register are used by various vocal pedagogues and voice scientists as labels for the different sound qualities produced in middle and higher pitch ranges. For some singing teachers, however, head register is immediately above chest, and falsetto is above head. To others, falsetto refers to all sound qualities above chest in both males and females, and sometimes the traditional high-range male falsetto is termed pure falsetto. Among speakers of English, the common, colloquial use of the term falsetto refers only to the female-like voice quality that males can make.
In some register concepts, there are two “auxiliary registers”, one above and one below the more commonly used head and chest registers. Some prepubescent children, changing-voice males, and changed-voice male and female adults can have a register that is variously labeled whistle, flute, or flageolet register that enables pitches that are quite high (Cooksey, 2000, Large, 1973; Miller, 1986, pp. 147-148; Mori, 1970; Timberlake, 1990). Some changed-voice males and females are able to produce unusually low pitches below the more commonly used registers. It has been referred to as pulse or Strohbass register.
In the late
1960s, the research team of Minoru Hirano, William Vennard, and John Ohala (1969, 1970; Vennard, et al., 1970a,b)
heightened interest in the use of the scientific method to resolve many
controversies in the vocal pedagogy tradition, including landmark studies on
fundamental frequency, intensity, and register phenomena. They suggested Wilcox’s label heavy mechanism as a substitute for
chest register and his term light
mechanism as a substitute for registers above heavy
mechanism. In 1967, both Ralph Appelman
and William Vennard published landmark, science-based vocal pedagogy
books. In 1971, the late Wilbur James
Gould, M.D., founded the Voice Foundation in
In 1974, Harry Hollien, an internationally prominent speech-voice scientist, presented four new register terminologies for use in the speech science community, based on his wide research experience:
1. Pulse register—a pulsated quality that can be produced in a very low pitch range below modal, a sound quality that also is called vocal fry.
2. Modal register—a heavier or thicker voice quality that is produced in a lower pitch range. The label was a reference to the most common "mode" of voice function, i.e., speech. It was the speech equivalent of chest register in singing pedagogy.
3. Loft register—a voice quality that is higher in pitch and lighter or thinner in quality, compared to modal register. It was the speech equivalent of head or falsetto register in singing pedagogy.
4. Flute register—an even thinner quality that can be produced in a very high pitch range above loft. In singing it is called falsetto in males and in females it is sometimes referred to as whistle register.
With increased sophistication of instruments that are capable of documenting various vocal phenomena came a curiosity about the actual anatomic, physiologic, and acoustic realities of vocal registers and their transitions in both speaking and singing. In the late 1970s, an international medical organization, the Collegium Medicorum Theatri (CoMeT), formed an international Committee on Vocal Registers with Dr. Hollien as Chair. The committee included prominent otolaryngologists, speech and voice scientists, and singing teachers. Their charge was to attempt a definition of vocal registers perceptually, physiologically, and acoustically.
After their early meetings, they agreed that at least four different vocal registers existed, but that a definitive physiological and acoustic definition of all register phenomena was not possible at that time (Hollien, 1985). The committee agreed that registers:
1. involve a series of consecutive fundamental frequencies that have the same perceived timbre;
2. can be detected perceptually, and thus should be recorded in the spectra of the different timbre groups;
3. are initiated by changes of laryngeal physiology involving at least the internal muscles of the larynx.
The committee's report questioned the scientific usefulness of the older names for registers such as head and chest. The historic terms attribute the identification of registers to areas of vibratory sensation in singers. While vibratory sensations definitely occur, they were not considered to be defining characteristics of registers. Defining characteristics are the physical and acoustic events that give rise to the sensations.
While the committee agreed that the sensations may be helpful in the education of singers and speakers, they could not accept them as scientific evidence for defining registers. The sensations themselves and their perceived intensity vary widely between human beings, and replication of vibratory sensations in groups of singers was nearly impossible to measure and study with precision. Differences among perceived vibratory sensations are the result of differences in: (1) anatomic structure and dimension among people, (2) the nature of the physical coordinations used, (3) acoustic consequences in body tissues, and (4) sensory perception abilities. The ability of the interoceptive and proprioceptive sensory networks to bring vibratory sensations to conscious awareness is variable. Interpretations and verbal descriptions of their causes are subjective, and therefore, may be inconsistent across human beings.
In order to begin the process of register identification and definition, the committee report identified four registers based on research at that time. In an attempt to reduce semantic confusion, numbers were used to refer to the registers. They were designated as #1, #2, #3 and #4 (Hollien, 1985). Based on information from the tradition of singing pedagogy, a possible middle register between #2 and #3 was added and designated #2A (e.g., Hollien & Schoenhard, 1983a,b). No scientific evidence for the existence of this register was found by the committee at that time. The committee agreed that registers #2, #2A, and #3 are the most frequently used in singing.
As voice science
and voice medicine became more prevalent, the National Institute on Deafness
and Other Communication Disorders (NIDCD), a component of the
As a result of these developments, research into the phenomena of vocal registers has increased over the past 25 years. In addition to Harry Hollien, four voice scientists have consistently allied themselves with various scientist and pedagogical colleagues to collaborate in the study of vocal registers and associated voice qualities: Jo Estill, Donald Miller, Johann Sundberg, and Ingo Titze (see references).
Within the vocal pedagogy tradition—and among voice scientists—many questions have been raised by: (1) the wide variety of register concepts, terminologies, and practices, and (2) the variability of register transition areas in the same voice (Mörner, et al., 1964; Timberlake, 1990). Eight explicit or implicit assumptions are imbedded in the current jargon of vocal registers that are likely to be confusing to people who are voice terminology novices:
1. There are speaking-voice registers and singing-voice registers. An implicit assumption is that all human beings have two voices, one for the “speaking voice”, and one for the “singing voice”, and each “voice” has categorically different vocal registers.
2. Chest register is associated with lower singing pitch range and a comparatively “thicker” voice quality. An implicit assumption is that it is activated by neuromuscular coordinations, or other phenomena, that occur within the chest and thus produces perceivable vibration sensations therein.
3. Head register is associated with higher singing pitch range and a comparatively “thinner” voice quality. An implicit assumption is that it is activated by neuromuscular coordinations, or other phenomena, that occur within the head and thus produces perceivable vibration sensations therein.
4. Falsetto register is associated with highest singing pitch range, or with all pitches produced above chest register, and a comparatively “thinnest” (or “thinner”) voice quality. This concept has confusing implications. In Western cultures, it is strongly associated with a female-like voice quality produced by males, but is a “false” or “fake” voice that is of little or no practical use except in comedy. Vocal jargon novices may ask, “Do females have a falsetto register?” and, “If ‘falsetto’ refers to the voice quality that occurs in all pitches above chest register, how is it that at least two categorically different voice qualities can be produced above chest register? Does that not violate the basic definition of a vocal register?”
5. When voices change from one register to another, unskilled vocalists typically experience register breaks (abrupt transitions), but skilled vocalists typically experience blended register transitions. Vocal jargon novices may ask, “What vocal anatomy and physiology creates a ‘voice break’ in one person but not in another person?”
6. Middle register is associated with a middle singing pitch range and a voice quality that is a “mixture” of chest and head (or falsetto) registers. This concept also has confusing implications. Vocal jargon novices may ask, “How does one ‘mix’ two categorically different voice qualities that, presumably, are produced by unique physiological coordinations?” and, “If ‘falsetto’ refers to the voice quality that occurs in all pitches above chest register, then how does this concept make sense?”
7. A lower and an upper passaggio pitch area are in all voices and they define the lower and upper pitch range compass of middle register. Vocal jargon novices may ask, “How does this concept make sense in the context of items 1 – 5 above?
8. Each register can be performed throughout the entire capable pitch range of all singers, from lowest capable pitch to highest. Vocal jargon novices may ask, “How does this concept make sense in the context of all the above items?
What are vocal registers, really? What anatomy and physiology produce their acoustic phenomena? How many registers are there? What are the most accurate and helpful word labels for vocal registers? What happens physically and acoustically when register events occur? Can the pitch areas where register transitions occur be changed, or do they indicate unchangeable, genetically inherited vocal characteristics? How is it that there can be so many different register patterns in voices (e.g., strong lower-pitch-range registers and weak/breathy upper-pitch-range registers, or vice versa, and register transitions that occur around several different pitches)? How does “belting out a song” relate to registers, and are there voice health issues involved in the use of belted singing? How do registers and their sound qualities relate to the musical styles of the world's cultures and sub-cultures?
CONTEXT FOR A SCIENCE-BASED THEORY OF VOCAL REGISTERS
An important conceptual understanding that underlies this paper is that scientific investigation is carried out by human beings and is, therefore, imperfect. Originally, the scientific method was invented as a means of determining objective reality and thus overcoming subjective human bias, so that valid and reliable knowledge could be gathered. Totally “objective” scientific investigation implies that the human beings who engage in scientific investigation are entirely free from implicit assumptions and that they can disconnect the parts of their brains that process feelings-emotions (biases) from those brain parts that process perception and analytical conceptualization.
For instance, in scientific investigations there is a possibility that human investigators—inside or outside their conscious awareness—may orient research procedures and findings so as to conform with previously held, emotionally nuanced points of view. In addition, technological instrumentation that is used to gather data for analysis may not be sensitive enough to detect all of the phenomena that are relevant to a given investigation, and sometimes, the instrumentation that could gather needed information may not yet have been invented.
In spite of these realities, the methods of science are still the best means we human beings have yet devised to minimize the influence of human bias. The “saving grace” of scientific investigation is that, over time, human scientists will question and reinvestigate previous findings and, based on a preponderance of evidence, reconfigure theoretical explanations of “the nature of the world”.
The authors of this paper do not claim status as “voice scientists”. We do claim to do the best we can to be current with findings in the voice sciences, and to integrate that information with our experience in helping people who are learning to sing and speak with increasing skill and expressiveness. We believe that such a background gives us credible grounds upon which to propose a science-based theory of vocal register phenomena that we believe can resolve historic conceptual, terminology, and practice discrepancies. In doing so, we hope to decrease doubts about the credibility of the voice professions that occur among some of the people who expect a high degree of consensus in voice-function knowledge among voice scientists, speech pathologists, singing and speech teachers, choral conductors, music educators, and theatre directors.
Some readers of this paper are quite familiar with the scientific language of vocal anatomy, physiology, and acoustics. Other readers may be less familiar or unfamiliar with that language. This paper will attempt to present the theory with both reader groups in mind. It also will assume that during the register phenomena that we describe, (1) vocal anatomy and physiology are in a state of health, and (2) the neuromuscular coordinations that enact basic breathflow, phonation, and resonation are reasonably efficient. The authors acknowledge that some aspects of this theory of vocal registers are not yet fully substantiated by scientific research. If some of its provisions are shown eventually to be inaccurate, that will be a learning moment, and that learning will be celebrated because learning is what human beings do.
Two commonly used linguistic nominalizations are: speaking voice and singing voice. The terms denote concrete categorical differences between two “voices” in human beings, but of course, human beings have one voice (one larynx and vocal tract), and its neuromuscular coordinations produce all vocal phenomena, including speaking and singing (details in Endnote 1). For that reason, in this paper, there are no references to a so-called speaking voice and singing voice. Vocal register phenomena, therefore, occur in all vocal sound-making, speaking, and singing.
Using the metaphor of a theatrical production, here is a review of the anatomy, physiology, and acoustic processes that we regard as relevant to producing the vocal phenomena that are referred to as vocal registers.
The Producer is the genetic and epigenetic expression that forms and maintains vocal anatomy and the neuropsychobiological processes that activate and modulate vocal physiology.
The Playwright and the Technical and Performance Director is the central nervous system (CNS; brain and spinal cord). The CNS contains the vast neural networks, and networks of networks ad infinitum, that plan and enact the complex neuromuscular coordinations that produce all overt and covert physical movements, including vocal phenomena such as vocal registers. Learning new vocal abilities, or altering already learned abilities, can only occur if relevant neural networks are added to, or altered (for general reviews, see Fuster, 1997, 2003; Holstege, et al., 1996; Huttenlocher, 1994; Thurman & Welch, 2000, Book I, Chapters 3-9; Verdolini & Titze, in preparation).
The endocrine and immune systems are Assistant Directors that are interfaced with and modulate nearly all physical functions, including those of the CNS. Together, the Director and Assistant Directors coordinate all human neuropsychobiological processing, including self-expression through symbolic systems (languages, mathematics) and symbolic modes (music, dance, theatre, painting, sculpture, architecture, and the like).
The Stage Crew is the peripheral nervous system (PNS). It is made up of a somatic division (cranial and spinal nerves) and an autonomic division (sympathetic, parasympathetic, and enteric subdivisions). The PNS has both sensory and motor nerves that are the interface between the CNS and the external world, and between the CNS and internal bodily processes. The motor functions of the PNS are activated by integrative processing between sensory perception and executive functions of the CNS.
The playwright/director and crew have collaborated in a whole series of hit vocal register dramatic and musical plays, such as:
A Streetcar Named Registers
A Funny Thing Happened on the Way to the Registers
The Register Menagerie
Paint Your Registers
Joseph and the Amazing Technicolor Registers
The Secret Register
How to Succeed in Registers without Really Trying
The Leading Actors in these productions are the primary laryngeal muscles that induce the primary acoustic phenomena of vocal registers. The role names of the leading actors refer to the primary functions of the “leading actor” muscles. (see Table 1)
1. Each of the paired thyroarytenoid muscles (TA) have two parts, a vocalis part (thyrovocalis) and a muscularis part (thyromuscularis). The thyrovocalis parts extend for most of the length of the vocal folds and form their body or core. The primary role of the TA muscles is to have a vocal fold shortening influence within the synergistic functioning of all the internal larynx muscles (especially in their interactions with the cricothyroid muscles). The thyrovocalis parts appear to perform most of the shortening influence and both parts appear to have a secondary adductory influence on the folds. For people who might feel averse toward the use of anatomic terminology, or are too young or inexperienced to use it, the vocal fold shortener muscles can be a colloquial English term for the primary role of the thyroarytenoid muscles.
2. Each of the paired cricothyroid muscles (CT) have two parts, a more upright part (pars recta) and a more oblique part (pars obliqua). The most anterior ends of the CT muscles are attached to the front of the cricoid cartilage and the posterior ends are attached inside the lower lateral walls of the thyroid cartilage. The primary role of the CT muscles is to have a vocal fold lengthening influence within the synergistic functioning of all the internal larynx muscles (especially in their interactions with the thyroarytenoid muscles). Their agonist-antagonist action with the thyroarytenoid muscles creates a complex kind of “rocking” motion between the cricoid and thyroid cartilages that alters the length of the folds. For people who might feel averse toward the use of anatomic terminology, or are too young or inexperienced to use it, the vocal fold lengthener muscles can be a colloquial English term for the primary role of the cricothyroid muscles.
3. The spatial location and configuration of the cover tissues of the two vocal folds (official term: lamina propria) is altered by actions of all the internal larynx muscles. The synergistic influence on the folds’ cover tissues by the vocal fold shortener and lengthener muscles is of greatest relevance to a science-based theory of vocal registers. The more the vocal folds are shortened, the more thickened and lax the cover tissues become, and the more the vocal folds are lengthened, the thinner and more taut the cover tissues become. These changes in the configuration of the oscillating vocal folds influence the modes of their oscillation and they change the characteristics of the voice source spectrum.
The Major Supporting Actors are the laryngeal muscles that induce a secondary influence on the acoustic phenomena of vocal registers. The role names of the major supporting actors refer to the secondary functions of these “supporting actor” muscles. (see Table 1)
1. The paired posterior cricoarytenoid muscles (PCA) are located at the rear area of the larynx on the right and left sides. They are attached to the cricoid and arytenoid cartilages in such a way that they participate in abducting the arytenoid cartilages and thus opening the vocal folds. Their primary role, therefore, is to have a vocal fold opening influence within the synergistic functioning of all the internal larynx muscles, especially in their interactions with the lateral cricoarytenoid and the interarytenoid muscles. Voice terminology novices may be more comfortable referring to these muscles as the primary vocal fold opener muscles.
2. The paired lateral cricoarytenoid muscles (LCA) are located on the right and left sides of the larynx. They are attached to the cricoid and arytenoid cartilages in such a way that they participate in adducting the vocal processes of the arytenoid cartilages, and thus closing the vocal folds. Their primary role, therefore, is to have a vocal fold closing influence within the synergistic functioning of all the internal larynx muscles, especially in their interactions with the posterior cricoarytenoid and the interarytenoid muscles. Voice terminology novices may be more comfortable referring to these muscles as one of the primary vocal fold closer muscles.
Summary of Internal Laryngeal Muscles that Can Interact
to Produce Voice Source Spectra Variations
that Can Be Perceived as Basic Voice Quality Variations
Functions and Influences on Voice Source Spectra
· A primary adductor of the cartilagenous portion of the vocal folds
Lateral cricoarytenoids (LCA)
· A primary adductor of the membranous portion of the vocal folds;
· Agonist-antagonist interaction with IA and PCA to stabilize vocal folds in many specific adductory positions
Posterior cricoarytenoids (PCA)
· Primary abductor of vocal folds;
· Agonist-antagonist interaction with IA and LCA to stabilize vocal folds in many specific adductory positions
· Primary shortener and thickener of the vocal folds
· Secondary adductor of the vocal folds
· Agonist-antagonist interaction with CT to stabilize the length of the vocal folds in many non-specific and specific “settings” to produce a wide range of F0s
· Primary lengthener and shortener of the vocal folds when not opposed by action of the CT
· Primary lengthener and thinner of the vocal folds
· Agonist-antagonist interaction with TA to stabilize the length of the vocal folds in many non-specific and specific “settings” to produce a wide range of F0s
· Primary lengthener and shortener of the vocal folds when not opposed by action of the TA
3. The singular interarytenoid muscle (IA) is located at the posterior areas of the two arytenoid cartilages. It is attached to the arytenoid cartilages in such a way that it participates in adducting the rear areas of the arytenoid cartilages, and thus the cartilagenous portion of the vocal folds. Its primary role, therefore, is to have a vocal fold closing influence within the synergistic functioning of all the internal larynx muscles, especially in their interactions with the posterior and the lateral cricoarytenoid muscles. Voice terminology novices may be more comfortable referring to these muscles as one of the primary vocal fold closer muscles.
PLOT SETTING 1—Auditory Perception and Memory of Acoustic Phenomena. When bodymind auditory systems perceive a series of sound events that share the same (or nearly the same) acoustic characteristics, then those characteristics are correlated within a number of interconnected auditory neural networks to become a larger neural network that extends into both the parietal and frontal brain areas (see Fuster, 2003). As a result, a distinct perceptual category is instantiated within the neural networks—referred to as memory.
For instance, when a range of different fundamental frequencies (F0s) are produced but prominent spectral characteristics remain nearly the same, then a perceptual category of perceived voice quality is formed even though the F0s have changed. But when a range of F0s occurs and a different set of spectral characteristics are produced, then a slightly different combination of neural networks process those events and another distinct perceptual category is formed in memory. Language labels are not needed in order for such percepts to occur, but typically, such labels are assigned by human beings.
PLOT SETTING 2—Vocal Fold Oscillation and Its Two Primary Modes. The most widely known theory of how mucosal waves are initiated and sustained is called the myoelastic-aerodynamic theory of vocal fold vibration (Van Den Berg, 1958). The theory proposes that vocal fold vibration (complex mucosal waving) occurs when:
1. the vocal fold surfaces are sufficiently compliant and elastic;
2. the vocal folds are adducted enough to create a sufficiently narrow glottis; and
3. the pressure-induced airflow force is great enough.
During each oscillation cycle of the adducted vocal folds, the folds are in alternate closed and open phases. During the closed phase of each vocal fold cycle, subglottal air pressure very rapidly builds up enough to displace the surface layers of vocal fold tissue and trigger an open phase. At one time, the Bernoulli effect was thought to bring the vocal folds back together for the next closed phase, but in 1988, scientific reservations were expressed (Titze, 1988; see also Titze, 2000, pp. 109-111). While the Bernoulli effect is incidentally present, its influence has been considerably overstated in prior versions of the myoelastic-aerodynamic theory of vocal fold vibration. The greater influence over the return of the vocal folds from open phase to closed is:
1. the constraining elastic properties of the folds themselves which reverse their opening motion back toward closure, and
2. “...the synchrony between the driving (subglottal) pressure and (alterations in) tissue velocity...” during mucosal waving cycles (Titze, 2000, p. 110; parenthetical expressions added for clarity).
Finally, there are two primary modes of vocal fold vibration. One mode can be described as a repeated medial-to-lateral-to-medial-to-lateral (and so on) oscillatory motion. The other mode can be described as a repeated bottom-to-top waving oscillitory motion that mostly occurs in the more surface tissues of the vocal folds. In most speaking and singing, the two modes occur simultaneously.
PLOT SETTING 3—Contributions to Basic Voice Qualities by Internal Larynx Muscles and the Vocal Tract. The vocal fold closer and opener muscle groups primarily (not exclusively) contribute one range of basic voice qualities (see Figure 1), and the vocal fold shortener and lengthener muscles contribute to another range of basic voice qualities (registers, see later). When vocal folds are incompletely adducted to some degree, they oscillate to create vocal sound waves, but pressurized air molecules also are flowing through the opening between the folds, thus producing air turbulence noise. A combination of vocal tone and air turbulence noise produces a breathy family of voice qualities.
Vocal Fold Closer Muscles’ Contributions to Basic Voice Qualities
Clear & Richer Family
When vocal folds are adducted with a range of higher forces, and lung-air pressure is correspondingly high, a pressed-edgy family of voice qualities is produced. Other terms used for this range of voice qualities are tense, tight, strained, strident, constricted, and harsh.
When vocal folds are sufficiently adducted and “balanced” with appropriate lung-air pressure, a clear and richer family of voice qualities can be produced. Because degrees of adductory force are a major source of vocal fold amplitude/intensity during speaking and singing, this family of voice qualities changes with perceived vocal volume, but none of these voice qualities have breathy or pressed-edgy characteristics. At softer volume levels, a firm (not breathy) and flutier voice quality can be perceived. At “middle-loud” volume levels, a richer (more upper partials) but mellow-warm voice quality can be perceived. At “loud” volume levels, a richest (even more upper partials) and brassier voice quality can be perceived.
When the vocal tract changes its length and/or circumference dimensions, it has differential effects on the pressures within the voice source sound spectra that are passing through it (see Figure 2). Very generally, when vocal tract dimensions are enlarged toward an extreme, lower partials are amplified and upper partials tend to be damped, thus influencing an over-full or over-dark family of voice qualities. Accordingly, when vocal tract dimensions are diminished toward an extreme, higher partials are amplified and lower partials are significantly damped, thus influencing an over-bright family of voice qualities. When vocal tract dimensions are optimally configured, a relatively balanced complement of higher and lower partials pass through and out of the vocal tract thus influencing a balanced resonance family of voice qualities that can range between fuller and brighter.
Contributions to Basic Voice Qualities
Balanced Resonance Family
Throaty or “Back”
PLOT SETTING 4—Resonance Effects on Vocal Fold Tissues and Internal Larynx Muscles. Above the larynx and its vocal folds, there is a “tube” of sorts. Its two ends, vocal folds and lips, can be open or closed. This tube is officially labeled as the vocal tract because it is the tract through which vocal sound waves pass, and it can vary its dimensions in numerous ways. Very generally, two vocal tract cavities can open or narrow: (1) the pharyngeal cavity (“throat part”) and (2) the oral cavity (“mouth part”). During speaking and singing, the vocal tract provides an open end (the lips or teeth) from which vocal sound waves can radiate.
The trachea (“windpipe”) is a rounded tube that extends downward from the larynx and its vocal folds. Its general dimensions, especially length, increase in children as they grow to their adult size. Dimensions vary between adult male and female human beings, and its length and circumference can be altered physiologically to a degree. In ordinary life circumstances, most of these dimensional differences are comparatively small. Just like all tubes, of course, the trachea has a resonance frequency.
Vocal sound pressure waves are created by the oscillating vocal folds and then radiate upward through the vocal tract. Simultaneously, sound pressure waves also are created that radiate downward into, and “bounce around” within, the trachea. Those sound pressure waves are “trapped” inside the trachea, so during voicing, they are continually impacting on the underside of the vocal folds. As the fundamental frequency of those sound waves approaches and then matches the resonance frequency of the trachea, the pressures within the reverberating tracheal sound waves are increased (reinforced or amplified). In other words, those sound waves have greater intensity, and the force of their impact on the underside of the vocal folds is that much greater.
The increased pressures can produce an interference with vocal fold oscillations where they are initiated, on the underside (trachea side) of the vocal folds. The interference has been referred to as acoustic loading of the vocal folds. Acoustic loading has been suggested as a significant influence on the passaggio effects that are related to vocal registers and their transitions (Austin, 1992; Austin & Titze, 1997; Titze, 1983, 1984, 1988, 2000).
Even when acoustic loading is not present, acoustic overloading of the vocal folds can occur. When the throat or mouth areas of the vocal tract, or both, are sufficiently narrowed, the sound pressure waves within the vocal tract are deflected and begin to reverberate inside the vocal tract. These sound pressure waves can interfere with vocal fold oscillations as they are passing over the topside (throat side) of the vocal folds. Sound pressure wave impacts, combined with the always-occurring tracheal sound pressure wave impacts, then produce an acoustic overloading effect.
PLOT SETTING 5—Activation of the Larynx by Neuromotor and Neurosensory Processing. Two general types of motor functions occur in human beings: (1) Reflexive or involuntary functions, and (2) learned or voluntary functions (see Endnote 2 for background). Reflexive vocal-motor functions (e.g., vocal startle sounds) are initiated from the paired nucleus ambiguus areas of the brainstem's medulla oblongata and pass through the peripheral nervous system's paired tenth cranial (vagus) nerves to the relevant muscles (Hollien & Gould, 1990; Webster, 1995, pp. 282-298).
Deliberate or learned vocal-motor functions are initiated in areas within the frontal lobes of the two cerebral hemispheres, using sensory and/or memory input for guidance (see Fuster, 1997, 2003). Eventually, habitual learned vocal-motor functions are triggered within the motor areas of the frontal lobes, but nearly all of the actual motor coordinations are enacted by subcortical neural networks within the basal ganglia, cerebellum, and brainstem (see Endnote 3 for learning background).
The left and right superior laryngeal nerves (SLN) branch away from the paired vagus nerves to innervate the left and right sides of the larynx (see Figure 4). The external branches of the SLN only supply motor innervation to the paired cricothyroid muscles, which are the primary vocal fold lengtheners. The internal branches of the SLN supply sensory reception only for the laryngeal mucosa that is immediately above vocal fold level, and for some muscles of the larynx.
The left and right recurrent laryngeal nerves (RLN) (see Figure 3) supply motor innervation for all internal laryngeal muscles except the cricothyroids. That includes the thyroarytenoid muscles—the primary vocal fold shorteners—and the adductory and abductory muscles that close and open the vocal folds, respectively. They also supply sensory reception for the laryngeal mucosa that is immediately below the vocal fold level, and for some muscles of the larynx.
Figure 3: Illustration of key speech-voice motor areas in the central nervous system and their connection to cranial nerve X (vagus) in the peripheral nervous system. [From Professional Voice: The Science and Art of Clinical Care, 2nd edition, by R.T. Sataloff (Ed.) ©1997. Reprinted with permission of Delmar, a division of Thomson Learning. FAX 800 730-2215.]
4: Illustration of the peripheral
motor and sensory innervation of the larynx.
[From Aronson, A. (1980). Clinical
Voice Disorders (2nd Ed.).
To appreciate the capabilities of the larynx for speed and precision of its neuromuscular coordinations, an understanding of its muscle fiber types and related motor unit types is necessary (see Table 2 and Endnote 2 for background). Normal genetic endowment provides a greater percentage of Type S motor units and their Type I muscle fibers in most of the external laryngeal muscles. The internal laryngeal muscles have all categories of the motor unit and muscle fiber types, including a significant number of the S motor units and Type I muscle fibers, BUT Type FR and FInt motor units and Type IIA fibers are predominant (see Table 2 and Endnote 1; also Bendiksen, et al., 1981; Claasen & Werner, 1992; Cooper, et al., 1993; Titze, 2000a). The small internal laryngeal muscles are estimated to have about 100 motor units per muscle. The capability, therefore, for high variability in the motor unit recruitment patterns and action potential frequencies is present in laryngeal muscles.
That means that laryngeal muscles are capable of a fairly wide range of slow-to-fast contractile speeds and have the capability for: (1) extensive and vigorous use, (2) high agility, (3) subtle, intricate adjustments, and (4) considerable resistance to fatigue when they are activiated with optimum efficiency and are well conditioned. In fact, muscles of the larynx are regarded as having the second fastest contraction capability in the whole body (Mårtensson & Skogland, 1964; eye muscles are fastest). That capability is related to survival functions such as: (1) preservation and facilitation of breathing, (2) high-speed closing of the airway to protect the lungs, and (3) making loud sounds quickly to frighten predators (vocal startle response).
Muscle fiber types combined with motor unit types
(see Gordon & Patullo, 1993).
Muscle Fiber Type I:
Slow-speed oxidative fibers (SO) that are highly resistant to fatigue, are smaller and unable to generate as much force as Type IIB fibers.
Muscle Fiber Type IIa:
Fast-speed oxidative and glycolytic fibers (FOG) that are moderately resistant to fatigue and intermediate in size and force generation.
Muscle Fiber Type IIb:
Fast-speed glycolytic fibers (FG) that have low fatigue resistance, and tend to be larger and capable of generating greater contractile force.
Motor Unit Type S:
Slow and fatigue resistant (oxidative muscle fibers).
Motor Unit Type FR:
Fast and fatigue resistant (oxidative and glycolytic muscle fibers).
Motor Unit Type FInt:
Fast and fatigue intermediate (more glycolytic than oxidative fibers).
Motor Unit Type FF:
Fast and fatigable (glycolytic muscle fibers).
The realization of optimal high-speed response and fatigue resistance in laryngeal muscles depends on the nature of their neural input. Consistent with research in neuromuscular physiology, a reasonable assumption can be made that when laryngeal muscles are activated with reasonable frequency, but not very strenuously over longer time periods, then presumably, Type S motor units will become larger and their metabolic capacity changes so that the number of neural impulses (action potentials) that they can generate increases, to some degree. Presumably, then, more protein will be added to the Type I muscle fibers, thus increasing their size. More capillaries will be grown around those fibers to supply bloodflow-delivered oxygen and nutrients in larger amounts.
When laryngeal muscles are engaged for shorter bursts of strong, vigorous activity (such as shouting or sung pitches that are high and loud), they develop fast speed capability by adding protein to the Type II muscle fibers, thus increasing their size (bulk) and their capacity for responding to more rapid motor unit neural impulses. Cellular changes also occur that enable increased metabolic activity. When speakers or singers must activate their laryngeal muscles with strength over longer time periods, then protein is added to the more extensive Type IIA fibers and cellular changes occur to increase both the speed of motor unit response and resistance to fatigue. Well conditioned laryngeal muscles are a fundamental requisite for skilled, expressive speaking and singing (more below; see Saxon & Schneider, 1995; Thurman & Welch, Book II, Chapter 15).
In the larynx, a voluntary control system has been identified that initiates vocal fold closing and opening and lengthening and shortening, and monitors their continuation so that adjustments can be made to match desired vocal intentions (Larson, 1988; Strong & Vaughan, 1981; Webster, 1995; Wyke, 1983a). Neuromuscular motor networks are linked with sensory receptor networks to form feedback loops to guide vocal coordinations toward fulfilling the bodymind's intentions. Motor networks signal selected muscles to contract in a particular sequences, speeds, and intensities (for additional information on laryngeal capabilities, see Endnote 4).
Laryngeal motor networks are modified by innervation from the sympathetic and parasympathetic divisions of the autonomic nervous system (ANS) (Basterra, et al., 1989). The ANS is prominently influenced by the brain's limbic system, and together they sometimes are referred to as the emotional motor system. Feeling or emotional states, therefore, affect vocal function (Graney & Flint, 1993; Holstege, et al., 1996; Thurman & Welch, 2000, Book I, Chapters 7 and 8).
PLOT SETTING 6—Larynx Muscle and Vocal Fold Tissue Conditioning. If we just consider the physical state of larynx anatomy and physiology, then there are two conditioning categories that need attention: (1) larynx muscles and connective tissues, and (2) the vocal fold cover tissues.
A body’s skeletal muscles move a body’s skeleton. Larynx muscles are skeletal muscles that move the skeletal parts of the larynx which, in turn, moves and “shapes” its internal soft, non-muscle tissues. The muscles of the larynx are made of the same types of anatomic material as limb and torso muscles, and they respond to use in nearly the same ways. When the extent and vigor of neuromuscular activation is increased or decreased, four characteristics change:
1. Strength. With increased use, the capacity for muscle contraction intensity and force generation increases (sign of conditioning). With decreased use, the capacity for contraction intensity and force generation decreases (sign of deconditioning).
2. Endurance. With increased use, higher contraction intensity can be sustained for longer and longer periods of time before fatigue sets in (sign of conditioning). With decreased use, fatigue sets in earlier and earlier (sign of deconditioning).
3. Precision, speed, and “smoothness” of neuromuscular coordinations. With increased use, the precision, speed, and smoothness of neuromuscular coordinations is increased (sign of conditioning). With decreased use, the precision, speed, and smoothness of neuromuscular coordinations is diminished (sign of deconditioning).
4. Bulk. With increased use over time, genes are activated to produce additional constitutive protein within each of a muscle’s fibers (to increase contractile properties), and thus, the size of the whole muscle increases (hypertrophy; a sign of conditioning). With decreased use over time, an atrophic process occurs and constitutive protein in each muscle fiber is gradually reduced, and thus, the size of the whole muscle and its contractile properties decrease (sign of deconditioning).
The generic term for ligaments, tendons, and the like, is connective tissue. Typically, ligament tissue attaches muscles to skeletal parts. Actually, that tissue is diffused through the muscle to form its membranes, and so forth. The core function of connective tissue is to hold a body’s skeleton together. Its nature, therefore, is to shrink. Range of skeletal motion at joints is diminished if connective tissue is continually slackened, because it shrinks and then cannot allow the actual capable range of motion to occur. Appropriately stretching the connective tissues in legs and arms, for example, increases the range of motion in skeletal joints, and that increases limb movement capabilities for motor skills.
When epithelial tissue endures relatively forceful impact and/or shear stresses over time, the tissue adapts to the encountered circumstances. Micro-level changes occur within the tissues that increase restoration processes and an adaptation to “demand” on the tissues. The immune system is likely to produce inflammation in the affected tissues, and genes in those tissues will activate to produce tissue changes that will make the tissues “tougher” or more resilient. A good analogy would be the reactions of ungloved “soft” hands when their surfaces handle heavier and rougher-surfaced materials over time.
A CURRENT, SCIENCE-BASED THEORY OF VOCAL REGISTERS:
Titze (2000, p. 282) describes vocal registers as “...perceptually distinct regions of vocal quality that can be maintained over some ranges of pitch and loudness” (italics added). Hollien (1974) described a vocal register as “...a totally laryngeal event; it consists of a series or a range of consecutive voice frequencies which can be produced with nearly identical phonatory quality...” (italics added). He further stated that “...the operational definition of a register must depend on supporting perceptual, acoustic, physiologic, and aerodynamic evidence.”
During speaking and singing, the primary agonist-antagonist functions of the thyroarytenoid and cricothyroid muscles result in changes of vocal fold fundamental frequencies (F0s), and they do so by lengthening-shortening, thinning-thickening, and tautening-laxing the folds. When these vocal fold configuration changes occur, they alter the voice source spectra that are introduced into the vocal tract. Voice source spectra are then modified by the changing dimensions of the vocal tract, but the vocal tract can only modify what the larynx introduces into it. Radiated spectra, therefore, will retain various spectral characteristics that were introduced into the vocal tract when the larynx and respiratory system created the original (voice source) spectra. Listening brains then are capable of perceiving voice quality changes that are originated at the larynx level, and distinguishing them from voice quality changes that are induced of changing vocal tract dimensions.
This theory of vocal registers subscribes to the perspective that the voice qualities that are referred to as vocal registers are brought into acoustic existence by varied states of the oscillating vocal fold tissues. Those varied states are preponderantly altered by variable coordinations of the internal larynx muscles, but most predominantly by the thyroarytenoid (shortener) and cricothyroid (lengthener) muscles.
We propose five categories of shortener-lengthener muscle adjustments that produce five perceivable categories of vocal register voice quality that we will describe in anatomic, physiologic, and acoustic terms. The voice quality categories are correlated with changes of vocal fold length-thickness-tautness and thus with fundamental frequency (F0) and voice source spectra that are perceived by listeners as pitch and voice quality or timbre.
We have selected five word labels for the five vocal register voice quality categories according to the following criteria. The word labels must:
1. convey direct referential correlation with at least one universal, measurable, and perceivable parameter of vocal acoustics;
2. be “relatable” to vocal anatomy and function as defined within the anatomical, physiological, and voice sciences; and
3. be easy to assimilate into the colloquial English of people who are not familiar with the technical jargon of the voice professions.
The voice register labels that we have selected are:
1. pulse register
2. lower register
3. upper register
4. falsetto register for men, flute register for women
5. whistle register
Pulse register is produced when the cricothyroid muscles (lengtheners) are uncontracted so that vocal fold length is determined solely by increases and decreases in the contraction of the thyroarytenoids (shorteners). The vocal fold mucosa, therefore, is quite short, thick, and lax. There is a comparatively minimal range of subglottal air pressures and minimal adductory force, resulting in a minimal aerodynamic flow between the vocal folds. Pulse register can be produced in both speaking and singing.
One defining sound characteristic of this register is a series of sound bursts with audible gaps in between each burst. The recorded waveforms show a series of “wave packets” with a temporal gap in between (see Figure 5). Some vocalists can intentionally shorten and lengthen the temporal gaps, mostly by subtle increases and decreases of subglottal air pressure and aerodynamic flow and slight alterations in the vocal fold shortener and closer muscles. When a vocalists’ vocal folds are thick enough--by genetic endowment or by sufficient swelling--they are capable of increasing the subglottal pressure and vocal fold adduction just enough to shorten the temporal gaps to produce a range of very low-frequency sustained tones. At the present time, speech-voice professionals label the audible gap version of this register as vocal fry or fry. Presumably, this perceived sound quality reminded some people of the sound of slow-frying food. The CoMeT committee referred to pulse register as Register #1.
The more a pulsed F0 lowers past about 70-Hz, the more experienced voice judges identify the continuing sound as a series of pulses with gaps (see Figure 5). The more a pulsed F0 raises above about 70-Hz, the more experienced voice judges identify continuing sound as vocal tone rather than bursts and gaps. The 70-Hz mark is the average crossover frequency between the perception of pulses and the perception of sustained sound within the pulse register, but the crossover can occur anywhere between 60-Hz to 80-Hz (Hollien & Michel, 1968; Hollien, 1974, 1985; Titze, 2000, pp. 283-288). Different vocal tract vowel shapes can produce acoustic overloading of the vocal folds and thus interrupt the continuation of pulsed sound (presented later and in Titze, 2000, pp. 286, 287). For instance, the vocal tract opening and neck-throat ease of an /uh/ vowel is more conducive to continuation of pulsed sound, whereas the tongue and lip narrowing of the vocal tract on an /oo/ vowel is more likely to produce a degree of overloading. Modification of vocal tract vowel shape to avoid the overloading, then, will be necessary for continuation of vocal sound.
Figure 5: (a) is a graph that shows the percentage of
experienced voice judges that perceptually discriminated F0s that
were (1) continuous, sustained sound, versus (2) a series of sound pulses
with audible gaps in between. (b)
shows recorded waveforms of temporal gap pulses (vocal fry “wave packets”). [From
I.R. Titze, Principles of Voice
Production. Copyright © 2000,
Pulse register may be developed by some singers into an unusually low singing range. Fry also can be used as an initial pathfinder (stepping-stone) to help some singers-in-training begin to develop their lower register with physical efficiency. This register is easier to produce when the vocal folds are swollen, so singers with a history of fairly frequent tobacco smoking and alcohol drinking have much greater chance of developing their pulse register coordination.
Some Russian and Eastern European male classical music singers are well known for developing this register and have become contrabass singers (German: Strohbass = straw bass, having a voice quality that suggests the sounds that are made when straw is crushed). In the choral singing of those cultures, contrabasses sometimes sing the bass part one octave lower than the written notation, contributing to a characteristically thick and dark tonal quality.
Members of some Asian cultures use pulse register in chanting—Tibetan monks, for instance. Some cultures have developed highly skilled “mouth” or “throat singing” that uses a sustained, very firm low-pitched, pulse register drone to produce an array of overtones. The singers then shape their vocal tracts in special ways to amplify harmonic regions so prominently that melodic contours and other acoustic effects can be produced.
Lower register voice qualities are produced when both the thyroarytenoid and the cricothyroid muscles are simultaneously contracted (primary shorteners and lengtheners, respectively), but the thyroarytenoids are more prominently contracted than the cricothyroids (Hirano, et al., 1970; Vennard, et al., 1970a,b; Titze, 2000; Titze, et al., 1989). Various agonist-antagonist contractions of the two muscles result in a range of stabilizations in vocal fold length, thickness, and tautness. The prominence of contraction by the thyroarytenoid muscles results in generally shorter, thicker, more lax vocal fold cover tissues and a lower range of F0s.
When compared to the essential quality of upper register, the essential quality of lower register can be described as thicker and more full-bodied. That voice quality would be reflected in its voice source spectra, with the lower partials having greater intensity when compared to the lower partials of upper register voice source spectra. At the present time, various voice professionals have labeled this register as chest register, modal register, or heavy mechanism. The CoMeT committee referred to this register as Register #2. We recommend the term lower register to reflect the pitch-dependent nature of this register's laryngeal coordinations, and to eliminate the implicit assumption that its activation occurs in the chest.
The increased intensity in the lower partials of the voice source spectra is produced when a greater mass of vocal fold tissue is involved in vocal fold oscillation (as compared to thinner vocal fold tissue mass when upper register qualities are produced). Greater tissue mass is produced when the vocal folds are shorter and thicker, and thus, adduction of both the superior and inferior areas of the vocal folds occurs (see Figure 6a; Hirano, et al., 1970; Vennard, et al., 1970a,b; Titze, et al., 1989; Titze, 2000; Vilkman, et al., 1995). The thyroarytenoid muscle, including the muscularis portion, bulges the portion of the vocal folds that is below the level of the arytenoid cartilage's vocal processes.
The vocal ligaments also are more lax and can participate in vocal fold oscillation, and the vocalis portion of the TA muscle undulates as well (Titze, 2000). The vocal fold ligament and the thyrovocalis muscle tissues vibrate with much less amplitude, however, than the outer, superficial layer of the cover because of their greater structural stiffness. Typically, that means that during vocal fold oscillating there is:
1. a larger bottom-to-top contact area of the oscillating surface tissues; and
2. greater depth of tissue movement.
These characteristics of vocal fold function result in longer closed phase times, that is, the closed quotient (CQ) of each vocal fold oscillation is nearly always above 50% of the total of each single oscillation cycle (0.5). These functions are observable in electroglottographic (EGG) recordings. The EGG recording in Figure 6b shows the greater CQ. The EGG for lower register shows a broader peak than the one for upper register. The “knee” in the lower register EGG waveform reflects the greater contact time that is produced by the bulging of the vocal folds below the level of the arytenoid vocal processes (Alipour & Scherer, 2000; Titze, 1990, 2000).
Upper register voice qualities are produced when both the thyroarytenoid (primary shortener) and the cricothyroid (primary lengthener) muscles are simultaneously contracted, but the cricothyroids are more prominently contracted than the thyroarytenoids (Titze, 2000). Various agonist-antagonist contractions of the two muscles result in a range of stabilizations in vocal fold length, thickness, and tautness. The prominence of contraction by the cricothyroid muscles results in generally longer, thinner, more taut vocal fold cover tissues and a higher range of F0s (Hirano, Ohala, & Vennard, 1970; Shipp & McGlone, 1971; Titze, 2000).
Figure 6: (A)
is a drawing that compares a cross-section of a right vocal fold configured
for upper register (left side) and for lower register (right side).
The essential quality of this register, when compared to the essential quality of lower register, can be described as thinner and lighter. The greater intensity that is present in the lower partials of the lower register would no longer be present, and upper register voice source spectra would reflect that loss. Typically, all of the partials in upper register voice source spectra would have less overall intensity when compared to lower register voice source spectra. Proportionately, the partials nearest the F0 are the most intense. These partial intensity differences create the distinct perceptual category described above.
The decreased intensity in the voice source partials is produced when comparatively thinner mass of vocal fold tissue is involved in vocal fold oscillation (as compared to the thicker vocal fold tissue mass when lower register qualities are produced). Thinner tissue mass is produced when the vocal folds are lengthened and thinned, and thus, adduction only of the superior portion of the vocal folds occurs (see Figure 6A). The vocal ligaments bear the vocal folds' passive stretch tension so that only the lengthened and thinned epithelium and superficial layer of the lamina propria can participate in oscillatory motion (Titze, 2000; Vilkman, et al., 1995). Typically, that means that during vocal fold oscillation there is:
1. a smaller bottom-to-top contact area of the oscillating surface tissues; and
2. more shallow depth of tissue movement.
These characteristics of vocal fold function result in longer open phase times, that is, the open quotient (OQ) of each vocal fold oscillation is nearly always above 0.5. Some trained singers are able to produce this register with a CQ that is slightly above 0.5 (Howard, et al., 1995), presumably because they have the ability to strongly adduct their vocal folds. These functions are observable in electroglottographic (EGG) recordings. The EGG for upper register shows a narrower peak than the one for lower register, reflecting the fact that only the more superior area of the bottom-to-top vocal fold mucosa is in contact (see Figure 6A and B).
At the present time, various voice professionals label this register as head, falsetto, loft, or light mechanism. The CoMeT committee referred to this register as Register #3. In pre-scientific vocal pedagogy, this register is labeled head by some, and falsetto by others. Still others use falsetto for register #3 and head for register #4. These varied uses have produced considerable semantic confusion, especially among English-speaking people. We recommend the term upper register to reflect the pitch-dependent nature of this register's laryngeal coordinations, and to eliminate the implicit assumption that its activation occurs in the head.
Falsetto Register (males) and Flute Register (females)
Among speakers of colloquial English, there is no confusion about the meaning of the term falsetto voice. It refers to a voice quality that adult males can produce within the female pitch range and is female-like in quality. Because of this near-universal identification, labeling this register with any other term would be confusing to a quite large majority of English speakers. Likewise, to use the same term for the same laryngeal function in females also would be confusing. The term flute was selected as the term for this register in females because its essential quality resembles the tone quality of the flute instrument. It is borrowed from Hollien (1974).
Although there is great variability among individual singers, and there are significant differences of anatomical dimension between male and female vocal structures, we propose that the basic biomechanics of falsetto/flute register are essentially the same in both males and females. The thyroarytenoid muscles (primary shortening influence) release completely so that vocal fold length is determined entirely by action of the cricothyroids (primary lengtheners) (Ardran & Wulstan, 1967; Titze, 2000; Welch, et al., 1988). In addition, the CT muscles are assisted by some of the external larynx muscles at the highest and lowest F0s (Vilkman, et al., 1995). Zero contraction of the thyroarytenoids removes all of its shortening, thickening, and laxing influences on the vocal fold cover tissues. In male falsetto, for instance, a typical lower pitch range appears to be about E3 to C4. In that pitch range, the vocal folds are as short and thick and lax as they can be without activation of the thyroarytenoid muscles. Also, without the adductory gesture that is provided by thyroarytenoid contraction, the membranous portion of the vocal folds are likely to be separated (“bowed” configuration) and thus a breathy quality is likely to be perceived.
As F0 is increased in the falsetto/flute register coordination, optimal action by the primary adductor muscles (closers), combined with the “stretching” (lengthening) action of the cricothyroids, results in complete adduction and a fairly wide intensity range. In this register's higher pitch range, the vocal fold cover tissues are in their longest, thinnest, and most taut range of configurations. In this state, the ligament layers of the lamina propria bear even more of the passive stretch tension than they exerted in upper register, and vocal fold oscillation occurs only in the epithelium and the superficial layer of the lamina propria. Typically, that means that there is:
1. a thin vertical tissue mass that creates a thin bottom-to-top contact area for the surface tissues;
2. minimal depth of oscillation in the vocal fold cover tissues.
At the present time, various voice professionals refer to this register as flute, whistle, or flageolet in females and falsetto or pure falsetto in males. The CoMeT committee referred to the falsetto/flute register as Register #4. The essential quality of falsetto/flute register, when compared to the essential qualities of the lower and upper registers, can be described as lightest and thinnest, although an optimally longer vocal tract and expanded pharyngeal cavity can add “fullness” or a “darker color” to the quality that is contributed by the vocal fold voice source (Shipp, et al., 1988). The voice source's contribution to perceived voice quality is reflected in its voice source spectra. Typically, this register coordination produces the fewest number of partials compared to the other registers. The F0 is the most prominent partial, and the overtones that are present have comparatively minimal intensity (Titze, 2000; Walker, 1988). These voice source spectrum characteristics resemble those produced when flutes are played in the same F0 range, and result in a distinct perceptual category that identifies this register.
When falsetto/flute register coordination is used in its lowest F0 range, or when the laryngeal adductor muscles are underconditioned, then the longest open phase times are produced compared to upper and lower registers. The OQ of each vocal fold oscillation is nearly always above about 0.7. These functions are observable in electroglottographic (EGG) recordings. The EGG for falsetto/flute register shows a narrower peak compared to the one for upper and lower registers, reflecting the fact that a small amount of the superior area of the vocal fold mucosa is waving. On the other hand, when falsetto/flute register coordination is used in its middle to highest F0 range, or when the laryngeal adductor muscles are well conditioned, the open phase times approach 0.5. In some professional falsettists and countertenors, open phase falls below 0.5 (Shipp, et al., 1988; Welch, et al., 1988, 1989).
Male falsetto quality has been referred to with the value-laden term “effeminate” (Fuchs, 1963; Miller, 1977) and with such terms as “unnatural”, “artificial”, and a “trick voice” that can only be performed at the pianissimo dynamic (Allen, 1935; Emile-Behnke, 1945). Its longer-term use has even been associated with impaired vocal health (Miller, 1986, p. 122; Procter, 1980, p. 129), although no evidence has been produced to verify such a claim. Videostroboscopic and electroglottographic studies of professional male falsettists and countertenors have invalidated all of the previous presumptions (Lindestad & Södersten, 1988; Welch, et al., 1988, 1989). Many professional countertenors use a form of upper register when they include a minimal degree of thyroarytenoid contraction along with the degree of cricothyroid contraction, a very fine-tuned neuromotor skill. The perceived quality is sometimes described as upper register with a high percentage of falsetto quality “mixed in”, or as falsetto register with a small amount of upper register “mixed in” (see Howard, et al., 2001).
The biomechanical details of whistle register coordinations are the least documented of all the register coordinations (Miller, et al., 2001). The biomechanical coordination of the larynx that produces flute/falsetto register is a prerequisite for the induction of whistle register, that is, absence of thyroarytenoid contraction and near maximum cricothyroid contraction. One visual observation of whistle register production, based on laryngeal videoendoscopic images, suggests that an as yet undocumented biomechanical action creates a suppressed cessation of vocal fold oscillation in the posterior “halves” of the vocal folds' membranous portions (Personal communication, Robert Bastian, M.D., Loyola Voice Institute, Chicago, Illinois, 1999). Only the front “halves” vibrate, therefore, and produce F0s that are quite high (in the E6/F6 to C7 range; one octave lower in males) in what may be described as a tiny voice quality as opposed to a thinnest voice quality in flute/falsetto.
These conditions appear to prevent increases and decreases of vibratory amplitude, thus changes of vocal volume at the respiratory/vocal fold level do not appear to be possible. Increased conditioning of the biomechanical coordination and the vocal fold surface tissues and ligaments might possibly enhance the “full-bodiedness” of this tiny voice quality. Presumably, perceived volume can be enhanced by vocal tract adjustments, but they are likely to be minimal due to the necessity for a quite wide jaw-mouth opening and a very small pharynx. Current observations are that only some people can produce this register. Some people with vocal fold nodules are able to produce it, whereas they are not able to do so with normal vocal fold tissues (Personal communication, Robert Bastian, M.D.).
When people speak expressively within a relatively wide range of F0s, most of their F0s will be produced in their lower register larynx coordinations (thyroarytenoid prominence), but some of their F0s will be produced in the lower F0s of their upper register coordinations (cricothyroid prominence). That means that they are transitioning between the two register coordinations. If there are no audible, abrupt changes in their voice quality when they transition from one register to the other—only a subtle, blended change—then that is evidence that their habitual neural networks have enacted a blended or “melted” transition between the two coordinations. If their voices do produce abrupt changes in voice quality—often called voice cracks or breaks—then that is evidence that they do not have habitual neural networks that enact a blended transition. The likelihood of voice cracks or breaks is greater when young adolescents are experiencing voice transformation or when people have inflamed and swollen or stiffened vocal folds.
Figure 7: Graphic description of an abrupt vocal
I.R. Titze, Principles of Voice Production. Copyright © 2000,
Transitions between pulse and lower registers. When transitioning from pulse to lower register, the cricothyroid muscles engage to assist in the stabilization of the shorter vocal fold lengths, and thus they participate in the production of intended F0s. When the cricothyroids are engaged, the intensity of contraction by the thyroarytenoids increases and that action moves the vocal fold cover tissues slightly closer to each other. That results in a slight increase of vocal fold adduction and necessitates a complementary increase in subglottal air pressure. Aerodynamic flow and mucosal waving then become more periodic, the temporal gaps in pulse register no longer occur, and a more sustained vocal sound is perceived. When transitioning from lower to pulse register, the cricothyroids disengage and subsequent shortening and lengthening of the vocal folds is carried out exclusively by increases and decreases in the contraction of the thyroarytenoids (see Titze, 2000; Titze, et al., 1989). When in lower register (shortener prominent), both of the two primary vibrational modes of the vocal folds are present due to the thickness and laxness of the vocal fold tissues.
If vocal fry pulses are perceived as a result of the pulse register coordination, then with the transition to lower register, mucosal waving converts from pulsed “wave packets” with temporal gaps to the periodic waving that produces sustained vocal sound. If the increases of subglottal pressure, aerodynamic flow, and engagement of the thyroarytenoids are abrupt, the transition will be perceived as an abrupt crossover from one quality to another at a particular F0. If the increases of subglottal pressure, aerodynamic flow, and engagement of the thyroarytenoids are evenly parceled in very small increments over several F0s, the transition will be perceived as a blended change from one quality to another. If the pulse register coordination produces perceived sustained tones in singing, then the perceived transition to lower register coordination is much less obvious; if it is skilled, the likelihood of audible or even kinesthetic detection is reduced.
Transitions between lower and upper registers. In lower register the thyroarytenoid muscles (shorteners) are more prominently contracted than the cricothyroid muscles (lengtheners), and the vocal fold cover tissues are in a comparatively shorter, thicker, and more lax state. When vocally unskilled singers lengthen their folds to raise the fundamental frequency (ascend in pitch), their lengthener muscles gradually increase their contraction intensity. If they do not have vocal neural networks that enable their shortener muscles to complementarily reduce the intensity of their contracting, then the agonist-antagonist “tug of war” between the two will become increasingly intense. Eventually, an impasse will be reached and novice singers will either stop singing or an abrupt readjustment of contraction intensities between the shortener and lengthener muscles will occur. The abrupt adjustment will be to a lengthener prominent state and an abrupt change in the voice source spectra will occur to produce an audible voice quality change—to the lighter-thinner quality that is associated with upper register—along with an abrupt reduction of vocal volume (as the closer-opener muscles synergistically adapt).
When vocally skilled singers lengthen their folds to raise the fundamental frequency, their lengthener muscles also gradually increase their contraction intensity. But they will have vocal neural networks that enable their shortener muscles to complementarily reduce the intensity of their contracting. The resulting reduction of bulk in the thyroarytenoids and the lengthening action of the cricothyroids create a gradual thinning and tautening of the vocal fold cover tissues. As those changes occur, the vibrational modes of the vocal fold cover tissues also undergo gradual changes and very subtle changes of voice source spectra and perceived voice quality occur correspondingly. The bottom-to-top waving vibrational mode of the vocal folds gradually diminishes with vocal fold lengthening and thinning, and voice quality gradually and subtly changes from thicker-more full-bodied toward thinner-lighter even within the overall shortener prominent state. That means that within the shortener prominent state of lower register, there can be a voice quality range that can be described as thicker-thinner, more and less full-bodiedness, or more and less lightness.
Thus, a subtle, intricate, agonist-antagonist “give-and-take” between the shortener and lengthener muscles will enable a gradual crossover transition, over several pitches, to a lengthener prominent state, and the medial-to-lateral-to-medial-to-lateral vibrational mode becomes more prominent. With those changes, the voice source spectra are altered and the lighter-thinner quality that is associated with upper register is audible. As lengthening continues, the intermediate and deep layers of the lamina propria become increasingly stretched and taut and, eventually, only the superficial layer will be involved in vocal fold tissue oscillations (Titze, 2000) and only the medial-to-lateral-to-medial-to-lateral vibrational mode will be present. When the closer-opener muscles produce increased adductory intensity, the top-to-bottom contact area of the vocal fold cover tissues is increased, and thus voice source spectral changes occur that produces a perceived voice quality that can be described as having more full-bodiedness even within the lengthener prominent state. That means that within the lengthener prominent state of upper register, there can be a voice quality range that can be described as thicker-thinner, more and less full-bodiedness, or more and less lightness.
When transitioning from upper register to lower register, a reverse process occurs in skilled singers, and in approximately the same pitch area.
When transitioning from upper register to lower register, the unskilled singer is not likely to have neural networks that will enable a blended transition from the lengthener prominent state to the shortener prominent state. Either (1) the shorteners will abruptly engage to create an audible crossover “break” in voice quality with an abrupt increase in vocal volume, or (2) the lengthener prominent state will continue into lower and lower pitches. In the absence, then, of sufficient contraction of the shortener muscles (and their adductory gesture), the vocal fold margins typically recede away from each other and a breathy voice quality becomes audible, and singers’ lowest pitches will not be produced as their voices “fade out”.
Transitions between upper and falsetto/flute registers. When transitioning from upper register to falsetto or flute register, the thyroarytenoid muscles gradually reduce their contraction to zero so that the cricothyroid muscles assume total influence over the lengthening and shortening of the vocal folds. The resulting elimination of thyroarytenoid bulk and the lengthening and thinning action of the cricothyroids create a thinnest, longest, and most taut range of vocal fold cover tissue states. As F0 increases, the intermediate and deep layers of the lamina propria become stretched and taut even more extensively than in the upper register condition. That means that the only vibrational mode of the vocal folds is the medial-to-lateral-to-medial-to lateral mode, and the bottom-to-top waving vibrational mode of the vocal folds may no longer be present. Quite high F0s are possible in this register, as a result.
In the falsetto/flute register coordination, the vocal fold cover tissues are so thin that the top-to-bottom contact area is generally smallest and the oscillating tissue is in a shallowest depth. The result is reduced production of harmonics and a proportional predominance of the F0 among the partials of the voice source spectra (Walker, 1988; Titze, 2000). The perceived voice quality can be described as thinnest and flutiest.
When transitioning from falsetto or flute register to upper register, the thyroarytenoid muscles re-engage but the cricothyroids are predominant. The neuromuscular capabilities of the larynx (see Endnotes 2 and 3) are such that very subtle engagement of the shorteners can occur in this register transition. The increased shortening and the addition of some thyroarytenoid bulk not only helps to lower the F0, it also reintroduces the spectral characteristics of the upper register voice source and its contributions to perceived voice quality.
If the removal or re-engagement of the thyroarytenoids is abrupt, the register transition will be perceived as a sudden crossover from one quality to another at a particular F0. If the changes are evenly parceled in very small increments over several F0s, the transition will be perceived as a blended change from one quality to the other, over a region of crossover frequencies, and the likelihood of audible or kinesthetic detection is minimized.
Prepubescent males and females share the same general register characteristics, although there is wide variety in the coordinations that have been learned. A history of swollen vocal folds from voice abuse, upper respiratory infections, or other disease states are but a few of the sources of variation in learned register transitions and pitch range. During pubescent female voice transformation, as the vocal folds lengthen and thicken, and the trachea increases its dimensions, the basic register coordinations continue to be present, but neural networks that coordinate register transitions appear to go through a period of adjustment (Gackle, 2000).
During pubescent male voice transformation, as the vocal folds lengthen and thicken, and the trachea increases its dimensions, the basic upper and lower register coordinations continue to be present IF those boys have had appropriate pre-adolescent singing experiences. Typically, those experiences instantiate in vocal neural networks a “template coordination” for blended register transitions that only require modifications during the laryngeal growth spurts of adolescent voice transformation. Boys who have not had appropriate pre-adolescent singing experiences are much more likely to experience register transition “breaks” and “flip-flops”. Cooksey, et al., (1985; published only in Cooksey, 2000) produced evidence that male falsetto register first appears in the high mutation stage of adolescent voice transformation (Midvoice II in the Cooksey Voice Classification Guidelines).
Transitions between falsetto/flute and whistle registers. When transitioning from falsetto or flute register to whistle register, an undocumented biomechanical action is engaged to suppress mucosal waving in the posterior “halves” of the vocal folds' membranous portions. As described earlier, only the front “halves” vibrate and produce F0s that are quite high and even thinner in quality than flute-falsetto. The medial-to-lateral-to-medial-to lateral vibrational mode that is present in falsetto/flute register continues into whistle register. When singers first produce this register, the transition from flute-falsetto commonly is “unstable” and may be abrupt. With vocal slides from flute-falsetto to whistle and back again, and with an imagined model of melting the two together, the transitions may become blended. The coordinations for sung pitches can then be learned and the two registers can become melted in those laryngeal coordinations.
Effects of Vocal Acoustics on Laryngeal Register Adjustments:
At the time of the 1982 CoMeT Voice Registers Committee report (previously described), no measurable evidence for a middle register had been found (Register #2A) and the committee did not support its certain existence. Singing teacher committee members strongly argued that a middle register exists. They indicated that middle register (#2A) is thought to result from “mixtures” of the laryngeal coordinations that are related to chest (#2 or lower) and head (#3 or upper) registers. The centuries-old concepts of voce mista (voix mixte) and the zoni di passaggii were cited as evidence for the existence of this register.
In 1983, 1984, and 1988, however, Titze reported studies showing that reverberating subglottal sound waves could influence reactive, involuntary adjustments in laryngeal muscle coordinations. Those adjustments, in turn, resulted in acoustic changes in supraglottic sound waves. When this phenomenon occurs in human singers, they and other listeners perceive a sound quality difference and label it as a register transition. These register transitions have been recorded in the F0 ranges that are associated with the zoni di passaggii of the Italian vocal pedagogy tradition (Miller, 1977). Austin (1992) and Austin and Titze (1997) have extended those findings, and they appear to explain the middle register in singing and perhaps other perceived phenomena that are related to vocal registers, such as “lift points”.
How do reactive or involuntary register transitions occur?
When vocal sound is produced, radiating sound pressure waves are created in two opposite directions. They travel:
1. upward through the supraglottic vocal tract and into the surrounding air; and
2. downward into the subglottic trachea.
During upright stance when singing, the trachea is a tube with a relatively stable length of about 14-cm to 15-cm (Titze, opening presentation, this conference). Its open circumference dimension also is relatively stable during voicing because of the nearly rigid cartilage rings that encircle it. Between human beings of the same age range, tracheal dimensions vary a relatively small amount, including between males and females. There is only about a 10% to 20% difference of tracheal length between the longest in adult males and the shortest in adult females.
Because the dimension of each person's trachea is relatively stable when standing well, its resonance frequency also is relatively stable. According to measurements by Ishizaka, et al. (1976) and Cranen & Boves (1987), the resonance frequency of average-sized adult tracheas can range from ± 500-Hz (± C5) to ± 600-Hz (± D5) (see also Titze, 2000). When the vocal folds are closed and oscillating, there is no opening from which subglottic sound waves may radiate out and away. During voicing, then, sound pressure waves in the trachea repeatedly impact on the underside of the waving vocal folds.
When the F0 of the waving folds approaches the resonance frequency of the trachea, the dimensions of the trachea will effect an increase in the SPL of the subglottic sound waves. Due to that intensity gain, the impact of the sound pressure waves on the underside of the vocal folds is increased. Those repeated impacts produce interference with vocal fold cover tissue oscillations. That interference has been referred to as acoustic loading or acoustic impedance of vocal fold oscillations (Rothenberg, 1981a,b; Titze, 1983, 1984, 1988, 2000).
When motor areas within the cerebral cortex have set in motion the singing of a learned F0 pattern, but one or more of the F0s approach or match the resonance frequency of the trachea, then the pressure-sensitive mechanoreceptors in the vocal folds will detect acoustic loading of the continuously oscillating vocal folds. The interference will be reported to the brainstem in milliseconds of time. High-speed involuntary or reflexive motor commands will then be enacted and sent to the laryngeal muscles to make compensatory coordination adjustments so that voicing can continue.
In both male and female adult bodies, the involuntary, reactive adjustments take place in the F0 areas that match the frequencies of the zoni di primo and secondo passaggii, or the frequencies of their prominent harmonics. The acoustic loading phenomenon, therefore, can explain the shortener-lengthener muscle adjustments and the auditory and kinesthetic perception of a middle register. Some laryngeal muscle adjustments can be subtle and produce a mildly abrupt voice quality change—voluntarily or involuntarily—that are sometimes referred to as lift points.
Above the vocal folds, the vocal tract can provide an open end from which vocal sound waves can radiate. Unlike the trachea, the vocal tract can vary its dimensions in numerous ways. Very broadly speaking, two vocal tract cavities can open or narrow: (1) the pharyngeal cavity (its “throat part”) and (2) the oral cavity (its “mouth part”).
When the adjustable vocal tract becomes more narrow, then more and more of the radiating sound wave activity within it will be reflected onto the topside of the oscillating vocal fold tissues. Under those conditions, the vocal folds are receiving increasingly intense pressurized impacts from both the subglottic and supraglottic sound waves. When this “double-dose” of acoustic loading occurs, it can be referred to as acoustic overloading of the vocal folds (see Thurman & Welch, 2000, p. 446).
During speaking or singing, the general spatial dimensions of the pharyngeal and oral cavities can be enlarged “too much”, restricted “too much”, or optimally “opened”. Inexperienced, unskilled singers will use the only vocal tract adjustments that they know—the adjustments that are appropriate for conversational speech. When speaking or singing in greater-than-speech F0 and intensity ranges, the conversational speech vocal tract adjustments will result in acoustic overloading by the subglottal and supraglottal sound pressure waves. Under those conditions, one of two reactive laryngeal adjustments can take place:
1. over-compensation, that is, increased contraction intensity in the internal larynx muscles (and typically some of the external larynx muscles) and increased lung-air pressure, so that the acoustic interference can be overpowered and continuation of vocal sound can be preserved; or
2. under-compensation, that is, an abrupt adjustment of internal laryngeal musculature that produces an abrupt voice quality change that is referred to as a register “break” or, if the abrupt adjustment is relatively minimal, a register “lift”.
When experienced, skilled singers sing many pitches and create many vowel formations of the vocal tract, there is a constant “tuning” and “re-tuning” between the laryngeal muscle coordinations and the shaping of the vocal tract in order to maintain sustained vocal sound, vowel intelligibility, and desired voice qualities (Colton, 1994; Sundberg, 1987; Titze, 2000).
In both male and female adult bodies, the involuntary, reactive adjustments of the internal larynx muscles typically take place in two F0 areas that are approximately one octave apart. In adult males, they most frequently can occur between Db3 to F#3, and Db4 to F#4. In adult females, they most frequently can occur between Db4 to F#4, and Db5 to F#5. These F0 ranges match the frequencies of the traditional zona di primo passaggio and the zona di secondo passaggio, or the frequencies of their prominent harmonics. The acoustic loading phenomenon, therefore, can explain the auditory and kinesthetic perception of a lower, a middle, and an upper register. Some laryngeal muscle adjustments can be more subtle and produce a mildly abrupt voice quality change—voluntarily or involuntarily—and they could be what is often referred to as lift points.
Skilled adjustments of the pharyngeal and oral cavities, and the internal laryngeal muscles, occur in singers whose brains have learned to blend the associated voice quality transitions (Titze, 2000). If, however, the vocal tract's pharyngeal and oral dimensions are appropriately adjusted during performance of the F0s that approach and match the tracheal resonance frequency, then tone continuity will be preserved and a subtle, smoothly blended change of perceived voice quality is much more likely. When these adjustments have been brought into conscious awareness, they eventually can become learned, habitual, or automatic motor patterns. A common method of helping people adjust the “mouth part” of the vocal tract is called vowel modification.
Voluntary, learned neuromuscular coordination changes in the larynx create the voice source spectra changes that we perceive and refer to as register transitions. Voluntary processing is initiated by various areas in the brain's motor cortex. The muscles that are needed to produce chosen pitches, loudness levels, and voice qualities are recruited and sequenced by cortical and subcortical motor networks of the nervous system, and they can induce them both inside and outside conscious awareness.
Human nervous systems can “assemble” automatic, habitual neural networks (learning) that induce register “breaks”, and they can assemble automatic networks that induce “melted” or “blended” register transitions, and induce them in predictable pitch areas. Human nervous systems also are capable of inducing deliberate register transitions in a variety of pitch areas, and of changing the habitual transitions to new habitual pitch areas. In other words, voluntary register transitions can be produced in conscious awareness or outside conscious awareness, and many register coordination patterns can be learned.
SOME APPLICATIONS OF THIS THEORY OF VOCAL REGISTERS
TO VOICE EDUCATION AND TO CLINICAL SETTINGS
Human auditory processing for spoken and sung language begins during the third trimester of womb life (Eisenberg, 1969; Lecanuet, 1996; Panneton, 1985; Pujol, 1993). During infancy and the early childhood years, human beings hear, see, and sense language models and their expressive-interactive, imitative, and exploratory-discovery capability-ability clusters activate to learn spoken and sung language (DeCasper, et al., 1994; Meltzoff, 1988a,b; Siegler, 1996).
In Western cultures, once language is learned, all human beings do nearly all of their speaking in the lower register (shortener prominent) larynx coordinations. The lower register family of voice qualities, therefore, is the most common one in speech. That means that, by far, the most neuromuscular “practice” in nearly everyone has been the shortener prominent one. The nervous system, therefore, has an extensively elaborated array of neural networks to “run” talking in that coordination, but a fairly undeveloped array of neural networks to “run” speaking or singing in the lengthener prominent coordination (upper register).
In most people, therefore, the varied shortener-lengthener stabilizations that are necessary for reasonable pitch accuracy in singing are most likely to show up first in the shortener-prominent, lower pitch range area. But when people attempt to sing gradually higher pitches starting from there, they may not have an array of neural networks that would enable them to transition “smoothly” from shortener prominent to lengthener prominent coordinations (upper register). A first pitch area to begin learning pitch accuracy in would be, therefore, in lower register, using simple limited pitch-range songs (for pre-adolescent children and adult females: a pitch range compass of about a perfect 5th or a 6th, e.g., “Go Tell Aunt Rhody” or the chorus of “Jingle Bells” in the key of Ab, with the melody beginning on C4).
While having lower register singing experiences, less skilled singers can be led through vocal sound-making and language-making that involve lengthener prominent coordinations. For example, vocal pitch slides and glides (a “yooooo hoooooooooooo” call, modeled by an adult)and siren sounds and short word phrases can be experienced in a game or story setting or “just for the fun of it”. Doing those vocal sound-makings over time result in development of some foundational neural networks that eventually can be used for singing pitch patterns in upper register. After experiencing the “yoo hoo” call enough times, that way of making sounds can be referred to as the “yoo hoo part of your voice”.
When relatively accurate pitches are sung in lower register coordinations, and some experience has been gained with upper register coordinations, then a teacher might say, “I wonder what would happen if we sang this song in the “yoo hoo” part of our voices?” There is a high probability that learners then will be able to accurately sing the same song in the same key, but pitched one octave higher (melody beginning on C5 in upper register).
If those songs are sung in the key of C with the melody beginning on E4, however, inexperienced singers, typically, will sing inaccurate pitches or in another key right away. That pitch area is in the transition area between the shortener prominent and the lengthener prominent neuromuscular coordinations. The clearly shortener prominent and clearly lengthener prominent coordinations are relatively gross motor skills. Transitioning between the two requires neural networks that can deal with a large array of subtle intensities of neuromuscular contraction in the vocal fold shortener and lengthener muscles, and inexperienced singers are not likely to have developed those fine motor skill neural networks yet.
A way to help inexperienced singers begin to master the register transition skill: After singing the song in the upper octave key (upper register), repeat the song in keys that descend from there in whole or half steps until they pass through the transition area (usually between F4 and D4). Starting out in the lengthener prominent coordination optimizes the chances that the vocal fold shortener muscles will not “overpower” the lengthener muscles and take the pitch “off target”. Their chances of singing with reasonable pitch accuracy through the transition area will be optimized as a result. If inexperienced singers start in the lower-octave key and the keys are gradually raised by half or whole steps, they most likely will continue to engage the shortener prominent register coordinations into the higher keys, and will start singing below the accurate pitches, and eventually will go “off tune”.
Belting or belt quality is a term that was coined in the American musical theatre and was popularized by the singing of Ethel Merman in the 1940s and 1950s. That style of singing is a staple of musical theatre in Western civilization. But for thousands of years, children, adolescents, and adults of nearly all the world's cultures have sung their folk and popular musics in a strong “belted way”. Current popular and religious musical styles that have roots in the African-American experience preponderantly use belted singing (spirituals, blues, jazz, gospel, rock, and so forth).
With the trend toward multicultural music education and multicultural choral singing comes the necessity for stylistic authenticity. The vocal qualities that sounded when “folk” created a culture's sung music are integral to its expressive style. Change the vocal qualities and you change the very core of its human expressiveness. It is no longer that culture's music.
Any form of strong, high intensity (loud) singing involves quite strenuous laryngeal muscle use and high impact and shear stresses on the vocal folds. So-called belted singing and Western opera style singing are two forms of strong, high intensity singing (Estill, 1988). There are inefficient, overly strenuous and fatiguing ways to produce strong singing, including belt quality, and there are efficient, optimally vigorous ways.
Based on this paper’s theory of vocal registers, strong belted singing involves maintaining variations of a shortener and closer prominent coordination of internal larynx muscles (with higher lung-air pressures) into higher and higher pitches. Thus, the “tug-of-war” tension between the shortener and lengthener and the closer-opener muscles becomes increasingly intense as pitches rise. Subtle, intricate variations in those laryngeal muscle adjustments can produce a variety of subtle “thickness-thinness” qualities. In order to avoid acoustic overloading of the vocal folds, the mouth part of the vocal tract must gradually widen as pitches rise, becoming quite wide open even in the middle of singers’ capable pitch range. In addition, subtle, intricate variations in vocal tract adjustments can produce a variety of subtle “fuller-brighter” qualities. Lifetime vocal health is possible when:
1. all fundamental vocal skills have been learned with optimum efficiency (especially including development and conditioning of the upper and falsetto/flute register coordinations);
2. the laryngeal muscles, connective tissues, and vocal fold cover tissues are well conditioned;
3. singers know how to protect their voices (recovery time, hydration, and the like; see below).
Applications in Clinical Voice Therapy
Register neuromuscular coordination tasks (examples described below) can be used to differentially distinguish between: (1) the presence of a vocal pathology, (2) deconditioning of the larynx muscles (thyroarytenoids, in particular), and (3) underdevelopment of vocal abilities.
Presence of vocal pathologies. Register neuromuscular coordination tasks can be used to assess the functional viability of the superior and recurrent laryngeal motor nerves. These assessments relate to conditions such as laryngeal paresis or paralysis and deconditioning. Register coordination tasks also can be used to assess the severity of inflammatory swelling of vocal fold cover tissues and other organic pathological disorders of the vocal folds (e.g., nodules, polyp, cyst, hemorrhage).
In the healthy,
well conditioned larynges of skilled singers, the absence of finely coordinated
register transitions may indicate early neurological disease. For example, a well trained, well conditioned
singer was seen at
Neuromuscular instabilities were observed visually and aurally during her register transitions, and that was the only abnormality that was observed during her voice evaluation. This condition persisted despite several weeks of voice therapy and could not be explained by any voice function diagnosed. She was referred for neurological examination, and was diagnoses with early stages of multiple sclerosis.
Register neuromuscular coordination tasks that progressively lengthen and thin the vocal fold cover tissues (upper, falsetto, and flute register coordinations) can reveal the degrees of severity in vocal fold cover tissue changes such as swelling and/or the presence of organic lesions (e.g., nodules, polyp, cyst, hemorrhage, sulcus). When these conditions are present, the following common consequences are likely:
1. generally increased air leakage (breathy quality)
2. voice onset delays and aphonic episodes
3. generally increased laryngeal and respiratory effort
4. abrupt or unstable “flip-flop” voice quality characteristics at register transitions
Among singers and actors, common contributors to the development of swelling and other pathological voice disorders is inappropriate use of voice register coordinations that produce high impact and shear stresses on the vocal fold mucosa. Using such coordinations with a larynx that is underconditioned for such higher levels of “stress-demand”, can also contribute to these disorders.
Deconditioning/underconditioning. When extensive and/or vigorous voice use has been reduced over time, or rarely undertaken, atrophic processes occur in the laryngeal nerves and muscles, and a “softening” occurs in vocal fold tissues (described in Plot Setting 6). Reduction in the thickness of the thyroarytenoid muscles (shorteners) results in a recession of the vocal fold cover tissues away from a complete adduction at the midline. Laryngeal videoendoscopy reveals a gap between the vocal folds during softer voicing and irregularities in mucosal waving patterns. These tissue changes result in reduction of vocal abilities, including register transition instabilities (e.g., “flip-flops”), as well as decrements in vocal pitch accuracy, vocal volume, and voice quality. These decrements in vocal ability are not pathological, per se, but can lead experienced vocalists to assume that they have a disordered voice that requires vocal rest. In the case of otherwise healthy larynx muscle and vocal fold tissues, gradual increases in the extent and vigor of voice use is needed for a return to normal voicing, along with attention to blended register transition coordinations.
Underdeveloped vocal abilities. When a patient’s history does not include voice education experiences, or includes relatively minimal voice education, then vocal register coordinations are likely to be “all or nothing”. For example, the lower register coordinations that are used to produce conversational speech will be strongest and also will be more likely to be used to produce higher-range pitches, especially pitches that have greater vocal volume. When voice is used that way frequently, the risk of vocal fold swelling and other disorders becomes greater. Also in less skilled vocalists, the fine motor skills that are required to produce blended or smooth register transitions are less likely to be present. Correlating a complete vocal health history, diagnostic vocal tasks, and laryngeal videoendoscopy can be used to differentially distinguish between underdeveloped vocal abilities and deconditioning, and vocal pathology conditions.
Basic register neuromuscular coordination tasks: Patients can be asked to imitate vocal sounds or word phrases as modeled by clinicians in their lower register (e.g., “Whoooooo are youuuuuu?” or “say the word ‘Hello’ in three different and interesting ways”). The phrases would be presented in two or three prosodic variations that necessitate changes of pitch, volume, timing, and voice quality. This task especially evaluates basic neuromuscular viability of the recurrent laryngeal motor nerves that innervate the vocal fold opener-closer-shortener functions (shorteners more prominently contracted than lengtheners).
Then patients can be asked to imitate the clinician’s upper register word phrases (e.g., the call, “Yoo hoo”) in two or three prosodic variations. Patients can then be asked, “How close can you come to starting in the ‘yoo-hoo’ part of your voice and slide downward continuously into the ‘who-are-you’ part of your voice?” Then: “How close can you come to starting in the ‘who-are-you’ part of your voice and slide upward continuously into the ‘yoo-hoo’ part of your voice?” This task especially evaluates basic neuromuscular viability of the superior laryngeal motor nerves that innervate the vocal fold lengthening functions in upper register (lengtheners more prominently contracted than shorteners).
Finally, patients can be asked to imitate the sound of a newly born, tiny puppy, as modeled by the clinician, and afterward, to create sound spirals that ascend “wherever they might want to go”. Puppy-cry sounds are intended to elicit falsetto register in males and flute register in females. This task especially evaluates basic neuromuscular viability of the superior laryngeal motor nerves that innervate the vocal fold lengthening functions in falsetto and flute registers (lengtheners only).
Patients who sing, can be asked to sing 5-4-3-2-1 musical scale pitch patterns that begin in lower register and descend to near the lowest producible pitch. The beginning pitch can be about F3 for males and F4 for females. Each succeeding scale pattern can begin on a pitch that is a semitone or whole tone below the previous starting pitch.
Singers can then be asked to sing 5-4-3-2-1 musical scale pitch patterns that begin in upper register and descend until lower register is clearly sounding. The beginning pitch can be about C4 for males and D5 for females. Each succeeding scale pattern can begin on a pitch that is a semitone below the previous starting pitch.
Singers can then be asked to sing 5-4-3-2-1 musical scale pitch patterns that begin in upper register and ascend to their highest producible pitch. The beginning pitch can be about C4 for males and D5 for females. Each succeeding scale pattern can begin on a pitch that is a semitone above the previous starting pitch.
Bastian, Keidar, and Verdolini-Marston (1990) validated two pitch pattern tasks that can be repeated several times in progressively higher keys (vocal fold thinning) to evaluate the severity of vocal fold swelling. One of the patterns is the first phrase only of “Happy Birthday” and it must be sung very softly and thinly (pianissimo or pp; and tiny-sounding, thus adding a greater degree of vocal fold thinness). Changed-voice males must sing in falsetto register and the first pitch would be C4, and the key of each repetition would be higher by one semitone, until the highest pitch is C5. Females and unchanged males would begin on C5 in upper register and the key of each repetition would be higher by one semitone, until the highest pitch is C6. Clinicians and patients can use this task to track the progress of swollen vocal folds toward normal dimensions, including post-surgical healing. For patients who are uncomfortable with singing, the puppy-cry spirals described earlier can be substituted.
1 A wide variety of neuromuscular coordinations occur to produce all vocal phenomena. Are speaking and singing so categorically different, really? To accomplish both of the so-called speaking voice and singing voice, human beings must inhale air into the lungs, close the vocal folds, “squeeze” on the lungs to pressurized the air therein to create a breathflow that causes the vocal folds to oscillate, configure the vocal folds and “shape” the vocal tract in various ways to create a variety of verbal and nonverbal expressions. The only differences appear to be: (1) In speech, vocal fold length is nearly always in variable flux patterns that produce “sliding” vocal fundamental frequencies. In singing, a variety of stabilized vocal fold lengths produce consecutive sustained fundamental frequencies. (2) Most of the time in singing, a greater range of fundamental frequencies (pitches), sound pressure levels (volumes), and radiating spectra (voice qualities) are produced, compared to speech.
Presumably, this categorical distinction is instantiated within somewhat differentiated neural networks that one highly respected neuroscientist refers to as cognits (Fuster, 2003). Among English speaking people, the speaking voice and singing voice cognits are sometimes so differentiated that they have unfortunate consequences for human beings.
Examples: (1) Some music teachers wonder why their diagnosed nodules have affected their speaking voices (hoarseness) but not their singing voices. (2) A choir sings a selection that includes both singing and speaking sections. Their singing sounds vocally efficient and is expressive, but when the somewhat loud speaking section is performed, the predominant voice quality is pressed, edgy, and harsh. The conductor explains that the singers use their “natural” speaking voices for that section, and reveals that he only knows how to teach the singing voice, not the speaking voice. (3) Some otolaryngologists have used the diagnostic term singer’s nodules, even though the way singers use their voices for speech may be the most significant factor in the formation of nodules. (4) People who are not familiar with the jargon of the voice professions have arrived for appointments with voice teachers, expecting to improve tgheir voices for greater effectiveness as business executives or salespersons. They were puzzled and dismayed when they were told that the teacher did not teach the speaking voice, only the singing voice.
This type of distinction is not used in any other area of human neuromuscular activity. We do not say that we have walking legs and running legs, or pushing arms and pulling arms. Do we have cognitive brains, affective brains, and sensorimotor brains? More particularly, do we have speaking registers and singing registers that are categorically different from each other and require categorically distinct nomenclature?
2 Reflexive motor functions are initiated by very high-speed sensory input and the circuits are short. Peripheral sensory reception is delivered: (1) to the spinal column which immediately triggers spinal motor nerves with no other central nervous system processing, or (2) to brainstem nuclei which immediately trigger cranial motor nerves with no other central nervous system processing.
Both spinal and cranial motor neurons extend their myelinated, largest-diameter axons outward to their target muscles and form their part of the peripheral nervous system. At the surface of a target muscle, each axon divides into multiple terminal branches. Each single terminal branch is attached to one of the many muscle fibers that make up that whole muscle. The multiple terminal branches that extend from one axon, and all of the muscle fibers that they innervate, are referred to as a motor unit (Burke, 1981; Vander, et al., 1994, pp. 315-317). The anatomical point at which an axon terminal branch and a muscle fiber interface is referred to as a neuromuscular junction. The muscle fibers that are innervated by the terminal branches of one neuron are not located adjacent to each other, but are distributed throughout the target muscle. When a single motor neuron “fires”, all of the muscle fibers to which it is attached will contract, so that the number of motor units that are activated relates to the contractile properties of the whole muscle (more later). The four motor unit types are described in the table.
Each muscle also has the two types of sensory receptors (affectors). One type detects degrees of contraction intensity (Golgi tendon organs), and the other detects degrees of stretch or lengthening (muscle spindle stretch receptors). The peripheral end of each sensory neuron's axon is divided into multiple terminals that receive stimulation from muscle or ligament fibers. Each sensory neuron's cell body is located astride its axon, and its other end is connected to the spinal or brainstem parts of the CNS.
There are three types of muscle contractile properties: (1) degree of generated force (related to muscle strength), (2) degree of contractile speed (related to quickness of response), and (3) degree of endurance (related to central nervous system or peripheral neuromuscular fatigability). The degree of force that is generated by a single motor unit depends on (1) the size of the neuron and (2) the number of impulses (action potentials) that are generated per second. Usually, larger neurons generate more force and smaller neurons generate less force. More impulses per second generate more force.
When a motor unit has higher numbers of terminal branches, and therefore innervates more muscle fibers, it is regarded as a large motor unit. Motor units with small numbers of terminal branches and innervated muscle fibers are regarded as small motor units. Muscles that are involved in relatively coarse motor coordinations (back and legs, for example) are generally large and have relatively few motor units, and each motor unit may control hundreds to even thousands of muscle fibers. On the other hand, muscles that are capable of participating in intricate, fine, subtle, or delicate motor coordinations (hands, eyes, and larynx, for example) tend to be small, and they have numerous motor units that may control only one to a few muscle fibers.
The central nervous system can very gradually increase the contractile force of small muscles, that have many motor units, by increasing the number of activated motor units in very small increments. This motor unit activation process is called recruitment of motor units. Another means by which the CNS can alter the contractile forces of muscles is by increasing or decreasing the frequency with which action potentials course through a muscle's motor units. These variations of motor unit recruitment and action-potential frequency are capable of producing highly intricate coordination patterns in multiple agonist-antagonist muscle pairs.
When muscles are contracted toward their maximum intensity, the CNS recruits motor units in a specific order (Gordon & Patullo, 1993; Williams, et al., 1987). The slow and fatigue resistant motor unit types (S) are recruited first, followed by the fast and fatigue resistant types (FR), then the fast and fatigue intermediate types (FInt), and finally the fast and fatigable types (FF). When muscles are reducing the intensity of their contracting, the CNS reduces the number of activated motor units by deactivating them in the reverse order from FF to FInt, to FR, to S. These processes can be accomplished in very small increments within smaller muscles that have a high ratio of small motor units in them. As higher intensity and finer-tuned use is repeated, the metabolic processing of fast-fatigable glycolytic motor units is converted more and more to fast and fatigue resistant oxidative processing.
Initially, physiologists categorized skeletal muscle fibers into two types, according to their speed of contraction and the extent of their resistance to fatigue (see Table 2; Vander, et al., 1994, p. 327). Slow-twitch muscle fibers (Type I) contract at slow speeds and also are resistant to fatigue. They can continue to contract for long periods of time and, therefore, require a rich capillary blood supply to deliver their primary fuel, oxygen. They have been referred to as red muscle fibers (the dark meat in chicken, for instance). Fast-twitch muscle fibers (Type II) contract at high speeds and do not use oxygen as their primary fuel. They have a comparatively minimal blood supply and have been referred to as white muscle fibers. Their primary energy source is glucose, used in the form of glycogen. Glycogen is stored in the body through metabolic processes and can be depleted more rapidly (O2 is delivered by the respirocardiovascular system and is continually renewable). White, glycolytic muscle fibers, therefore, fatigue faster. More recently, two versions of Type II muscle fiber were labeled as Types IIa and IIb (see table). Nearly all muscles, including the internal and external laryngeal muscles, contain all fiber types and all four related motor unit types (Vander, et al., 1994, p. 327).
3 During learning, the organizing phase of vocal motor functions occurs in various areas of the frontal lobes such as Broca's area in the premotor cortex, and supplementary motor areas, which are inter-looped with the basal ganglia, cerebellum, thalamus, the limbic system, and the sensory networks. The execution phase of those functions begins in the vocalization areas of the two primary motor cortices and extends downward through axons that form the corona radiata, and eventually through the periaqueductal gray (PAG) area of the midbrain and brainstem to the nucleus ambiguus (located within the medulla oblongata), and finally through the right and left vagus nerves to the relevant muscles (see Figure 3 and Holstege, 1996). Thus, the reflexive vocal neural networks of the nucleus ambiguus are entrained by higher brain areas to complete the enactment of learned vocal coordination patterns. When learned functions have been repeated a sufficient number of times, frontal motor areas decrease their activation as the subcortical areas become more elaborated and sensitized.
When the two vagus nerves extend from the brainstem, they are actually made up of a few hundred thousand long axons that are sheathed together. The axons of laryngeal motor neurons (effectors) extend outward from their cell bodies in the medulla's nucleus ambiguus. They eventually branch off from the main vagus nerve trunk and extend to their target muscles (see Figure 3). Each muscle of the larynx also has the two types of sensory receptor fibers (Golgi tendon organs and muscle spindles). Laryngeal sensory neurons extend from the target muscles, join the vagus nerves that extend into the brainstem where they synapse with central nervous system sensory neurons in the nucleus ambiguus. From there, sensory signaling is distributed to a variety of processing areas within the brain.
4 Sensory reception networks for one’s own voice receive feedback about a “running” series of learned vocal coordinations, and report that feedback to “interested” brain areas for “interpretation.” For example, “status reports” are sent to various brain areas about the stretching force exerted on muscles, degrees of subglottic pressure, and relative positioning of the laryngeal cartilages (Larson, 1988; Wyke, 1983b). This feedback may be used to adjust elements of ongoing coordination sequences, or to change subsequent sequences so that they may more closely approximate a target intention. Auditory feedback and kinesthetic feedback participate significantly in the motor adjustments of trained singers and speakers, but much less so in less-trained singers or speakers (see, e.g., Thurman & Welch, 2000, Book I, Chapter 6 for a brief review, and Ward & Burns, 1978). Most auditory and nearly all kinesthetic feedback is processed outside conscious awareness (implicit perception, see Thurman & Welch, 2000, Book I, Chapter 7 for a brief review).
High-speed laryngeal capability is entrained and refined by singers who learn how to sing very rapid and wider-interval pitch patterns--the melismas of Baroque, jazz, and African-American gospel musics, for instance. The true extent of that capability is realized only when nearby unnecessary muscles are released from interfering contraction. When unnecessary muscles interfere, they force a slowing of higher-speed laryngeal muscle movement.
Capabilities for laryngeal muscle speed and fatigue resistance change with:
1. morphological age (compare early childhood and adolescent voice transformation capabilities with mature adult capabilities);
2. extent and manner of use (compare vocally healthy and well trained professional speakers and singers with untrained, quiet conversationalists); and
3. neuromuscular impairment (neuromuscular disease or injury).
REFERENCES AND SELECTED BIBLIOGRAPHY
Alipour, F., & Scherer, R.C. (2000). Vocal fold bulging effects on phonation using a biophysical computer model. Journal of Voice, 14(4), 470-483.
(1935). The Technique of Modern Singing.
(1967). The Science of Vocal Pedagogy.
Ardran, G.M., & Wulstan, D. (1967). The alto or countertenor voice. Music and Letters,
R.J. (1991). An overview of laryngeal
function for voice production. In R.T.
Sataloff (Ed.), Professional Voice: The
Science and Art of Clinical Care (pp. 19-47).
Basterra, J., Dilly, P.N., & Martorell, M.A. (1989). The autonomic innervation of the human vocal cord: Neuropeptides. Laryngoscope, 99, 293-296.
Bastian, R.W., Keidar, A., & Verdolini-Marston, K. (1990). Simple vocal tasks for detecting vocal fold swelling. Journal of Voice, 4(2), 172-183.
Klitzke, C., & Thurman, L. (2000).
Vocal fold and laryngeal surgery.
In L. Thurman & G. Welch, G. (Eds.),
Bodymind and Voice: Foundations of
Voice Education (Rev. Ed., pp. 620-631).
F.S., Dahl, H.A., & Teig, E. (1981).
Innervation types of muscle fibers in the human thyroarytenoid
muscle. Acta Otolaryngology (
M., & Chen, Y. (1998). Acoustic,
aerodynamic, physiologic and perceptual properties of modal and vocal fry
registers. Journal of the Acoustical Society of
Broad, D.J. (1973). Phonation.
In F.D. Minifee, T.J. Hixon & F. Williams (Eds.). Normal
Aspects of Speech, Hearing and Language (pp. 127-167).
& Behnke, E. (1884). Voice, Song, Speech.
Burke, R.E. (1981). Motor units: Anatomy, physiology, and
functional organization. In J.M.
Brookhart & V.B. Mountcastle (Section Eds.),
Roubeau, B., & Valette, C. (1985).
Study on the acoustical phenomena characteristic of the transition
between ++++++. In
Askenfelt Proceedings of the
Claassen, H., & Werner, J.A. (1992). Fiber differentiation of the human laryngeal muscles using the inhibition reactivation myofibrillar ATPase technique. Anatomy and Embryology, 186, 341-346.
(200). Voice transformation in male
adolescents. In L. Thurman & G.
Welch (Eds.), Bodymind and Voice:
Foundations of Voice Education (Rev. Ed., pp. 718-738).
Beckett, R.L., & Wiseman, R. (1985).
A longitudinal investigation of selected vocal, physiological, and
acoustical factors associated with voice maturation in the junior high school
male adolescent. Unpublished research
Partridge, L.D., & Alipour-Haghighi, F. (1993). Muscle energetics, vocal efficiency, and
laryngeal biomechanics. In I.R Titze
(Ed.), Vocal Fold Physiology: Frontiers
in Basic Science (pp. 37-92).
& Boves, L. (1987). On subglottal
formant analysis. Journal of the Acoustical Society of
DeCasper, A.J., Lecanuet, J.-P., Busnel, M.-C., Granier-Deferre, C., & Maugeais, R. (1994). Fetal reaction to recurrent maternal speech. Infant Behavior and Development, 17, 159-164.
(1951). Bel Canto in Its Golden Age.
Elder, G.C.B., Bradbury, D., & Roberts, R. (1982). Variability of fiber type distributions within human muscles. Journal of Applied Physiology, 53(6), 1473-1480.
(1945). The Technique of Singing.
(1982). The control of voice
quality. In V.L. Lawrence
(Ed.). Transcripts of the Eleventh Symposium: Care of the
Professional Voice (pp.
Estill, J. (1988). Belting and classic voice quality: Some physiological differences, Medical Problems of Performing Artists, 3, 37-43.
Estill, J., Baer, T., Harris, K.S. & Honda, K.
(1985). Supralaryngeal activity in a study of six voice qualities. In A. Askenfelt, S. Felicetti, E. Jansson,
& J. Sundberg (Eds.), Proceedings of
the Stockholm Music Acoustics Conference (pp. 157-174).
Baer, T., Honda, K. & Harris, K.S. (1983).
The control of pitch and voice quality:
An EMG study of supralaryngeal muscles,
In V.L. Lawrence (Ed.). Transcripts
of the Twelfth Symposium: Care of the Professional Voice (pp. 86-91).
Baer, T., Honda, K. & Harris, K.S. (1984).
The control of pitch and voice quality:
An EMG study of infrahyoid muscles,
In V.L. Lawrence (Ed.). Transcripts
of the Thirteenth Symposium: Care of the Professional Voice (pp.
Fröschels, E. (1920). Singen und Sprechen. Leipzig: Deuticke.
Fuster, J.M. (1997).
The Prefrontal Cortex: Anatomy, Physiology, and
Neuropsychology of the Frontal Lobe
(2003). Cortex and Mind: Unifying Cognition.
(2000). Understanding voice
transformation in female adolescents. In
L. Thurman & G. Welch (Eds.), Bodymind
and Voice: Foundations of Voice Education (Rev. Ed., pp. 739-744).
(1854-1855). Observations on the human
Proceedings of the Royal Society of
Gordon, T., & Pattullo, M.C. (1993). Plasticity of muscle fiber and motor unit types. Exercise and Sports Sciences Reviews, 21, 331-362.
& Flint, P.W. (1993). Anatomy. In C.W. Cummings & J.M. Fredrickson
(Eds.), Otolaryngology--Head and Neck
Surgery, Vol. 3: Larynx/Hypopharynx (2nd Ed., pp. 1693-1703).
(1984). Hints on Singing.
Garrett, J.D., & Carson, C.R. (1991). Neurology
of the laryngeal system. In C.N. Ford
& D.M. Bless (Eds.), Phonosurgery:
Assessment and Surgical Management of Voice Disorders.
Herzel, H., & Reuter, R. (1997). Whistle register and biphonation in a child’s voice. Folia Phoniatrica et Logopedica, 49(5), 216-224.
Hess, M.M., & Ludwigs, M. (2000). Strobophotoglottographic transillumination as a method for the analysis of vocal fold vibration patterns. Journal of Voice, 14(2), 255-271.
Hirano, M. (1988). Vocal mechanisms in singing: Laryngological and phoniatric aspects. Journal of Voice, 2(1), 51-69.
Hirano, M., Ohala, J., & Vennard, W. (1969). The function of the laryngeal muscles in regulating fundamental frequency and intensity of phonation. Journal of Speech and Hearing Research, 12, 616-628.
Hirano, M., Vennard, W., & Ohala, J. (1970). Regulation of register, pitch, and intensity of voice: An electromyographic investigation of intrinsic laryngeal muscles. Folia Phoniatrica, 22, 1-20.
Hollien, H. (1974). On vocal registers. Journal of Phonetics, 2,125-143.
(1985). Report on vocal registers. Proceedings of the
Hollien, H., & Michel, J.F. (1968). Vocal fry as a phonational register. Journal of Speech and Hearing Research, 11, 600-604.
& Schoenhard, C. (1983a). The riddle
of the middle register. In
& Schoenhard, C. (1983b). A review
of vocal registers. In V.L. Lawrence
(Ed.). Transcripts of the Twelfth Symposium:
Care of the Professional Voice (pp. 1-6).
Bandler, R. & Saper, C.B. (Eds.) (1996).
Progress in Brain Research, No.
107: The Emotional Motor System.
Howard, D.M. (1995). Variation of electrolaryngographically derived closed quotient for trained and untrained adult female singers. Journal of Voice, 9(2), 163-172.
Howard, D.M., Lindsey, G.A., & Allen, B. (1990). Towards the quantification of vocal efficiency. Journal of Voice, 4(3), 205-212 [See also Errata, (1991), Journal of Voice, 5(1), 93-95.]
Howard, D.M., & Rossiter, D. (1992). Results from a pilot longitudinal study of
derived closed quotient for adult male singers in training. Proceedings
Welch, G.F., & Penrose, T. (2001).
Case study acoustic and voice source evidence for the existence of
sub-registers in the countertenor voice.
In T. Murao, Y. Minami, & M. Shinzanoh (Eds.), Proceedings of the 3rd Asia-Pacific Symposium on Music
Education Research and International Symposium on ‘Uragoe’ and Gender (pp.
P.R. (1994). Synaptogenesis in human
cerebral cortex. In G. Dawson, &
K.W. Fischer (Eds.), Human Behavior and
the Developing Brain (pp. 137-152).
Matsudaira, M., & Kaneko, T. (1976).
Input acoustic impedance measurement of the subglottal system. Journal
of the Acoustical Society of
(1986). Vocal register change: An
investigation of perceptual and acoustic isomorphism. Unpublished Ph.D. dissertation,
Hurtig, R., & Titze,
(1981). Sub- and supra-glottal pressure
variation during phonation. In K.N.
Stevens & M. Hirano, (Eds.), Vocal
Fold Physiology (pp. 181-191).
(Ed.) (1973). Vocal Registers in Singing.
Larson, C. (1988). Brain mechanisms involved in the control of vocalization. Journal of Voice, 2(4), 301-311.
(1996). Prenatal auditory
Lecanuet, J.-P., Granier-Deferre, C., & Busnel, M.-C. (1989). Differential fetal auditory reactiveness as a function of stimulus characteristics and state. Seminars in Perinatology, 13, 421-429.
Krasnegor, N., Fifer, W.P., & Smotherman, W.P. (Eds.) (1995). Fetal
Development: A Psychobiological Perspective.
Lindestad, P., & Södersten, M. (1988). Laryngeal and pharyngeal behavior in countertenor and baritone singing--a videofiberscopic study. Journal of Voice, 2(2), 132-139.
Fritzell, B., & Persson, A. (1990).
Evaluation of laryngeal muscle function by quantitative analysis of the
EMG interference pattern. Acta Otolaryngologica (
G. (1953). What Happens in Singing.
Mårtensson, A., & Skoglund, C.R. (1964). Contraction properties of intrinsic laryngeal muscles. Acta Physiologica Scandinavia, 60, 318-336.
Martin, F., Thumfart, W.F., Jolk, A., & Klingholz, F. (1990). The electromyographic activity of the posterior cricoarytenoid muscle during singing. Journal of Voice, 4, 25-29.
(1997). Acoustic, Perceptual and Physiological Studies of Ten-Year-Old
Meltzoff, A.N. (1988a). Infant imitation and memory: Nine-month-olds in immediate and deferred tests. Child Development, 59, 217-225.
Meltzoff, A.N. (1988b). Infant imitation after a one-week delay: Long-term memory for novel acts and multiple stimuli. Developmental Psychology, 24, 470-476.
(1863). Anatomie und Physiologie des menschlichen Stimm- und Sprachorgans.
Miller, D.G., Schutte, H.K., & Hess, M.M. (2001). Physical definition of the “flageolet register”. Journal of Voice, 7(3), 206-212.
(1977). English, French, German and Italian Techniques of Singing.
(1986). The Structure of Singing: System and Art in Vocal Technique.
(1970). Coscienza della Voce nella Scuola Italiana di Canto (Awareness of
Voice in the
Mörner, M., Fransesson, N., & Fant, G. (1964). Voice register terminology and standard pitch. Speech Transmission Laboratory Quarterly Status Progress Report, 4, 12-15.
Murry, T., Xu, J.J., & Woodson, G.E. (1998). Glottal configuration associated with fundamental frequency and vocal register. Journal of Voice, 12(1), 44-49.
Orlikoff, R.F. (1991). Assessment of the dynamics of vocal fold contact from the electroglottogram. Journal of Speech and Hearing Research, 34, 1066-1072.
(1985). Prenatal Auditory Experiences
with Melodies: Effects on Postnatal Auditory Preferences in Human
Newborns. Unpublished Ph.D.
Pujol, R. (1993). Développement et plasticité du système auditif de l'enfant (Development and plasticity in the auditory system of the infant). Communiquer, 111, 13-16.
Pujol, R., &
Uziel, A. (1986). Auditory development:
Peripheral aspects. In P.F. Timeras
& E. Meisami (Eds.), Handbook of
Human Biological Development.
Frentzen, B., Gerhardt, K.J.,
(1981a). Acoustic interaction between
the glottal source and the vocal tract.
In K.N. Stevens & M. Hirano, (Eds.), Vocal Fold Physiology (pp. 305-323).
(1981b). The voice source in
singing. In Research Aspects on Singing (Publication #33).
(1988). Acoustic reinforcement of vocal
fold vibratory behavior in singing. In Vocal Fold Physiology: Voice Production,
Mechanisms and Functions.
Roubeau, B., Chevrie-Muller, C., & Arabia-Guidet, C. (1987). Electro-glottographic study of the changes of voice registers. Folia Phoniatrica, 39, 280-289.
& Schneider, C.M. (1995). Vocal Exercise Physiology.
J., Zauner, B., Weikert, M.,
Shahidullah, S., & Hepper, P.G. (1994). Frequency discrimination by the fetus. Early Human Development, 36, 13-26.
Shiotani, A., Fukuda, H., Kawaida, M., & Kanzaki, J. (1996). Vocal fold vibration in simulated head voice phonation in excised canine larynges. European Archives of Otolaryngology, 253(6), 356-363.
Shipp, T. (1975). Vertical laryngeal position during continuous and discrete vocal frequency change. Journal of Speech and Hearing Research, 18(4), 707-718.
Lindestad, P.-Å., MacCurtain, F.,
Shipp, T., & McGlone, R. (1971). Laryngeal dynamics associated with voice frequency change. Journal of Speech and Hearing Research, 14, 761-768.
(1999). The Developing Mind: Toward a Neurobiology of Interpersonal Experience.
W.P., & Robinson, S.R. (1995).
Tracing developmental trajectories into the prenatal period. In T.P. Lecanuet, W.P. Fifer, N.A. Krasnegor,
& W.P. Smotherman (Eds.), Fetal
Development: A Psychobiological Perspective.
& Vaughan, C.W. (1981). The
morphology of the phonatory organs and their neural control. In K.N. Stevens & M. Hirano (Eds.), Vocal Fold Physiology (pp. 13-22).
(1987). The voice source. In J. Sundberg, The Science of the Singing Voice (pp. 49-92).
Sundberg, J., & Kullberg, A. (1998). Voice source studies of register differences in untrained female voices. Royal Institute of Technology - Speech, Music and Hearing Quarterly Progress, (TMH-QPSR), 1-2, 9-17.
Sundberg, J., & Högset, C. (1999). Voice source differences between falsetto and modal registers in countertenors, tenors, and baritones. Royal Institute of Technology - Speech, Music and Hearing Quarterly Progress, (TMH-QPSR), 3-4, 65-74.
Tanaka, K., Kitajima, K., & Kataoka, H. (1997). Effects of tranglottal pressure change on fundamental frequency of phonation: Preliminary evaluation of the effect of intraoral pressure change. Folia Phoniatrica et Logopedica, 49(6), 300-307.
(1988). Parental singing during
pregnancy and infancy can assist in cultivating positive bonding and later
development. In Fedor-Freybergh, P.G.,
& Vogel, M.L.V. (Eds.), Prenatal and
Perinatal Psychology and Medicine:
Encounter with the Unborn (pp. 273-282),
Thurman, L., Chase, M., & Langness, A.P. (1987). Reaching the young child through music: Is pre-natal and infant music education possible? International Music Education, 14, 21-28.
& Welch, G. (2000). Bodymind and Voice: Foundations of Voice
Education (Rev. Ed.).
Timberlake, C. (1990). Practicae musicae: Terminological turmoil--the naming of registers. The National Association of Teachers of Singing Journal, 47(1), 14-26.
(1983). The importance of vocal tract
loading in maintaining vocal fold oscillation.
Proceedings from the
(1984). Influences of subglottal
resonance on the primary register transition.
In V.L. Lawrence (Ed.). Transcripts of the Thirteenth Symposium: Care of the Professional Voice
Titze, I.R. (1988). A framework for the study of vocal registers. Journal of Voice, 2(3),183-194.
Titze, I.R. (1990). Interpretation of the electroglottographic signal. Journal of Voice, 4(1),1-9.
I.R. (2000c). Generation and propagation
of sound. In I.R. Titze, Principles of Voice Production (pp.
(2000d). Vocal registers. In Principles
of Voice Production (pp. 281-310).
Titze, I.R., Luschei, E., & Hirano, M. (1989). Role of the thyroarytenoid muscle in regulation of fundamental frequency. Journal of Voice, 3, 213-224.
Tom, K., Titze,
I.R., Hoffman, E.A., & Story, B.H. (2001).
Three-dimensional vocal tract imaging and formant structure: Varying
vocal register, pitch, and loudness. Journal of the Acoustical Society of
Van den Berg, J.W. (1958). Myoelastic-aerodynamic theory of voice production. Journal of Speech and Hearing Research, 1, 227-244.
van Diense, J. (1982). On vocal registers. Journal of Research in Singing, 5(2), 33-39.
Vander, A.J., Sherman, J.H., & Luciano, D.S.
(1994). Human Physiology: The
Mechanisms of Body Functions
(1967). Singing: The Mechanism and Technic.
Vennard, W., Hirano, M., & Ohala, J. (1970a). Chest, head and falsetto. The National Association of Teachers of Singing Bulletin, 27, 30-36.
Vennard, W., Hirano, M., & Ohala, J. (1970b). Laryngeal synergy in singing. The National Association of Teachers of Singing Bulletin, 27(1), 16-21.
Vilkman, E., Alku, P., & Laukkanen, A.-M. (1995). Vocal-fold collision mass as a differentiator between registers in the low-pitch range. Journal of Voice, 9(1), 66-73.
Gordon, T., & Jones, R. (1994). Nerve-Muscle Interaction.
(1995). Neuroscience of Communication.
Welch, G.F., Sergeant, D.C., & MacCurtain, F. (1988). Some physical characteristics of the male falsetto voice. Journal of Voice, 2(2), 151-163.
Welch, G.F., Sergeant, D.C., & MacCurtain, F. (1989). Xeroradiographic-electrolaryngographic analysis of male vocal registers. Journal of Voice, 3(3), 244-256.
(1935). The Living Voice.
Williams, R.S., Garcia-Moll, M., Mellor, J., Salmons, S., & Harlan, W. (1987). Adaptation of skeletal muscle to increased contractile activity: Expression of nuclear genes encoding mitochondrial proteins. Journal of Biological Chemistry, 262, 2764-2767.
Wurgler, P. (1990). A perceptual study of vocal registers in the singing voices of children. Unpublished Ph.D. dissertation, The Ohio State University.
(1983a). Neuromuscular control systems
in voice production. In D.M. Bless,
& J.H. Abbs (Eds.), Vocal Fold
Physiology: Contemporary Research and Clinical Issues.
(1983b). Reflexogenic contributions to
vocal fold control systems. In I.R.
Titze, & R.C. Scherer, (Eds.) (1983).
Vocal Fold Physiology:
Biomechanics, Acoustics and Phonatory Control (pp. 138-141).
Leon Thurman is Specialist Voice Educator at Fairview Voice Center, Rehabilitation Services, Fairview-University Medical Center in Minneapolis, Minnesota, USA, [email: email@example.com], and founder and Development Director of The VoiceCare Network [www.voicecarenetwork.org].
Graham Welch is Professor of Music
Education and Head of the
Axel Theimer is Professor of Voice and
Choral Music at
Carol Klitzke is Speech Pathologist/Voice