|“||And who in time knows whither we may vent / The treasure of our tongue? To what strange shores / This gain of our best glory shall be sent, / T' enrich unknowing nations with our stores? / What worlds in th' yet unformed Occident / May come refin'd with th' accents that are ours?||”|
|— Samuel Daniel, Musophilus, final lines, 1599|
Articulatory phonetics is the study of the production of human speech sounds. It concerns the manipulation of the shape of the oral tract to change the shape of resulting sound waves, creating human speech. Human beings have evolved a very precise oral mechanism that allows the production of an amazing number of sounds that are then combined into meaningful words and phrases.
The very smallest piece of meaningful sound produced for speech is the phoneme. By changing one phoneme, the meaning of a word is also changed. The study of meaningful sounds in a language is known as phonology. Phonetics is also concerned with small differences within phonemes, known as allophonic variation. For example, in English, consider the words "leaf" and "feel". Although the first sound in leaf and the last sound in feel would both be described by a native English speaker as an "l" sound, they are formed slightly differently, and have different acoustic characteristics. In a narrow phonetic transcription, the first would be transcribed as [l], and is known as "clear l", while the second would be transcribed as [ɫ], and is called "dark l". The two versions of l are known as allophones, and in English, are not perceived as meaningfully different sounds. Leaf could be produced with a dark l, and would still be perceived as leaf, albeit oddly pronounced; similarly, full could be pronounced with a clear l, and still be heard as an oddly accented version of the word.
While phonology is only concerned with sounds that produce meaningful changes in a language, phonetics considers any allophonic variation of the phonemes. Articulatory phonetics looks at the production of these sounds.
The International Phonetic Association was founded in 1886, with the goal of creating a comprehensive yet simple to use collection of symbols representing the sounds created in languages. The IPA has established conventions for both the phonological representation of the world's phonemes and diacritical marks to further transcribe allophonic variances. The phonetic alphabet chart is separated into two sections: consonants and vowels, as well as sections for non-pulmonic consonants and diacritics. The chart is broken down in the following sections.
Vowels in the IPA
The vowels are represented by the IPA in the following chart:
- Where vowels are paired, the one on the left is unrounded, while the one on the right is rounded.
- Vowels that exist in Canadian English have been bolded.
The IPA chart for vowels represents an abstraction of the human oral cavity, from the teeth at the left, out to the back of the mouth at the right, and from the roof of the mouth(or at least as high as the tongue will reach for a vowel, down to the lowest point for the tongue at the bottom. The chart divides the mouth into three main horizontal areas: front, centre, and back, with minor divisions also labeled as near front and near back. The height of the tongue is divided into three sections as well: high, or close, mid, and low, or open. The minor divisions between are also marked. Where vowels are paired, convention puts the unrounded vowel, that is, the vowel spoken with the lips relaxed or stretched, to the left, and the rounded vowel, produced with pursed lips, to the right. The IPA considers all vowels to be voiced, unless marked otherwise with a special diacritic, which will be discussed in the section on diacritics.
Consonants in the IPA
The consonants, as represented by the IPA, are represented in the following chart, with phones of English in bold.
|Tap or Flap||ɾ||ɽ||ʡ̯|
- Shaded areas indicate articulations judged impossible.
- Glottal Stops are not phonemic in English, and either occur between words, such as in "ice cream" : /ajsʔkrim/ to distinguish it from "I scream" : /ajʔskrim/, or as a reduced consonant in rapid speech, such as some forms of bottle : /bɔʔəl/.
- The voiced Alveolar Flap, /ɾ/, is not phonemic in English, but occurs as a reduced "t" or "d" in some rapid speech, such as "butter" /bʊɾəɹ/ or "writer" /ɹajɾəɹ/.
- The velar approximant is actually co-articulated, with a pursing of the lips as the tongue approaches the velum
The IPA chart separates consonants along three dimensions: manner of articulation, place of articulation, and voicing. The manners of articulation are the rows of the table and the places of articulation are the columns of the table. Voicing is indicated by pairing of the symbols at a particular place and manner of articulation. If the consonants are paired, then the example on the left is voiceless, and the one on the right is voiced. The chart is organised such that the place of articulation moves farther back in the oral tract as the consonant moves to the right on the chart. Also, as the consonant falls lower in the chart, it generally (with a few exceptions, such as trills and laterals) has less obstruction than the manner of articulation above it. Thus, stops provide the least obstruction, followed by fricatives, and so on.
Not present on the chart, but also usually viewed as a single co-articulated sound as opposed to two separate phones are the affricates. Affricates are sounds that begin as stops, and release into fricatives, and their symbols show this relation. The symbol for an affricate contains the symbols for the two constituent articulations, with or without a super-imposed bar connecting the two. English has two common affricates, one voiced, and one unvoiced: /tʃ/, such as the sound found at the beginning and end of "church", and /dʒ/, the sound found at the beginning and end of "judge". Although it can be seen that the place of articulation is different for the stop and the fricative, the affricate is labeled as occurring at the place of the fricative. Thus, the two English affricates are post-alveolar affricates.
Non-Pulmonic consonants in the IPA
Although the majority of consonants are produced as egressive pulmonics - that is, by a supply of air starting in the lungs and exiting at the lips or nostrils - there are three types of consonants that are articulated completely differently: the ejectives, the implosives, and the clicks. The IPA recognizes all three, and transcribes them according to a separate section of the IPA chart, seen below.
|ǀ||Laminal alveolar ("dental")||ɗ||Alveolar||pʼ||Bilabial|
|ǃ||Apical (post-) alveolar ("retroflex")||ʄ||Palatal||tʼ||Alveolar|
|ǂ||Laminal postalveolar ("palatal")||ɠ||Velar||kʼ||Velar|
|ǁ||Lateral coronal ("lateral")||ʛ||Uvular||sʼ||Alveolar fricative|
The implosives are created by blocking the airflow as in a stop consonant, and then lowering the glottis downward, allowing the air to expand in what is now a larger vocal cavity. The stop is then released, creating a sound that unlike the egressive stops, has no burst of air upon release. Although it would seem that the glottis need be closed for the production of an implosive, they can be and often are produced with a slight opening of the glottis, producing vibration of the vocal folds. The symbols for the implosives resemble their voiced egressive counterparts, albeit with a "hook" extending from the top right of the symbol.
The ejectives are created in a way opposite to the implosives. The stop is created, as in an implosive, but then the glottis is raised, making a smaller vocal tract and condensing the air between the glottis and the lips. The stop is then release, resulting in an even harder burst of air than would be present in a fully released egressive pulmonic stop. Due to the full closing of the glottis required for the production of ejectives, they are all voiceless. There are no special symbols in the IPA for ejectives. Rather, the symbol for the voiceless egressive pulmonic stop is used, with the ʼ diacritic.
Click consonants are doubly articulated consonants, whereby air is trapped in an area between two articulated stops. Air is then sucked into the mouth, which contracts the area being held for the click. The foremost stop is then released, pushing out the air that was trapped inside, and creating the click sound.
Diacritics of the IPA
The IPA strives to represent any possible producible speech sound, but there are far too many variations, even within an individual's speech, to make such a system useful. Rather than create symbols for every single variant of every single sound, the IPA employs a system of diacritical marks that can be applied if a variance is noted. If a speaker is producing his /t/ with his tongue against the teeth, for example, as opposed to behind the teeth, the sound can be noted as "dental", and transcribed as [t̪]. Likewise, if a speaker is devoicing a sound that should be voiced, /d/ for example it can be marked as "voiceless", and transcribed as [d̥]. Note that [d̥] is different than [t], even though the only difference between /t/ and /d/ is voicing. /t/ is a different phoneme that /d/ in English, while [d̥] instead of [d] would not change the meaning of the word, but would lead to communication issues; the speaker means to use [d], but it would be heard as [t]. It is also worth noting that slashes // are used to denote phonemes, those sounds that carry meaning in a language, while brackets, , are used for phonetic transcription, and are able to discern differences at the phonetic, rather than phonemic, level. Not all diacritics are used to label speech errors; some, such as aspiration, marked with a superscript h, are useful in determining if certain phones only occur in certain phonetic environments. The most common diacritics are shown in the following table 
|◌̩||ɹ̩ n̩||Syllabic||◌̯||e̯ ʊ̯||Non-syllabic|
|◌ʰ||tʰ||Aspirated||◌̚||d̚||No audible release|
|◌ⁿ||dⁿ||Nasal release||◌ˡ||dˡ||Lateral release|
|◌̥||n̥ d̥||Voiceless||◌̬||s̬ t̬||Voiced|
|◌̤||b̤ a̤||Breathy voiced||◌̰||b̰ a̰||Creaky voiced|
|◌̪||t̪ d̪||Dental||◌̼||t̼ d̼||Linguolabial|
|◌̺||t̺ d̺||Apical||◌̻||t̻ d̻||Laminal|
|◌̟||u̟ t̟||Advanced Articulation||◌̠||i̠ t̠||Retracted articulation|
|◌̈||ë ä||Centralised articulation||◌̽||e̽ ɯ̽||Mid-centralised articulation|
|◌̝||e̝ ɹ̝||Raised and lowered articulation|Raised (ɹ̝ = Voiced alveolar fricative|
|◌̞||e̞ β̞||[[Raised and lowered articulation|Lowered (β̞ = Voiced bilabial fricative|
|◌̹||ɔ̹ x̹||More rounded||◌̜||ɔ̜ x̜ʷ||Less rounded|
|◌ʷ||tʷ dʷ||Labialized or labio-velarized||◌ʲ||tʲ dʲ||Palatalized|
|◌ˠ||tˠ dˠ||Velarized||◌ˤ||tˤ aˤ||Pharyngealized|
|◌ᶣ||tᶣ dᶣ||Labio-palatalized||◌̴||ɫ z̴||Velarized or pharyngealized|
|◌̘||e̘ o̘||Advanced tongue root||◌̙||e̙ o̙||Retracted tongue root|
|◌̃||ẽ z̃||Nasalized||◌˞||ɚ ɝ||Rhotacized|
The state of the glottis can be finely transcribed with diacritics. A series of alveolar plosives ranging from an open to a closed glottis phonation are:
|[d̤]||breathy voice, also called murmured|
|Vibrating Vocal Folds||[d]||modal voice|
|Closed glottis||[ʔ͡t]||glottal closure|
A sample transcription
The following are two very broad transcriptions of the first four and a half lines of the soliloquy from Act 3, Scene 1 of Hamlet. The first contains the lines as they would be spoken in careful, stage standard English. The second contains the same lines, but as they might appear in rapid, conversational speech.
|“||To be, or not to be, that is the question / Whether 'Tis nobler in the mind to suffer / the slings and arrows of outrageous fortune / Or to take arms agains a sea of troubles / And by opposing, end them?||”|
|— William Shakespeare, Hamlet, Act III, Scene i, lines 56-60, ~1599|
/tu bi ɔɹ nat tu bi ðæt ɪz ðə kwɛstʃiən wɛðəɹ tɪz nobləɹ ɪn ðə majnd tu sʌfəɹ ðə slɨŋz ænd æɹoz ov owtɹejdʒəs fɔɹtjun ɔɹ tu tejk aɹmz əgenst ə si ov trʌbəlz ænd baj əpoziŋ ɛnd ðɛm/
/tə bi əɹ nat tə bi ðæts ðə kwɛstʃən wɛðəɹts nobləɹ ɪn ðə majnd tə sʌfəɹ ðə slɨŋz ənd æɹoz əv owtɹejdʒəs fɔɹtjən əɹ tə tejk aɹmz əgenstə si əv trʌbəlz ənd baj əpoziŋ ɛnd ðəm/
There are a few small differences between the two: many of the unstressed vowels reduce to schwa (/ə/), while it becomes more difficult to separate words. Even in the stage transcription, the words are separated based upon a knowledge of the language, not upon any heard segmentations (although pauses are more likely on the stage). If the transcription was true to real speech, there would be very few word separations.
Why use the IPA?
The above transcription shows some of the strengths of using a universal alphabet. For one thing, it can represent in one symbol sounds that often take several to write in English (double consonants are represented by a single phoneme, such as in "suffer"). Furthermore, there is no ambiguity in spelling. Some languages employ a one-to-one letter to sound correspondence, but for the many that don't, or for those that have never been written down, the IPA provides a consistent guideline for pronunciation. Whereas the sound /f/ might be spelled four or five different ways in English, it is always reproduced as /f/ in the IPA. Conversely, the IPA is versatile enough to capture differences in pronunciation. The English /ɹ/, Spanish /r/, and French /R/ are all written as "r" in their respective languages, but are produced (and sound) quite different. Furthermore, small variances in pronunciation within a language can be marked. The diacritic system allows linguists, as well other professionals, such as speech pathologists to mark noted peculiarities of speech that can then be easily recognized by other specialists the world over. Doctors can share information regarding patients, and researchers can readily understand descriptions of languages, without having to hear the afflicted speech firsthand.
Consonants and Vowels
The main distinction to be made when considering human speech sounds is that between vowels and consonants. Generally speaking, vowels have very little restriction of the air-flow, and are differentiated only by the placement of the tongue upon their production. Consonants, on the other hand, rely on partial restriction of airflow at any or all of a number of points of articulation.
For the most part, vowels are pronounced with very little obstruction of the airflow, and are only distinguished by the shape of the mouth as the air is expressed. Vowels can be distinguished by the height and forward position of the tongue, and whether air is allowed to resonate through the nasal cavity as well as the oral cavity. A third dimension, rounding, is also used to distinguish vowels, and is a description of whether or not the lips are rounded or open when the vowel is produced. It should also be noted that often, vowels in English are not as clear as their European counterparts, and are realized with a diphthongal glide, especially at the ends of syllables.
It is also worth noting that consonants are some of the first sounds produced when children are learning to speak.. This is not surprising; as infants are learning to manipulate their tongues, it is not unreasonable to expect the first sounds to be those that require no obstruction, i.e. very little motor control over the tongue.
Using the chart above, we see that certain vowels occur at along the edges of the range of vowels; these vowels are known as cardinal vowels. These vowels are produced when the tongue is in an extreme position, either at its highest point, lowest point, most forward point, or most back point. These vowels are used as reference points, to note the range of the tongue in vowel production. Thus, the [i] is the highest, most forward vowel that can be produced before the tongue moves into a consonantal position, [ɑ] is the lowest, most posterior vowel that can be produced, with the tongue lowered back as far as it can go, [u] is produced with the tongue as high and far back as it can go, and [a] is produced with the tongue as low as possible, and as forward as possible, without creating a consonant. All of the other vowels fall between these four "corners" of the mouth.
Vowels are described according to the height of the tongue, relative to its position at rest. If the tongue is higher than it would be at rest, the vowel is said to be high; if the tongue is lower than rest position, then the vowel is described as low; if neither is true, then the vowel is described as mid. Note that even with high vowels, the tongue is nowhere near as high as it would be for a stop or fricative consonant. The effect of the changed tongue position is to change the quality of the sound without hindering the airflow at all. One can see the difference between comparing the vowels [i] such as in "meat", and [æ], such as in "mat". The tongue drops upon the production of the second; so much so, in fact, that the jaw drops with it. Furthermore, vowels can be distinguished as tense or lax. Practically speaking, tense vowels are slightly higher than their lax equivalents. For example, the sound [i], found in the word "beat", is the tense equivalent of the vowel [ɪ], found in [bit]. These two words form what is known as a minimal pair , in that they are only distinguished in one sound. In this case, the sounds [i] and [ɪ] are distinguished only by the fact that [i] is tense, and [ɪ] is lax; they are both high, front, unrounded vowels.
Vowels are also defined by how far forward in the mouth the tongue is positioned. If the tongue is forward in the mouth, relative to its rest position, the vowel is a front vowel; if the tongue is further back than it would be at rest, it is a back vowel, and if it is neither front or back, the vowel is a central vowel. One can feel the tongue moving by articulating the vowels in quick succession, and the changing tongue position can be viewed by using a mirror. If one starts with the [i] sound in "feed", and moves towards the [u] sound in "food", one can both feel and see the tongue moving backwards. Likewise, if one moves from the [ɪ] sound in "pit" to the [ʊ] sound in "put", one can see the tongue sliding back to produce the second.
Whereas consonants have pairs dictating the state of the vocal cords during their production, vowels are, in a high majority of cases, voiced, and thus this distinction is invalid. Vowels, however, do have a series of pairing that reflects the state of the lips during their production. Vowels can be produced either with the lips spread, or with the lips pursed together into an "o" shape, as when whistling. This condition is known as rounding. Although English makes no distinction between rounding (if a rounded vowel exists in English, the unrounded equivalent does not, and vice versa), this is not necessary across all languages. French, for example, contains both the high front unrounded [i], and the high front rounded equivalent [y] as separate phonemes.
Vowels can also be distinguished by the position of the velum during their production. Although the velum itself produces no obstruction of the airflow, a raised velum allows the air to resonate through the nasal as well as the oral cavity, giving it a distinctly different sound. Nasality can represent phonemic variance, as is the case in French, or can be the result of a vowel occurring in the vicinity of a nasal consonant, resulting in a vowel produced with a lowered velum, and no distinct nasal stop, or as a nasal vowel, complete with accompanying nasal consonant, as often happens in English.
Diphthongs can be considered the vowel equivalent of either affricates or consonant clusters. A diphthong starts with the tongue in the position of one vowel, and then, as the vowel is being produced, the tongue moves to the position of a second vowel, or possibly a glide (as is discussed in the section concerning approximant consonants, the distinction is often minimal). Because there is no interruption in the airstream, rather than creating two separate sounds, a new sound is created. For example, the diphthong [aj], as pronounced in the English word "ride", begins with the low back vowel [a], and finishes with the glide [j], which is considerably higher in the mouth, with the tongue much farther forward than in [a]. To illustrate, if one were to pronounce a clear [a], and slowly change to [j], which is the sound in first syllable of the English word "yes", one ends up producing the diphthong [aj]. One peculiarity of Canadian English is the tendency to raise the first vowel in a diphthong before voiceless consonants: the notorious Canadian Raising. Thus, the diphthongs [aj], and [aw], common to American and British accents, and realised in such words as "might," "light," "mouse" and "louse,", become [əj] and [əw] in those words in Canadian English. The diphthongs show no change in such words as "loud," or "raze," however, due to the subsequent voiced consonant. English also occasionally makes use of the diphthongs [ɔj], as in "boy," [ej], as in "late," and [ju], as in some pronunciations of "news."
Unlike vowels, consonants are the result of a varying amount of obstruction of the airflow along the vocal tract. Consonants, like vowels, are defined by three common features of their production: the status of the vocal folds during their production, the place of obstruction of the air, and the manner of the obstruction. Generally, pulmonary consonants only have one point of articulation, but can have secondary articulators further forward in the mouth. For this article, the descriptions of places of articulation will be constrained to a single place for each consonant, although it is recognized that the quality of the sound can be slightly altered with the addition of further points of articulation. The major points of articulation are listed in the image to the right.
Although this article is concerned with Articulatory Phonetics in general, it is acknowledged that the readers of this article will be familiar with the English language, and thus, an inventory of the various phones of English are listed here. Even when considering a single language, we must be restrictive (or all-inclusive). Various accents of English produce different phones, and even among the "same" sounds, there is some variation that is not generally noticeable. However, there is a subset of the phones that are able to be produced that do exist in most English accents.
After vowels, it might be assumed that infants progress to approximants or fricatives, as they do not require as much obstruction as stops. This, however, is not the case.  As the infant continues to explore his vocal tract, the stops appear next, starting with the velar stops (/k/ and /g/), and moving towards the front of the mouth as the child moves into the babbling stage (/p/, /b/, /t/, /d/). This may seem counter-intuitive, but the /p/ and /b/ sounds are relatively simple to make, require only the ability to press the lips together, and open them at the correct time. Likewise, the other stops merely require that the tongue be held against the passive articulator, being released at the correct time. The approximants and fricatives require the tongue to be held a specific distance from the articulator, which requires greater motor control.
After the stops are acquired, the infant often gains the front nasals; as he is playing with the oral stops, he invariably discovers that the velum can be lowered, creating the same sounds, but nasally. These sounds are usually acquired during the canonical babbling stage. Sometime during this stage, the child also acquires the voiceless fricatives /s/ and /h/ and the approximants /w/ and /j/, which require some finer control of the tongue and larynx. The child will often begin speaking his first words with this repertoire, only gaining the sounds such as the affricates ("ch" and "sh") and lateral /l/, or the inter-dental (the "th" sound) or labio-dental (/f/ and /v/) sounds, after he has been speaking for a while. This is only a general guideline of course; although it seems to be fairly consistent, even across languages (stops before fricatives, fricatives before affricates, etc), each child will have some variation.
Places of Articulation
This section is concerned with the pulmonary egressive consonants. These consonants form a subset of the entire group of consonants, and are created by pushing air out from the lungs, past the vocal folds, and out through either the oral or nasal cavity. All vowels, and a majority of the consonants are produced in this manner.
The tongue can be considered the major articulator involved in human speech. A majority of consonants are created by the coordination of the tongue and at least one other articulator, and all vowels are defined by the position of the tongue in the mouth. There are sounds that are created irrespective of the position of the tongue, where some other articulator serves as the primary articulator, but the tongue is never used as a passive articulator. That is, if the tongue is used, other articulators do not approach the tongue while it remains static; it is always the other way around. The tongue is divided into four areas: the tip, at the very front of the tongue; the blade, which is the flexible part at the front of the tongue; the back, or dorsum, which is the large section towards the back of the tongue, and the root, which is the posterior section of the tongue. Sounds made with the tip of the tongue are known as apical; with the blade, laminal; with the back, dorsal, and with the root, radical.
The larynx, commonly called the "voice box," is one of the chief elements in creating speech sounds. For common, pulmonary speech sounds - that is, those originating in the lungs - all air must pass through the vocal folds, which are contained within the larynx. Depending upon the amount of separation between the vocal folds, air flow can be cut off completely; cause the vocal folds to vibrate, as in voiced consonants and vowels; or pass freely through the folds, as in voiceless consonants.
Sounds that are hindered in the vocal folds are called glottal, after the glottis, which is the space between the vocal folds. The airflow of glottal consonants is obstructed by briefly closing the folds either partially, or in the case of the glottal stop, all the way. The sound may be further manipulated in the oral cavity, but its shape is defined by the obstruction in the glottis.
Unlike the more forward points of articulation, where various manners of articulation are possible, sounds manipulated at the glottis can only be realized as stops or fricatives. A glottal stop is performed by completely closing the vocal folds, resulting in a silent sound that is only recognizable in contrast to surrounding syllables. The glottal fricative is created by opening the vocal folds slightly, creating turbulence in the airstream.
English contains two glottal phones: /h/, the voiceless glottal fricative, and /ʔ/, the glottal stop (which must be voiceless, as if the vocal folds are closed, they cannot be vibrating). As its symbol suggests, the glottal fricative is realized most often in English by the letter h, such as the initial sound in "house"; in English, it never occurs in the middle or at the end of a syllable. The glottal stop is not phonemic, but does help define word boundaries, and can occur as a reduced form of some stop consonants in some dialects.
Between the vocal folds and back of the mouth is a passage known as the pharynx. Although not a point of articulation in English, some languages in the world do consist of pharyngeal consonants, where the root of the tongue approaches the back wall of the throat. The IPA does not currently recognize any pharyngeal consonants save for both a voiced and voiceless pharygeal fricative. This sound is created by moving the back of the tongue towards the back of the pharynx, without actually touching the pharyngeal wall with the tongue. The anatomy of the voiceless version of this sound is shown in the image on the right.
The uvula hangs from the roof of the mouth at the very back of the soft palate, and can be used as a secondary articulator in the production of some speech sounds. Uvular consonants can be created by extending the root of the tongue back towards the uvula, either touching it to complete a stop, as in the IPA consonants /q/ and /G/, or creating a fricative, as in the IPA /χ/. Furthermore, the uvula can be used as a primary articulator. Airflow can be directed towards the uvula, causing it to vibrate, which in turn alters the quality of the sound being produced. English does not employ uvular sounds.
The velum, or soft palate, is the fleshy part at the back of the roof of the mouth. The velum has a double role in speech articulation: it can be used as a passive articulator, allowing the tongue to create consonants by approaching it, and as an active articulator in the creation of nasal vowels and consonants. English uses the velum as a passive articulator in the production of the sounds /g/ as in "gate", and /k/ as in "Kate". The velum can be raised or lowered to varying heights, although in practice, articulatory phonetics is only concerned with two distinctions: raised, also called closed, or lowered, also called open. When the velum is raised, air can only pass through the oral cavity, and exit through the lips. When the velum is lowered, however, air is redirected into the nasal cavity, and exits through the nostrils. English also makes use of a lowered velum, with two nasal sounds, namely /m/ as in the first sound in "mine", /n/, which is the first sound in "nine". English also makes use of a sound that uses the velum as both active and passive articulator, namely /ŋ/, which is the last sound in "thing". For this sound, the tongue touches the velum, and the velum is lowered, as shown on the right. In English, this sound cannot occur at the beginning of words.
The Hard Palate
Directly in front of the velum, the roof of the mouth is harder (an area known as the hard palate, or simply the palate). The palate cannot be manipulated as the velum, uvula, and glottis can: its position and form are constant. Rather, it acts as a passive articulator, as the tongue either touches the hard palate, or rests close to it. English jumps over the hard palate, producing no consonants at this point of articulation. Although the tip of the tongue can be reached back to touch the front of the palate, generally, palatal sounds are created with the middle or back of the tongue, as opposed to the tip. It the tip or the blade of the tongue touches the hard palate, the sounds are generally called retroflex, instead of palatal.
The area on the roof of the mouth between the hard palate and the alveolar ridge is used for various consonants, and is generally divided into two areas. Consonants that are formed at the front of this area are called post-alveolar, while consonants formed at the back of this area, more toward the hard palate, are called retroflex. Retroflex consonants are peculiar in that although the area of the roof of the mouth that is used is above the middle or back of the tongue, retroflex consonants are generally made with the blade or the tip of the tongue, rarely even using the bottom of the tip, as the tongue is curled over.
The Alveolar Ridge
Directly behind the teeth is a hard projection known as the alveolar ridge. The ridge is easily accessed by the tongue, and is the only place of articulation that allows for every manner of articulation. That is, with the alveolar ridge as a passive articulator, the tongue can be manipulated to form the widest variety of consonants at that place in the mouth. The alveolar ridge is never used as an active articulator. An example of the voiceless alveolar stop, /t/, is shown to the left. This is the sound heard at the beginning of the English word "tip"; its voiced counterpart, /d/, is articulated in the same manner, but with the vocal cords vibrating, and is heard at the beginning of the word "dip". /s/ and /z/, the initial sounds of "Sue" and "zoo", respectively, are also articulated at the alveolar ridge. The alveolar ridge also defines an area behind it, but in front of the area that known as the post-alveolar region. There is no physical signifier of this region; it can only be described in terms of the alveolar ridge and the palate. English produces two sounds in the this region: /ʃ/ and /ʒ/. The first is found in words such as "shift", at the beginning. The second, the voiced equivalent, is found only in certain pronunciations of French loan words such as "beige" and "garage," in both cases at the end, although it can also be found as part of its affricate form /dʒ/ in words such as "bridge," again at the end.
The teeth are another passive articulator, and are used in a variety of sounds across various languages. Sounds created at the teeth are known as dental. When referring to the teeth, it is the four front-teeth in the middle of the mouth, the maxillary incisors, that are of concern. The various sounds produced at the teeth are often produced in very different locations, unlike some of the other places of articulation. Dental stops are performed by pressing the tongue up against the incisors at the top of the mouth. The IPA does not recognize these stops as being significantly different from alveolar stops, and transcribes them with the alveolar symbols /t/ and /d/, with the dental diacritic /t̪/ or /d̪/. Other consonants classified as dental include the dental fricatives /θ/ and /ð/, which are often called "inter-dentals", as they are produced by placing the tongue between the upper and lower incisors. These sounds are found syllable-initially and syllable finally, respectively, in words such as "thing" and "this", and "with" and "lithe". Finally, there are a class of dental sounds that are created not with the tongue as the active articulator, but rather the lips. These labio-dental sounds are created by pressing the lower lip against the upper teeth. An example of the voiced version of this sound, /v/ on the IPA, is shown to the right. This is the sound found at the end of the English word "of". Its voiceless equivalent is /f/, found at the beginning of the word "friend".
The lips are the final articulator before air exits the mouth, and are usually used as a primary articulator, with sounds created at the lips known as labial sounds. Although linguo-labial sounds can be produced, they are generally treated as variations of labial sounds. Apart from the labio-dentals described in the previous section, the lips often work in conjunction, creating bilabial sounds such as /b/, /p/, and /m/, which are some of the most common sounds across languages.. An example of the bilabial nasal, /m/ is shown to the left. This is the sound at the beginning and end of "mom". Similarly, /b/ and /p/ can be found at both the beginning and end of syllables such as "bob" and "pop". Note that for /m/, the tongue is in rest position, and that the air thus flows through the oral cavity until it is stopped by the lips.
Manner of Articulation
Just as important as the place of articulation is the manner of articulation. Generally, consonants are sub-divided into two groups: obstruents and sonorants. The obstruents have some sort of obstruction of the airflow before the sound leaves the oral cavity, while the sonorants allow the air to pass, although not to the same extent as vowels: airflow is redirected but its exit from the mouth is not hindered.
Obstruents are a sub-class of consonants, and can be described as those consonants which are formed by blocking the airflow so that it has difficulty exiting the oral cavity. Egressive obstruents include stops, fricatives, and affricates, which have varying degrees of obstruction and can be either voiced or voiceless. The obstruents also include three special classes of consonants: namely, the clicks, the ejectives, and the implosives.
Stops, also known as plosives, are the result of a complete blocking of the air flow. One articulator touches another, resulting in a brief cancellation of airflow. The articulators then move apart, and the air is released. For example, the voiced velar stop /g/ is formed by pressing the dorsum of the tongue up against the velum. This traps the air, which is trying to exit, at this point. When the tongue is removed from the velum, the air is allowed to progress, creating the sound. One key feature of stops is the manner of their release. In isolation, the release is noticeable, and the air escapes in a rush through the mouth. However, in regular speech, stops can be released fully, or direct the air not out of the mouth, but into the next sound. If the air is completely released, the sound is said to be aspirated for voiceless stops, or breathy for voiced stops. Stops are among the most common sounds across languages, particularly stops occurring at the lips, alveolar ridge, and velum.
Fricatives allow more air to pass than stops, and can be held indefinitely. Whereas stops include a touching of two articulators, fricatives allow still have a small passage between the articulators, allowing air to pass between them. As the air passes between the articulators, it creates a certain amount of turbulence, giving the fricative its sound quality. Note the difference between the voiceless alveolar stop /t/, on the left, and the voiceless alveolar fricative /s/, on the right. Since there is no release, as there is in a stop, fricatives can be held as long as the speaker has breath, as the air continues to flow through the passage created by the articulators.
Affricates are a special class of obstruents, and operate similarly to dipthongs, in that they begin as one consonant and end as another. Affricates begin with the complete closure of the oral tract through the creation of a stop consonant. However, whereas stops are released either into a vowel or another stop, affricates release into a fricative at the same place of articulation as the stop. Thus, instead of completely releasing the active articulator, it is simply moved a little bit away from the passive articulator. This results in a turbulent release of the air trapped by the stop. Interestingly, although the IPA recognizes stops at every place of articulation except the pharynx, and fricatives at every place of articulation, it does not recognize any known affricates that occur behind the hard palate.
The sonorants include all of the remaining consonants as well as all vowels, and are created by changing the shape of the vocal tract, rather than restricting the airflow. Sonorant consonants are therefore very similar to vowels, although the consonants still tend to have more airway restriction than than vowels. For example, lateral consonants redirect air from the centre to the sides of the mouth, and nasal consonants redirect the airflow through the nasal cavity. Sonorant consonants can be sub-classified into nasals, laterals, trills, taps, and approximants.
The nasal consonants are articulated as stops. However, unlike their oral-stop equivalents, nasal consonants are articulated with a lowered velum, allowing air to pass through the nose, rather than through the lips. Although most nasals are voiced, they do not have to be. Every nasal consonant has an oral equivalent across languages, if not necessarily within languages. Native speakers generally raise or lower the velum subconsciously, although the difference between a nasal and oral version of the same stop can be consciously noticed. One way to feel the lowering of the velum is to prepare the lips for the sound [b], and before releasing the air, change the sound to [m], which is the nasal equivalent of [b]. The lowering of the velum can be felt as the sound shifts from [b] to [m]. Another feature of nasals that does not necessarily apply to many other consonants is that they can be syllabic. In unstressed syllables consisting of a schwa and a nasal, often the schwa disappears completely, leaving the nasal consonant as the nucleus of the syllable.
Lateral consonants are similar to stops or fricatives in that the tongue approaches, and sometimes touches, the roof of the mouth. However, the sides of the tongue are contracted in a lateral, so that instead of the air being restricted, it is merely pushed around the centre of the tongue, still passing out through the lips without any turbulence. Like nasals, laterals can become syllabic in the presence of a schwa. English only makes note of one lateral phoneme /l/, although there are two common variants in spoken English: [l] and [ɫ]. The first, the so-called clear l is heard in such words as "light," "leaf," and "let." The second, the dark l is co-articulated with the tongue raised towards the velum, as well as the primary articulation at the alveolar ridge, and is heard in words such as "full," "look," and "pale."
Trills are formed through the rapid vibration of an articulator. Trills can be divided into three sorts, based upon the articulator that is vibrating. Bilabial trills are created through a rapid vibration of the lips, which strike each other repeatedly as air passes between them. Uvular trills are created by a vibration of the uvula, which strikes rapidly against the back of the tongue. Coronal trills involve the vibration of the tip or blade of the tongue against the roof of the mouth, usually in the alveolar or post-alveolar area. Theoretically, velar or epiglottal trills are possible, although they are not recognized as separate phonemes by the IPA as existing in any known languages, that is, they are only produced allophonically in the languages in which they occur.
Taps and Flaps
Taps and flaps can be considered a sub-variation of the trill. Unlike trills, which repeat for several periods of the vibration, a tap/flap is a single, quick touch of the tongue against the passive articulator. Taps and flaps are generally used interchangeably in the literature.
Approximants are sometimes also called glides or semi-vowels, due to their similarity to vowels. Like vowels, approximants have very little obstruction in the oral cavity. In this way, they can be considered to fall between vowels and fricatives: they are slightly more constricted than vowels, but not constricted enough to create turbulence in the airflow. Like other consonants, their pronunciation is shorter than that of vowels. Glides are sometimes distinguished as being non-syllabic, while maintaining all other features of vowels, while other approximants are consonants that can be syllabic. English makes use of four approximants: the alveolar approximant, [ɹ] is most often written as the letter "r", and is the rhotic "r" heard in most North American English accents. It is heard at the end of most North American pronunciations of words such as "far", "near", and "war". The lateral approximant, [l] is the sound most often written with an "l", and is further discussed in the section on lateral consonants. The velar approximant [w] is found in words such as "word," "walk," and "wheel." It is co-articulated with the lips, as the lips purse while the back of the tongue moves towards the velum. It has a somewhat rarer voiceless analogue [ʍ], which is sometimes heard in pronunciations of "wh" in words such as "wheel," "whether", and "what." The palatal approximant [j] is the sound heard at the beginning of "yes," "yoke," and "yawn," and is most often written as "y." As well as being separate consonants, the velar and palatal approximants often surface in English as the second sound in a diphthong, such as the [aw] in "now", and the [ej] in "day."
For my learning exercise, I have built a fully functional Java Applet that tests the user's knowledge of the various manners and places of articulation, as well as the IPA chart. The applet can be found at http://myweb.dal.ca/gr599577/ArticulatoryPhonetics.html. In order to run the applet, you will need the Java runtime environment(JRE), available at http://www.java.com/en/download/index.jsp. Many computers already have the JRE installed, so this step may be unnecessary. Once you go to the webpage, you will be asked to accept the security certificate for the applet. Click OK to begin the applet. Please note that on some computers, the applet may be slow to load; please be patient.
Once the applet has been loaded, your adventure begins! You will be guided in your quest towards phonetic mastery by the mysterious "hint", found at the top of the applet. This hint must be answered correctly by choosing the correct option for each phonetic distinguisher. Choose carefully, lest your phonetic ignorance be revealed! The puzzle is solved in the mid-section of the applet. Make your choices in each category, and choose the correct phonetic symbol. As the solution is constructed, a literary representation of your decision will present itself at the bottom of the applet. When you are certain that you have made the correct choices, click "Submit Answer". Should this riddle confound your reasoning, click "Next Question", and an alternative will be presented to you.
Occasionally, the hint may try to deceive you by seeming to make no reference to the classifiers available. Fret not, Phonetic Adventurer! A hint of such a nefarious nature may be protecting the identity of a vowel, rather than a consonant. If you were to set the manner of articulation to "Vowel", a whole host of new options will magically present itself for your consideration. Should you decide that the solution is a consonant after all, simply change the manner of articulation from "Vowel" back to a different manner, and the spell will be reversed. Should you master the quests present in this applet, you will be well on your way to mastery of the field of Articulatory Phonetics. Should you fail, worry not; all of the required information to succeed is available on this page. Enjoy!
- International Phonetic Association(1999). Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet, Cambridge University Press
- Wikimedia Commons, IPA Vowel Chart
- Wikimedia Commons, General Phonetics, http://commons.wikimedia.org/wiki/General_phonetics
- Wikipedia article on IPA
- Gleason, J, and Ratner, N. The development of Language, Pearson Education, 2009
- Davenport, M. and Hannahs, S.J. (2010). Introducing Phonetics and Phonology, Hodder Education
- Crystal, D. (2003) The Cambridge Encyclopedia of The English Language, Cambridge University Press, UK.
- Davenport, M., and Hannahs, S.J.(2010). Introducing Phonetics and Phonology, Hodder Education, UK.
- Gleason, J, and Ratner, N. (2009) The development of Language, Pearson Education, USA
- Jay, T. (2003). The Psychology of Language, Pearson Education, Upper Saddle River, New Jersey, USA.
- O'Grady, W., and Archibald, J. (2009). Contemporary Linguistic Analysis, 6th Ed., Pearson Education Canada, Toronto, Canada.