A West Virginia University study of American English and Spanish speakers’ pronunciation of certain consonants could change linguists’ understanding of how people learn to speak.
In prior research, Jonah Katz, associate professor in the Eberly College of Arts and Sciences, had observed unusual patterns for consonants between vowels across “language after language.” Katz’s observations led him to question what most linguists believe: that these aspects of speech are learned by internalizing abstract rules about how to deal with, say, a “t” sound when it is between vowels within a word, as opposed to when it starts or ends a word.
Now Katz wants to know if speakers pronounce certain sounds differently in various contexts not because of unconscious adherence to a phonological rulebook but because they’re following very broad patterns of timing and grouping that apply beyond language to fields like music.
Those patterns play out phonetically, in concrete physical features of speech like duration, how long it takes to pronounce a sound, and intensity, how much effort a sound takes.
With the support of $249,794 from the National Science Foundation, Katz will analyze recordings of volunteers reading sample phrases to ascertain what factors determine the pronunciation of certain consonants.
“Linguistics is the science of how humans come to know and speak their languages,” he explained. “Many details of linguistic sound structure must be learned in infancy and childhood because they differ between languages. For instance, in American English, the ‘t’ at the beginning of a word like ‘tough’ involves the tongue tip pressing on the area behind your teeth. But the ‘t’ in between two vowels, as in ‘metal,’ just involves a quick flick or ‘tap’ of the tongue tip in the direction of that surface.”
“Tapping” the tongue against the roof of the mouth rather than using it to completely close the vocal track speeds up speech, producing a “t” that’s less crisp and distinct and more open and “vowel-like” due to increased low-frequency acoustic energy. Tapping only happens in certain situations, like when the “t” occurs in the middle of a word, flanked by vowels.
Children must infer dozens or hundreds of speech patterns from the language they hear around them very early in life, which Katz called “a tough learning challenge.” But his idea is that speakers don’t only learn specific patterns like tapping. Instead, “they figure out larger patterns that govern, for instance, the amount of time between two sounds within a word as opposed to two sounds across two consecutive words.”
For linguists, that concept, which reduces hard learning processes to easier ones, represents a radical reimagining of many sound patterns in many languages, Katz acknowledged.
His two-year study, “The causal structure of intervocalic lenition,” will test the extent to which variability in intensity is predictable from duration, or vice versa.
“If we’re right, these patterns are about shortening a sound and making it more similar to surrounding sounds when it’s inside a unit like a word, as well as lengthening a sound and making it less similar to adjacent sounds when it’s at the beginning or end of a unit,” Katz said.
“Our focus is linguistics, but the broader issues behind these connections are important to speech science, music theory and cognitive psychology. If we can clarify how common sound patterns are rooted in proximity and similarity principles, the impact should reach beyond our field.”
Katz said an easy way to imagine how something like duration could behave in similar patterns across disciplines is to consider how words are put to music.
“Take the song ‘Happy Birthday to You.’ If you hum the birthday song without any words, you’ll find certain notes are longer than the surrounding notes: ‘Do-do-do-do-do-doooooo, do-do-do-do-do-doooooo.’
“Auditory principles say that when you hear that song, you’ll break it into musical units that end with those long notes. And when you put words to that song, you’ll match musical units to linguistic phrases. The phrase ‘Happy birthday to you’ will end on that longer note marking the musical boundary. One of the keys to separating both that large linguistic unit and that large musical unit from what follows is how long the final note is.”
In addition to collecting data on tapping in American English, Katz and Associate Professor Sergio Robles-Puente will research a similar phenomenon, “spirantization,” which applies to pronouncing “b,” “d” and “g” sounds in Spanish.
Robles-Puente has already recorded native speakers of North Central Peninsular Spanish reading phrases that present these sounds in a variety of contexts to test the factors driving spirantization. He and Katz have also trained WVU student assistants Elena Invernon García, a master’s degree student from Spain, and Taya Sullivan, an undergraduate from Fairmont, in acoustic analysis and how to annotate software visualizations of waveforms and energy distribution in the recordings.
Over the fall, as they broaden the recording sessions to include English as well as Spanish, Katz and Robles-Puente will provide the same training to additional student research assistants, including Khadijat Abdulrazaq, a master’s student from Nigeria, and Breann Tennyson, an undergraduate from Huntington. All student members of the research team will develop competencies ranging from computational modeling to phonetic description.
Although Spanish and English don’t share all the same sounds or rules — for example, tapping doesn’t happen to “t” sounds in Spanish — Robles-Puente said similar patterns should emerge from the English and Spanish recordings.
“Are these findings something that we can generalize across different languages?” Robles-Puente asked. “That is one of the questions we’re trying to answer.”