Neural Correlates of Speech Segregation Based on Formant Frequencies of Adjacent Vowels


The neural substrates by which speech sounds are perceptually segregated into distinct streams are poorly understood. Here, we recorded high-density scalp event-related potentials (ERPs) while participants were presented with a cyclic pattern of three vowel sounds (/ee/-/ae/-/ee/). Each trial consisted of an adaptation sequence, which could have either a small, intermediate, or large difference in first formant (I "f 1) as well as a test sequence, in which I "f 1 was always intermediate. For the adaptation sequence, participants tended to hear two streams ("streaming") when I "f 1 was intermediate or large compared to when it was small. For the test sequence, in which I "f 1 was always intermediate, the pattern was usually reversed, with participants hearing a single stream with increasing I "f 1 in the adaptation sequences. During the adaptation sequence, I "f 1-related brain activity was found between 100-250 ms after the /ae/ vowel over fronto-central and left temporal areas, consistent with generation in auditory cortex. For the test sequence, prior stimulus modulated ERP amplitude between 20-150 ms over left fronto-central scalp region. Our results demonstrate that the proximity of formants between adjacent vowels is an important factor in the perceptual organization of speech, and reveal a widely distributed neural network supporting perceptual grouping of speech sounds.

Publication Title

Scientific Reports