Electronic Theses and Dissertations

Date

2021

Document Type

Dissertation

Degree Name

Doctor of Philosophy

Department

Electrical & Computer Engineering

Committee Chair

Mohammed Yeasin

Committee Member

Mohammed Yeasin

Committee Member

Gavin Bidelman

Committee Member

Eugene Eckstein

Committee Member

Madhusudhanan Balasubramanian

Abstract

Categorical perception (CP) of audio is critical to understand how the human brainperceives speech sounds despite widespread variability in acoustic properties. Most studiesthat examine cognitive speech processing have applied methodological approaches that focuson specific or a set of brain regions. Departing from hypothesis-driven approaches, in thisdissertation, we proposed multivariate data-driven approaches to identify the spatiotemporal(i.e., when in time and where in the brain) and spectral (i.e., frequency-band power levels)characteristics of auditory neural activity that reflects CP for speech (i.e., differentiatesphonetic prototypes from ambiguous speech sounds). We recorded 64-channel EEG aslisteners rapidly classified vowel sounds along an acoustic-phonetic continuum. We usedparameter optimized support vector machine (SVM), k-nearest neighbors (KNN) classifiers,and stability selection to determine spatiotemporal and spectral characteristics of neuralactivity that decode CP best via source-level neural activity. Using event-related potentials(ERPs), we found that early (120 ms) whole-brain data decoded speech categories (i.e.,prototypical vs. ambiguous speech tokens) with 95.16% accuracy [area under the curve(AUC) 95.14%; F1-score 95.00%]. Separate analyses on the left hemisphere (LH) and righthemisphere (RH) responses showed that LH decoding was more accurate and earlier than RH(89.03% vs. 86.45% accuracy; 140 ms vs. 200 ms). Stability (feature) selection identified 13regions of interest (ROIs) out of 68 brain regions (including auditory cortex, supramarginalgyrus, and inferior frontal gyrus (IFG)) that showed categorical representation duringstimulus encoding (0-260 ms). In contrast, 15 ROIs (including fronto-parietal regions, IFG,motor cortex) were necessary to describe later decision stages (later 300 to 800 ms) ofcategorization but these areas were highly associated with the strength of listenerscategorical hearing (i.e., slope of behavioral identification functions).viMoreover, our induced vs. evoked mode analysis shows that whole-brain evoked -bandactivity decoded prototypical from ambiguous speech sounds with ~70% accuracy. However,induced -band oscillations showed better decoding of speech categories with ~95% accuracycompared to evoked -band activity (~70% accuracy). Induced high frequency (-band)oscillations dominated CP decoding in the LH, whereas lower frequency (-band) dominateddecoding in the RH. In addition, feature selection identified 14 brain regions carryinginduced activity and 22 regions of evoked activity that were most salient in describingcategory-level speech representations. Among the areas and neural regimes explored, wefound that induced -band modulations were most strongly associated with listenersbehavioral CP.In sum, our data-driven multivariate models demonstrate that abstract categories emergesurprisingly early (~120 ms) in the time course of speech processing and are dominated byengagement of a relatively compact fronto-temporal-parietal brain network. In addition, thecategory-level organization of speech is dominated by relatively high frequency inducedbrain rhythms.

Comments

Data is provided by the student.

Library Comment

Dissertation or thesis originally submitted to ProQuest

Notes

Open Access

Share

COinS