Words matter: Automatic detection of teacher questions in live classroom discourse using linguistics, acoustics, and context


We investigate automatic detection of teacher questions from audio recordings collected in live classrooms with the goal of providing automated feedback to teachers. Using a dataset of audio recordings from 11 teachers across 37 class sessions, we automatically segment the audio into individual teacher utterances and code each as containing a question or not. We train supervised machine learning models to detect the human-coded questions using highlevel linguistic features extracted from automatic speech recognition (ASR) transcripts, acoustic and prosodic features from the audio recordings, as well as context features, such as timing and turn-taking dynamics. Models are trained and validated independently of the teacher to ensure generalization to new teachers. We are able to distinguish questions and non-questions with a weighted F1 score of 0.69. A comparison of the three feature sets indicates that a model using linguistic features outperforms those using acoustic-prosodic and context features for question detection, but the combination of features yields a 5% improvement in overall accuracy compared to linguistic features alone. We discuss applications for pedagogical research, teacher formative assessment, and teacher professional development.

Publication Title

ACM International Conference Proceeding Series