rConverse: Moment by Moment Conversation Detection Using a Mobile Respiration Sensor


Monitoring of in-person conversations has largely been done using acoustic sensors. In this paper, we propose a new method to detect moment-by-moment conversation episodes by analyzing breathing patterns captured by a mobile respiration sensor. Since breathing is affected by physical and cognitive activities, we develop a comprehensive method for cleaning, screening, and analyzing noisy respiration data captured in the field environment at individual breath cycle level. Using training data collected from a speech dynamics lab study with 12 participants, we show that our algorithm can identify each respiration cycle with 96.34% accuracy even in presence of walking. We present a Conditional Random Field, Context-Free Grammar (CRF-CFG) based conversation model, called , to classify respiration cycles into speech or non-speech, and subsequently infer conversation episodes. Our model achieves 82.7% accuracy for speech/non-speech classification and it identifies conversation episodes with 95.9% accuracy on lab data using a leave-one-subject-out cross-validation. Finally, the system is validated against audio ground-truth in a field study with 32 participants. rConverse identifies conversation episodes with 71.7% accuracy on 254 hours of field data. For comparison, the accuracy from a high-quality audio-recorder on the same data is 71.9%.

Publication Title

Proceedings of the ACM on interactive, mobile, wearable and ubiquitous technologies