Prosody based co-analysis for continuous recognition of coverbal gestures
Abstract
Although recognition of natural speech and gestures have been studied extensively, previous attempts at combining them in a unified framework to boost classification were mostly semantically motivated, e.g., keyword-gesture co-occurrence. Such formulations inherit the complexity of natural language processing. This paper presents a Bayesian formulation that uses a phenomenon of gesture and speech articulation for improving accuracy of automatic recognition of continuous coverbal gestures. The prosodic features from the speech signal were co-analyzed with the visual signal to learn the prior probability of co-occurrence of the prominent spoken segments with the particular kinematical phases of gestures. It was found that the above co-analysis helps in detecting and disambiguating small hand movements, which subsequently improves the rate of continuous gesture recognition. The efficacy of the proposed approach was demonstrated on a large database collected front the weather channel broadcast. This formulation opens new avenues for bottom-up frameworks of multimodal integration.
Publication Title
Proceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002
Recommended Citation
Kettebekov, S., Yeasin, M., & Sharma, R. (2002). Prosody based co-analysis for continuous recognition of coverbal gestures. Proceedings - 4th IEEE International Conference on Multimodal Interfaces, ICMI 2002, 161-166. https://doi.org/10.1109/ICMI.2002.1166986