Tracking the human arm using constraint fusion and multiple-cue localization


The use of hand gestures provides an attractive means of interacting naturally with a computer-generated display. Using one or more video cameras, the hand movements can potentially be interpreted as meaningful gestures. One key problem in building such an interface without a restricted setup is the ability to localize and track the human arm robustly in video sequences. This paper proposes a multiple-cue localization scheme combined with a tracking framework to reliably track the dynamics of the human arm in unconstrained environments. The localization scheme integrates the multiple cues of motion, shape, and color for locating a set of key image features. Using constraint fusion, these features are tracked by a modified extended Kalman filter that exploits the articulated structure of the human arm. Moreover, an interaction scheme between tracking and localization is used for improving the estimation process while reducing the computational requirements. The performance of the localization/tracking framework is validated with the help of extensive experiments and simulations. These experiments include tracking with calibrated stereo camera and uncalibrated broadcast video.

Publication Title

Machine Vision and Applications