A Unified Framework for Dividing and Predicting a Large Set of Action Units


This paper presents a unified framework for robust and real-time recognition of a full set of Action Units or a majority of them. The key idea is to systematically divide the AUs into two subsets: strongly connected AUs (SAUs) and weakly connected AUs (WAUs). A probabilistic scoring function is introduced to divide the AUs into SAUs and WAUs based on the strength of spatial AU relations. Then, a spatio-temporal model is created to predict WAUs in real-time without using any computer vision techniques. A number of empirical analyses were performed to validate the proposed framework. The systematic division of AUs is found to be consistent across various datasets. It is also observed that the spatio-temporal model is robust in predicting WAUs within and across datasets. For example, comparison of the proposed framework with the state-of-the-art technique, such as Computer Expression Recognition Tool Kit (CERT), is performed on posed, FERA 2011, and spontaneous expression databases. The average two-alternative forced choice (2AFC) scores of WAUs on these databases are found to be 0.931, 0.637, and 0.654 for the unified framework. The corresponding numbers for the CERT are 0.795, 0.434, and 0.414 for the same datasets. Hence, the 2AFCs are improved by 17.11, 46.77, and 57.97 percent, respectively. Most importantly, the proposed approach outperforms the CERT significantly with spontaneous facial expressions. In addition, the unified framework is found to be effective in minimizing errors introduced by simultaneous display of emotion and speaking. In particular, it is observed that it has an 75.87 percent improvement over the CERT in identifying WAUs on speaking part of the FERA 2011 dataset. Additionally, it was also observed that proposed framework has better 2AFC score compared to another state of the art AU detector: LAUD 2010. Finally, the proposed approach also contributes in a significant reduction of the run-time and improves robustness in predicting WAUs from various datasets.

Publication Title

IEEE Transactions on Affective Computing