Abstract
This paper explores the use of local parametrized models of image motion for recovering and recognizing the non-rigid and articulated motion of human faces. Parametric flow models (for example affine) are popular for estimating motion in rigid scenes. We observe that within local regions in space and time, such models not only accurately model non-rigid facial motions but also provide a concise description of the motion in terms of a small number of parameters. These parameters are intuitively related to the motion of facial features during facial expressions and we show how expressions such as anger, happiness, surprise, fear, disgust, and sadness can be recognized from the local parametric motions in the presence of significant head motion. The motion tracking and expression recognition approach performed with high accuracy in extensive laboratory experiments involving 40 subjects as well as in television and movie sequences.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Adiv, G. 1985. Determining three-dimensional motion and structure from optical flow generated by several moving objects. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-7(4):384-401.
Azarbayejani, A., Horowitz, B., and Pentland, A. 1993a. Recursive estimation of structure and motion using relative orientation constraints. In Proc. Computer Vision and Pattern Recognition, CVPR-93, New York, pp. 294-299.
Azarbayejani, A., Starner, T., Horowitz, B., and Pentland, A. 1993b. Visually controled graphics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(6):602-604.
Bassili, J. N. 1979. Emotion recognition: The role of facial movement and the relative importance of upper and lower areas of the face. Journal of Personality and Social Psychology, 37:2049- 2059.
Bergen, J. R., Anandan, P., Hanna, K. J., and Hingorani, R. 1992. Hierarchical model-based motion estimation. In Proc. of Second European Conference on Computer Vision, ECCV-92, G. Sandini (Ed.), Springer-Verlag, volume 588 of LNCS-Series, pp. 237-252.
Beymer, D., Shashua, A., and Poggio, T. 1993. Example based image analysis and synthesis. Technical Report A. I. Memo No. 1431, MIT.
Black, M. J. and Anandan, P. 1993. A framework for the robust estimation of optical flow. In Proc. Int. Conf. on Computer Vision, ICCV-93, Berlin, Germany, pp. 231-236.
Black, M. J. and Jepson, A. 1994. Estimating multiple independent motions in segmented images using parametric models with local deformations. In Proceedings of the Workshop on Motion of Non-rigid and Articulated Objects, Austin, Texas, pp. 220- 227.
Black, M. J. and Anandan, P. 1996. The robust estimation of multiple motions: Parametric and piecewise-smooth flow fields. Computer Vision and Image Understanding, 63(1):75-104.
Blake, A. and Isard, M. 1994. 3D position, attitude and shape input using video tracking of hands and lips. In Proceedings of SIGGRAPH 94, pp. 185-192.
Chow, G. and Li, X. 1993. Towards a system for automatic facial feature detection. Pattern Recognition, 26(12):1739-1755.
Cipolla, R. and Blake, A. 1992. Surface orientation and time to contact from image divergence and deformation. In Proc. of Second European Conference on Computer Vision, ECCV-92, G. Sandini (Ed.), Springer-Verlag, volume 588 of LNCS-Series, pp. 187- 202.
Ekman, P. 1992. Facial expressions of emotion: An old controversy and new findings. Philosophical Transactions of the Royal Society of London, B(335):63-69.
Ekman, P. and Friesen, W. 1975. Unmasking the Face. Prentice Hall.
Ekman, P. (Ed.) 1982. Emotion in the Human Face. Cambridge University Press.
Essa, I. A. and Pentland, A. 1994. A vision system for observing and extracting facial action parameters. In Proc. Computer Vision and Pattern Recognition, CVPR-94, Seattle, WA, pp. 76-83.
Essa, I., Darrell, T., and Pentland, A. 1994. Tracking facial motion. In Proceedings of the Workshop on Motion of Non-rigid and Articulated Objects, Austin, Texas, pp. 36-42.
Geman, S. and McClure, D. E. 1987. Statistical methods for tomographic image reconstruction. Bulletin of the International Statistical Institute, LII-4:5-21.
Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., and Stahel, W. A. 1986. Robust Statistics: The Approach Based on Influence Functions. John Wiley and Sons: New York, NY.
Kass, M., Witkin, A., and Terzopoulos, D. 1987. Snakes: Active contour models. In Proc. First International Conference on Computer Vision, pp. 259-268
Koenderink, J. J. and van Doorn, A. J. 1975. Invariant properties of the motion parallax field due to the movement of rigid bodies relative to an observer. Optica Acta, 22(9):773-791.
Li, H., Roivainen, P., and Forcheimer, R. 1993. 3-D motion estimation in model-based facial image coding. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(6):545-555.
Mase, K. 1991. Recognition of facial expression from optical flow. IEICE Transactions, E 74:3474-3483.
Rosenblum, M., Yacoob, Y., and Davis, L. S. 1994. Human emotion recognition from motion using a radial basis function network architecture. In Proceedings of the Workshop on Motion of Non-rigid and Articulated Objects, Austin, Texas.
Terzopoulos, D. and Waters, K. 1993. Analysis and synthesis of facial image sequences using physical and anatomical models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(6):569-579.
Toelg, S. and Poggio, T. 1994. Towards an example-based image compression architecture for video-conferencing. Technical Report CAR-TR-723, Center for Automation Research, U. of Maryland.
Waxman, A. M., Kamgar-Parsi, B., and Subbarao, M. 1987. Close-form solutions to image flow equations. In Proc. Int. Conf. on Computer Vision, ICCV-87, London, England, pp. 12-24.
Yacoob, Y. and Davis, L. S. 1993. Labeling of human face components from range data. In Proc. Computer Vision and Pattern Recognition, CVPR-94, New York, NY, pp. 592-593.
Yacoob, Y. and Davis, L. S. 1994. Computing spatio-temporal representations of human faces. In Proc. Computer Vision and Pattern Recognition, CVPR-94, Seattle, WA, pp. 70-75.
Young, A. W. and Ellis, H. D. (Eds.) 1989. Handbook of Research on Face Processing. Elsevier Science Publishers B. V.
Yuille, A. L., Cohen, D. S., and Hallinan, P. W. 1989. Feature extraction from faces using deformable templates. In Proc. Computer Vision and Pattern Recognition, CVPR-89, pp. 104-109.
Yuille, A. and Hallinan, P. 1992. Deformable templates. In Active Vision, A. Blake and A. Yuille (Eds.), MIT Press: Cambridge, Mass, pp. 21-38.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Black, M.J., Yacoob, Y. Recognizing Facial Expressions in Image Sequences Using Local Parameterized Models of Image Motion. International Journal of Computer Vision 25, 23–48 (1997). https://doi.org/10.1023/A:1007977618277
Issue Date:
DOI: https://doi.org/10.1023/A:1007977618277