Implicit Surfaces for Body Tracking

Articulated Implicit Surfaces For Human Body Modeling From Video Sequences

In recent years, because cameras have become inexpensive and ever more prevalent, there has been increasing interest in modeling human shape and motion from image data. Such an ability has many applications, such as electronic publishing, entertainment, sports medicine and athletic training. This, however, is an inherently difficult task, both because the body is very complex and because the data that can be extracted from images is often incomplete, noisy and ambiguous.

To overcome these difficulties we start from sophisticated 3-D animation models and reformulate them so that they can be used for data analysis. We use them, not only to represent faces and bodies in motion, but also to guide the interpretation of the image data, thereby substantially improving performance. Our framework retains the articulated skeleton of most existing methods but replaces the usual simple geometric primitives by soft objects. Each primitive defines a field function and the skin is taken to be a level set of the sum of these fields. This implicit surface formulation has the following advantages :

  • Effective use of stereo and silhouette data: Defining surfaces implicitly allows us to define a distance function of data points to models that is both differentiable and computable without search.
  • Accurate shape description by a small number of parameters: Varying a few dimensions yields models that can match different body shapes and allow both shape and motion recovery.
  • Explicit modeling of 3–D geometry: Geometry can be taken into account to predict the expected location of image features and occluded areas, thereby making the extraction algorithm more robust.
  • At the root of our approach is a sophisticated human-body model that was originally developed solely for animation purposes and that we reformulated for data analysis purposes.


Click on images for MPEG1 video. Upper left corner: One of three original video sequences, Upper right corner: Recovered model in motion, Lower half: Blend of the original sequence and of the recovered body. Note that the shaded model is overlaid almost exactly on the real body.


R. Plankers; P. Fua : Articulated Soft Objects for Multi-View Shape and Motion Capture; IEEE Transactions on Pattern Analysis and Machine Intelligence. 2003. DOI : 10.1109/TPAMI.2003.1227995.