Figure 1From: A multi-cue spatio-temporal framework for automatic frontal face clustering in video sequences First, faces are extracted from each video frame by a standard face detector. Then appearance features and spatio-temporal features are built to compute different dissimilarity matrices. Finally, these dissimilarity matrices are combined to a single one and clustering is performed by solving a MAP estimation problem.Back to article page