Skip to main content

Table 8 Statistics for COGNIMUSE Database (Hollywood movies and GWW) annotated with Audio-Visual events per event category. Subcategories with numerous instances but with small duration

From: COGNIMUSE: a multimodal video database annotated with saliency, events, semantics and emotion with application to summarization

Frequent audio and visual events with small duration

Category/subcategory

Instances

Dur. (min)

Voice: shouting

204

8

Voice: crying

101

7

Voice: breathing

102

18.55

Movement: footsteps

165

11.51

Social signals: door opening closing

114

2.18

General facial actions: smile

115

3.72

General body mov.: running

109

5.56

General body mov: turn

216

4.79

Gestures: wave hands

116

3.97