Skip to main content

Table 3 Action classification experiment

From: Weakly supervised spatial–temporal attention network driven by tracking and consistency loss for action detection

Method

Mode

J-HMDB-21

UCF101-24

T-CNN [12]

Full

61.3

41.4

ACT [20]

Full

65.7

69.5

STEP [46]

Full

75.0

P3D-CTN [15]

Full

71.1

I3D [47]

Full

73.3

77.7

ACRN [24]

Full

77.9

80.4

YOWO+LFB [24]

Full

75.7

87.3

3C-Net [34]

Weak

77.9

86.4

HAM-Net [39]

Weak

88.1

92.1

Ours

Weak

90.2

94.8

  1. The table lists the comparison results of Frame-mAP (IOU=0.5, 16 frames clip). We compared with recent fully and weakly supervised methods. Note that, the proposed method is an object location-unsupervised classification-supervised attention network