Skip to main content

Table 3 Action classification experiment

From: Weakly supervised spatial–temporal attention network driven by tracking and consistency loss for action detection

Method Mode J-HMDB-21 UCF101-24
T-CNN [12] Full 61.3 41.4
ACT [20] Full 65.7 69.5
STEP [46] Full 75.0
P3D-CTN [15] Full 71.1
I3D [47] Full 73.3 77.7
ACRN [24] Full 77.9 80.4
YOWO+LFB [24] Full 75.7 87.3
3C-Net [34] Weak 77.9 86.4
HAM-Net [39] Weak 88.1 92.1
Ours Weak 90.2 94.8
  1. The table lists the comparison results of Frame-mAP (IOU=0.5, 16 frames clip). We compared with recent fully and weakly supervised methods. Note that, the proposed method is an object location-unsupervised classification-supervised attention network