Method | Mode | J-HMDB-21 | UCF101-24 |
---|
T-CNN [12] | Full | 61.3 | 41.4 |
ACT [20] | Full | 65.7 | 69.5 |
STEP [46] | Full | – | 75.0 |
P3D-CTN [15] | Full | 71.1 | – |
I3D [47] | Full | 73.3 | 77.7 |
ACRN [24] | Full | 77.9 | 80.4 |
YOWO+LFB [24] | Full | 75.7 | 87.3 |
3C-Net [34] | Weak | 77.9 | 86.4 |
HAM-Net [39] | Weak | 88.1 | 92.1 |
Ours | Weak | 90.2 | 94.8 |
- The table lists the comparison results of Frame-mAP (IOU=0.5, 16 frames clip). We compared with recent fully and weakly supervised methods. Note that, the proposed method is an object location-unsupervised classification-supervised attention network