From: Weakly supervised spatial–temporal attention network driven by tracking and consistency loss for action detection
Method
Speed(fps)
Frame-mAP
P3D-CTN
28
–
I3D
30
77.7
3C-Net
45
84.4
HAM-Net
29
92.1
YOWO+LFB
38
86.4
Ours
31
94.8