From: Improved two-stream model for human action recognition
Top 1 %
Top 5%
Spatial Stream ConvNet [9]
72.7
VGG16
32.1
51.3
Inception V3 [16]
54.55
79.92
VGG16+LSTM (bidirectional)
88.1
96.72
VGG16+LSTM (single directional)
90.81
98.61