Table 11 HMDB-51 (split 1)

From: Gated spatio and temporal convolutional neural network for activity recognition: towards gated multimodal deep learning

Methods Accuracy
Spatial streams (three-channel RGB) 36%
Motion streams (three flow fields) 43%
Averaging (model A) 47.5%
Gating network (model C) 48%
Temporal segment network (averaging) [23] 69.93%
Our gating network (model C) + expert network of temporal segment network [23] 70%