Skip to main content

Table 1 We used 80% of the UCF101-24 dataset for training and 20% for validation

From: Weakly supervised spatial–temporal attention network driven by tracking and consistency loss for action detection

Domain Mode 20% 30% 50% 70% 100%
Source Full 80.7 86.8 95.4 96.5 96.7
Target Weak 93.3 94.9 96.1 96.3 -
  1. The Frame-mAP is shown in the table. We assume 20–100% usage of training data for fully supervised learning with our model in source domain. In target domain, we trained the network with the pre-trained model and the remaining data that only has classification annotations