Skip to main content

Table 1 Backbone architecture. Details of each building block are reported in square brackets

From: Learning attention for object tracking with adversarial learning network

Block

Output size

Backbone

Conv1

125 × 125

7 × 7, 64, stride 2

Conv2_x

63 × 63

\( \left[\begin{array}{l}1\times 1,64\\ {}3\times 3,64\\ {}1\times 1,256\end{array}\right]\times 3 \)

Conv3_x

31 × 31

\( \left[\begin{array}{l}1\times 1,128\\ {}3\times 3,128\\ {}1\times 1,512\end{array}\right]\times 4 \)

Conv4_x

31 × 31

\( \left[\begin{array}{l}1\times 1,256\\ {}3\times 3,256\\ {}1\times 1,1024\end{array}\right]\times 6 \)