Scaling parameter m | Nth layer | Network depth | Parameter value | Test set error (%) | Training situation |
---|---|---|---|---|---|
1 | n = 20 | 122 | 7.6M | 4.89 | Can be trained |
1 | n = 24 | 146 | 8.6M |  | Unable to train |
2 | n = 8 | 50 | 11.7M | 4.21 | Can be trained |
2 | n = 10 | 62 | 14.6M |  | Unable to train |
3 | n = 6 | 38 | 17.1M | 4.23 | Can be trained |
3 | n = 7 | 44 | 22.1M |  | Unable to train |
4 | n = 4 | 26 | 22.6M | 4.1 | Can be trained |
4 | n = 5 | 32 | 26.1M |  | Unable to train |
5 | n = 3 | 20 | 29.1M | 4.06 | Can be trained |
5 | N = 4 | 26 | 35M |  | Unable to train |