Skip to main content
Fig. 1 | EURASIP Journal on Image and Video Processing

Fig. 1

From: Reversible designs for extreme memory cost reduction of CNN training

Fig. 1

Illustration of the ResNet-18 architecture and its memory requirements. Modules contributing to the peak memory consumption are shown in red. These modules contribute to the memory cost by storing their input in memory. The green annotation represents the extra memory cost of storing the gradient in memory. The peak memory consumption happens in the backward pass through the last convolution so that this layer is annotated with an additional gradient memory cost. At this step of the computation, all lower parameterized layers have stored their input in memory, which constitutes the memory bottleneck

Back to article page