Fig. 2

The structure of encoder network The entire work is composed of three blocks: three downsampling layers, nine residual layers, and three upsampling layers. In the downsampling phase, the channels of image gradually increase from 6 to 256. In the residual layers, the channels and size of feature maps are maintained at 256 and 64. In the upsampling layers, the channels of feature maps gradually reduce from 256 to 3