 Research
 Open access
 Published:
Cascaded reconstruction network for compressive image sensing
EURASIP Journal on Image and Video Processing volumeÂ 2018, ArticleÂ number:Â 77 (2018)
Abstract
The theory of compressed sensing (CS) has been successfully applied to image compression in the past few years, whose traditional iterative reconstruction algorithm is timeconsuming. Fortunately, it has been reported deep learningbased CS reconstruction algorithms could greatly reduce the computational complexity. In this paper, we propose two efficient structures of cascaded reconstruction networks corresponding to two different sampling methods in CS process. The first reconstruction network is a compatibly sampling reconstruction network (CSRNet), which recovers an image from its compressively sensed measurement sampled by a traditional random matrix. In CSRNet, deep reconstruction network module obtains an initial image with acceptable quality, which can be further improved by residual reconstruction network module based on convolutional neural network. The second reconstruction network is adaptively sampling reconstruction network (ASRNet), by matching automatically sampling module with corresponding residual reconstruction module. The experimental results have shown that the proposed two reconstruction networks outperform several stateoftheart compressive sensing reconstruction algorithms. Meanwhile, the proposed ASRNet can achieve more than 1 dB gain, as compared with the CSRNet.
1 Introduction
In the traditional Nyquist sampling theory, the sampling rate must be at least twice of the signal bandwidth in order to reconstruct the original signal losslessly. On the contrary, compressive sensing (CS) theory is a signal acquisition paradigm, which can sample a signal at subNyquist rates but realize the highquality recovery [1]. Later, Gan et al. proposed block compresses sensing to reduce the algorithmâ€™s computational complexity to avoid directly applying CS on images with large size [2]. Due to CSâ€™s excellent performance on sampling, CS has already been widely used in a great deal of fields, such as communication, signal processing, etc.
In the past decades, CS theory has advanced considerably, especially in the development of reconstruction algorithms [3â€“10]. Compressive sensing reconstruction aims to recover the original signal xâˆˆR^{nÃ—1} from the compressive sensing measurement yâˆˆR^{mÃ—1}(mâ‰ªn). The CS measurement is obtained by y=Î¦x, where Î¦âˆˆR^{mÃ—n} is a CS measurement matrix. The process of reconstruction is highly illposed, because there exist more than one solutions xâˆˆR^{nÃ—1} that can generate the same CS measurement y. To solve this problem, the early reconstruction algorithms always assume the original image signal has the property of l_{p}norm sparsity. Based on this assumption, several iterative reconstruction algorithms have been explored, such as orthogonal matching pursuit (OMP) [3] and approximate message passing(AMP) [4]. Distinctively, the extension of the AMP, denoisingbased AMP (DAMP) [5], employs denoising algorithms for CS recovery and can get a high performance for nature images. Furthermore, many works incorporate prior knowledge of the original image signals, such as total variation sparsity prior [6] and KSVD [7], into CS recovery framework, which can improve the CS reconstruction performance. Particularly, TVAL3 [8] combines augmented Lagrangian method with total variation regularization, which is also a perfect CS image reconstruction algorithm. However, almost all these reconstruction algorithms require to solve an optimization problem. Most of those algorithms need hundreds of iterations, which inevitably leads to high computational cost and becomes the obstacle for the applications of CS.
In recent years, some deep learningbased methods have been introduced into the lowlevel problems and get excellent performance, such as image superresolution [11, 12], image artifact removal [13], and CS image reconstruction [14â€“17]. Recently, some deep networkbased algorithms for CS image reconstruction have been proposed. ReconNet is proposed in [14], which takes CS measurement of image patch as input and outputs its corresponding image reconstruction. Especially, for patchbased CS measurement, ReconNet, inspired of SRCNN [11], can retain rich semantic contents at low measurement rate as compared to the traditional methods. In [15], a framework is proposed to recover images from CS measurements without the need to divide images into small blocks, but there is no competitive advantage for the performance of the reconstruction compared with other algorithms. In [16, 17], the residual convolutional neural network is introduced in the image reconstruction for compressive sensing, which can preserve some information in previous layers and also can improve the convergence rate and accelerate the training process. Different from the optimizationbased CS recovery methods, the neural networkbased methods often directly learn the inverse mapping from the CS measurement domain to original image domain. As a result, it effectively avoids expensive computation and achieves a promising image reconstruction performance.
In this paper, two different cascaded reconstruction networks are proposed to meet different sampling methods. Firstly, we propose a compatibly sampling reconstruction network (CSRNet), which is employed to reconstruct highquality images from compressively sensed measurements sampled by a random sampling matrix. In CSRNet, deep reconstruction network module can obtain an initial image with acceptable quality, which can be further improved by residual network module based on convolutional neural network. Secondly, in order to improve the sampling efficiency of CS, an automatically sampling module is designed, which has a fully connected layer to learn a sampling matrix automatically. In addition, the residual reconstruction module is presented, which can match the sampling module. Both the sampling module and its matching residual reconstruction module form a complete compressive sensing image reconstruction network, named ASRNet. As compared with CSRNet, ASRNet can achieve more than 1 dB gain. The experimental results demonstrate the proposed networks outperform several stateoftheart iterative reconstruction algorithms and deeplearningbased approaches in objective and subjective quality.
The rest of this paper is organized as follows. In Section 2, two novel networks are proposed for different sampling methods. In Section 3, the performance of the proposed networks is examined. We conclude the paper in Section 4.
2 The methods of proposed networks
In this section, we describe the proposed two networks CSRNet in Fig. 1 and ASRNet in Fig. 4. The first network, CSRNet, is designed to reconstruct image from the CS measurement sampled by a random matrix. The second one is a complete compressive sensing image reconstruction network, ASRNet, consisting of both sampling and reconstruction module. Here, our sampling module contains only one fullyconnected layer (FC), which is more powerful to imitate traditional BlockCS sampling process.
2.1 CSRNet
Our proposed CSRNet consists of three modules, initial reconstruction module, deep reconstruction module, and residual reconstruction module. The initial reconstruction module takes the CS measurement y as input and outputs a B Ã—Bsized preliminary reconstructed image. As shown in the Fig. 1, the deep reconstruction module takes the preliminary reconstructed image as input and outputs a samesized image. The deep reconstruction module contains three convolutional layers, shown in Fig. 2. The first layer generates 64 feature map with 11Ã—11 kernel. The second layer uses kernel of size 1Ã—1 and generates 32 feature maps. And the third layer produces one feature map with 7Ã—7 kernel, which is the output of this module. All the convolutional layers have the same stride of 1, without pooling operation, and appropriate zero padding is used to keep the feature map size constant in all layers. Each convolutional layer is followed by a ReLU layer except the last convolutional layer. Here, deep reconstruction network module can obtain an initial image with acceptable quality, which is more suitable to residual network module than cascaded residual network module [16]. The residual reconstruction network has the similar architecture as the deep reconstruction network, shown in Fig. 3, which learns the residual information between the input data and the ground truth. In our model, we set B=32.
In order to train our CSRNet, we need CS measurements corresponding to each of the extracted patches. For a given measurement rate, we construct a measurement matrix, Î¦_{B}, by first generating a random Gaussian matrix of appropriate size, followed by orthonormalizing its rows. Then, we apply y_{i}=Î¦_{B}Ã—x_{verâˆ’i} to obtain the set of CS measurements, where x_{verâˆ’i} is the vectorized version of an image patch x_{i}. Thus, an inputlabel pair in the training set can be represented as \(\{y_{i},x_{i}\}^{N}_{i}\). The loss function is the average reconstruction error over all the training image blocks, given by
where N is the total number of image patches in the training dataset, x_{i} is the ith patch, and y_{i} is the corresponding CS measurement. The initial reconstruction mapping, the deep reconstruction mapping, and the residual reconstruction mapping are represented as f_{1}, f_{2}, and f_{3} respectively. In addition, {W_{1},W_{2},W_{3}} are the network parameters which can be obtained in the training.
2.2 ASRNet
Our proposed ASRNet contains three modules, sampling module, initial reconstruction module, and residual reconstruction module, as shown in Fig. 4. In the sampling module, we use a fully connected layer to imitate the traditional compressed sampling process. And the process of compressed sampling is expressed as y_{i}=Î¦_{B}x_{i} in traditional BlockCS. If the image is divided into B Ã—B blocks, the input of the fully connected layer is a B^{2}Ã—1 vector. For the sampling ratio Î±, we can obtain n_{B}=B^{2}Ã—Î± sampling measurements.
The initial reconstruction module and residual reconstruction module are matching with the sampling module. The initial reconstruction module takes those sampling measurements as input and outputs a B Ã—Bsized preliminary reconstructed image. Similar to sampling module, we also use a fully connected layer to imitate the traditional initial reconstruction process, which can be presented by \(\tilde {x_{j}}=\tilde {\phi _{B}}\times y_{j}\). In our design, the \(\tilde {\phi _{B}}\) can be learned automatically instead of computing by the complicated MMSE linear estimation. The residual reconstruction module is similar as the residual reconstruction module in CSRNet, shown in Fig. 3. The output of the residual reconstruction module is the final output of the network.
Given the original image x_{i}, our goal is to obtain the highly compressed measurement y_{j} with the compressed sampling module and then accurately recover it to the original image x_{j} with reconstruction module. Since the sampling module, the initial reconstruction module, and the residual reconstruction module form an endtoend network, they can be trained together and do not need to be concerned with what the compressed measurement is in training. Therefore, the input and the label are all the image itself for training our ASRNet. Following most of deep learningbased image restoration methods, the mean square error is adopted as the cost function of our network. The optimization objective is represented as
where {W_{4},W_{5},W_{6}} are the network parameters needed to be trained, f_{4} is the sampling, and f_{5} and f_{6} correspond the initial reconstruction mapping and residual reconstruction mapping respectively. It should be noted that we train the compressed sampling network and the reconstruction network together, but they can be used independently.
3 Results and discussion
In this section, we evaluate the performance of the proposed methods for CS reconstruction. We will firstly introduce the details during our training and testing. Then, we show the quantitative and qualitative comparisons with four stateoftheart methods.
3.1 Training
The dataset used in our training is the set of 91 images in the [14]. The set 5 from [14] constitutes to be our validation set. We only use the luminance component of the images. We uniformly extract patches of size 32Ã—32 from these images with a stride equal 14 for training and 21 for validation to form the training dataset of 22,144 patches and the validation dataset which contains 1112 patches. Both CSRNet and ASRNet use the same dataset. We train the proposed networks with different measurement rates (MR)=0.25,0.10,0.04, and 0.01. The Caffe is used to train the proposed model.
3.2 Comparison with existing methods
3.2.1 Objective quality comparisons
Our proposed algorithm is compared with four representative CS recovery methods, TVAL3 [8], DAMP [5], ReconNet [14], and DR^{2}Net [16]. The first two belong to traditional optimizationbased methods, while the last two are recent networkbased methods. For the simulated data in our experiments, we evaluate the proposed methods on the same test images as in [14], which consists of 11 grayscale images. Here, nine images have size of 256Ã—256 and two images are 512Ã—512. We compute the PSNR value for total 11 images, and the results are shown in Table 1. We use the BM3D [18] as the denoiser to remove the artifacts resulting due to patch processing. It is obvious to see that ASRNet can achieve more than 1 dB gain, as compared with CSRNet. We add SSIM comparison between our proposed networks and networkbased methods, ReconNet and DR^{2}Net, as shown in Table 2. From Tables 1 and 2, it can be found that our proposed CSRNet and ASRNet outperform other algorithms under each measurement rates, especially at 0.04 and 0.01. In addition, the performance of our methods decreases slowly compared to other algorithms with the measurement rate down.
3.2.2 Time complexity
The time complexity is a key factor for image compressive sensing. In the progress of reconstruction, the networkbased algorithms are much faster than traditional iterative reconstruction methods, so we only compare the time complexity with ReconNet and DR^{2}Net. Table 3 shows the average time for reconstructing nine sized 256Ã—256 images of those networkbased methods.
From the Tables 1, 2, and 3, we can observe that the proposed CSRNet and ASRNet outperform the ReconNet and DR^{2}Net in terms of PSNR, SSIM, and time complexity. And our ASRNet obtains the best performance in all objective quality assessments. Notably, ASRNet run fastest which is very important for realtime applications.
3.2.3 Visual quality comparisons
Our proposed algorithm is compared with four representative CS recovery methods, TVAL3, DAMP, ReconNet, and DR^{2}Net in visual. Figures 5 and 6 show the visual comparisons of Parrots in the case of measurement rate = 0.1 with and without BM3D respectively. It is obvious that the proposed CSRNet and ASRNet are able to reconstruct more details and sharper, which offers better visual reconstruction results than other networkbased algorithms. The other three groups are shown in Figs. 7, 8, 9, 10, 11, and 12. The test images under different rates with or without BM3D are shown in the Figs. 13, 14, 15, and 16. We can see that our proposed CSRNet and ASRNet outperforms ReconNet and DR^{2}Net at each measurement rate.
3.3 Evaluation on our proposed architectures
In order to verify the innovation and rationality of our networksâ€™ architectures in more detail, we add a comparison between the intermediate outputs and the final outputs of our methods in objective and subjective quality. Apart from the above four MR, we additionally train the models of CSRNet and ASRNet at the MR=0.2 and 0.15. We calculate the mean PSNR and SSIM values of the total 11 test images at each measurement rate, as shown in Table 4. The CSRNeti means the intermediate output of the deep reconstruction module in the CSRNet, and the ASRNeti represents the intermediate output of the initial reconstruction module in the ASRNet. From the results shown in Table 4, the final outputs of CSRNet and ASRNet both perform better than their intermediate outputs at each measurement rate. As shown in Figs. 17 and 18, we respectively give the intermediate and final CSRNet subjective evaluation of Cameraman, House, Monarch, and Parrots at MR=0.20 and 0.10. Figures 19 and 20 show the intermediate and final ASRNet subjective evaluation of Boats, Cameraman, Peppers, and Monarch at MR=0.04 and 0.10. Compared to intermediate results, our final results express more natural textures and details. All final results outperform previous intermediate results substantially in terms of image quality. It is intuitively confirmed that our proposed networks are reasonable, stable, and reliable.
4 Conclusion
In this paper, two cascaded reconstructed networks are proposed for different CS sampling methods. In most previous works, the sample matrix is a random matrix in CS process. And the first network is a compatibly sampling reconstruction network (CSRNet), which can reconstruct highquality image from its compressively sensed measurement sampled by a traditional random matrix. The second network is adaptively sampling reconstruction network (ASRNet), by matching automatically sampling module with corresponding residual reconstruction module. And the sampling module could perfectly solve the problem of sampling efficiency in compressive sensing. Experimental results show that the proposed networks, CSRNet and ASRNet, have achieved the significant improvements in reconstruction results over the traditional and neural networkbased CS reconstruction algorithms both in terms in quality and time complexity. Furthermore, ASRNet can achieve more than 1 dB gain, as compared with CSRNet.
Abbreviations
 ASRNet:

Adaptively sampling reconstruction network
 CS:

Compressive sensing
 CSRNet:

Compatibly sampling reconstruction network
References
D.L. Donoho, Compressed sensing. IEEE Trans. Inf. Theory. 52(4), 1289â€“1306 (2006).
G. Lu, Block compressed sensing of natural images. Int. Conf. Digit. Signal Process, 403â€“406 (2007).
J.A. Tropp, Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory. 50(10), 2231â€“2242 (2004).
D.L. Donohoa, A. Malekib, A. Montanaria, Messagepassing algorithms for compressed sensing. Proc. Natl. Acad. Sci. U. S. A.106(45), 18914 (2009).
C.A. Metzler, A. Maleki, R.G. Baraniuk, From denoising to compressed sensing. IEEE Trans. Inf. Theory. 62(9), 5117â€“5144 (2016).
Y. Xiao, J. Yang, Alternating algorithms for total variation image reconstruction from random projections. Inverse Probl. Imaging. 6(3), 547â€“563 (2017).
M. Elad, M. Aharon, Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15(12), 3736â€“3745 (2006).
C. Li, W. Yin, H. Jiang, Y. Zhang, An efficient augmented Lagrangian method with applications to total variation minimization. Comput. Optim. Appl.56(3), 507â€“530 (2013).
L. Ma, H. Bai, M. Zhang, Y. Zhao, Edgebased adaptive sampling for image block compressive sensing. Ieice Trans. Fundam. Electron. Commun. Comput. Sci.E99.A(11), 2095â€“2098 (2016).
H. Bai, M. Zhang, M. Liu, A. Wang, Y. Zhao, Depth image coding using entropybased adaptive measurement allocation. Entropy. 16(12), 6590â€“6601 (2014).
C. Dong, C.L. Chen, K. He, X. Tang, Image superresolution using deep convolutional networks. IEEE Trans. Pattern. Anal. Mach. Intell.38(2), 295â€“307 (2016).
L. Zhao, J. Liang, H. Bai, A. Wang, Y. Zhao, Simultaneously colordepth superresolution with conditional generative adversarial network. arXiv preprint arXiv:1708.09105. (2017).
L. Zhao, J. Liang, H. Bai, A. Wang, Y. Zhao, in IEEE International Conference on Image Processing. Convolutional neural networkbased depth image artifact removal (IEEE International Conference on Image Processing, 2017).
K. Kulkarni, S. Lohit, P. Turaga, R. Kerviche, A. Ashok, ReconNet: noniterative reconstruction of images from compressively sensed random measurements. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).
A. Mousavi, R.G. Baraniuk, Learning to invert: signal recovery via deep convolutional networks. IEEE International Conference on Acoustics, Speech and Signal Processing (2017).
H. Yao, F. Dai, D. Zhang, Y. Ma, S. Zhang, DR ^{2}Net: deep residual reconstruction network for image compressive sensing. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).
G. Nie, Y. Fu, Y. Zheng, H. Huang, Image restoration from patchbased compressed sensing measurement. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).
K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising by sparse 3d transformdomain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080â€“2095 (2007).
Acknowledgements
The author would like to thank the editors and anonymous reviewers for their valuable comments.
Funding
This work was supported in part by Key Innovation Team of Shanxi 1331 Project (KITSX1331) and Fundamental Research Funds for the Central Universities (2018JBZ001).
Author information
Authors and Affiliations
Contributions
YW completed most of the text and implementation of the algorithm. HB and LZ directed this work and contributed to the revisions. YZ reviewed and revised the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing interests.
Publisherâ€™s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Wang, Y., Bai, H., Zhao, L. et al. Cascaded reconstruction network for compressive image sensing. J Image Video Proc. 2018, 77 (2018). https://doi.org/10.1186/s1364001803155
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1364001803155