Cascaded reconstruction network for compressive image sensing

Wang, Yahan; Bai, Huihui; Zhao, Lijun; Zhao, Yao

doi:10.1186/s13640-018-0315-5

Research
Open access
Published: 28 August 2018

Cascaded reconstruction network for compressive image sensing

Yahan Wang¹,
Huihui Bai ORCID: orcid.org/0000-0002-3879-8957¹,
Lijun Zhao¹ &
…
Yao Zhao¹

EURASIP Journal on Image and Video Processing volume 2018, Article number: 77 (2018) Cite this article

1668 Accesses
11 Citations
Metrics details

Abstract

The theory of compressed sensing (CS) has been successfully applied to image compression in the past few years, whose traditional iterative reconstruction algorithm is time-consuming. Fortunately, it has been reported deep learning-based CS reconstruction algorithms could greatly reduce the computational complexity. In this paper, we propose two efficient structures of cascaded reconstruction networks corresponding to two different sampling methods in CS process. The first reconstruction network is a compatibly sampling reconstruction network (CSRNet), which recovers an image from its compressively sensed measurement sampled by a traditional random matrix. In CSRNet, deep reconstruction network module obtains an initial image with acceptable quality, which can be further improved by residual reconstruction network module based on convolutional neural network. The second reconstruction network is adaptively sampling reconstruction network (ASRNet), by matching automatically sampling module with corresponding residual reconstruction module. The experimental results have shown that the proposed two reconstruction networks outperform several state-of-the-art compressive sensing reconstruction algorithms. Meanwhile, the proposed ASRNet can achieve more than 1 dB gain, as compared with the CSRNet.

1 Introduction

In the traditional Nyquist sampling theory, the sampling rate must be at least twice of the signal bandwidth in order to reconstruct the original signal losslessly. On the contrary, compressive sensing (CS) theory is a signal acquisition paradigm, which can sample a signal at sub-Nyquist rates but realize the high-quality recovery [1]. Later, Gan et al. proposed block compresses sensing to reduce the algorithm’s computational complexity to avoid directly applying CS on images with large size [2]. Due to CS’s excellent performance on sampling, CS has already been widely used in a great deal of fields, such as communication, signal processing, etc.

In the past decades, CS theory has advanced considerably, especially in the development of reconstruction algorithms [3–10]. Compressive sensing reconstruction aims to recover the original signal x∈R^n×1 from the compressive sensing measurement y∈R^m×1(m≪n). The CS measurement is obtained by y=Φx, where Φ∈R^m×n is a CS measurement matrix. The process of reconstruction is highly ill-posed, because there exist more than one solutions x∈R^n×1 that can generate the same CS measurement y. To solve this problem, the early reconstruction algorithms always assume the original image signal has the property of l_p-norm sparsity. Based on this assumption, several iterative reconstruction algorithms have been explored, such as orthogonal matching pursuit (OMP) [3] and approximate message passing(AMP) [4]. Distinctively, the extension of the AMP, denoising-based AMP (D-AMP) [5], employs denoising algorithms for CS recovery and can get a high performance for nature images. Furthermore, many works incorporate prior knowledge of the original image signals, such as total variation sparsity prior [6] and KSVD [7], into CS recovery framework, which can improve the CS reconstruction performance. Particularly, TVAL3 [8] combines augmented Lagrangian method with total variation regularization, which is also a perfect CS image reconstruction algorithm. However, almost all these reconstruction algorithms require to solve an optimization problem. Most of those algorithms need hundreds of iterations, which inevitably leads to high computational cost and becomes the obstacle for the applications of CS.

In recent years, some deep learning-based methods have been introduced into the low-level problems and get excellent performance, such as image super-resolution [11, 12], image artifact removal [13], and CS image reconstruction [14–17]. Recently, some deep network-based algorithms for CS image reconstruction have been proposed. ReconNet is proposed in [14], which takes CS measurement of image patch as input and outputs its corresponding image reconstruction. Especially, for patch-based CS measurement, ReconNet, inspired of SRCNN [11], can retain rich semantic contents at low measurement rate as compared to the traditional methods. In [15], a framework is proposed to recover images from CS measurements without the need to divide images into small blocks, but there is no competitive advantage for the performance of the reconstruction compared with other algorithms. In [16, 17], the residual convolutional neural network is introduced in the image reconstruction for compressive sensing, which can preserve some information in previous layers and also can improve the convergence rate and accelerate the training process. Different from the optimization-based CS recovery methods, the neural network-based methods often directly learn the inverse mapping from the CS measurement domain to original image domain. As a result, it effectively avoids expensive computation and achieves a promising image reconstruction performance.

In this paper, two different cascaded reconstruction networks are proposed to meet different sampling methods. Firstly, we propose a compatibly sampling reconstruction network (CSRNet), which is employed to reconstruct high-quality images from compressively sensed measurements sampled by a random sampling matrix. In CSRNet, deep reconstruction network module can obtain an initial image with acceptable quality, which can be further improved by residual network module based on convolutional neural network. Secondly, in order to improve the sampling efficiency of CS, an automatically sampling module is designed, which has a fully connected layer to learn a sampling matrix automatically. In addition, the residual reconstruction module is presented, which can match the sampling module. Both the sampling module and its matching residual reconstruction module form a complete compressive sensing image reconstruction network, named ASRNet. As compared with CSRNet, ASRNet can achieve more than 1 dB gain. The experimental results demonstrate the proposed networks outperform several state-of-the-art iterative reconstruction algorithms and deep-learning-based approaches in objective and subjective quality.

The rest of this paper is organized as follows. In Section 2, two novel networks are proposed for different sampling methods. In Section 3, the performance of the proposed networks is examined. We conclude the paper in Section 4.

2 The methods of proposed networks

In this section, we describe the proposed two networks CSRNet in Fig. 1 and ASRNet in Fig. 4. The first network, CSRNet, is designed to reconstruct image from the CS measurement sampled by a random matrix. The second one is a complete compressive sensing image reconstruction network, ASRNet, consisting of both sampling and reconstruction module. Here, our sampling module contains only one fully-connected layer (FC), which is more powerful to imitate traditional Block-CS sampling process.

2.1 CSRNet

Our proposed CSRNet consists of three modules, initial reconstruction module, deep reconstruction module, and residual reconstruction module. The initial reconstruction module takes the CS measurement y as input and outputs a B ×B-sized preliminary reconstructed image. As shown in the Fig. 1, the deep reconstruction module takes the preliminary reconstructed image as input and outputs a same-sized image. The deep reconstruction module contains three convolutional layers, shown in Fig. 2. The first layer generates 64 feature map with 11×11 kernel. The second layer uses kernel of size 1×1 and generates 32 feature maps. And the third layer produces one feature map with 7×7 kernel, which is the output of this module. All the convolutional layers have the same stride of 1, without pooling operation, and appropriate zero padding is used to keep the feature map size constant in all layers. Each convolutional layer is followed by a ReLU layer except the last convolutional layer. Here, deep reconstruction network module can obtain an initial image with acceptable quality, which is more suitable to residual network module than cascaded residual network module [16]. The residual reconstruction network has the similar architecture as the deep reconstruction network, shown in Fig. 3, which learns the residual information between the input data and the ground truth. In our model, we set B=32.

In order to train our CSRNet, we need CS measurements corresponding to each of the extracted patches. For a given measurement rate, we construct a measurement matrix, Φ_B, by first generating a random Gaussian matrix of appropriate size, followed by orthonormalizing its rows. Then, we apply y_i=Φ_B×x_ver−i to obtain the set of CS measurements, where x_ver−i is the vectorized version of an image patch x_i. Thus, an input-label pair in the training set can be represented as $\{y_{i},x_{i}\}^{N}_{i}$. The loss function is the average reconstruction error over all the training image blocks, given by

$$L(\{W_{1},W_{2},W_{3}\})\,=\,\frac{1}{N}\sum_{i=1}^{N}\!\|f_{3}(f_{2}(f_{1}(y_{i},W_{1}),W_{2}),W_{3})-x_{i}\|^{2} $$

where N is the total number of image patches in the training dataset, x_i is the ith patch, and y_i is the corresponding CS measurement. The initial reconstruction mapping, the deep reconstruction mapping, and the residual reconstruction mapping are represented as f₁, f₂, and f₃ respectively. In addition, {W₁,W₂,W₃} are the network parameters which can be obtained in the training.

2.2 ASRNet

Our proposed ASRNet contains three modules, sampling module, initial reconstruction module, and residual reconstruction module, as shown in Fig. 4. In the sampling module, we use a fully connected layer to imitate the traditional compressed sampling process. And the process of compressed sampling is expressed as y_i=Φ_Bx_i in traditional Block-CS. If the image is divided into B ×B blocks, the input of the fully connected layer is a B²×1 vector. For the sampling ratio α, we can obtain n_B=B²×α sampling measurements.

The initial reconstruction module and residual reconstruction module are matching with the sampling module. The initial reconstruction module takes those sampling measurements as input and outputs a B ×B-sized preliminary reconstructed image. Similar to sampling module, we also use a fully connected layer to imitate the traditional initial reconstruction process, which can be presented by $\tilde {x_{j}}=\tilde {\phi _{B}}\times y_{j}$. In our design, the $\tilde {\phi _{B}}$ can be learned automatically instead of computing by the complicated MMSE linear estimation. The residual reconstruction module is similar as the residual reconstruction module in CSRNet, shown in Fig. 3. The output of the residual reconstruction module is the final output of the network.

Given the original image x_i, our goal is to obtain the highly compressed measurement y_j with the compressed sampling module and then accurately recover it to the original image x_j with reconstruction module. Since the sampling module, the initial reconstruction module, and the residual reconstruction module form an end-to-end network, they can be trained together and do not need to be concerned with what the compressed measurement is in training. Therefore, the input and the label are all the image itself for training our ASRNet. Following most of deep learning-based image restoration methods, the mean square error is adopted as the cost function of our network. The optimization objective is represented as

$$L(\{W_{4},W_{5},W_{6}\})\,=\,\frac{1}{N}\sum_{i=1}^{N}\!\|f_{6}(f_{5}(f_{4}(x_{i},W_{1}),W_{2}),W_{3})-x_{i}\|^{2} $$

where {W₄,W₅,W₆} are the network parameters needed to be trained, f₄ is the sampling, and f₅ and f₆ correspond the initial reconstruction mapping and residual reconstruction mapping respectively. It should be noted that we train the compressed sampling network and the reconstruction network together, but they can be used independently.

3 Results and discussion

In this section, we evaluate the performance of the proposed methods for CS reconstruction. We will firstly introduce the details during our training and testing. Then, we show the quantitative and qualitative comparisons with four state-of-the-art methods.

3.1 Training

The dataset used in our training is the set of 91 images in the [14]. The set 5 from [14] constitutes to be our validation set. We only use the luminance component of the images. We uniformly extract patches of size 32×32 from these images with a stride equal 14 for training and 21 for validation to form the training dataset of 22,144 patches and the validation dataset which contains 1112 patches. Both CSRNet and ASRNet use the same dataset. We train the proposed networks with different measurement rates (MR)=0.25,0.10,0.04, and 0.01. The Caffe is used to train the proposed model.

3.2 Comparison with existing methods

3.2.1 Objective quality comparisons

Our proposed algorithm is compared with four representative CS recovery methods, TVAL3 [8], D-AMP [5], ReconNet [14], and DR²-Net [16]. The first two belong to traditional optimization-based methods, while the last two are recent network-based methods. For the simulated data in our experiments, we evaluate the proposed methods on the same test images as in [14], which consists of 11 grayscale images. Here, nine images have size of 256×256 and two images are 512×512. We compute the PSNR value for total 11 images, and the results are shown in Table 1. We use the BM3D [18] as the denoiser to remove the artifacts resulting due to patch processing. It is obvious to see that ASRNet can achieve more than 1 dB gain, as compared with CSRNet. We add SSIM comparison between our proposed networks and network-based methods, ReconNet and DR²-Net, as shown in Table 2. From Tables 1 and 2, it can be found that our proposed CSRNet and ASRNet outperform other algorithms under each measurement rates, especially at 0.04 and 0.01. In addition, the performance of our methods decreases slowly compared to other algorithms with the measurement rate down.

Table 1 PSNR valves in dB for testing image by different algorithms at the ratio MR=0.25,0.1,0.04, and 0.01

Full size table

Table 2 SSIM valves in dB for testing image by different algorithms at the ratio MR=0.25,0.1,0.04, and 0.01 (Continued)

Full size table

3.2.2 Time complexity

The time complexity is a key factor for image compressive sensing. In the progress of reconstruction, the network-based algorithms are much faster than traditional iterative reconstruction methods, so we only compare the time complexity with ReconNet and DR²-Net. Table 3 shows the average time for reconstructing nine sized 256×256 images of those network-based methods.

Table 3 Time(in seconds) for reconstruction a single 256×256 image

Full size table

From the Tables 1, 2, and 3, we can observe that the proposed CSRNet and ASRNet outperform the ReconNet and DR²-Net in terms of PSNR, SSIM, and time complexity. And our ASRNet obtains the best performance in all objective quality assessments. Notably, ASRNet run fastest which is very important for real-time applications.

3.2.3 Visual quality comparisons

Our proposed algorithm is compared with four representative CS recovery methods, TVAL3, D-AMP, ReconNet, and DR²-Net in visual. Figures 5 and 6 show the visual comparisons of Parrots in the case of measurement rate = 0.1 with and without BM3D respectively. It is obvious that the proposed CSRNet and ASRNet are able to reconstruct more details and sharper, which offers better visual reconstruction results than other network-based algorithms. The other three groups are shown in Figs. 7, 8, 9, 10, 11, and 12. The test images under different rates with or without BM3D are shown in the Figs. 13, 14, 15, and 16. We can see that our proposed CSRNet and ASRNet outperforms ReconNet and DR²-Net at each measurement rate.

3.3 Evaluation on our proposed architectures

In order to verify the innovation and rationality of our networks’ architectures in more detail, we add a comparison between the intermediate outputs and the final outputs of our methods in objective and subjective quality. Apart from the above four MR, we additionally train the models of CSRNet and ASRNet at the MR=0.2 and 0.15. We calculate the mean PSNR and SSIM values of the total 11 test images at each measurement rate, as shown in Table 4. The CSRNet-i means the intermediate output of the deep reconstruction module in the CSRNet, and the ASRNet-i represents the intermediate output of the initial reconstruction module in the ASRNet. From the results shown in Table 4, the final outputs of CSRNet and ASRNet both perform better than their intermediate outputs at each measurement rate. As shown in Figs. 17 and 18, we respectively give the intermediate and final CSRNet subjective evaluation of Cameraman, House, Monarch, and Parrots at MR=0.20 and 0.10. Figures 19 and 20 show the intermediate and final ASRNet subjective evaluation of Boats, Cameraman, Peppers, and Monarch at MR=0.04 and 0.10. Compared to intermediate results, our final results express more natural textures and details. All final results outperform previous intermediate results substantially in terms of image quality. It is intuitively confirmed that our proposed networks are reasonable, stable, and reliable.

Table 4 The mean PSNR/SSIM values of 11 test images without applying BM3D

Full size table

4 Conclusion

In this paper, two cascaded reconstructed networks are proposed for different CS sampling methods. In most previous works, the sample matrix is a random matrix in CS process. And the first network is a compatibly sampling reconstruction network (CSRNet), which can reconstruct high-quality image from its compressively sensed measurement sampled by a traditional random matrix. The second network is adaptively sampling reconstruction network (ASRNet), by matching automatically sampling module with corresponding residual reconstruction module. And the sampling module could perfectly solve the problem of sampling efficiency in compressive sensing. Experimental results show that the proposed networks, CSRNet and ASRNet, have achieved the significant improvements in reconstruction results over the traditional and neural network-based CS reconstruction algorithms both in terms in quality and time complexity. Furthermore, ASRNet can achieve more than 1 dB gain, as compared with CSRNet.

Abbreviations

ASRNet:: Adaptively sampling reconstruction network
CS:: Compressive sensing
CSRNet:: Compatibly sampling reconstruction network

References

D.L. Donoho, Compressed sensing. IEEE Trans. Inf. Theory. 52(4), 1289–1306 (2006).
Article MathSciNet MATH Google Scholar
G. Lu, Block compressed sensing of natural images. Int. Conf. Digit. Signal Process, 403–406 (2007).
J.A. Tropp, Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory. 50(10), 2231–2242 (2004).
Article MathSciNet MATH Google Scholar
D.L. Donohoa, A. Malekib, A. Montanaria, Message-passing algorithms for compressed sensing. Proc. Natl. Acad. Sci. U. S. A.106(45), 18914 (2009).
Article Google Scholar
C.A. Metzler, A. Maleki, R.G. Baraniuk, From denoising to compressed sensing. IEEE Trans. Inf. Theory. 62(9), 5117–5144 (2016).
Article MathSciNet MATH Google Scholar
Y. Xiao, J. Yang, Alternating algorithms for total variation image reconstruction from random projections. Inverse Probl. Imaging. 6(3), 547–563 (2017).
MathSciNet MATH Google Scholar
M. Elad, M. Aharon, Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15(12), 3736–3745 (2006).
Article MathSciNet Google Scholar
C. Li, W. Yin, H. Jiang, Y. Zhang, An efficient augmented Lagrangian method with applications to total variation minimization. Comput. Optim. Appl.56(3), 507–530 (2013).
Article MathSciNet MATH Google Scholar
L. Ma, H. Bai, M. Zhang, Y. Zhao, Edge-based adaptive sampling for image block compressive sensing. Ieice Trans. Fundam. Electron. Commun. Comput. Sci.E99.A(11), 2095–2098 (2016).
Article Google Scholar
H. Bai, M. Zhang, M. Liu, A. Wang, Y. Zhao, Depth image coding using entropy-based adaptive measurement allocation. Entropy. 16(12), 6590–6601 (2014).
Article Google Scholar
C. Dong, C.L. Chen, K. He, X. Tang, Image super-resolution using deep convolutional networks. IEEE Trans. Pattern. Anal. Mach. Intell.38(2), 295–307 (2016).
Article Google Scholar
L. Zhao, J. Liang, H. Bai, A. Wang, Y. Zhao, Simultaneously color-depth super-resolution with conditional generative adversarial network. arXiv preprint arXiv:1708.09105. (2017).
L. Zhao, J. Liang, H. Bai, A. Wang, Y. Zhao, in IEEE International Conference on Image Processing. Convolutional neural network-based depth image artifact removal (IEEE International Conference on Image Processing, 2017).
K. Kulkarni, S. Lohit, P. Turaga, R. Kerviche, A. Ashok, ReconNet: non-iterative reconstruction of images from compressively sensed random measurements. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).
A. Mousavi, R.G. Baraniuk, Learning to invert: signal recovery via deep convolutional networks. IEEE International Conference on Acoustics, Speech and Signal Processing (2017).
H. Yao, F. Dai, D. Zhang, Y. Ma, S. Zhang, DR ²-Net: deep residual reconstruction network for image compressive sensing. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).
G. Nie, Y. Fu, Y. Zheng, H. Huang, Image restoration from patch-based compressed sensing measurement. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).
K. Dabov, A. Foi, V. Katkovnik, K. Egiazarian, Image denoising by sparse 3-d transform-domain collaborative filtering. IEEE Trans. Image Process. 16(8), 2080–2095 (2007).
Article MathSciNet Google Scholar

Download references

Acknowledgements

The author would like to thank the editors and anonymous reviewers for their valuable comments.

Funding

This work was supported in part by Key Innovation Team of Shanxi 1331 Project (KITSX1331) and Fundamental Research Funds for the Central Universities (2018JBZ001).

Author information

Authors and Affiliations

Institute of Information Science, Beijing Jiaotong University, Beijing, 100044, China
Yahan Wang, Huihui Bai, Lijun Zhao & Yao Zhao

Authors

Yahan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Huihui Bai
View author publications
You can also search for this author in PubMed Google Scholar
Lijun Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yao Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

YW completed most of the text and implementation of the algorithm. HB and LZ directed this work and contributed to the revisions. YZ reviewed and revised the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Huihui Bai.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Wang, Y., Bai, H., Zhao, L. et al. Cascaded reconstruction network for compressive image sensing. J Image Video Proc. 2018, 77 (2018). https://doi.org/10.1186/s13640-018-0315-5

Download citation

Received: 02 January 2018
Accepted: 06 August 2018
Published: 28 August 2018
DOI: https://doi.org/10.1186/s13640-018-0315-5

Cascaded reconstruction network for compressive image sensing

Abstract

1 Introduction

2 The methods of proposed networks

2.1 CSRNet

2.2 ASRNet

3 Results and discussion

3.1 Training

3.2 Comparison with existing methods

3.2.1 Objective quality comparisons

3.2.2 Time complexity

3.2.3 Visual quality comparisons

3.3 Evaluation on our proposed architectures

4 Conclusion

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords