Single image super-resolution by directionally structured coupled dictionary learning
- Junaid Ahmed^{1}Email author and
- Madad Ali Shah^{1}
https://doi.org/10.1186/s13640-016-0141-6
© The Author(s) 2016
Received: 18 April 2016
Accepted: 7 October 2016
Published: 21 October 2016
Abstract
In this paper, a new algorithm is proposed based on coupled dictionary learning with mapping function for the problem of single-image super-resolution. Dictionaries are designed for a set of clustered data. Data is classified into directional clusters by correlation criterion. The training data is structured into nine clusters based on correlation between the data patches and already developed directional templates. The invariance of the sparse representations is assumed for the task of super-resolution. For each cluster, a pair of high-resolution and low-resolution dictionaries are designed along with their mapping functions. This coupled dictionary learning with a mapping function helps in strengthening the invariance of sparse representation coefficients for different resolution levels. During the reconstruction phase, for a given low-resolution patch a set of directional clustered dictionaries are used, and the cluster is selected which gives the least sparse representation error. Then, a pair of dictionaries with mapping functions of that cluster are used for the high-resolution patch approximation. The proposed algorithm is compared with earlier work including the currently top-ranked super-resolution algorithm. By the proposed mechanism, the recovery of directional fine features becomes prominent.
Keywords
1 Introduction
Super-resolution (SR) is the goal in image data presentation which is already an active area of research for some years due to the interest in high-resolution (HR) images in many applications. Of course, HR images can easily be generated by using a high-definition (HD) camera. For some applications, it is still not yet practical to install such a camera (e.g., due to limitations of the capacity of the data channel), or simply not cost-efficient in a particular context of computer vision, medical imaging, or satellite imaging.
Recently proposed image representation approaches use sometimes sparse representation models for storage or transmission reasons. According to the so-called Sparseland model [1], a set of signals called the dictionary is created for linearly representing the signals of interest. These dictionaries are designed by selecting image patches from a natural set of images and iteratively minimizing the representation error. Sparsity is used as the regularizing technique (for achieving SR image representations) by enforcing the concept that low-resolution (LR) projections are preserved in linear relations of their HR counterparts [2].
Earlier dictionary learning algorithms for super-resolution were focused on learning the separate HR and LR dictionaries for super-resolution. In [3], authors propose a joint dictionary learning mechanism for learning HR and LR dictionaries in a joint feature space, thus enforcing the similarity between HR and LR sparse coefficients. At the image reconstruction, stage authors proposed the invariance of sparse coefficients for HR and LR patches. In [4], authors propose a multi-scale dictionary learning approach where wavelets were used for analysis of the LR images and dictionaries were learned at different resolution levels. By doing so, authors designed compact dictionaries at different resolution levels achieving reduced computational cost. In [5], authors propose multi-scale dictionary learning by introducing local and non-local priors for the task of single-image super-resolution. These priors are used to recover SR images by suppressing artifacts and estimating the required HR image pixels. However, in the recent work, authors propose the use of classification of training data based on scale invariant features and learn the class-dependent dictionaries instead of a single universal one or on multi-scales.
Related work includes Dong et al. [6] where authors proposed to divide the training data by k-means algorithm and clustered data into different sets and then applied the dictionary learning to get the compact dictionaries. In [7], Feng et al. use k-subspace clustering and divided the data into different subspaces and then dictionaries were learned from those subspaces in a shared bases manner. More recently, Yu et al. [8] consider the design of structural dictionaries. In [8], authors considered the orthogonal bases for the dictionary atoms and designed structured dictionaries from those orthogonal bases. Yang et al. [9] propose the use of multiple patches based clustered dictionaries instead of a single universal one. In this mechanism, the authors studied the geometric properties of the image patches. Patches were clustered into different clusters depending on their geometric property. Dictionaries were obtained from training the image patches from these clusters. In [10], authors designed nine LR directional dictionaries for solving the single-image SR problem. Here, the LR dictionaries were learned by the K-SVD algorithm [1] and HR dictionaries were obtained by solving a pseudo-inverse problem. An important thing to note here is that despite clustering the data into the directional templates, in the dictionary learning process there is no coupling between the HR and LR sparse representation coefficients. Because the SISR problem depends on the invariance of the sparse coefficients. The idea of single dictionary learning with no coupling between the sparse coefficients has already been superseded by [11, 12].
In [12], the authors proposed a coupled dictionary learning mechanism for training of HR and LR dictionaries. In this setup, an alternate mechanism is applied to the sparse coefficients of HR and LR patches; for each iteration, one sparse coefficient is chosen either the HR or LR and it is used to update both the HR and LR dictionaries. In doing so, authors achieved a slight improvement in forcing the sparse coefficients of HR and LR to be the same and thus produced results on par with the state-of-the-art algorithm published in [11].
In this paper, for the task of SISR, the basic idea of Yang et al.’s [11], the approach is assumed that the HR and LR have the same sparse coefficients. Instead of using a single pair of dictionaries as done in [11] and [12], multiple directional dictionaries are proposed as done in [10]. The training data is divided into eight directional clusters and a non-directional one. The training data is clustered by correlating the training patches with already developed directional templates. These templates have directional structure. It is shown that these are helpful in creating compact and directional dictionaries.
Now for each cluster, a pair of directional and compact dictionaries are designed along with their mapping functions. In the image recovery stage, each patch at hand is recovered with each cluster dictionary by calculating the sparse representations and using the already designed HR dictionary and mapping functions. Then, based on the sparse representation error, a proper dictionary pair is selected along with a mapping matrix. Sparse coefficients are calculated for the LR patch using the selected LR dictionary and mapping matrix. Then, HR patches are reconstructed by using the sparse representation along with the corresponding HR dictionary and mapping matrix. This clustering mechanism, along with the mapping function paradigm, allows us to super-resolve patches with high-frequency components. Experiment results show that the proposed algorithm is on par with the existing state-of-the-art algorithms and shows improvement in recovering images with directional fine features.
The rest of the paper is structured as follows. Section 2 presents the super-resolution via sparse representations. Section 3 describes the proposed algorithm. Section 4 reports simulations. Section 5 concludes. Section 5.1 gives future recommendations.
2 Image super-resolution
Achieving SISR is a type of problem that is ill-posed. Researchers tried to regularize the solution process. Recently, authors proposed a very effective method called sparsity, for regularization. Sparsity has a very nice property of scale invariance (to some extent) due to resolution blur [11]. Using sparsity as a regularizer, one can find HR from LR images using the scale invariance of sparse coefficients.
Let x _{ H } be the HR signal vector extracted from an HR image in the form of the 2−D patch, then vectorized into column form. Let D _{ H } be the corresponding HR dictionary whose columns represent atoms. We can represent this signal vector x _{ H } by using the sparse representations as x _{ H }≈D _{ H } α _{ H }, where α _{ H } is a sparse coefficient matrix for the HR signal vector with only very few non-zero elements.
Let x _{ L } be the corresponding LR signal vector extracted in the same manner after performing blurring and down-sampling operation on the HR images. The sparse representation of this vector LR signal can be given as x _{ L }≈D _{ L } α _{ L }, where D _{ L } represents the dictionary for the LR signal vector and α _{ L } represents the sparse coefficient matrix for the LR signal vector.
From Eq. 3, it is concluded that α _{ H }≈α _{ L }.
3 The proposed method
3.1 The proposed dictionary learning algorithm
The patches are extracted and clustered into these directional template clusters by a correlation between a given patch and the template. Decisions are made based on suitable thresholds chosen from the empirical set based on a histogram of correlation. After evaluating results on different patch sizes and number of samples of training data, the threshold value 0.69 was selected for the optimum performance of the algorithm. Next, a coupled dictionary learning problem is formulated and solved to obtain the clustered dictionary pairs and their mapping functions.
where E _{data}(·,·) is the data fidelity term, E _{map}(·,·) is the mapping fidelity, and E _{reg} is the regularizer. The coupling between the sparse coefficients of HR and LR data over dictionaries is related by the mapping function f(·). The HR and LR dictionaries are optimized concurrently with the mapping function.
where \(\gamma, {\lambda ^{y}_{H}}, {\lambda ^{y}_{m}}\), and \( {\lambda ^{y}_{L}}\) represent the regularization terms for the optimum performance, and \({D}^{y}_{H,i}\) and \({D}^{y}_{L,i}\) are the atoms of \({D}^{y}_{H}\) and \({D}^{y}_{L}\), respectively.
The problem formulated by Eq. (6) can be solved by optimizing one parameter at a time while considering the others as being constant. As the mapping function (matrix) M ^{ y } is linear, bi-directional transforms are learned from \({\alpha ^{y}_{H}}\) to \({\alpha ^{y}_{L}}\) and vice versa.
The problem in Eq. 7 can easily be solved by applying l _{1}norm minimization algorithm such as least-angle regression (LARS) [14].
where I represents the identity matrix. By this strategy, a set of directional dictionaries is developed along with their mapping function (matrix). The proposed training algorithm is summarized in Algorithm 1.
3.2 The proposed image reconstruction algorithm
This corresponds to a model selection scenario. We need to find which dictionary pair among the nine clusters will give the least sparse representation error and hence the best HR patch recovery. This is done by recovering HR patch from LR patch at hand using each directional dictionary pair and its mapping function. For patch-based sparse recovery, first the sparse coefficients of the LR patch are calculated by [14] using the LR patch and LR dictionary. Then HR dictionary is used along with mapping functions to recover the HR patch assuming the invariance property of the sparse coefficients. The dictionary and mapping pair which gives the least sparse representation error is chosen for the HR patch estimation. Here a very basic approach is presented to show the need and effect of directional clustering. By using all dictionaries for HR, patch recovery serves as a perfect model selection (PMS) which can be used as a reference while designing different cluster selection models. In this case, the results show peak signal to noise ratio (PSNR) improvements of 1 dB over the baseline algorithms.
Finally, those approximated HR vector patches are reshaped into two-dimensional form. As we know, patches were extracted with full-overlap, and the overlap-add method of [11] is employed at the end to get the approximate HR image. The reconstruction process is summarized in Algorithm 2.
4 Results and discussion
The proposed algorithm is compared with the algorithm of Yang et al. [11], algorithm of Xu et al. [12], and Bicubic technique (Bic.).
Images | Bic. | [11] | [12] | Proposed(PMS) | Proposed | |||||
---|---|---|---|---|---|---|---|---|---|---|
AnnieYukiTim | 31.42 | 0.5709 | 32.79 | 0.3824 | 32.76 | 0.3848 | 33.83 | 0.3659 | 32.86 | 0.3789 |
0.9064 | 0.3280 | 0.9375 | 0.1879 | 0.9376 | 0.1865 | 0.9338 | 0.2038 | 0.9221 | 0.1961 | |
Barbara | 25.34 | 0.7400 | 25.34 | 0.5531 | 25.81 | 0.5555 | 26.46 | 0.5439 | 25.91 | 0.5438 |
0.7929 | 0.5890 | 0.8356 | 0.4199 | 0.8330 | 0.4207 | 0.8563 | 0.4274 | 0.8360 | 0.4249 | |
BooksCIMAT | 24.89 | 0.6176 | 26.02 | 0.3940 | 27.17 | 0.4075 | 26.62 | 0.3916 | 26.21 | 0.3987 |
0.8271 | 0.4075 | 0.8826 | 0.2047 | 0.9039 | 0.2176 | 0.8941 | 0.2272 | 0.8829 | 0.2151 | |
Butterfly | 27.45 | 0.4541 | 30.07 | 0.2174 | 29.85 | 0.2182 | 31.77 | 0.1961 | 30.53 | 0.1965 |
0.8984 | 0.1633 | 0.9445 | 0.0145 | 0.9430 | 0.0101 | 0.9508 | 0.0569 | 0.9460 | 0.0356 | |
Fence | 25.04 | 0.6432 | 26.32 | 0.4563 | 27.85 | 0.4693 | 27.31 | 0.4414 | 26.36 | 0.4483 |
0.7448 | 0.5281 | 0.8158 | 0.3271 | 0.8496 | 0.3395 | 0.8332 | 0.3379 | 0.8210 | 0.3289 | |
ForbiddenCity | 24.06 | 0.7143 | 24.66 | 0.5559 | 25.90 | 0.5718 | 25.58 | 0.5465 | 24.75 | 0.5569 |
0.6767 | 0.5759 | 0.7549 | 0.3974 | 0.7925 | 0.4078 | 0.7930 | 0.4062 | 0.7899 | 0.4062 | |
HowMany | 27.98 | 0.5739 | 29.19 | 0.3735 | 29.16 | 0.3750 | 30.04 | 0.3586 | 29.17 | 0.3599 |
0.8686 | 0.3038 | 0.9126 | 0.1571 | 0.9120 | 0.1545 | 0.9000 | 0.1708 | 0.8843 | 0.1637 | |
Kodak-05 | 23.97 | 0.7055 | 24.78 | 0.5372 | 25.89 | 0.5483 | 25.60 | 0.5256 | 25.56 | 0.5256 |
0.7235 | 0.5015 | 0.7898 | 0.3365 | 0.8239 | 0.3444 | 0.8154 | 0.3510 | 0.8101 | 0.3449 | |
Kodak-08 | 22.12 | 0.7182 | 22.81 | 0.5638 | 23.48 | 0.5723 | 23.83 | 0.5461 | 22.90 | 0.5539 |
0.6995 | 0.5183 | 0.7672 | 0.3569 | 0.7950 | 0.3592 | 0.8027 | 0.3633 | 0.7570 | 0.3602 | |
Michoacan | 22.19 | 0.6614 | 23.85 | 0.4851 | 24.96 | 0.4992 | 24.76 | 0.4702 | 24.32 | 0.4729 |
0.7877 | 0.4506 | 0.8388 | 0.2879 | 0.8627 | 0.2979 | 0.8508 | 0.3000 | 0.8346 | 0.2943 | |
MissionBay | 26.67 | 0.6174 | 27.90 | 0.4294 | 27.86 | 0.4295 | 29.27 | 0.4003 | 28.89 | 0.4092 |
0.8459 | 0.3475 | 0.8883 | 0.1928 | 0.8868 | 0.1899 | 0.9040 | 0.2129 | 0.9010 | 0.2047 | |
NuRegions | 19.81 | 0.5326 | 21.30 | 0.3413 | 22.09 | 0.3536 | 22.14 | 0.3255 | 21.35 | 0.3266 |
0.8469 | 0.2749 | 0.9047 | 0.0780 | 0.9182 | 0.0875 | 0.9207 | 0.0898 | 0.9046 | 0.0848 | |
Peppers | 29.95 | 0.5459 | 31.22 | 0.3545 | 31.94 | 0.3781 | 31.91 | 0.3253 | 31.56 | 0.3261 |
0.9045 | 0.2395 | 0.9422 | 0.1113 | 0.9583 | 0.1270 | 0.9608 | 0.1382 | 0.9596 | 0.1248 | |
Rocio | 36.63 | 0.4019 | 39.19 | 0.1816 | 39.01 | 0.1821 | 40.38 | 0.1516 | 38.98 | 0.1674 |
0.9612 | 0.1640 | 0.9778 | 0.0515 | 0.9773 | 0.0455 | 0.9745 | 0.0660 | 0.9799 | 0.0530 | |
Starfish | 30.22 | 0.5101 | 32.12 | 0.2880 | 32.04 | 0.2895 | 33.09 | 0.2765 | 32.89 | 0.2846 |
0.8923 | 0.2779 | 0.9358 | 0.1239 | 0.9354 | 0.1208 | 0.9365 | 0.1380 | 0.9332 | 0.1331 | |
Yan | 26.96 | 0.6346 | 28.01 | 0.4579 | 27.94 | 0.4590 | 29.16 | 0.4346 | 28.15 | 0.4411 |
0.8276 | 0.3878 | 0.8743 | 0.2268 | 0.8729 | 0.2227 | 0.8795 | 0.2447 | 0.8898 | 0.2357 | |
Average | 26.54 | 0.6026 | 27.85 | 0.4107 | 28.36 | 0.4184 | 28.86 | 0.3937 | 28.15 | 0.3994 |
0.8253 | 0.3786 | 0.8752 | 0.2171 | 0.8876 | 0.2207 | 0.8879 | 0.2334 | 0.8783 | 0.2254 |
PSNR (top left), sharpness (bottom left), SSIM (top right), and contrast (bottom right), for each image first row (scale factor 3) and second row (scale factor 4) comparison of the bicubic (Bic.) technique, algorithm of Yang et al.’s [11], algorithm of Xu et al.’s [12], and the proposed algorithm
Images | Bic. | [11] | [12] | Proposed (PMS) | ||||
---|---|---|---|---|---|---|---|---|
AnnieYukiTim | 26.04 | 0.7870 | 30.66 | 0.8957 | 30.64 | 0.8950 | 31.3 | 0.8990 |
0.6800 | 0.3932 | 0.5118 | 0.2493 | 0.5118 | 0.2463 | 0.4868 | 0.2533 | |
26.94 | 0.7600 | 27.62 | 0.8174 | 27.62 | 0.8169 | 28.10 | 0.8211 | |
0.8059 | 0.5246 | 0.6938 | 0.3900 | 0.6935 | 0.3891 | 0.6719 | 0.3904 | |
Barbara | 24.52 | 0.7111 | 25.24 | 0.7490 | 24.60 | 0.7254 | 25.62 | 0.7727 |
0.8606 | 0.6644 | 0.7807 | 0.5842 | 0.7791 | 0.5820 | 0.7629 | 0.5834 | |
23.60 | 0.6570 | 23.82 | 0.6744 | 23.82 | 0.6743 | 24.15 | 0.7018 | |
0.9102 | 0.7360 | 0.8631 | 0.6765 | 0.8628 | 0.6756 | 0.8429 | 0.6732 | |
BooksCIMAT | 22.20 | 0.6360 | 24.72 | 0.8102 | 24.69 | 0.8087 | 24.77 | 0.8210 |
0.7883 | 0.5292 | 0.6059 | 0.3423 | 0.6078 | 0.3417 | 0.5867 | 0.3353 | |
21.62 | 0.5863 | 22.94 | 0.7007 | 22.94 | 0.7007 | 23.30 | 0.7104 | |
0.8585 | 0.5998 | 0.7459 | 0.4505 | 0.7462 | 0.4502 | 0.7310 | 0.4517 | |
Butterfly | 20.99 | 0.7329 | 25.89 | 0.8627 | 25.83 | 0.8600 | 26.65 | 0.8652 |
0.6085 | 0.2283 | 0.4148 | 0.0736 | 0.4119 | 0.0679 | 0.3824 | 0.0876 | |
22.14 | 0.6990 | 23.00 | 0.7564 | 23.00 | 0.7550 | 23.75 | 0.7655 | |
0.7362 | 0.3279 | 0.5921 | 0.1543 | 0.5874 | 0.1520 | 0.5567 | 0.1616 | |
Fence | 22.20 | 0.5741 | 22.70 | 0.6494 | 22.69 | 0.6488 | 23.84 | 0.6894 |
0.8202 | 0.6847 | 0.6854 | 0.5192 | 0.6793 | 0.5234 | 0.6740 | 0.5378 | |
21.49 | 0.4885 | 21.87 | 0.5547 | 21.89 | 0.5545 | 22.13 | 0.5692 | |
0.9082 | 0.7965 | 0.8420 | 0.7037 | 0.8425 | 0.7033 | 0.8243 | 0.7014 | |
ForbiddenCity | 22.38 | 0.4415 | 24.02 | 0.6123 | 24.06 | 0.6140 | 24.20 | 0.6294 |
0.8500 | 0.7079 | 0.7470 | 0.5722 | 0.7482 | 0.5717 | 0.7185 | 0.5572 | |
22.06 | 0.3498 | 22.73 | 0.4615 | 22.74 | 0.4625 | 23.23 | 0.4976 | |
0.9151 | 0.7982 | 0.8411 | 0.6791 | 0.8426 | 0.6794 | 0.8204 | 0.6733 | |
HowMany | 23.56 | 0.6774 | 27.15 | 0.8396 | 27.12 | 0.8384 | 27.71 | 0.8432 |
0.6947 | 0.3788 | 0.5326 | 0.2429 | 0.5338 | 0.2413 | 0.5068 | 0.2460 | |
22.96 | 0.6048 | 24.85 | 0.7303 | 24.85 | 0.7295 | 25.30 | 0.7995 | |
0.7996 | 0.4872 | 0.6855 | 0.3499 | 0.6842 | 0.3493 | 0.6611 | 0.3505 | |
Kodak-05 | 23.05 | 0.6121 | 24.28 | 0.7147 | 24.25 | 0.7136 | 24.81 | 0.7381 |
0.8003 | 0.5698 | 0.6708 | 0.4227 | 0.6716 | 0.4212 | 0.6503 | 0.4251 | |
20.80 | 0.4341 | 22.60 | 0.5945 | 22.58 | 0.5936 | 23.95 | 0.6320 | |
0.8862 | 0.6848 | 0.8120 | 0.5702 | 0.8114 | 0.5691 | 0.7886 | 0.5623 | |
Kodak-08 | 20.72 | 0.5695 | 21.84 | 0.6723 | 21.84 | 0.6724 | 22.50 | 0.7070 |
0.8180 | 0.5988 | 0.7058 | 0.4691 | 0.7065 | 0.4673 | 0.6818 | 0.4688 | |
18.70 | 0.3914 | 20.61 | 0.5493 | 20.60 | 0.5497 | 21.71 | 0.6603 | |
0.8971 | 0.7025 | 0.8349 | 0.6098 | 0.8357 | 0.6080 | 0.8072 | 0.6035 | |
Michoacan | 21.22 | 0.6522 | 22.64 | 0.7446 | 22.61 | 0.7429 | 23.23 | 0.7605 |
0.7766 | 0.5345 | 0.6365 | 0.3887 | 0.6357 | 0.3870 | 0.6112 | 0.3900 | |
18.76 | 0.4721 | 20.84 | 0.6253 | 20.81 | 0.6235 | 21.10 | 0.6758 | |
0.8715 | 0.6503 | 0.7878 | 0.5359 | 0.7886 | 0.5346 | 0.7630 | 0.5290 | |
MissionBay | 24.13 | 0.7399 | 25.52 | 0.8019 | 25.50 | 0.8021 | 26.35 | 0.8167 |
0.7566 | 0.4472 | 0.6129 | 0.3053 | 0.6139 | 0.3056 | 0.5816 | 0.3122 | |
21.77 | 0.6162 | 24.05 | 0.7309 | 24.06 | 0.7305 | 24.93 | 0.8129 | |
0.8149 | 0.5062 | 0.6985 | 0.3702 | 0.6980 | 0.3704 | 0.6697 | 0.3707 | |
NuRegions | 15.08 | 0.5079 | 18.22 | 0.7765 | 18.24 | 0.7765 | 18.84 | 0.7840 |
0.7198 | 0.4805 | 0.5369 | 0.2410 | 0.5419 | 0.2402 | 0.5176 | 0.2395 | |
13.39 | 0.2408 | 15.70 | 0.5647 | 15.72 | 0.5657 | 16.21 | 0.6836 | |
0.8503 | 0.6714 | 0.7069 | 0.4420 | 0.7091 | 0.4416 | 0.6907 | 0.4369 | |
Peppers | 24.00 | 0.7678 | 28.58 | 0.9013 | 28.52 | 0.8993 | 28.99 | 0.9810 |
0.6997 | 0.3360 | 0.5512 | 0.2168 | 0.5502 | 0.2153 | 0.5100 | 0.2160 | |
25.34 | 0.7786 | 26.63 | 0.8377 | 26.56 | 0.8355 | 27.30 | 0.9130 | |
0.7447 | 0.3731 | 0.6059 | 0.2406 | 0.6061 | 0.2382 | 0.5710 | 0.2425 | |
Rocio | 30.27 | 0.8704 | 33.77 | 0.9376 | 33.70 | 0.9327 | 34.42 | 0.9390 |
0.5753 | 0.2492 | 0.3732 | 0.1271 | 0.3739 | 0.1237 | 0.3374 | 0.1335 | |
29.27 | 0.8407 | 31.22 | 0.8926 | 31.22 | 0.8917 | 32.77 | 0.9010 | |
0.6694 | 0.3241 | 0.5114 | 0.1975 | 0.5110 | 0.1964 | 0.4725 | 0.1991 | |
Starfish | 24.38 | 0.7056 | 28.67 | 0.8643 | 28.61 | 0.8625 | 29.24 | 0.8720 |
0.6675 | 0.3748 | 0.4815 | 0.2229 | 0.4835 | 0.2220 | 0.4580 | 0.2238 | |
25.18 | 0.6960 | 25.79 | 0.7595 | 25.78 | 0.7584 | 26.34 | 0.8020 | |
0.7960 | 0.5033 | 0.6746 | 0.3647 | 0.6734 | 0.3634 | 0.6517 | 0.3616 | |
Yan | 24.27 | 0.6847 | 26.42 | 0.8061 | 26.41 | 0.8057 | 27.30 | 0.8121 |
0.7470 | 0.4719 | 0.5922 | 0.3134 | 0.5933 | 0.3123 | 0.5640 | 0.3188 | |
22.74 | 0.6047 | 24.12 | 0.6907 | 24.14 | 0.6910 | 25.71 | 0.7019 | |
0.8496 | 0.5864 | 0.7534 | 0.4532 | 0.7527 | 0.4529 | 0.7246 | 0.4526 | |
Average | 23.06 | 0.6669 | 25.65 | 0.7899 | 25.58 | 0.7874 | 26.24 | 0.8081 |
0.7414 | 0.4781 | 0.5900 | 0.3307 | 0.5902 | 0.3293 | 0.5644 | 0.3330 | |
22.30 | 0.5763 | 23.65 | 0.6838 | 23.65 | 0.6833 | 24.37 | 0.7280 | |
0.8321 | 0.5795 | 0.7281 | 0.4493 | 0.7278 | 0.4483 | 0.7030 | 0.4475 |
4.1 Quantitative experimentation
LR images are reconstructed by the three algorithms and bicubic technique to their original sizes. The PSNR and SSIM as given in [11] and [17] are used along with sharpness and contrast measures used by Liu et al. [18] for the quantitative performance evaluation.
The SSIM [17] is used as a perceptual quality metric, which is more compatible with human image quality perception than the PSNR measure. The sharpness and contrast measures, as introduced by Liu et al. [18], are at first calculated as s(i,j) and c(i,j), respectively, for each pixel position (i,j) and then averaged for the whole image.
where c(i,j) is the contrast value of image I at (i,j).
The sharpness and contrast values are for comparing the contrast and sharpness values of reconstructed images with those of the original images. The table shows absolute errors (i.e., the absolute difference in contrast or sharpness from the original value, divided by the original value). Smaller values indicate less deviation from true contrast and sharpness.
Tables 1 and 2 indicate that images reconstructed by the proposed algorithm have less deviation in terms of sharpness from the original value. This corresponds to the observation that the proposed algorithm is well able to recover high-frequency components better than the other algorithms. Also, there is slightly more deviation from the original contrast value when compared with the other algorithms.
Tables 1 and 2 lists the comparison results for the proposed algorithm with the spatial domain state-of-the-art algorithm of [11] and bicubic technique. The proposed algorithm produces better results when compared with Yang et al.’s [11], due to the directional clustered dictionary learning. The proposed algorithm gives an average PSNR raise of 1.01, 0.59, and 0.72 dB for scale parameters 2, 3, and 4 over the state of the algorithm of Yang et al. [11] with SSIM improvement of 0.0127, 0.0182, and 0.0442 for scale parameters 2, 3, and 4 when tested on [15, 16] data sets and some other benchmark images. The improvements over the coupled K-SVD algorithm of Xu et al. [12] is 0.5, 0.66, and 0.72 dB in terms of PSNR for scale parameters 2, 3, and 4. The improvements in SSIM values are 0.0002, 0.0207, and 0.0447 for scale parameters 2, 3, and 4. The improvements over the bicubic technique over this set of test images is 2.32, 3.18, and 2.07 dB in terms of PSNR for scale parameters 2, 3, and 4 and 0.0623, 0.1412, and 0.1017 in terms of SSIM for scale parameters 2, 3, and 4, respectively. This justifies the fact that directional clustered dictionaries better recover some of the high-frequency components of the LR image.
From Table 1, one can see that the average PSNR and SSIM results of the proposed algorithm are less than the algorithm of [12] for scale parameter 2. This is due to the fact that the algorithm by [12] uses a coupled K-SVD approach for the dictionary update stage, also after recovering the HR patches a geometric mean algorithm is implemented to get the HR image estimate which serves as an additional post processing. However, the proposed (PMS) clearly outperforms the compared algorithms for all scale parameters.
Images | Bic. | [11] | [12] | Proposed |
---|---|---|---|---|
Baboon | 24.66 | 25.28 | 25.30 | 26.05 |
0.6359 | 0.7594 | 0.7602 | 0.7746 | |
Boat | 32.35 | 33.71 | 33.66 | 35.03 |
0.8989 | 0.9292 | 0.9291 | 0.9375 | |
Bridge | 26.49 | 27.46 | 27.46 | 28.36 |
0.7922 | 0.8445 | 0.8446 | 0.8725 | |
Cameraman | 26.32 | 27.63 | 27.61 | 28.73 |
0.8629 | 0.8918 | 0.8912 | 0.9132 | |
Coala | 33.40 | 36.26 | 36.26 | 37.83 |
0.8958 | 0.9513 | 0.9513 | 0.9697 | |
Coastguard | 29.13 | 30.47 | 30.47 | 31.61 |
0.7725 | 0.8495 | 0.8501 | 0.8731 | |
Comic | 26.05 | 28.33 | 28.28 | 28.78 |
0.8419 | 0.9105 | 0.9092 | 0.9255 | |
Elaine | 31.04 | 31.31 | 31.32 | 31.83 |
0.6531 | 0.7123 | 0.7131 | 0.7214 | |
Face | 34.74 | 36.53 | 36.53 | 36.90 |
0.8041 | 0.9095 | 0.9097 | 0.9432 | |
Fingerprint | 31.92 | 34.43 | 34.43 | 35.14 |
0.9513 | 0.9729 | 0.9730 | 0.9765 | |
Flowers | 30.41 | 32.19 | 32.10 | 33.47 |
0.8828 | 0.9270 | 0.9264 | 0.9496 | |
Foreman | 35.35 | 37.39 | 37.20 | 38.76 |
0.8928 | 0.9594 | 0.9586 | 0.9686 | |
House | 32.76 | 34.25 | 34.13 | 35.72 |
0.8928 | 0.9099 | 0.9092 | 0.9279 | |
Lena | 34.71 | 36.21 | 36.18 | 37.14 |
0.8507 | 0.9259 | 0.9260 | 0.9737 | |
Man | 29.25 | 30.38 | 30.33 | 31.40 |
0.8314 | 0.8782 | 0.8779 | 0.8905 | |
Parrot | 26.91 | 28.57 | 28.63 | 29.75 |
0.8931 | 0.9185 | 0.9186 | 0.9340 | |
Average | 30.34 | 31.90 | 31.87 | 32.91 |
0.8345 | 0.8906 | 0.8905 | 0.9095 |
It is noted here that the computational cost of the proposed algorithm increases nine times as compared to the algorithms of [11] and [12]. It is well known that the most expensive stage in the dictionary learning process is the sparse representation stage which is a vector selection process. Using each directional dictionary along with mapping to recover the HR patch increases the computational cost given that the proposed algorithm is using the same number of dictionary atoms and patch size. However, in some applications, one can compromise the number of computations given that the improvement margin in quality is considerable.
We also tested other dictionary model selection approaches which can reduce the computational cost. One approach that we used during the testing phase of the proposed algorithm for cluster selection was only the correlation of the LR patch at hand with each directional cluster and then using that dictionary pair for HR patch reconstruction. Using this very simple approach on average using the same test images and scale parameter 2, the PSNR improvements were 0.3 dB over the algorithm of [11] and SSIM improvement of 0.0031. These results are given in Table 1 last column. In this case, the computational cost is same as the baseline algorithms with only additional correlation computation. In this scenario, the only extra cost is the correlation computation for cluster decision when comparing with the baseline algorithms. In the same way, one can use different probabilistic models for deciding which cluster to use during the reconstruction phase given that the clustering is carried out by correlation with designed templates. One can also exploit hidden Markov trees (HMT) between the HR and LR training data and develop suitable models.
4.2 Qualitative experimentation
5 Conclusions
The directional clustering with coupled dictionary learning is proposed for the problem of SISR. Nine pairs of directional dictionaries are designed. The proposed algorithm uses a patch size of 6×6 with 216 dictionary atoms to ensure the much needed computational cost. The proposed algorithm outperforms the spatial domain baseline algorithm of Yang et al. [11]. The proposed algorithm performs quite well when compared with the algorithm of Xu et al. [12] due to clustering and coupled dictionary learning with mapping functions.
From the results, it can be seen that the proposed idea of clustering-based coupled dictionary learning and mapping functions can produce better results when compared with the state-of-the-art algorithms.
For scale parameter 2 compared to the bicubic interpolation, the proposed algorithm gives 2.32 dB improvement as tested over the set of benchmark images. The proposed algorithm provides a 1.01 dB improvement over the baseline algorithm of Yang et al. [11], and 0.5 dB improvement over the algorithm of Xu et al. [12] as tested over the image data sets [15, 16]. Visual results also verify those quantitative results.
5.1 Future recommendations
Considering the possibilities of the extension of this work, it is suggested that in the process of designing dictionaries, one can employ the model selection from LR to HR by learning hidden Markov models [19]. Moreover, to generate the LR images, the blur filter is assumed as the bicubic filter. This work can be extended to include and compare the accurate camera blur models as in [20].
Declarations
Authors’ contributions
Both the authors have contributed equally to the text, while JA has implemented the algorithms and performed most of the tests. Both authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- M Elad, M Aharon, Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. (TIP). 15:, 3736–3745 (2006).MathSciNetView ArticleGoogle Scholar
- DL Donoho, Compressed sensing. IEEE Trans. Inf. Theory. 52:, 1289–1306 (2006).MathSciNetView ArticleMATHGoogle Scholar
- J Yang, J Wright, T Huang, Y Ma, Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010).MathSciNetView ArticleGoogle Scholar
- B Ophir, M Lustig, M Elad, Multiscale dictionary learning using wavelets. IEEE J. Sel. Topics in Signal Process. 5(5), 1014–1024 (2011).View ArticleGoogle Scholar
- K Zhang, X Gao, D Tao, X Li, in Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR’12). Multi-scale dictionary for single image super-resolution (IEEE ProvidenceRI, USA, 2012), pp. 1114–1121.View ArticleGoogle Scholar
- W Dong, L Zhang, G Shi, X Wu, Image deblurring and super-resolution by adaptive sparse doamin selection and adaptive regularization. IEEE Trans. Image Process. 20:, 1838–1857 (2011).MathSciNetView ArticleGoogle Scholar
- J Feng, L Song, X Yang, W Zhang, in Proceedings of the IEEE International Conference on Image Processing, (ICIP’11). Learning dictionaries via subspace segmentation for sparse representation (IEEE BrusselsBelgium, 2011), pp. 1245–1248.Google Scholar
- G Yu, G Sapiro, S Mallat, in Preceedings of IEEE International Conference on Image Processing, (ICIP’10). Image modelling and enhancement via structured sparse model selection (IEEEHong Kong, 2010), pp. 1641–1644.View ArticleGoogle Scholar
- S Yang, M Wang, Y Chen, Y Sun, Single image super-resolution reconstruction via learned geometric dictionaries and clustered sparse coding. IEEE Trans. Image Process. 21:, 4016–4028 (2012).MathSciNetView ArticleGoogle Scholar
- F Farhadifard, E abar, M Nazzal, H Ozkaramanli, in Proceedings IEEE Signal Processing Communication Applications Conference (SIU’2014). Single image super-resolution based on sparse representation via directionally structured dictionaries (IEEE TrabzonTurkey, 2014), pp. 1718–1721.Google Scholar
- J Yang, Z Wang, Z Lin, S Cohen, T Huang, Coupled dictionary training for image super-resolution. IEEE Trans. Image Process. 21:, 3467–3478 (2012).MathSciNetView ArticleGoogle Scholar
- J Xu, C Qi, Z Chang, in Proceedings of IEEE International Conference on Image Processing (ICIP’14). Coupled K-SVD dictionary training for super-resolution (IEEEParis, France, 2014), pp. 3910–3914.Google Scholar
- S Wang, L Zhang, Y Liang, Q Pan, in Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR’12). Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis (IEEE ProvidenceRI, USA, 2012), pp. 2216–2223.View ArticleGoogle Scholar
- B Efron, T Hastie, I Johnstone, R Tibshirani, Least angle regression. Ann. Stat. 32:, 407–499 (2004).MathSciNetView ArticleMATHGoogle Scholar
- R Franzen, Kodak lossless true color image suite (2014). onliner0k.us/graphics/kodak/index.html. accessed 20 January 2016.
- R Klette, Concise computer vision (Springer, London, 2014). Single images. online:ccv.wordpress.fos.auckland.ac.nz/data/single-images/. accessed 20 Jan 2016.View ArticleMATHGoogle Scholar
- Z Wang, AC Bovik, HR Sheikh, EP Simoncelli, Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004).View ArticleGoogle Scholar
- D Liu, R Klette, in Proceedings of International Conference on Image Vision Computing New Zealand (IVCNZ’15). Sharpness and contrast measures on videos (IEEE AucklandNew Zealand, 2015). IEEE online.Google Scholar
- RK Lama, MR Choi, GR Kwon, Image interpolation for high-resolution display based on the complex dual tree wavelet transform and hidden markov. Multimedia Tools Appl. online, 1–12 (2016).Google Scholar
- N Efrat, D Glasner, A Apartsin, B Nadler, A Levin, in Proceedings of IEEE International Conference on Computer Vision (ICCV’13). Accurate blur models vs. image priors in single image super-resolution (IEEE SydneyAustralia, 2013), pp. 2832–2839.View ArticleGoogle Scholar