Reversible data hiding using least square predictor via the LASSO
© The Author(s). 2016
Received: 1 June 2016
Accepted: 10 November 2016
Published: 7 December 2016
Reversible watermarking is a kind of digital watermarking which is able to recover the original image exactly as well as extracting hidden message. Many algorithms have aimed at lower image distortion in higher embedding capacity. In the reversible data hiding, the role of efficient predictors is crucial. Recently, adaptive predictors using least square approach have been proposed to overcome the limitation of the fixed predictors. This paper proposes a novel reversible data hiding algorithm using least square predictor via least absolute shrinkage and selection operator (LASSO). This predictor is dynamic in nature rather than fixed. Experimental results show that the proposed method outperforms the previous methods including some algorithms which are based on the least square predictors.
Reversible data hiding technique embeds data into host signal such as text, image, audio, or video with the functionality of recovering original signal as well as extracting hidden data. It can be utilized for various purposes such as military or medical image processing which requires the integrity of the original image.
Difference expansion invented by Tian  is a fundamental technique for reversible data hiding that expands the difference value of a pair of pixels to hide one bit per pair. Alattar  proposed an embedding method using difference values among a triplet of pixels to hide two bits per triplet. In addition, he showed that three bits can be hidden into a quad .
After that, prediction error expansion (PEE) was proposed by Thodi and Rodriguez  as a generalized form of difference expansion. Prediction error which means difference between the original pixel and the predicted pixel is expanded for reversible data hiding. Probability distribution function of the prediction errors is sharper and narrower than that of the simple difference of the pixel values, which is better for reversible data hiding. Small distortion with large embedding capacity is a desirable feature of the reversible data hiding. Thodi and Rodriguez  also used the median edge detector (MED) as a predictor introduced for the lossless image compression standard such as JPEG-LS .
Chen et al.  compared the performances of many predictors such as MED, 4th-order gradient-adjusted predictor (GAP) employed in context-based adaptive lossless image compression (CALIC) , and the full context prediction  using the average of the four closest neighbored pixels. Full context prediction using rhombus pattern and sorting method is also proposed in  by Sachnev et al. These are all classified as fixed predictor in .
Full context rhombus predictor has the best performance among all fixed predictors . That is the reason why many papers implemented embedding algorithm based on the full context rhombus predictor [10–13].
On the other hand, various papers [14–16] focused on improving PSNR performance in small embedding capacity. Dragoi and Coltuc  utilized the rhombus pattern even in small embedding capacity and obtain a good result. However, the problem of these methods have small embedding capacity. The optimization scheme such as least square approach is essential for high embedding capacity as well as small image distortion.
Adaptive predictors using least square approach are also introduced in many papers [17, 18] and applied in reversible data hiding [19, 20]. Edge-directed prediction (EDP) is a least square predictor which optimizes the prediction coefficients locally inside a training set. Kau and Lin  proposed edge-look-ahead (ELA) scheme using least square prediction with efficient edge detector to maximize the edge-directed characteristics. Wu et al. improved the least square predictor by determining the order of predictor and support pixels adaptively .
All of these predictors’ performance was properly compared in several papers [17, 21]. However, all these adaptive predictors` performance was not able to outperform the simple rhombus predictor, because those had to use the only previous pixels of the target pixel while the rhombus predictor utilized four neighboring pixels .
Dragoi and Dinu  and Lee et al.  improved the least square predictor by modifying traditional training set consisting of only previous pixels of the target pixel. Dragoi and Dinu utilized training set with pixels in square shaped block surrounding the target pixel. Only half of the pixels within the block are original pixels and the other half are modified ones after data embedding. Least square predictor in  includes four neighboring pixels as well as a subset of previous pixels for the training set. Their predictor divides an image into cross and dot sets. When embedding data in the cross set, predictor uses training set consisting of the original pixels, while in the dot set, it uses half-modified training set. Therefore, both techniques clearly outperform the previous least square predictor  and the rhombus patterned fixed predictor .
Least square approach is one of the most advanced types of adaptive predictor in reversible data hiding. However, in statistics, it is well known that penalized regression approach which accompanies efficient variable selection can lead to finding smaller and more necessary supports for the purpose of good prediction accuracy.
In this paper, we propose a reversible data hiding technique using the least square predictor via penalized regression method called the least absolute shrinkage and selection operator (LASSO) to overcome weaknesses of the existing prediction methods.
In addition to the difference expansion method, histogram shifting (HS) method  has played important role in the reversible data hiding community. It provides less distortion to that difference expansion method. However, in most cases, two methods are used as a single algorithm. One of the mainstreams of the reversible data hiding is utilizing a combination of histogram shifting and prediction error expansion (PEE + HS) with good predictors. Comprehensive explanation of the various algorithms and their application can be available at .
The organization of this paper is as follows: section 2 explains the related works on which the proposed method is based on, section 3 presents the proposed algorithm, section 4 presents experimental results to show that the proposed algorithm is superior to other methods, and section 5 presents the conclusion.
2 Related works
2.1 Two-stage embedding scheme using rhombus pattern
2.2 Linear prediction
2.2.1 Least square approach
The coefficients for support pixels are computed adaptively by the least square (LS) methods in linear prediction. It is one of the most advanced types of adaptive predictor, and it normally can provide better performance than fixed predictors [6, 9]. The fixed predictor uses the fixed coefficients. However, adaptive predictor computes the coefficients dynamically according to the context.
2.2.2 Penalized regression using LASSO
Penalized regression methods aim at simultaneous variable selection in coefficient estimation. In practice, even if the sample size is small, a large number of support pixels are typically included to mitigate modeling biases. With such a large number of support pixels, there might exist multicollinearity problems among explanatory variables X. Thus, selecting an appropriate size of the support pixels in a subset is desirable. Penalized regression can be an effective tool for such a selection.
3 Proposed algorithm
Applying least square predictor which is able to obtain adaptive weigh for each support pixel.
Applying LASSO penalized regression to least square predictor on purpose of selecting the number and location of support pixels adaptively.
3.1 Least square predictor based on rhombus scheme
Due to the property of two-stage embedding scheme , there are some pixels which should be excluded from the training set. Suppose that we embed a bit in dot set first. In Fig. 4, (in case of N = 9), basically all pixels of the cross set in the past of target pixel can be included in the training set and all pixels in the dot set such as E 1, E 2, E 3, E 4, E 5, and E 6 should be excluded from the training set because those pixels break reversibility. In other words, those pixels use at least one support pixel which is located in or behind target pixel.
Suppose that we have M training set pixels excluding the above improper pixels according to the size T. Each pixel has N support pixels. Then, it forms an M × N matrix X as shown in Eq. (3). However, the proposed method applies one more idea to use more proper support pixels for the purpose of improving the accuracy of the LS-based predictor.
3.2 Applying penalized regression via LASSO
LS-based prediction method, an adaptive predictor, can be improved by using penalized regression. In the proposed method, LASSO is utilized for penalized regression. LS predictor provides an adaptive coefficient value, but penalized regression can make LS method be more adaptive. By the proposed method, we can penalize and remove some support pixels which are not influential to the target pixel. In other words, we can estimate the location of the most critically influential support pixels as well as their prediction coefficients.
The sample of prediction coefficients for support pixels
x(n − i)
Table 1 shows that the coefficients of 124, 98, and 73 are smaller than others in magnitude. Thus, LASSO assigns 0 values to them and remove those pixels from the support pixels.
The LS-based approach calculates the predicted value x p as 74, and the LASSO penalization calculates it as 81 according to Eq. (1). LASSO estimates the target pixel more exactly because its given value is 83.
3.3 Encoder and decoder
This section describes the main step of the encoding and decoding processes. The proposed idea is explained more explicitly step by step with the description of the full process.
Compute local variance value for all pixels. Find the threshold value of the local variance values which is able to meet the embedding capacity.
Determine which pixels have smaller value of local variance comparing with the threshold value of local variance. Only these pixels are available for embedding.
- 3.Compute x p (n) using only a rhombus predictor  for the border pixels since training is not possible along the border. Compute x p (n) using the proposed algorithm for other pixels.
Decide the training set with size L centered on y(n) as shown in Fig. 4.
Create X and Y from the pixel values of the training set.
Run LASSO estimator and obtain prediction coefficient β for each support pixel.
Compute x p (n) using Eq. (1).
Compute the prediction error such as e(n) = x (n) − x p (n).
Embed a bit into the prediction error value using the prediction error expansion and histogram shift method.
Overflow and underflow problem has to be considered by using the location map bits such as Sachnev et al.’s method .
The pixels of the cross set are modified by embedding associated bits as shown above. The dot set embedding procedure starts with the same process. Obviously, training set includes the modified pixels of the cross set.
Watermarked image is divided into the cross set and the dot set. Decoding procedure proceeds in the inverse order of the embedding procedure. In other words, dot set decoding proceeds first and cross set second.
Obtain the threshold value of the local variance, embedding capacity, and so on, from the side information.
Determine which pixels have smaller value of the local variance than the threshold value. Those pixels have the embedded bits.
In case of those pixels, compute x p (n) using a rhombus predictor  for the border pixels. Compute x p (n) using the proposed algorithm for other pixels.
Compute the modified prediction error such as e(n) = x(n) − x p (n).
Extract a bit out of the modified prediction error value using the prediction error expansion and histogram shift method. Original value of the target pixel is recovered.
Overflow and underflow problem has to be considered by using the location map bits such as Sachnev et al.’s method .
4 Experimental results
Comparison in terms of average PSNR(dB) for low embedding capacities(lower than 0.5 bpp)
Sachnev et al.
Lee et al.
Dragoi and Coltuc
Comparison in terms of average PSNR(dB) for high embedding capacities(higher than 0.5 bpp)
Sachnev et al.
Lee et al.
Dragoi and Coltuc
We embed the watermark message and side information as binary data in the images as a payload.
4.1 Effect of training set size, L
In all above test images, the value 13 or 17 is a proper compromise as the training set size for the best results. It means that LASSO-based LS method needs to have enough training set size to obtain the best effect.
4.2 Effect of the number of support pixel, N
4.3 Comparison with other state-of-the-art schemes
LS predictor via LASSO with well-compromised size of L and N
Two-stage embedding scheme with histogram shifting method 
To further verify the superiority of the proposed method, experimental results for high and low embedding capacities are listed in Tables 2 and 3. The average PSNR for low embedding capacities are computed by using 40,000, 70,000, 100,000, and 130,000 bits which are lower than 0.5 bpp in Table 2. On all test images of Table 2, the proposed method outperforms the others with an average gain in PSNR of 0.982 dB over , 0.344 dB over , and 0.226 dB over .
The average PSNR for high embedding capacities are computed by using 160,000, 190,000, and 220,000 bits which are higher than 0.5 bpp in Table 3. The result of high embedding capacities makes the superiority of the proposed method clearer. The proposed method outperforms the others with an average gain in PSNR of 1.625 dB over , 0.354 dB over , and 0.508 dB over .
First, the proposed method improves the state-of-the-art LS predictors   via LASSO optimization. Dragoi and Dinu’s method  and Lee et al.’s method  utilize the LS predictor using the different shape of training set and support pixels. However, the proposed method applies LASSO optimization to improve the previous LS predictors. In most cases of images, the number of support pixel, N = 26 is selected for the best prediction performance in the proposed method while N = 4  and N = 6  are used in other LS predictor. In the proposed method, LASSO optimization selects the optimized support pixels to use and remove others. In other words, the proposed method is able to utilize more proper support pixels out of many candidate support pixels to increase accuracy of the LS computation.
In this paper, we proposed an enhanced predictor by using LASSO approach over normal LS predictor with rhombus-shaped two-stage embedding scheme. It enables finding out the shape of region around the target pixel and the proper weight coefficients. In other words, in the proposed method, it is possible to find reasonable number and location of the support pixels due to applying LASSO into the LS approach. That is why a set of pixels located in highly variative region of image is predicted more effectively by the proposed scheme rather than other LS predictors. Due to this property, the number of high prediction errors decreases. Thus, the proposed method has a tendency that significant improvement happens in high embedding capacity, especially in highly variative images. Experimental results demonstrate that the proposed method has better results than other state-of-the-art methods.
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (NRF-2015R1A2A2A0104587).
HH invented the proposed idea by combining previous statistical theory with reversible watermarking prediction scheme, drafted the manuscript, and performed the statistical analysis. SK participated in the statistical theory to back up and support the proposed idea. HK participated in the design and coordination of paper and helped draft and finish manuscript of the paper.
The authors declare that they have no competing interests.
About the authors
Hee Joon Hwang
He received a B.S. degree in Electric and Electronic Engineering department in 2008 and a M.S. degree in Graduate School of Information Management and Security from Korea University, Seoul, Korea, in 2010. He joined at Graduate School of Information Security, Korea University, Seoul, Korea, in 2010, where he is currently pursuing Ph.D. His research interests include multimedia security, reversible and robust watermarking, and steganography.
He received a B.S. degree in Education department in 2007 and a M.S. degree in Statistics from Korea University, Seoul, Korea, in 2010. He received Ph. D. Biostatistics from University of Pittsburgh, Pittsburgh, in 2015. His research interests include methodological development for statistical machine learning methods, image processing, and optimization.
Hyoung Joong Kim
He is currently with the Graduate School of Information Security, Korea University, Korea. He received his B.S., M.S., and Ph.D. from Seoul National University, Korea, in 1978, 1986, and 1989, respectively. He was a professor at Kangwon National University, Korea, from 1989 to 2006. He was a visiting scholar at University of Southern California, Los Angeles, USA, from 1992 to 1993. His research interests include data hiding such as reversible watermarking and steganography.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- J Tian, Reversible data embedding using a difference expansion. IEEE Trans. Circuits Syst. Video Technol. 13(8), 890–896 (2003)View ArticleGoogle Scholar
- AM Alattar, Reversible watermark using difference expansion of triplets, in Proc. IEEE Int. Conf. Image Process. IEEE International Conference on Image Processing, Catalonia, Spain, 2003, vol. 1, pp. 501–504Google Scholar
- AM Alattar, Reversible watermark using the difference expansion of a generalized integer transform. IEEE Trans. Image Process. 13, 1147–1156 (2004)MathSciNetView ArticleGoogle Scholar
- DM Thodi, JJ Rodriguez, Expansion embedding techniques for reversible watermarking. IEEE Trans. Image Process 16(3), 721–730 (2007)MathSciNetView ArticleGoogle Scholar
- MJ Weinberger, G Seroussi, G Sapiro, The LOCO-I lossless image compression algorithm:principles and standardization into JPEG-LS. IEEE Trans. Image Process. 9(8), 1309–1324 (2000)View ArticleGoogle Scholar
- M Chen, Z Chen, X Zeng, Z Xiong, Model order selection in reversible image watermarking. IEEE J. Sel. Top. Signal Process. 4(3), 592–604 (2010)View ArticleGoogle Scholar
- X Wu, N Memon, Context-based, adaptive, lossless image coding. IEEE Trans. Commun. 45(4), 437–444 (1997)View ArticleGoogle Scholar
- V Sachnev, HJ Kim, J Nam, S Suresh, YQ Shi, Reversible watermarking algorithm using sorting and prediction. IEEE Trans. Circuits Syst. Video Technol. 19(7), 989–999 (2009)View ArticleGoogle Scholar
- IC Dragoi, D Coltuc, Local-prediction-based difference expansion reversible watermarking. IEEE Trans. Image Process. 23(4), 1779–1790 (2014)MathSciNetView ArticleGoogle Scholar
- HJ Hwang, HJ Kim, V Sachnev, SH Joo, Reversible watermarking method using optimal histogram pair shifting based on prediction and sorting. KSII, Trans. Internet Inform. Syst. 4(4), 655–670 (2010)Google Scholar
- SU Kang, HJ Hwang, HJ Kim, Reversible watermark using an accurate predictor and sorter based on payload balancing. ETRI J. 34(3), 410–420 (2012)View ArticleGoogle Scholar
- G Feng, Z Qian, N Dai, Reversible watermarking via extreme learning machine prediction. Neurocomputing 82(1), 62–68 (2012)View ArticleGoogle Scholar
- L Luo, Z Chen, M Chen, X Zeng, Z Xiong, Reversible image watermarking using interpolation technique. IEEE Trans. Inf. Forensics Secur 5(1), 187–193 (2010). 5010-5021View ArticleGoogle Scholar
- B Ou, X Li, Y Zhao, R Ni, YQ Shi, Pairwise prediction-error expansion for efficient reversible data hiding. IEEE Trans. Image Process. 22(12), 36–42 (2013)MathSciNetView ArticleGoogle Scholar
- S. Weng, and J.S. Pan, Reversible watermarking based on two embedding schemes, Multimedia Tools Appl. 2016;75(12):7129-157.Google Scholar
- IC Dragoi, D Coltuc, Adaptive pairing reversible watermarking. IEEE Trans. Image Process. 25(5), 2420–2422 (2016)MathSciNetView ArticleGoogle Scholar
- L-J Kau, Y-P Lin, Adaptive lossless image coding using least squares optimization with edge-look-ahead. IEEE Trans. Circuits Syst. 52(11), 751–755 (2005)View ArticleGoogle Scholar
- X Wu, G Zhai, X Yang, W Zhang, Adaptive sequential prediction of multidimensional signals with applications to lossless image coding. IEEE Trans. Image Process. 20(1), 36–42 (2011)MathSciNetView ArticleGoogle Scholar
- J. Wen, L. Jinli, and W. Yi Adaptive reversible data hiding through autoregression, In Proceedings of 2012 IEEE International Conference on Information Science and Technology, ICIST. 2012. pp.831-838.Google Scholar
- BY Lee, HJ Hwang, HJ Kim, Reversible data hiding using piecewise autoregresive predictor based on two-stage embedding. J. Elect. Eng. Tech. 11(4), 974–986 (2016)View ArticleGoogle Scholar
- X Li, MT Orchard, Edge-directed prediction for lossless compression of natural images. IEEE Trans. Image Process. 10(6), 813–817 (2001)View ArticleMATHGoogle Scholar
- Z Ni, Y-Q Shi, N Ansari, W Su, Reversible data hiding. IEEE Trans, Circuits Syst. 16(3), 354–365 (2006)Google Scholar
- Y-Q Shi, X Li, X Zhang, H-T Wu, B Ma, Reversible data hiding: advances in the past two decades. IEEE Access 4, 3210–3237 (2016)View ArticleGoogle Scholar
- R Tibshirani, Regression shrinkage and selection via the lasso. J. Royal Stat. Soc., Series B 58(1), 267–288 (1996)MathSciNetMATHGoogle Scholar
- G Schwarz, Estimating the dimension of a model. Ann Stat 6(2), 461–464 (1978)MathSciNetView ArticleMATHGoogle Scholar