Skip to main content


A hybrid NSCT domain image watermarking scheme

Article metrics

  • 1277 Accesses

  • 3 Citations


At present, dealing with the piracy and tampering of images has become a notable challenge, due to the presence of smart mobile gadgets. In this paper, we propose a novel watermarking algorithm based on non-subsampled contourlet transform (NSCT) for improving the security aspects of such images. Moreover, the fusion of feature searching approach with watermarking methods has gained prominence in the current years. The scale-invariant feature transform (SIFT) is a technique in computer vision for detecting and illustrating the local features in images. Nevertheless, the SIFT algorithm can extract feature points with high invariance that are resilient to several issues like rotation, compression, and scaling. Furthermore, the extracted feature points are embedded with watermark using the NSCT approach. Subsequently, the tree split, voting, rotation searching, and morphology techniques are employed for improving the robustness against the noise. The proposed watermarking algorithm offers superior capability, better capture quality, and tampering resistance, when compared with existing watermarking approaches.


In the current scenario, digital watermarking approaches have gained a lot of significance, due to the rapid progress in technology. Additionally, exchange and sharing of images have become easier and faster due to the advent of smart phones and gadgets. Furthermore, this technology has been considered as a vital approach for safeguarding the copyright and intellectual property of images from severe privacy and security issues. Nevertheless, when the image resolution gets higher and the volume turns bulkier, handling such images is always a challenging task. Subsequently, it becomes essential to compress the images; while compressing, a lot secretive information can probably get distorted. Hence, capturing the identifiable information from the image turns out to be a difficult task. Consequently, there is a need for the present watermarking approaches to be strong enough to deal with the compression issues. However, the digital watermarking approaches should also be able to achieve transparency, robustness, and better capacity to represent their unique identities. Data fusion can be also considered as an important aspect in image watermarking in video recognition systems [1, 2]. Deep learning can also enhance the image-based watermarking and can be used to formulate ranking algorithms for image recognition [3, 4]. Several watermarking techniques based on discrete cosine transform (DCT) [510] and discrete wavelet transform (DWT) [1117] have been already established. Patra [7] established a watermarking scheme based on the Chinese remainder theorem (CRT), which was deployed in the DCT domain, and this approach was more effective against the JPEG compression attacks. Ababneh [1214] established a compensated signature embedding (CSE) framework that could resist against attacks by the JPEG 2000 compression. Nevertheless, the abovementioned techniques were unable to solve the problems of rotation and scaling. Huynh-The and team [18] established a digital image watermarking scheme based on a coefficient quantization method that intuitively encodes the owner’s data for each color channel to enhance imperceptibility and robustness of the concealed data. Wang and Colleagues [19] established a strong color image watermarking model based on local quaternion exponent moments that resisted the desynchronization attacks. Abdelhakim and team [20] established a scheme, in which the embedding strength parameters for per-block image watermarking in the DCT domain are optimized. In this approach, the Bees algorithm was chosen as the optimization model and the fitness function was employed to exactly fit the optimization issues. Choi and Pun [21] presented a strong reversible watermarking model, in which the bit plane manipulation was deployed to conceal watermark bits in bit planes that are resistant to attacks. Moosazadeh and Andalib [22] established a digital image watermarking approach in YCbCr color space and DCT domain. Moreover, their scheme employed the coefficient exchange for embedding the watermark bits and also the genetic algorithm was used for choosing the target Y component coefficients of the host image. Kadu and team [23] presented a very proficient approach for copyright protection that was based on a modest and a competent embedding method for DWT-based video watermarking. They utilized this model in indoor video watermarking applications. Presently, the notion of feature searching has been widely employed in the digital watermarking approaches for enhancing its robust capacity. Feature searching is an important aspect of modern day sign-board reading systems. Detection of low-resolution images from weakly labeled street images can be done using efficient learning and recognition system as the one developed by Tsai et al. [24]. The scale invariant feature transform (SIFT) is one of the widely used feature searching techniques [25]. Furthermore, SIFT can determine some feature points even under distinct vicious distortions and the watermark is embedded in this feature region. Therefore, when the image gets distorted, it is definitely feasible to determine the feature regions with embedded information. Apart from selecting the feature region, identifying a suitable ambience for embodiment is also an essential task. Recently, the non-subsampled contourlet transform (NSCT) is an emerging approach that can be utilized for watermarking [26]. When compared with DCT, DWT, and alternate transforms, the NCST has superior capacity and offers a large amount of coefficients for the watermarking process. Li [27] established a scheme, in which they amalgamated the SIFT and NSCT approaches and they employed the notion of quantization to embed the watermark, thereby it has a greater capacity. Nevertheless, while deploying the NSCT approach to embed the information, the region surrounding the texture portion gets distorted by high-frequency information. Consequently, the resultant watermark turns out to be ruined due to the high-frequency noises, thus conceding the capture quality. In this work, a novel work amalgamating the SIFT and NSCT approaches with the tree split, voting, rotation, searching, and morphology, thereby providing a splendid and an efficacious model with high capture quality. In Section 2, the SIFT and NSCT approaches are illustrated. Section 3 presents the in-depth details of the proposed method. The experimental results and conclusion are discussed in Sections 4 and 5, respectively. Overall, the main contributions of this paper can be summarized as follows:

  • Design, implementation, and evaluation of a novel robust image watermarking scheme.

  • The proposed watermarking scheme first perform a quadtree decomposition on lowpass subband of NSCT domain to avoid the relatively high-frequency texture. Second, the proposed method employ max-pooling technique to retrieve the fused watermark from each subregions to enhance the capture quality and tampering resistance. Third, a circulation procedure is proposed to offer rotation-tamper-proof ability. Finally, a morphology step is included to refine the extracted watermark.

  • The proposed watermarking algorithm offers superior capability, better capture quality, and tampering resistance, when compared with existing watermarking approaches.

Materials and methods

Lowe established the SIFT method, in which the notion is to capture the feature points, not ruined by image processing, despite the fact that the image is under a distinct scale (either zooming or shrinking) [25]. As soon as the images are processed by means of a Gaussian function, the blurred version of the image shall be the best fit to characterize the scaling space. Primarily, to capture the feature points fruitfully, the difference of the Gaussian and the pyramid depiction has to be exploited for simulating the scaling space. Furthermore, in the region, extremes are utilized as feature candidate points, thereby computing the stability of these neighboring pixels. Subsequently, the pixels with low stability are discarded and finally the orientation of these feature points is determined. Moreover, every feature point offers the information about its coordinate, scale, and orientation, after the SIFT computation [25]. The non-subsampled contourlet transform includes two major steps: (1) non-subsampled pyramid (NSP) and (2) non-subsampled directional filter bank (NSDFB). This approach is analogous to Laplacian pyramid, with a sub-band decomposition of L stages as shown in Fig. 1 a, and there is no requirement of downsampling. The NSP process results in the decomposition of a lowpass sub-band and L highpass sub-bands; further, NSDFB is applied to the highpass sub-bands, as shown in Fig. 1 b. Moreover, only the lowpass sub-bands are computed in this work [26]. When compared with other transformed low frequencies, the NSCT lowpass sub-bands have several coefficients that can be employed for watermarking. Nevertheless, the lowpass sub-bands are usually blurred images which are processed by a filter. Furthermore, the lowpass sub-bands contain smooth low-frequency information, in addition to a small number of high-frequency information. Hence, when the watermarking is performed in that region, the information gets effortlessly dispersed by high-frequency noises, during the process of computing and capturing it.

Fig. 1

Non-subsampled contourlet transform. Non-subsampled contourlet transform for (a) three-stage pyramid decomposition. (b) Non-sampled filter bank

Proposed model

After obtaining the feature details from the SIFT process, the coordinate of the feature points is chosen to be the center, thereby creating a rectangular region for the watermark embedding. The measurement of every part of the region is assessed by scales. However, if the assessed region is very petite, so the capacity for the watermark embedding is not sufficient. On the other hand, if the assessed region is very large, the image will undergo severe tampering. Furthermore, the length and width of the feature region are fixed as 4s+1 (making use of the feature point as the center and 2s as the radius). Nevertheless, the SIFT approach computes several feature points, and not all feature points are appropriate for information embedding. Hence, the feature points need to be filtered. The following particulars are assumed to be the filtering constraints: (1) the feature points may overlap with the neighboring regions, if the value of S is very large; (2) the image undergoes severe tampering, if the value of S is very petite; (3) since the value of S varies with the resolution, a specific range of S is chosen to be the feature region, which results in missing feature points, during the image scaling process. Considering the entire scenario, primarily, the threshold D is defined (it varies with the image resolution)and every feature point is sorted based on its value. Besides, the values of S greater than the value of D are discarded and obtain N (the user can choose the value of N) feature regions with decreasing orders of S. The image resolution is fixed as m×n and m represents the longest portion and n indicates the shortest portion of the image. Based on our examination, there will be an overlap, when the side length of the radius is greater than \(\frac {2m}{15}\). Therefore, the value of D is fixed to be \(\frac {m}{15}\); and if still there is an overlap, further, the feature region with a larger scale is chosen. After fixing the feature regions, then it is processed distinctly using NSCT approach. Moreover, the lowpass sub-band is embedded and its resolution is similar to the matching regions.

Embed phase

Orientation identify

The SIFT outcomes provide the orientation information of every feature point; however, based on Lowe Lowe [25], 15% of the feature points might have more than one orientation. Moreover, in this work, merely a single orientation information is required for positioning; therefore, the other orientations produced by SIFT approach are discarded. In order to obtain the orientation information of the feature regions, the formula (1) is employed for computing the gradient of feature regions in the x and y axis t 1,t 2,f 0 as the post-NSCT low pass sub-bands; f 0(x,y) as the intensity value of the matching position; \(f_{0}(\bar {x},\bar {y}\)) as the center of that region. Subsequently, the value of θ is obtained by deploying the formula (2). In the result of t 1,t 2, the angle is identical to −t 1,−t 2; however, the orientation is 180° opposite as in Fig. 2. While making a single orientation, (θ is on the first/fourth quadrant), when the result is t1 <0, θ + π is the only orientation of ϕ that rotates the embedded binary image clockwise towards the only orientation, and this phenomenon is referred to as discrete rotation.

$$ \begin{array}{clcr} &t_{1} = {\sum\limits_{(x,y)\in A}f_{0}(x,y)/(x-\bar{x})} \\ &t_{2} = {\sum\limits_{(x,y)\in A}f_{0}(x,y)/(y-\bar{y})} \\ &A = \lbrace(x,y) | \sqrt{(x-\bar{x})^{2} + (y-\bar{y})^{2}} <= 2s\rbrace \end{array} $$
Fig. 2

Orientation. Identification of rotation for embedded binary image

$$ \begin{array}{clcr} &\theta = \text{arctan}(t2/t1)\\ &\phi= \left\lbrace \begin{array}{clcr} \theta + \pi &, \text{if}~ t_{1} < 0\\ \theta &, \text{otherwise}\\ \end{array} \right. \end{array} $$

Embedding the watermark

As stated earlier, in the NSCT approach, the captured image has blurred texture caused by the relatively high-frequency components. Therefore, the tree split approach is employed to safeguard the watermark from being embedded into the high-frequency region. However, voting is the technique to capture the feature points, for achieving better accuracy. Primarily, the edge detection is performed on the lowpass sub-band of the feature region for determining the location of the high-frequency details, by employing the canny edge detector [28]. If the region has no high-frequency details, then the region is not segregated. On the other hand, if the region contains the high-frequency information, then the region is segregated into 2×2 blocks and all blocks are completely analyzed. The analysis is performed by using the recursive method; it gets accomplished, when the segregated block size is lesser than half size of the minutest feature region and this process is referred to as “tree split.” Once the segregation process gets over, the watermarked image is modified into the similar size as each block by means of bicubic interpolation. All bits are embedded into analogous coefficients of f 0. The block size is adjusted, to prevent the watermark from getting extensively distorted. Hence, the block size should be identical to the half size of the smallest feature region. The quantization process is similar to the technique used by Li [27], due to fact that the coefficient of lowpass sub-band ranges between 0 255, and Δ is the quantization step. The coefficient can be computed using the formula 255/ Δ (255 is divided by Δ) to obtain the result. When the resultant coefficient is a odd number, it is substituted in sub-formula 1 and when the resultant coefficient is an even number, it is substituted in sub-formula 0. In formula (3), f 0(x,y) represents the number of coefficients.

$$ {}Q(x,y) = \left\lbrace \begin{array}{clcr} 0, &\text{if}~ k\Delta\leq f_{0}(x,y) < (k+1)\Delta~ \text{for}~k = 0, 2,... \\ 1, &\text{if}~ k\Delta\leq f_{0}(x,y) < (k+1)\Delta~ \text{for}~ k = 1, 3,...\\ \end{array} \right. $$

The original coefficient is assigned to the corresponding number by using the embedded information. The coefficients are represented in the numbers as depicted in formula (3). Consequently, the coefficient has to be set in the center of the corresponding quantization. Hence, the quantization noise should be computed before finding the deviation in formula (4).

$$ r(x,y) = f_{0}(x,y) - \lfloor f_{0}(x,y)/\Delta\rfloor * \Delta $$

Furthermore, the outcome of formulas (3) and (4) is employed to compute formula (5). w i indicates the bit of the watermark process. The ultimate result of the coefficient is given by formula (6). Figure 3 illustrates the embedded diagram.

$$ {}u(x,y) = \left\lbrace\!\! \begin{array}{clcr} -r(x,y)+0.5\Delta,& \text{if}~ Q(x,y) = w_{i}\\ -r(x,y)+1.5\Delta,& \text{if}~ Q(x,y) \neq w_{i}, r(x,y) > 0.5\Delta\\ -r(x,y)-0.5\Delta,& \text{if}~ Q(x,y) \neq w_{i}, r(x,y) \leq 0.5\Delta\\ \end{array} \right. $$
Fig. 3

Embed diagram. The detailed diagram for the embedded binary image

$$ \grave{f_{0}}(x,y) = f_{0}(x,y) + u(x,y) $$

Extraction phase

Analogous to the embedding process, primarily, the orientation of the feature region is computed and then the tree-split approach is applied to the post-NSCT region. In the tree split process, assuming that if the region is segregated into n blocks, the formula (3) is applied to every block for obtaining the corresponding watermark bit information. Moreover, this process would result in the capturing of n watermark images. Besides, the segregated blocks containing no high-frequency information are utilized for voting the matching bit; and lastly, the block corresponding to the matching bit is found. The standard of voting determines the captured bit, if 0 is voted, so the matching bit would be 0; otherwise, it would be 1. If each block encompasses high-frequency information, so vote for bits is based on no high-frequency information with its matching position. Once the watermark image is obtained, it is scaled to four times larger in size and voting and scaling are repeated as aforementioned, until it reaches the top of the tree. The watermark is obtained, after accomplishing the abovementioned process. Further, the watermark is rotated anticlockwise, depending on the orientation of that region, which is termed as discrete rotation.

Nevertheless, when the image is tampered due to rotation, so the obtained square region shall not be the region assimilated during the embedding process and it will result in capturing distortion. Figure 4 b depicts the variation in the embedded information, when the image is rotated by 50°. It is apparent from the figure that the captured image of each block is diverse from the segregated portions during the embedding process. In order to resolve this issue, the image rotation is used to ascertain the best angle. Each unit is assigned with a value of 10°. Figure 4 c, d represents the captured position of each rotated angle. It can be noticed that after the rotation, the capture region and the embedded region look dissimilar. Moreover, while comparing two captured blocks at a time, it is apparent that the consistency should be greater than τ for proceeding with the voting process, which is determined by a trial-and-error method on a training image.

Fig. 4

The example of rotation search. a is the feature region without rotation and the location of embedded watermark. b is the captured feature region with 50° rotation, in comparison with the feature region without rotation. They do not match with each other. c is the captured feature region after being re-rotated at 20°, in comparison with the feature region without rotation. They do not match with each other. d is the captured feature region after being re-rotated at 40°, in comparison with the feature region without rotation. They have a great matching

When the voting process is accomplished, once again the watermark images are compared. Furthermore, if the angle is found to be perfect, then the result is collected from diverse regions. The extraction flow diagram presenting the procedure to obtain the final watermarked image is shown in Fig. 5. The error angle is segregated as shown in Fig. 6 b, e. The black portion indicates the matching and white signifies the unmatched parts. Based on these characteristics, the morphology closing for discarding the segregated black portions that are matched. (Closing indicates the process of dilation (Eq. 7) which is done first and then process of erosion (Eq. 8) subsequently.) A represents the region for dilation, and B signifies the structuring elements. Dilation means that B is a circle on the pixels around A as continuum, to make A larger. Erosion means the region in which A deducts B, to make B smaller, as illustrated in Fig. 6. Figure 6 c, d, f, g depicts the result of performing the closing operation on two different angles. The biggest black region is known as the connected component and the angle of biggest connected component is the accuracy angle. The region is searched based on the accuracy angle and the subsequent step is to search the accuracy angle with a deviation of ± 6° for obtaining the best rotated angle. If the deviation is under 1°, when compared with the confirmed angle, the region between ± 6° is fixed as the accuracy angle.

$$ D(A,B) = {\bigcup_{b\in B}A+b} $$
Fig. 5

Extraction flow diagram. The flow chart for the extraction procedure to obtain the final watermarked image

Fig. 6

The examples of using morphology to determine the maximum connected component. a is the morphology diagram. b, c, and d are not tampered by rotation and are re-rotated at 0°. They are the result after doing morphology closing on the perfect angle of watermark. e, f, and g are not tampered by rotation but are re-rotated at 10°. They are the result after doing morphology closing on the error angle of watermark. The black part is the connected component. b Exclusive-or. c dilation. d erosion. e Exclusive-or. f dilation. g erosion

$$ E(A,B) = {\bigcap_{b\in -B}A+b} $$

Experimental results and discussion

In our experiments, nine images from the USC-SIPI database ( are deployed (Fig. 7). These images have a 512×512 resolution, and the watermark embedded on it is a 64×64 binary image, as shown in Fig. 7 j. The N feature regions for embedding are obtained, once the SIFT process gets over. In our experiments, the value of N is set as five, which is determined based on a trial-and-error method on a training image. Therefore, five feature regions are essential to proceed with the further process. In case of an overlap between these regions, then the process is continued until five disjoint feature regions are obtained. The value of N is fixed based on the requirements of the model. Moreover, if the value of N is larger, then more feature regions would be essential for information embedding, consequently leading to a superior chance of capturing watermarks with enhanced quality. Nevertheless, tampering effect on the original image would be highly intense, if there are more regions with embedded information. Additionally, computed peak signal-to-noise ratio (PSNR) dB values would be lower.

Fig. 7

Nine test images and a watermark binary image. Test images images with a 512×512 resolution, and the watermark embedded as a 64×64 binary image. a Lenna. b aerial. c airfield. d lake. e Goldhill. f Barbara. g Baboon. h elaine. i peppers. j ntrust

The proposed method is compared with Li [27] and Patra [7]. Moreover, based on the following three major criteria, the performance of the proposed algorithm can be found: (1) the perfection of watermark under diverse tampering conditions (i.e., the robustness); (2) the changeability of the information embedded in the image (PSNR); (3) the amount of capacity offered by an image. After performing the robust experiments, the normalized Hamming similarity (NHS) is computed. The NHS formula is depicted in (9). w(i) and \(\overline {w}(i)\) correspondingly illustrate the original watermark and post-captured watermark; M represents the total amount of bits in the watermark image; signifies the exclusive-or operation.

$$ \text{NHS} = 1-\frac{1}{M}\left\lbrace\sum\limits_{i=0}^{M-1}\left[ w(i)\oplus\overline{w}(i)\right]\right\rbrace*100 $$

In order to tamper the image, the following values are set: JPEG quality factor 10 90%, rotation angle 10° 90°, scaling ×0.65 to ×1.75, median filter 3×3 mask 9×9 mask, shearing X and Y are 1 10%. Therefore, for capturing the watermark from the N feature region and also for accomplishing the NHS computation, select the largest NHS value to be W, as illustrated in (10).

$$ W = \max(\text{NHS}_{1}, \text{NHS}_{2},..., \text{NHS}_{N}) $$

For instance, considering Li’s [27] approach, it can be observed that the lowpass sub-bands of NSCT are embedded directly. Hence, the captured watermark would produce blurred texture caused by the relatively high-frequency components, as a result of diverse texture details. In Fig. 8, the subplots 8 a and b portrays the Lenna image’s extracted watermark of Li’s [27] approach and our proposed method, respectively. Furthermore, in our proposed method, the tree split algorithm is employed to find out the regions with high-frequency information and such regions are not embedded. Additionally, the voting mechanism is deployed, in order to preserve the perfect nature of the watermark, thereby resolving the high-frequency problem.

Fig. 8

The results of watermarking. The results of watermark extracted by a Li’s [27] method (Lenna). b Proposed method (Lenna). c Li’s [27] method (peppers). d Proposed method (peppers)

With the purpose of resisting the image tampering, such as rotation and deformation, the SIFT-produced orientations are not unique. The proposed method provides a unique orientation depending on their content, thereby it can be positioned perfectly. Moreover, the notion of rotation searching and morphology is employed to determine the exact capture angle. From the computations done so far, it is apparent that our method is superior in terms of capture quality than Li’s [27] approach, and it can resist all forms of tampering. It can be noticed from Fig. 9 that the blue line specifies our W under diverse tampering; red line signifies the W of Li’s [27] approach; black line depicts the W of Patra’s [7] model. Furthermore, it can be witnessed that, for the tampering with diverse JPEG quality factor,the proposed method provides superior results than that of Li [27], and the maximum W is greater than 0.885. When compared with Patra’s [7] model, predominantly, while processing the JPEG with quality factor greater than 50, the proposed method underperforms. On the other hand, when the quality factor is lesser than 50, the proposed method is comparatively superior than that of Patra’s [7] model. Further, since the notion of SIFT feature searching is deployed in the proposed method, it is superior than that of Patra’s [7], while resisting rotation and deformation.

Fig. 9

The maximum NHS comparison among various attacks for Lenna image. Results for JPEG compression, rotation attacks, scaling attacks, median filter, and Y and X shearing percentages of the proposed approach in comparison with Li [27], Patra [7], and Duman [11]. a JPEG compressiion. b rotation attacks. c scaling attacks. d median filter. e shearing Y %. f shearing X %

Considering the rotation experiment, even though certain angles of the proposed method are lower than that of Li’s [27] approach, the average angle W is fairly superior to that of Li’s [27] model. While taking into account about the scaling experiment, as the feature scale of Li’s [27] model is chosen from the fixating regions, if the scaling factor is zoomed, it is dubious to capture feature regions obtained from productive embedding. In the median filter experiment, due to the watermark capture failure, W of the proposed method is superior to Li’s [27] model for diverse masks. While comparing our proposed method for the Lenna experimental map with Patra’s [7] model, it can be seen that the proposed method yields better results. Moreover, apart from comparing the value of W, the change of capacity and embedding of Patra [7] segregate the image into blocks and one block was made as a unit for embedding the watermark. The size of the block was 8×8; therefore, Patra [7] was able to embed 4096 bits into a 512×512 image. However, more fixing and modifications of the image happen in Patra’s [7] approach. In our proposed method, an amalgamation of the SIFT and NSCT approaches are utilized, in which most of the image details are unchanged and only a certain region of the image is modified. Furthermore, all coefficients in a region are permitted to be embedded. Additionally, as portrayed in Table 1, after the information embedding, the proposed method’s watermarked image has almost 1.8 dB greater PSNR than Patra’s [7], and as displayed in Table 2, the proposed method’s capacity can reach 17,689 bits (in the maximum feature region with a size of 133×133). The detailed results of the other experiments are illustrated in Tables 3, 4, 5, 6, 7, and 8.

Table 1 Image quality evaluation based on PSNR
Table 2 Image capacity evaluation
Table 3 Normalized Hamming similarity results for JPEG attack
Table 4 Normalized Hamming similarity results for rotation attack
Table 5 Normalized Hamming similarity results for scaling attack
Table 6 Normalized Hamming similarity results for median filter attack
Table 7 Normalized Hamming similarity results for Y shearing % attack
Table 8 Normalized Hamming similarity results for X shearing % attack

It can be clearly seen from the JPEG experiment in Table 3 that the value of W for the pepper image, with JPEG quality 40, is lower than Li’s [27] approach. Besides, there is no disruption due to high-frequency noises, and the watermark captured by the proposed method is more pleasant for the human vision. Furthermore, as displayed in Fig. 8 c, d these are the captured watermark images of JPEG quality 40 tampered peppers, obtained by employing Li’s [27] and the proposed method, respectively. It is apparent from the rotation experiment in Table 4 that all the output images of the proposed method have better angle averages W than the other approaches. Nevertheless, it can be noticed from the scaling experiment in Table 5, when there is an progressive increase in the image scaling, the value of W computed by Li’s [27] model, would drop down steadily. On the other hand, the proposed method is very robust to scaling. While considering the case of the image Elaine, even though the W average of the proposed method is mediocre under diverse scaling conditions, the W average of Li’s [27] becomes slowly declining, when there is ×1.5 scaling, and the deterioration is faster for larger scaling values. Henceforth, in general, the proposed method is comparatively superior than Li’s [27] model for the scaling experiment. It is apparent from the results of the median experiment shown in Table 6, although the W values are inferior for Goldhill, Barbara, and pepper images, but the results are extremely superior for the other six images. Moreover, the shearing results of the proposed method are identical to the other approaches. Generally, in case of non-deforming tampering processes like zipping and blurring, the SIFT approach offers better accuracy in capturing the feature points and the proposed model provides superior capture quality. Even though there are few losses in the certain feature points due tampering effects like deforming, the watermark can be captured from the other feature regions that are stable. Finally, in the case of majority of the image tampering processes, the proposed approach preserves a great robust capability.


Earlier, the concept of deformation was a very big issue for watermarking technologies. Nevertheless, this issue was addressed after the assimilation of feature searching into the watermarking models. Moreover, the notion of feature searching is inadequate to sustain deformation; enhancing the capacity of the watermarking scheme is the other fact that requires consideration. Consequently, NSCT approach offers greater capacity for further progression in the watermarking technologies. The experimental results have proven that the NSCT approach yields superior capacity, in comparison with other DCT-based watermarking schemes, even though the amalgamation of both SIFT and NSCT was explored by Li [27], which offers a high capacity and robust results. However, while considering the quality of the captured information, there are further challenges like blurred texture caused by the relatively high-frequency components. Therefore, in this work, the proposed method includes the concepts of tree split, voting, rotation searching, and morphology, and this resolves the issues caused by high-frequency noises in NSCT computing, thereby greatly improving the image quality.


  1. 1

    J Sanchez-Riera, K-L Hua, Y-S Hsiao, T Lim, SC Hidayati, W-H Cheng, A comparative study of data fusion for RGB-D based visual recognition. Pattern Recogn. Lett.73:, 1–6 (2016).

  2. 2

    W-H Cheng, C-W Wang, J-L Wu, Video adaptation for small display based on content recomposition. IEEE Trans. Circ. Syst. Video Technol.17(1), 43–58 (2007).

  3. 3

    T Lim, K-L Hua, H-C Wang, K-W Zhao, M-C Hu, W-H Cheng, in Multimedia Signal Processing (MMSP), 2015 IEEE 17th International Workshop On 2015 Oct 19. Vrank: voting system on ranking model for human age estimation (IEEE, 2015), pp. 1–6.

  4. 4

    K-L Hua, C-H Hsu, SC Hidayati, W-H Cheng, Y-J Chen, Computer-aided classification of lung nodules on computed tomography images via deep learning technique. OncoTargets Ther.8:, 2015–2022 (2015).

  5. 5

    GC Langelaar, RL Lagendijk, Optimal differential energy watermarking of DCT encoded images and video. IEEE Trans. Image Process.10(1), 148–158 (2001).

  6. 6

    TK Das, S Maitra, J Mitra, Cryptanalysis of optimal differential energy watermarking (DEW) and a modified robust scheme. IEEE Trans. Image Process.53(2), 768–775 (2005).

  7. 7

    JC Patra, JE Phua, D Rajan, DCT domain watermarking scheme using Chinese remainder theorem for image authentication. IEEE Int. Conf. Multimed. Expo., 111–116 (2010).

  8. 8

    H Ling, Z Lu, F Zou, Improved differential energy watermarking (IDEW) algorithm for DCT encoded image and video. IEEE Int. Conf. Signal Process.3:, 2326–2329 (2004).

  9. 9

    Z Wei, K N.Ngan, Spatial just noticeable distortion profile for image in DCT domain. IEEE Int. Conf. Multimed. Expo., 925–928 (2008).

  10. 10

    Y Niu, J Liu, S Krishnan, Q Zhang, Combined just noticeable difference model guided image watermarking. IEEE Int. Conf. Multimed. Expo., 1679–1684 (2010).

  11. 11

    O Duman, O Akay, A new method of wavelet domain watermark embedding and extraction using fractional Fourier transform. IEEE Int. Conf. Electr. Electron. Eng. (ELECO), 187–191 (2011).

  12. 12

    S Ababneh, R Ansari, A Khokhar, Compressed-image authentication using compensated signature embedding. IEEE Int. Conf. Electro/Information Tech., 476–481 (2007).

  13. 13

    S Ababneh, A Khokhar, R Ansari, Compensated signature embedding based multimedia content authentication system. IEEE Int. Conf. Image Process., 393–396 (2007).

  14. 14

    S Ababneh, R Ansari, A Khokhar, Improved image authentication using closed-form compensation and spread-spectrum watermarking. IEEE Int. Conf. Acoust. Speech and Signal Process., 1781–1784 (2008).

  15. 15

    H Yuan, X-P Zhang, A multiscale fragile watermark based on the Gaussian mixture model in the wavelet domain. IEEE Int. Conf. Acoust. Speech Signal Process., 413–416 (2004).

  16. 16

    H Yuan, X-P Zhang, A secret key based multiscale fragile watermark in the wavelet domain. IEEE Int. Conf. Multimedia Expo., 1333–1336 (2006).

  17. 17

    X Wang, J Wang, H Peng, A semi-fragile image watermarking resisting to JPEG compression. IEEE Int. Conf. Manag. e-Commerce e-Government., 498–502 (2009).

  18. 18

    T Huynh-The, O Banos, S Lee, Y Yoon, T Le-Tien, Improving digital image watermarking by means of optimal channel selection. Expert Syst. Appl.62:, 177–189 (2016). doi:

  19. 19

    X-y Wang, P-p Niu, H-y Yang, C-p Wang, A-l Wang, A new robust color image watermarking using local quaternion exponent moments. Inf. Sci.277:, 731–754 (2014).

  20. 20

    AM Abdelhakim, HI Saleh, AM Nassar, Quality metric-based fitness function for robust watermarking optimisation with bees algorithm. IET Image Process.10(3), 247–252 (2016). doi:

  21. 21

    KC Choi, CM Pun, in 2016 13th International Conference on Computer Graphics, Imaging and Visualization (CGiV). Difference expansion based robust reversible watermarking with region filtering (IEEEMorocco, 2016), pp. 278–282. doi:

  22. 22

    M Moosazadeh, A Andalib, in 2016 Second International Conference on Web Research (ICWR). A new robust color digital image watermarking algorithm in DCT domain using genetic algorithm and coefficients exchange approach (University of Science and CultureTehran, 2016), pp. 19–24. doi:

  23. 23

    S Kadu, C Naveen, VR Satpute, AG Keskar, in 2016 IEEE Students’ Conference on Electrical, Electronics and Computer Science (SCEECS). A blind video watermarking technique for indoor video content protection using discrete wavelet transform (Bhopal, 2016), pp. 1–6. doi:

  24. 24

    T-H Tsai, W-H Cheng, C-W You, M-C Hu, AW Tsui, H-Y Chi, Learning and recognition of on-premise signs from weakly labeled street view images. IEEE Trans. Image Process.23(3), 1047–1059 (2014).

  25. 25

    DG Lowe, Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis.60(2), 91–110 (2004).

  26. 26

    AL Cunha, J Zhou, MN Do, The nonsubsampled contourlet transform: theory, design, and application. IEEE Trans. Image Process.15(10), 3089–3101 (2006).

  27. 27

    L Li, J Qian, J-S Pan, High capacity watermark embedding based on local invariant features. IEEE Int. Conf. Multimedia Expo., 1311–1314 (2010).

  28. 28

    John Canny, A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell.PAMI-8(6), 679–698 (1986).

Download references


This work is partly supported by Ministry of Science and Technology of Taiwan under Grants MOST103-2221-E-011-105 and MOST104-2221-E-011-091-MY2.

Authors’ contributions

KH and KS proposed the framework of this work and drafted the manuscript; BD and YH carried out the algorithm studies, participated in the simulation, and helped draft the manuscript. VS participated in the discussion, corrected the English errors, and helped polish the manuscript. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Author information

Correspondence to Kathiravan Srinivasan.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark


  • Watermarking algorithm
  • SIFT
  • NSCT
  • Morphology
  • Mobile images