Skip to main content

Adaptive and separable multiary reversible data hiding in encryption domain

Abstract

To ensure the security of digital information, information needs to be stored and processed in the encryption domain during the cloud storage process, during which the user does not allow the cloud provider (CP) to access the image contents, and the CP must embed additional information in the image. Therefore, reversible data hiding in an encryption domain (RDHEI) has emerged. This paper presents an adaptive and separable multiary RDHEI method. First, the image is divided into two parts that consist of reference pixels and non-reference pixels. Next, we compute the prediction errors of the non-reference pixels, followed by a replacement of the original non-reference values with the computed prediction errors. Then, the entire image is encrypted with a specially designed stream cipher strategy so that the CP can embed additional information by modifying the prediction errors without knowing the original image. In addition, although our method vacates space before encryption, the CP can adaptively control the reserved space, unlike traditional RDHEI schemes which also vacate space before encryption. Because our method is separable, the embedded message can be extracted from both encryption and plaintext domains accurately. The experimental results demonstrate the efficacy of the proposed method.

1 Introduction

Data hiding technology can achieve the purposes of secret transmission and content authentication by embedding secret information into the host [1,2,3,4]. As a significant branch of data hiding, reversible data hiding (RDH) is an algorithm that embeds additional data into a specific payload, such as image, video, or audio signals, and recovers the additional data and the payload without any loss. Traditional RDH can be roughly divided into four categories: (1) lossless compression (LC) [5], in which the space for data hiding is generated by LC of the image, (2) histogram shifting (HS) [6, 7], in which the embedding space is generated by shifting the gray-scale histogram of the image, (3) difference expansion (DE) [8] [9], in which the differences between nearby pixels are expanded to reserve space, and (4) prediction error expansion (PEE) [10,11,12,13,14], in which the space for embedding is generated by shifting the prediction error histogram (PEH).

All the methods mentioned above were applied in the plaintext domain. However, with the rapid development of Internet technology and cloud storage applications, a new requirement for RDH has emerged. On the one hand, many images must be uploaded to a cloud server, followed by message embedding and other post-processing procedures; on the other hand, the content of the images should not be accessible by the cloud provider (CP), especially in the case of critical and private images, such as medical images that are private, and military images that relate to national security. In order to deal with this paradox, RDH has inevitably emerged in the encryption domain. While original images must be protected and recovered losslessly, we expect to achieve the highest embedding capacity. In order to obtain an optimal balance of security and embedding capacity, several studies have been carried out. The existing RDH in encrypted images (RDHEI) can be mainly divided into two categories: methods based on reserving room before encryption (RRBE), and methods based on vacating space after encryption (VRAE).

For an RRBE-based method, Ma et al. [15] proposed an RDHEI method that reserved a room for hiding by embedding some pixels in the original image using a traditional RDH algorithm. Zhang et al. [16] proposed calculating the prediction error to replace the image pixels, followed by an encryption with a traditional encryption algorithm so the data hider can embed the additional data by modifying the prediction error. Cao et al. [17] provided a method that implemented a patch-level sparse representation technique to create vacated space for data hiding. Xu and Wang [18] proposed a scheme to enable a traditional PEE strategy in the plaintext domain to be effective in the encryption domain. Yi and Zhou [19] provided a binary-block embedding strategy to improve embedding capacity and visual quality in marked decrypted images. Li et al. [20] proposed a separable RDH scheme, in which encryption quality was improved by combing the permutation and stream cipher during the encryption process, and the embedding rate (ER) was increased by replacing pixels with their corresponding prediction errors.

For a VRAE-based method, Qian et al. [21] proposed a method based on progressive recovery that consisted of three agents, including the content owner, the data hider, and the recipient. In [22], the use of HS with a public-key cryptosystem was implemented to realize RDHEI. Agrawal and Kumar [23] presented a method based on additive modulo 256 and utilized a property of the mean to hide data. Singh and Raman [24] proposed an RDHEI scheme based on a Chinese remainder theorem (CRT)-based sharing scheme to transmit media information over cloud architecture and prove its original ownership. Xiao et al. [25] proposed a separable RDHEI method by utilizing pixel value ordering to embed secret data in each block in an additive homomorphic encrypted image. Liu and Pun [26] presented a redundant space transfer (RST) scheme to create redundant space for data hiding in which traditional RDH technology could be applied. Qin et al. [27] presented an RDHEI method in which owners used an analog stream cipher and block permutation to encrypt blocks of original images that did not overlap, and the data hider could classify blocks and use data hiding keys when embedding. Khelifi et al. [28] used a special encryption method for original images, and then compressed the encryption image to a bit series to create space for data hiding. Wu et al. [29] presented a method applied in homomorphic encrypted images so that some of the hidden data could be extracted in the encryption domain, and the rest could be extracted in the plaintext domain. Fu et al. [30] encrypted a host image via block scrambling and a stream cipher, followed by an adaptive compression of the most significant bit (MSB) layer to vacate space for data hiding. Qin et al. [31] scrambled an image with three different levels and vacated the embedding space by using sparse matrix coding to compress the least significant bit (LSB) of the encrypted image.

From the researches mentioned above, RDHEI has developed to a certain degree. However, unlike traditional RDH in the plaintext domain, there are still many lines of research to follow. Motivated by Xu’s work [18], which was devoted to applying the traditional PEE method in the encryption domain, in this work, we propose a method for improving encryption quality and the ER. Specifically, compared with [18], the proposed method offers the following three improvements: (1) in [18], the vacated space for data hiding was determined by the image owner; in our work, the image owner needs only to encrypt the image, while the work of creating space is conducted by the data hider, thereby reducing the workload. (2) A new encryption progress is proposed so that the encrypted image is less noticeable and better able to resist attacks from hackers. (3) A new method is presented to increase the ER by shifting the PEH based on the unique characteristics of the encryption domain.

This article is organized as follows: Section 2 describes the proposed method in detail, Section 3 presents the experimental results of the proposed method, and finally, Section 4 concludes this paper.

2 Methods/experimental

In this section, we propose a separable RDHEI scheme, which mainly consists of prediction and replacement, encryption, data hiding, data extraction, and image recovery. Specifically, the image owner uses our method to encrypt the image and vacate space for data hiding during a subsequent procedure, which is conducted by the CP. The CP then uses the reserved room to embed data. If he does not have the encryption keys, he can hide data but he will not access the original content, so the original owner can keep his images safe and the data hider can guarantee his rights by embedding additional information. A flow diagram of encryption and data hiding is briefly shown in Fig. 1.

Fig. 1
figure 1

Flow diagram of encryption and data hiding

2.1 Prediction and replacement

Suppose that the original image X is an 8-bit gray-scale image with a size of M × N, and its pixels are denoted as X(i, j), 1 ≤ i ≤ M, 1 ≤ j ≤ N. First, we divide all the pixels using a checkerboard pattern into two sets, defined as the reference and non-reference pixels. In detail, the pixels in odd rows and odd columns are selected as reference pixels, and the other pixels are non-reference pixels. Figure 2 shows the division between reference and non-reference pixels, which are denoted in gray and white, respectively. To predict one non-reference pixel, we tend to use the four adjacent reference pixels for the sake of accuracy. Depending on whether there are enough reference pixels around the non-reference pixels, all non-reference pixels are then further divided into two sets. For some non-reference pixels, there is a sufficient number of reference pixels around them, and we denote them as Δ. For other non-reference pixels, due to the lack of adjacent reference pixels, we use the prediction results of their adjacent non-reference pixels as references, and we denote these pixels as Ω.

Fig. 2
figure 2

Division of pixels for prediction

The prediction process is executed following an order that predicts the pixels belonging to Δ first, followed by the prediction of pixels belonging to Ω. Their corresponding prediction values P(i, j) can be computed as

$$ P\left(i,j\right)=\mathrm{round}\left(\left[X\left(i-1,j-1\right)+X\left(i+1,j-1\right)+X\left(i+1,j+1\right)+X\left(i-1,j+1\right)\right]/4\right),\mathrm{if}\ X\left(i,j\right)\in \Delta, $$
(1)

and

$$ P\left(i,j\right)=\mathrm{round}\left(\left[X\left(i-1,j\right)+X\left(i+1,j\right)+P\left(i,j-1\right)+P\left(i,j+1\right)\right]/4\right),\mathrm{if}\ X\left(i,j\right)\in \Omega, $$
(2)

Then, we compute the prediction error E(i, j) as

$$ E\left(i,j\right)=X\left(i,j\right)-P\left(i,j\right),\mathrm{if}\ X\left(i,j\right)\in \left(\Delta \cup \Omega \right). $$
(3)

Note that in theory, the value range of E(i, j) is from − 255 to 255, which requires a nine-bit number to represent. However, according to our observation, to correctly show an image, E(i, j) generally ranges from − 127 to 127. Thus, in our work, we continue to use 8 bits to represent the prediction error. Here, we only use the least significant 7 bits to show the absolute value of errors, and the MSB to indicate the sign, i.e., “0” and “1” indicate positive and negative signs, respectively. Because the prediction error represented in our work only varies from − 127 to 127, prediction errors out of range of the set are modified as

$$ {E}^{\prime}\left(i,j\right)=\left\{\begin{array}{c}E\left(i,j\right)-127,\kern0.75em \mathrm{if}\ E\left(i,j\right)>127,\\ {}E\left(i,j\right)+127,\kern0.75em \mathrm{if}\ E\left(\mathrm{i},j\right)<-127,\\ {}E\left(i,j\right),\kern0.75em \mathrm{otherwise}.\end{array}\right. $$
(4)

Note that we reserved (10000000)2, and use (00000000)2 to represent the particular case E'(i, j) = 0. We record the coordinates of the prediction errors, which are initially outside the range of [− 127, 127], and save them as a location map, denoted as O. Finally, we replace all the non-reference pixels with their corresponding modified prediction errors E'(i, j) to generate a new image, denoted as I. It is worth pointing out that the prediction criterion in our work is not limited to Eqs. (1) and (2), and any precise prediction strategy can be used to further improve the efficiency of our entire performance.

2.2 Image encryption

First, we only shuffle E'(i, j) in I via an encryption key KEY1 to convert E'(i, j) to E''(i, j). For an easier description, we denote the reference pixels as Y(i, j). We use a stream cipher to encrypt Y(i, j) and E''(i, j) separately. Specifically, in our work, we use another encryption key, KEY2, to generate a pseudorandom matrix R of size M × N, in which each element is a 7-bit number and denoted as R(i, j). We only encrypt the lowest 7-bit plane of Y(i, j) as

$$ {Y}^{\prime}\left(i,j\right)=Y\left(i,j\right)\bigoplus R\left(i,j\right), $$
(5)

where Y'(i, j) represents the encrypted value of Y(i, j), and indicates the exclusive or (XOR) operation. Meanwhile, to encrypt E''(i, j), we need to find two peak points, Tp and Tn, in advance. E''(i, j) is then encrypted as

$$ \overset{\sim }{E}\left(i,j\right)=\left\{\begin{array}{c}{E}^{\prime \prime}\left(i,j\right),\mathrm{if}\ {E}^{\prime \prime}\left(i,j\right)={T}_{\mathrm{p}}\ \mathrm{or}\ {T}_{\mathrm{n}},\\ {}{E}^{\prime \prime}\left(i,j\right)\bigoplus R\left(i,j\right),\kern0.75em \mathrm{otherwise},\end{array}\right. $$
(6)

where \( \overset{\sim }{E}\left(i,j\right) \) represents the encrypted value of E''(i, j). It must be pointed out that if \( \overset{\sim }{E}\left(i,j\right) \) equals Tp or Tn after the XOR operation, we need to add one or subtract one to avoid confusing the original pixels with E''(i, j) = Tp or Tn, and we use a new location map O1 to record those \( \overset{\sim }{E}\left(i,j\right) \).

Next, we extract the MSB plane of Y'(i, j). Both the MSB plane and the location maps O and O1 are next compressed via LC coding. Here, arithmetic coding is used in our work because all the elements for compression are made up of “0”s and “1”s. We then replace the MSB of Y'(i, j) with the following parts: compression code of the original MSB plane of Y'(i, j) and its length, and compression codes of O and O1 and their lengths. We denote the replacement version of Y'(i, j) as Y''(i, j).

Finally, we composite \( \overset{\sim }{E}\left(i,j\right) \) and Y''(i, j) to generate the final encrypted image, denoted as I', and send it to the cloud server. Fig. 3a, b shows the PEHs before and after the stream cipher, respectively. As can be seen in Fig. 3b, the values approximately obey a uniform distribution after the stream cipher, except for the reserved space.

Fig. 3
figure 3

PEH before and after stream cipher

To explain the security of our encryption methods, we suppose that there is an image of size 512 × 512 running through the entire encryption process. In the first part of the encryption, there are \( 512\times 512\times \frac{1}{4}=\mathrm{65,536} \) reference pixels that remain unchanged during the shuffling, and \( 512\times 512\times \frac{3}{4}=\mathrm{196,608} \) non-reference pixels that are rearranged. Therefore, there are \( {P}_{196608}^{196608} \) possibilities in total after rearrangement, which is a rather large number for probability analysis. In the second part of the stream cipher, we suppose that 25% of pixels are working as payload, and the remaining 75% of pixels are XOR with a random 7-bit sequence produced by KEY2. As a result, there are 512 × 512 × 0.75 × 27 = 25,165,824 possibilities after the stream cipher. After running through the entire encryption process, there are \( {P}_{196608}^{196608}\times \mathrm{25,165,824} \) possibilities in total that are different from the original image. It is difficult or even impossible to detect only one original image from such a multitude of probabilities. In addition, Khelifi et al. [28] presented a mathematical security analysis of shuffling followed by a stream cipher.

2.3 Data hiding

After receiving I', due to the lack of KEY1 and KEY2, the CP cannot access to the original image. However, he can embed data in \( \overset{\sim }{E}\left(i,j\right) \) without any knowledge of the original image. In this paper, owing to the elaborately designed encryption method, applying PEE in the encryption domain becomes feasible. The three steps of our data hiding strategy are listed in detail as follows.

Step 1: The data hider selects the multiary parameter Q, which is suggested to be set to 2, 3, and 4 according to the actual embedding requirement. The reason why Q is recommended to be set to 2, 3, and 4 is that we need to shift histogram during the data embedding process and it will cause inevitably underflow/overflow issues. In our method, we use a location map to deal with these issues. If Q is set to be higher than 4, the size of the location map will increase correspondingly and the embedding capacity might be influenced by the limited space in the bit plane of reference pixels. Before embedding, to increase the security of the proposed method, a new key KEY3 is used to shuffle the to-be-embedded message and transfer the message to the corresponding Q-ary format. Note that our method belongs to RRBE, and if we encrypt the message before the embedding phase, all we need to do is extract the encrypted message and then decrypt this message in a reverse manner. The peak points Tp and Tn from the histogram of \( \overset{\sim }{E}\left(i,j\right) \) are then searched for.

Step 2: Shift the histogram of \( \overset{\sim }{E}\left(i,j\right) \) to vacate space for data hiding as

$$ {\overset{\sim }{E}}^{\prime}\left(i,j\right)=\left\{\begin{array}{c}\overset{\sim }{E}\left(i,j\right)+Q-1,\mathrm{if}\ \overset{\sim }{E}\left(i,j\right)>{T}_{\mathrm{p}},\\ {}\overset{\sim }{E}\left(i,j\right)-Q+1,\kern0.75em \mathrm{if}\ \overset{\sim }{E}\left(i,j\right)<{T}_{\mathrm{n}},\\ {}\overset{\sim }{E}\left(i,j\right),\kern0.75em \mathrm{otherwise}.\end{array}\right. $$
(7)

where \( {\overset{\sim }{E}}^{\prime}\left(i,j\right) \) stands for the shifted value for data hiding. Note that a new location map O2 is needed to record \( {\overset{\sim }{E}}^{\prime}\left(i,j\right) \) with underflow/overflow issues. O2 can then be compressed and embedded into Y''(i, j) via an MSB replacement, similar to O and O1.

Step 3: Embed the message as

$$ {\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right)=\left\{\begin{array}{c}{\overset{\sim }{E}}^{\prime}\left(i,j\right)+W,\kern0.75em \mathrm{if}\ {\overset{\sim }{E}}^{\prime}\left(i,j\right)={T}_{\mathrm{p}},\\ {}{\overset{\sim }{E}}^{\prime}\left(i,j\right)-W,\kern0.75em \mathrm{if}\ {\overset{\sim }{E}}^{\prime}\left(i,j\right)={T}_{\mathrm{n}},\\ {}{\overset{\sim }{E}}^{\prime}\left(i,j\right),\kern0.75em \mathrm{otherwise},\end{array}\right. $$
(8)

where \( {\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right) \) denotes the ultimate value after data hiding, and W stands for the Q-ary message. Finally, the marked encrypted image is composited by Y''(i, j) and \( {\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right) \).

For the ease of understanding, Fig. 4a, b shows two simple examples of the encryption and data embedding processes. In Fig. 4a, we suppose a 3 × 3 block through the prediction process, in which the pixels valued “1” and “0” are determined to carry additional data. We first apply KEY1 to shuffle the prediction errors, and then XOR the block with a random matrix generated by KEY2. Finally, we obtain an encrypted version of this block. Figure 4b illustrates the schematic diagram during the data hiding process when Q is set to 2, 3, or 4. Pixels with values that are not 1 and 0 are shifted by (Q − 1) to reserve space for data hiding and are denoted by the dashed lines in Fig. 4(b). Moreover, the pixels valued “1” and “0” are modified to carry additional Q-ary message data, which have been shuffled by KEY3 in advance to guarantee the data security, as shown in Fig. 4b by the solid lines.

Fig. 4
figure 4

Examples of encryption and data hiding

2.4 Data extraction and image recovery

Because the proposed scheme is separable, the restore process can be divided into two cases, which are described as follows.

Case #1: Data extraction before image recovery

To extract the data correctly, Q, Tp, and Tn are essential, and all these parameters can be observed from the histogram of \( {\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right) \). In conjunction with O2, which can be extracted from Y''(i, j), we can remove the pixels with overflow/underflow issues that may influence the data extraction. Then, W can be extracted as

$$ W=\left\{\begin{array}{c}{\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right)-{T}_{\mathrm{p}},\mathrm{if}\ {\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right)\in \left[{T}_{\mathrm{p}},{T}_{\mathrm{p}}+Q-1\right],\\ {}{T}_{\mathrm{n}}-{\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right),\mathrm{if}\ {\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right)\in \left[{T}_{\mathrm{n}}-Q+1,{T}_{\mathrm{n}}\right].\end{array}\right. $$
(9)

Eventually, we use KEY3 to restore the original message data.

After data extraction, the recovery process can be divided into four steps as

Step 1: Recover the histogram to the condition before embedding, as

$$ \overset{\sim }{E}\left(i,j\right)=\left\{\begin{array}{c}\begin{array}{c}{\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right),\mathrm{if}\ {\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right)\in \left({T}_{\mathrm{n}},{T}_{\mathrm{p}}\ \right),\\ {}{T}_{\mathrm{p}},\kern0.5em \mathrm{if}\ {\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right)\in \left[{T}_{\mathrm{p}},{T}_{\mathrm{p}}+Q-1\right],\\ {}{T}_{\mathrm{n}},\kern0.5em \mathrm{if}\ {\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right)\in \left[{T}_{\mathrm{n}}-Q+1,{T}_{\mathrm{n}}\right],\end{array}\\ {}{\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right)+Q-1,\mathrm{if}\ {\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right)<{T}_{\mathrm{n}}-Q+1,\\ {}{\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right)-Q+1,\kern0.5em \mathrm{otherwise}.\end{array}\right. $$
(10)

Step 2 Extract the compressed code of O, O1, and original MSB plane of Y'(i, j) from the MSB plane of Y''(i, j). We recover them by using the corresponding decompression method. Next, we replace the MSB plane of Y''(i, j) with the MSB plane of Y'(i, j) to recover Y'(i, j). The original reference pixels can then be recovered as

$$ Y\left(i,j\right)={Y}^{\prime}\left(i,j\right)\oplus R\left(i,j\right). $$
(11)

Step 3: Increase or decrease \( \overset{\sim }{E}\left(i,j\right) \) by one if its corresponding element in O1 is equal to one. We then process \( \overset{\sim }{E}\left(i,j\right) \) as

$$ {E}^{\prime \prime}\left(i,j\right)=\left\{\begin{array}{c}\overset{\sim }{E}\left(i,j\right),\kern0.75em \mathrm{if}\ \left(\overset{\sim }{E}\left(i,j\right)={T}_{\mathrm{n}}\ \mathrm{or}\ {T}_{\mathrm{p}}\right)\ \mathrm{and}\ \left({O}_1\left(i,j\right)=0\right),\\ {}\overset{\sim }{E}\left(i,j\right)\oplus R\left(i,j\right),\kern0.75em \mathrm{if}\ \left(\overset{\sim }{E}\left(i,j\right)={T}_{\mathrm{n}}\ \mathrm{or}\ {T}_{\mathrm{p}}\right)\ \mathrm{and}\ \left({O}_1\left(i,j\right)=1\right),\\ {}\overset{\sim }{E}\left(i,j\right)\oplus R\left(i,j\right),\kern0.75em \mathrm{otherwise},\end{array}\right. $$
(12)

where O1(i, j) is an element in O1. After the stream cipher, we use KEY1 to restore E'(i, j).

Step 4: Modify E'(i, j)according to O as

$$ E\left(i,j\right)=\left\{\begin{array}{c}{E}^{\prime}\left(i,j\right)-127,\kern0.75em \mathrm{if}\ \left({E}^{\prime}\left(i,j\right)<0\right)\ \mathrm{and}\ \left(O\left(i,j\right)=1\right),\\ {}{E}^{\prime}\left(i,j\right)+127,\kern0.75em \mathrm{if}\ \left({E}^{\prime}\left(i,j\right)>0\right)\ \mathrm{and}\ \left(O\left(i,j\right)=1\right),\\ {}{E}^{\prime}\left(i,j\right),\kern0.75em \mathrm{otherwise},\end{array}\right. $$
(13)

where O(i, j) is an element in O. At last, according to Section 2.1, we compute the prediction values P(i, j) and add them to the corresponding E(i, j) so that we can recover the original image without any error.

Case #2 Data extraction after image recovery

First, we implement the method mentioned in Case# 1 to create an image that includes the additional message. It is worth noting that we should create a new location map O3 to mark the pixels of \( {\overset{\sim }{E}}^{{\prime\prime}}\left(i,j\right) \) that includes the message and keep them unchanged during the entire recovery process. Next, referring to Section 2.1, we compute the prediction errors for the non-reference pixels and use KEY1 to re-shuffle them. Then, we can use O3 to identify the prediction errors whether message data is included or not and extract the message W according to Eq. (9). Finally, the correct message data can be restored via KEY3.

3 Results and discussion

In this section, we present a series of experiments implemented to demonstrate the effect of the proposed method. The standard images we used in our experiments were Lena, Couple, Boat, Man, Peppers, and Baboon, which are shown in Fig. 5. The size of all test images was 512×512. All the experiments were implemented on a personal computer with an Intel® Core™ i7-4720HQ, GTX 960M and 12 GB RAM.

Fig. 5
figure 5

Six standard test images

3.1 Encryption security

Security is a significant indicator of justifying a method for RDHEI. As the owner wants only the intended recipient to perceive the information in the encryption image, we must guarantee the imperceptibility of the encrypted image. In this section, to demonstrate encryption security, we compare our encryption results with Xu and Wang’s work [18] and Li et al.’s work [20] in the two following aspects.

First, a comparison was carried out via subjective visual effects. Figure 6 shows the encryption results of our work on six test images. It can be observed in Fig. 6 that we perceive almost no useful information from the encrypted images. Figures 7 and 8 show the encryption results of [18, 20], respectively. As can be observed in Fig. 7, there are still some contour lines of the original images in the encryption domain, which is a fatal weakness for [18]. It should be pointed out that although the imperceptibility of the encryption images in [20] is similar to ours in Fig. 8, the shuffling strategy applied in [20] was based on block shuffling, and it is more likely that attackers can perceive the contours via a mathematical analysis of an image encrypted with block shuffling than one encrypted with our proposed method, which uses a shuffling strategy based on pixels.

Fig. 6
figure 6

Encryption results of our work

Fig. 7.
figure 7

Encryption results of Xu and Wang’s work [18]

Fig. 8.
figure 8

Encryption results of Li et al.’s work [20]

In addition, we used the peak signal-to-noise ratio (PSNR) as an objective evaluation index to demonstrate the imperceptibility of the encryption result. Table 1 lists the PSNR results for the three methods. As can be observed from Table 1, the PSNR values from [20] indicate a performance similar to the method from [18]; however, the PSNR results of our work are approximately 75% of the other two methods. From comparisons of the above two aspects, our method performs the best in terms of security.

Table 1 PSNR comparisons among Xu and Wang’s work [18], Li et al.’s work [20], and ours

In addition to a comparison of the security level, we also compared the computing complexity among the three methods, and we used operation time as a metric to evaluate the complexity, as is shown in Table 2. From Table 2, we can find that the method from [18] spent approximately 0.3 s on each test image to complete the encryption process, which is nearly half the time of the method from [20] and the proposed method. However, according to the analysis of the security level, the method in [18] leaves the contour lines in the encryption version of all test images. Moreover, the average operation times for the proposed method and [20] are at the same level. To summarize, our method strikes a satisfactory balance between security and computational complexity.

Table 2 Operation time comparisons among Xu and Wang’s work [18], Li et al.’s work [20], and ours. (unit: s)

3.2 Embedding capacity

The embedding capacity is another significant indicator that identifies whether an RDHEI method is efficient. In order to justify the embedding efficiency of the proposed method, we added three more test images, i.e., Barbara, Lake, and Airplane, to our experiments, and we chose three different Q values, i.e., Q = 2, 3, 4, to compute the maximum ER for each image with each Q. Figure 9 shows the ER performance of the proposed method on nine test images.

Fig. 9
figure 9

ERs for nine standard pictures

It can be observed from Fig. 9 that the capacity of our method mainly depends on the prediction accuracy and the parameter Q. Specifically, with the increase in Q, the ER also increases. Except for Baboon, the ERs of all other images are higher than 0.2 bits per pixel (bpp), which represents a relatively satisfying result for RDHEI, when Q is set to 4. As our method depends on the prediction to reserve space for data hiding, images with a complex texture, such as Baboon, which has neighboring pixels that vary greatly from one another, cannot be predicted precisely with the existing prediction algorithms so that limited space can be reserved.

For a comprehensive evaluation of the ER of the proposed method, we tested our method on the uncompressed color image database (UCID) [32] with the setting Q = 4. Figure 10 shows the ER for each image, and the average ER for the UCID images is 0.433 bpp, which is satisfactory for most embedding applications.

Fig. 10.
figure 10

ERs for UCID images

3.3 Comparison and discussion

In this section, we compare the ERs and PSNRs of the recovered images that included additional messages among the four methods. Note that, in this section, we also consider a method from [33], which was further development of [18].

First, we compared the PSNRs and maximum ERs among the four methods with different parameter settings for each method. Their corresponding results are shown in Table 3, where “n/a” strands for “not available,” for the current parameter setting. From Table 3, for the method from [20], the average values of PSNRs and ERs are 50.07 dB and 0.394 bpp, respectively, when T is set to 1. For the method from [33], when β is set to 0, the average PSNR and ER are 29.24 dB and 0.184 bpp, respectively; when β is set to 1, the average PSNR and ER are 28.32 dB and 0.322 bpp, respectively. For the method from [18], when Tn is set to −1 and Tp is set to 0, the average PSNR and ER are 49.87 dB and 0.140 bpp, respectively; when Tn is set to −1 and Tp is set to 1, the average PSNR and ER are 45.97 dB and 0.207 bpp, respectively; when Tn is set to −2 and Tp is set to 1, the average PSNR and ER are 44.62 dB and 0.257 bpp, respectively. For our method, the average values of PSNRs and ERs are 59.76 dB and 0.140 bpp when Q is set to 2, 54.54 dB and 0.222 bpp when Q is set to 3, and 51.32 dB and 0.280 bpp when Q is set to 4. Note that for a fair comparison, five test images, not including Baboon, are used to compute the average PSNRs and ERs with different parameters for all four methods, because the additional message could not be embedded in Baboon in the method from [20]. Comparing the average values between the proposed method and that from [20], the embedding performance of the method from [20] is better than the proposed method when Q is set to 2 or 3, and the embedding performance of the proposed method is close to the method from [20] when Q is set to 4. However, it is worth pointing out that the method proposed in [20] is not entirely separable, and the data can be extracted correctly only from the encryption domain. On the contrary, the proposed method is perfectly separable, and the data can be extracted correctly not only from the encryption domain but also from the plaintext domain. Comparing the average values between the proposed method and the method in [33], we find that the average ER of the method from [33] is slightly higher than the ER of our method. However, as the cost of higher ER, the average PSNR of [33] is significantly lower than that of the other methods. By comparing the average values between the proposed method and [18], it can be observed that the embedding performances of both methods are similar at a low ER. However, with the increment in the parameters, the increment of the ER of the proposed method is higher than that of the method from [18] at a similar PSNR level.

Table 3 Comparison among four methods in terms of PSNR and maximum ER

Because PSNR is not sufficient to show the visual quality of a decrypted image, we also evaluated our method through another metric, SSIM, which stands for the structural similarity between a pair of decrypted and original images. The experimental results are shown in Table 4. From Table 4, for [20], the average value of SSIM is 0.9999 when T is set to 1. For [33], the average values of SSIM are 0.9830 when β is set to 0 and 0.9787 when β is set to 1. For [18], the average values of SSIM are 0.9999 (when Tn and Tp are set to −1 and 0, respectively), 0.9998 (when Tn and Tp are set to−1 and 1, respectively), and 0.9996 (when Tn and Tp are set to −2 and 1). For the proposed method, the average values of SSIM are 0.9999, whether Q is set to 2, 3, or 4. Note that here, for a fair comparison, five test images, not including Baboon, were used to compute the average values. From the results in Table 4, [18, 20], and the proposed method have the same level in terms of SSIM and achieve a better performance than [33].

Table 4 Comparisons among four methods in terms of SSIM

In addition, we also tested the four methods on UCID, and the average values of SSIM, PSNR, and ER for different methods are shown in Table 5. Note that the parameters T, β, Tn, Tp, and Q were set to 1, 1, − 2, 1, and 4, respectively. As can be observed in Table 5, our method achieves the highest average values in terms of SSIM and PSNR, while the method from [20] achieves the highest average ER.

Table 5 Comparison among four methods in terms of SSIM, PSNR, and ER for UCID

Next, we compared performance in terms of ER versus PSNR among the four methods. Figure 11 illustrates the performance comparisons among the four methods in terms of ER versus PSNR. From Fig. 11, it is evident that all the PSNR curves of the proposed method are higher than the other three methods under any ER value. It is worth pointing out that the methods in [18, 33], and the proposed method can generate a directly decrypted image while keeping all the embedded data, but [20] cannot keep the embedded data in the directly decrypted image.

Fig. 11
figure 11

Performance comparisons between the methods [18, 20, 33], and the proposed method in terms of ER versus PSNR

In addition to comparisons of the capacity and visual quality of the four methods, we also compared the computing complexity of the four methods during the embedding process. We used operation time as a metric to evaluate the computing complexity, and set T, β, Tn, Tp, and Q to 1, 1, − 2, 1, and 4, respectively. The results are shown in Table 6. The average operation times for [18, 20, 33], and the proposed method are 0.0714 s, 0.4070 s, 0.3220 s, and 0.1226 s, respectively. From Table 6, the method from [20] consumes the most time because the embedding strategy in this method is based on bit operations. Moreover, the operation time of the method from [33] is also higher than the method from [18] and the proposed method, due to the fact that the embedding strategy in the method from [33] is based on module operations. However, both the method from [18] and the proposed method attain efficient results, with both average embedding times lower than 0.2000 s. Note that the reason why our method requires more time than the method from [18] is because of the overflow/underflow processing involved in our embedding process.

Table 6 Operation time comparisons among four methods during the embedding process. (unit:s)

4 Conclusions

In this study, a separable multiary RDHEI method that involved prediction and replacement, image encryption, data hiding, data extraction, and recovery of the original image was developed. A stream cipher and shuffling were used to encrypt the original image so that we could obtain a high-security level. The data hider could embed additional data into the image without knowing the original image because we had already vacated space for the data hider. It is worth pointing out that, in our work, the data hider is also capable of adaptively modifying the reserved space according to the actual embedding demand. After the receiver acquires the encrypted image with the additional data, the receiver can extract the entire additional message either from the encryption domain or the plaintext domain because our scheme is entirely separable. It should be noted that there are two limitations in our method: (1) the amount of overflow/underflow data increases dramatically with an increase in Q, and (2) the ER of our method relies heavily on the accuracy of the predictions. Both issues will be addressed in our future research.

Availability of data and materials

Not applicable.

Abbreviations

LSB:

least significant bit

CP:

Cloud provider

RDHEI:

Reversible data hiding in encryption domain

RDH:

Reversible data hiding

LC:

Lossless compression

HS:

Histogram shift

DE:

Difference expansion

PEE:

Prediction error expansion

PEH:

Prediction-error histogram

RRBE:

Reserving a room before encryption

VRAE:

Vacating space after encryption

ER:

Embedding rate

CRT:

Chinese remainder theorem

RST:

Redundant space transfer

RLC:

Run-length coding

MSB:

Most significant bit

XOR:

Exclusive or

PSNR:

Peak signal-to-noise ratio

bpp:

Bits per pixel

References

  1. S. Li, X. Zhang, Toward construction based data hiding: from secrets to fingerprint images. IEEE Trans on Image Process 28(3), 1482–1497 (2019)

    Article  MathSciNet  Google Scholar 

  2. J. Tao, S. Li, X. Zhang, Z. Wang, Robust image steganography. IEEE Trans Circuits Syst Video Technol 29(2), 594–600 (2019)

    Article  Google Scholar 

  3. C. Qin, Q. Zhou, F. Cao, J. Dong, X. Zhang, Flexible lossy compression for selective encrypted image with image inpainting. IEEE Trans Circuits Syst Video Technol 29(11), 3341–3355 (2019)

    Article  Google Scholar 

  4. C. Qin, P. Ji, C.C. Chang, J. Dong, X. Sun, Non-uniform watermark sharing based on optimal iterative BTC for image tampering recovery. IEEE Multimedia 25(3), 36–48 (2018)

    Article  Google Scholar 

  5. M.U. Celik, G. Sharma, A.M. Tekalp, E. Saber, Lossless generalized-LSB data embedding. IEEE Trans Image Process 14(2), 253–266 (2005)

    Google Scholar 

  6. Z.C. Ni, Y.Q. Shi, N. Ansari, W. Su, Reversible data hiding. IEEE Trans Circuits Syst Video Technol 16(3), 354–362 (2006)

    Article  Google Scholar 

  7. C. Qin, C.C. Chang, Y.H. Huang, L.-T. Liao, An inpainting-assisted reversible steganographic scheme using a histogram shifting mechanism. IEEE Trans Circuits Syst Video Technol 23(7), 1109–1118 (2012)

    Article  Google Scholar 

  8. J. Tian, Reversible data embedding using a difference expansion. IEEE Trans Circuits Syst Video Technol 13(8), 890–896 (2003)

    Article  Google Scholar 

  9. A.M. Alattar, Reversible watermark using the difference expansion of a generalized integer transform. IEEE Trans Image Process 13(8), 1147–1156 (2004)

    MathSciNet  Google Scholar 

  10. V. Sachnev, H.J. Kim, J. Nam, S. Suresh, Y.Q. Shi, Reversible watermarking algorithm using sorting and prediction. IEEE Trans Circuits Syst Video Technol 19(7), 989–999 (2009)

    Article  Google Scholar 

  11. Y. Hu, H.K. Lee, J. Li, DE-based reversible data hiding with improved overflow location map. IEEE Trans Circuits Syst Video Technol 19(2), 250–260 (2008)

    Google Scholar 

  12. D.M. Thodi, J.J. Rodríguez, Expansion embedding techniques for reversible watermarking. IEEE Trans. Image Process. 16(3), 721–730 (2007)

    Article  MathSciNet  Google Scholar 

  13. H. Yao, C. Qin, Z. Tang, Y. Tian, Guided filtering based color image reversible data hiding. J Vis Commun Image R 43, 152–163 (2017)

    Article  Google Scholar 

  14. H. Yao, X. Liu, Z. Tang, Y.C. Hu, C. Qin, An improved image camouflage technique using color difference channel transformation and optimal prediction-error expansion. IEEE Access 6, 40569–40584 (2018)

    Article  Google Scholar 

  15. K. Ma, W. Zhang, X. Zhao, N. Yu, F. Li, Reversible data hiding in encrypted images by reserving room before encryption. IEEE Trans Inf Forensics Security 8(3), 553–562 (2013)

    Article  Google Scholar 

  16. W.M. Zhang, K. Ma, N.H. Yu, Reversibility improved data hiding in encrypted images. Signal Process. 94, 118–127 (2014)

    Article  Google Scholar 

  17. X.C. Cao, L. Du, X.X. Wei, D. Meng, X.J. Guo, High capacity reversible data hiding in encrypted images by patch-level sparse representation. IEEE Trans Cybern 46(5), 1132–1143 (2016)

    Article  Google Scholar 

  18. D.W. Xu, R.D. Wang, Separable and error-free reversible data hiding in encrypted images. Signal Process. 123, 9–21 (2016)

    Article  Google Scholar 

  19. S. Yi, Y.C. Zhou, Binary-block embedding for reversible data hiding in encrypted images. Signal Process. 133, 40–51 (2017)

    Article  Google Scholar 

  20. Q. Li, B. Yan, H. Li, N. Chen, Separable reversible data hiding in encrypted images with improved security and capacity. Multimed. Tools Appl. 77(23), 30749–30768 (2018)

    Article  Google Scholar 

  21. Z.X. Qian, X.P. Zhang, G.R. Feng, Reversible data hiding in encrypted images based on progressive recovery. IEEE Signal Process Lett 23(11), 1672–1676 (2016)

    Article  Google Scholar 

  22. M. Li, Y. Li, Histogram shifting in encrypted images with public key cryptosystem for reversible data hiding. Signal Process. 130, 190–196 (2017)

    Article  Google Scholar 

  23. S. Agrawal, M. Kumar, Mean value based reversible data hiding in encrypted images. Optik 130, 922–934 (2017)

    Article  Google Scholar 

  24. P. Singh, B. Raman, Reversible data hiding for rightful ownership assertion of images in encrypted domain over cloud. AEU-Int J Electron C 76, 18–35 (2017)

    Article  Google Scholar 

  25. D. Xiao, X.P. Xiang, H.Y. Zheng, Y. Wang, Separable reversible data hiding in encrypted image based on pixel value ordering and additive homomorphism. J Vis Commun Image R 45, 1–10 (2017)

    Article  Google Scholar 

  26. Z.L. Liu, C.M. Pun, Reversible data-hiding in encrypted images by redundant space transfer. Inf. Sci. 433, 188–203 (2018)

    Article  MathSciNet  Google Scholar 

  27. C. Qin, W. Zhang, F. Cao, X.P. Zhang, C.C. Chang, Separable reversible data hiding in encrypted images via adaptive embedding strategy with block selection. Signal Process. 153, 109–122 (2018)

    Article  Google Scholar 

  28. F. Khelifi, T. Brahimi, J.G. Han, X.L. Li, Secure and privacy-preserving data sharing in the cloud based on lossless image coding. Signal Process. 148, 91–101 (2018)

    Article  Google Scholar 

  29. H.T. Wu, Y.M. Cheung, Z.Y. Yang, S.H. Tang, A high-capacity reversible data hiding method for homomorphic encrypted images. J Vis Commun Image R 62, 87–96 (2019)

    Article  Google Scholar 

  30. Y. Fu, P. Kong, H. Yao, Z. Tang, C. Qin, Effective reversible data hiding in encrypted image with adaptive encoding strategy. Inf. Sci. 494, 21–36 (2019)

    Article  MathSciNet  Google Scholar 

  31. C. Qin, X. Qian, W. Hong, X. Zhang, An efficient coding scheme for reversible data hiding in encrypted image with redundancy transfer. Inf. Sci. 487, 176–192 (2019)

    Article  Google Scholar 

  32. G. Schaefer, M. Stich, UCID—an uncompressed colour image database. IS&T/SPIE Electronic Imaging 5307, 472–480 (2004)

    Google Scholar 

  33. D.W. Xu, S.B. Su, Separable reversible data hiding in encrypted images based on difference histogram modification. Secur Commun Netw 2019, 7480147 (2019)

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their helpful comments.

Funding

This work was supported in part by the National Natural Science Foundation of China (61702332, 61702150), and the Innovation and entrepreneurship training program for college students of USST (XJ2019063).

Author information

Authors and Affiliations

Authors

Contributions

MY designed the scheme and wrote the manuscript. YL and HS did the experiments. HY supervised the algorithm design and experiments, and modified the manuscript. TQ offered the suggestion. All author(s) read and approved the final manuscript.

Authors’ information

Mingji Yu is currently working toward a B.S. degree at University of Shanghai for Science and Technology. His research interests include reversible data hiding.

Yuchen Liu is currently working toward a B.S. degree at University of Shanghai for Science and Technology. His research interests include reversible data hiding.

Hu Sun is currently working toward a B.S. degree at University of Shanghai for Science and Technology. His research interests include reversible data hiding.

Heng Yao received the B.S. degree from Hefei University of Technology, China, in 2004, the M.S. degree from Shanghai Normal University, China, in 2008, and the Ph.D. degree from Shanghai University, China, in 2012. Currently, he is with School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, China. His research interests include digital forensics, data hiding, image processing, and pattern recognition. received the B.S. degree from Hefei University of Technology, China, in 2004, the M.S. degree from Shanghai Normal University, China, in 2008, and the Ph.D. degree from Shanghai University, China, in 2012. Since 2012, he has been with School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, China, where he is currently an Associate Professor. His research interests include digital forensics, data hiding, image processing, and pattern recognition. He has contributed more than 30 international peer-reviewed journal papers.

Tong Qiao received the B.S. degree in Electronic and Information Engineering in 2009 from Information Engineering University, Zhengzhou, China, and the M.S. degree in Communication and Information System in 2012 from Shanghai University, Shanghai, China, and the Ph.D. degree in System Optimization and Dependability in 2016 from University of Technology of Troyes, Laboratory of System Modeling and Dependability, Troyes, France. The Ph.D. degree is funded by China Scholarship Council with UT-INSA project. He is currently an assistant professor at Hangzhou Dianzi University, School of Cyberspace. His current research interests focus on steganalysis and digital image forensics.

Corresponding author

Correspondence to Heng Yao.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

All authors agree to publish this paper in this journal.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, M., Liu, Y., Sun, H. et al. Adaptive and separable multiary reversible data hiding in encryption domain. J Image Video Proc. 2020, 16 (2020). https://doi.org/10.1186/s13640-020-00502-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13640-020-00502-w

Keywords