Adaptive and separable multiary reversible data hiding in encryption domain

To ensure the security of digital information, information needs to be stored and processed in the encryption domain during the cloud storage process, during which the user does not allow the cloud provider (CP) to access the image contents, and the CP must embed additional information in the image. Therefore, reversible data hiding in an encryption domain (RDHEI) has emerged. This paper presents an adaptive and separable multiary RDHEI method. First, the image is divided into two parts that consist of reference pixels and non-reference pixels. Next, we compute the prediction errors of the non-reference pixels, followed by a replacement of the original non-reference values with the computed prediction errors. Then, the entire image is encrypted with a specially designed stream cipher strategy so that the CP can embed additional information by modifying the prediction errors without knowing the original image. In addition, although our method vacates space before encryption, the CP can adaptively control the reserved space, unlike traditional RDHEI schemes which also vacate space before encryption. Because our method is separable, the embedded message can be extracted from both encryption and plaintext domains accurately. The experimental results demonstrate the efficacy of the proposed method.


Introduction
Data hiding technology can achieve the purposes of secret transmission and content authentication by embedding secret information into the host [1][2][3][4]. As a significant branch of data hiding, reversible data hiding (RDH) is an algorithm that embeds additional data into a specific payload, such as image, video, or audio signals, and recovers the additional data and the payload without any loss. Traditional RDH can be roughly divided into four categories: (1) lossless compression (LC) [5], in which the space for data hiding is generated by LC of the image, (2) histogram shifting (HS) [6,7], in which the embedding space is generated by shifting the gray-scale histogram of the image, (3) difference expansion (DE) [8] [9], in which the differences between nearby pixels are expanded to reserve space, and (4) prediction error expansion (PEE) [10][11][12][13][14], in which the space for embedding is generated by shifting the prediction error histogram (PEH).
All the methods mentioned above were applied in the plaintext domain. However, with the rapid development of Internet technology and cloud storage applications, a new requirement for RDH has emerged. On the one hand, many images must be uploaded to a cloud server, followed by message embedding and other postprocessing procedures; on the other hand, the content of the images should not be accessible by the cloud provider (CP), especially in the case of critical and private images, such as medical images that are private, and military images that relate to national security. In order to deal with this paradox, RDH has inevitably emerged in the encryption domain. While original images must be protected and recovered losslessly, we expect to achieve the highest embedding capacity. In order to obtain an optimal balance of security and embedding capacity, several studies have been carried out. The existing RDH in encrypted images (RDHEI) can be mainly divided into two categories: methods based on reserving room before encryption (RRBE), and methods based on vacating space after encryption (VRAE).
For an RRBE-based method, Ma et al. [15] proposed an RDHEI method that reserved a room for hiding by embedding some pixels in the original image using a traditional RDH algorithm. Zhang et al. [16] proposed calculating the prediction error to replace the image pixels, followed by an encryption with a traditional encryption algorithm so the data hider can embed the additional data by modifying the prediction error. Cao et al. [17] provided a method that implemented a patch-level sparse representation technique to create vacated space for data hiding. Xu and Wang [18] proposed a scheme to enable a traditional PEE strategy in the plaintext domain to be effective in the encryption domain. Yi and Zhou [19] provided a binary-block embedding strategy to improve embedding capacity and visual quality in marked decrypted images. Li et al. [20] proposed a separable RDH scheme, in which encryption quality was improved by combing the permutation and stream cipher during the encryption process, and the embedding rate (ER) was increased by replacing pixels with their corresponding prediction errors.
For a VRAE-based method, Qian et al. [21] proposed a method based on progressive recovery that consisted of three agents, including the content owner, the data hider, and the recipient. In [22], the use of HS with a public-key cryptosystem was implemented to realize RDHEI. Agrawal and Kumar [23] presented a method based on additive modulo 256 and utilized a property of the mean to hide data. Singh and Raman [24] proposed an RDHEI scheme based on a Chinese remainder theorem (CRT)-based sharing scheme to transmit media information over cloud architecture and prove its original ownership. Xiao et al. [25] proposed a separable RDHEI method by utilizing pixel value ordering to embed secret data in each block in an additive homomorphic encrypted image. Liu and Pun [26] presented a redundant space transfer (RST) scheme to create redundant space for data hiding in which traditional RDH technology could be applied. Qin et al. [27] presented an RDHEI method in which owners used an analog stream cipher and block permutation to encrypt blocks of original images that did not overlap, and the data hider could classify blocks and use data hiding keys when embedding. Khelifi et al. [28] used a special encryption method for original images, and then compressed the encryption image to a bit series to create space for data hiding. Wu et al. [29] presented a method applied in homomorphic encrypted images so that some of the hidden data could be extracted in the encryption domain, and the rest could be extracted in the plaintext domain. Fu et al. [30] encrypted a host image via block scrambling and a stream cipher, followed by an adaptive compression of the most significant bit (MSB) layer to vacate space for data hiding. Qin et al. [31] scrambled an image with three different levels and vacated the embedding space by using sparse matrix coding to compress the least significant bit (LSB) of the encrypted image.
From the researches mentioned above, RDHEI has developed to a certain degree. However, unlike traditional RDH in the plaintext domain, there are still many lines of research to follow. Motivated by Xu's work [18], which was devoted to applying the traditional PEE method in the encryption domain, in this work, we propose a method for improving encryption quality and the ER. Specifically, compared with [18], the proposed method offers the following three improvements: (1) in [18], the vacated space for data hiding was determined by the image owner; in our work, the image owner needs only to encrypt the image, while the work of creating space is conducted by the data hider, thereby reducing the workload. (2) A new encryption progress is proposed so that the encrypted image is less noticeable and better able to resist attacks from hackers. (3) A new method is presented to increase the ER by shifting the PEH based on the unique characteristics of the encryption domain.
This article is organized as follows: Section 2 describes the proposed method in detail, Section 3 presents the experimental results of the proposed method, and finally, Section 4 concludes this paper.

Methods/experimental
In this section, we propose a separable RDHEI scheme, which mainly consists of prediction and replacement, encryption, data hiding, data extraction, and image recovery. Specifically, the image owner uses our method to encrypt the image and vacate space for data hiding during a subsequent procedure, which is conducted by the CP. The CP then uses the reserved room to embed data. If he does not have the encryption keys, he can hide data but he will not access the original content, so the original owner can keep his images safe and the data hider can guarantee his rights by embedding additional information. A flow diagram of encryption and data hiding is briefly shown in Fig. 1.

Prediction and replacement
Suppose that the original image X is an 8-bit gray-scale image with a size of M × N, and its pixels are denoted as X(i, j), 1 ≤ i ≤ M, 1 ≤ j ≤ N. First, we divide all the pixels using a checkerboard pattern into two sets, defined as the reference and non-reference pixels. In detail, the pixels in odd rows and odd columns are selected as reference pixels, and the other pixels are non-reference pixels. Figure 2 shows the division between reference and non-reference pixels, which are denoted in gray and white, respectively. To predict one non-reference pixel, we tend to use the four adjacent reference pixels for the sake of accuracy. Depending on whether there are enough reference pixels around the non-reference pixels, all non-reference pixels are then further divided into two sets. For some non-reference pixels, there is a sufficient number of reference pixels around them, and we denote them as Δ. For other non-reference pixels, due to the lack of adjacent reference pixels, we use the prediction results of their adjacent non-reference pixels as references, and we denote these pixels as Ω.
The prediction process is executed following an order that predicts the pixels belonging to Δ first, followed by the prediction of pixels belonging to Ω. Their corresponding prediction values P(i, j) can be computed as and Then, we compute the prediction error E(i, j) as Note that in theory, the value range of E(i, j) is from − 255 to 255, which requires a nine-bit number to represent. However, according to our observation, to correctly show an image, E(i, j) generally ranges from − 127 to 127. Thus, in our work, we continue to use 8 bits to represent the prediction error. Here, we only use the least significant 7 bits to show the absolute value of errors, and the MSB to indicate the sign, i.e., "0" and "1" indicate positive and negative signs, respectively. Because the prediction error represented in our work only varies from − 127 to 127, prediction errors out of range of the set are modified as  Note that we reserved (10000000) 2 , and use (00000000) 2 to represent the particular case E'(i, j) = 0. We record the coordinates of the prediction errors, which are initially outside the range of [− 127, 127], and save them as a location map, denoted as O. Finally, we replace all the non-reference pixels with their corresponding modified prediction errors E'(i, j) to generate a new image, denoted as I. It is worth pointing out that the prediction criterion in our work is not limited to Eqs. (1) and (2), and any precise prediction strategy can be used to further improve the efficiency of our entire performance.

Image encryption
First, we only shuffle E'(i, j) in I via an encryption key K EY1 to convert E'(i, j) to E''(i, j). For an easier description, we denote the reference pixels as Y(i, j). We use a stream cipher to encrypt Y(i, j) and E''(i, j) separately. Specifically, in our work, we use another encryption key, K EY2 , to generate a pseudorandom matrix R of size M × N, in which each element is a 7-bit number and denoted as R(i, j). We only encrypt the lowest 7-bit plane of Y(i, j) as where Y'(i, j) represents the encrypted value of Y(i, j), and ⨁ indicates the exclusive or (XOR) operation.
Meanwhile, to encrypt E''(i, j), we need to find two peak points, T p and T n , in advance. E''(i, j) is then encrypted asẼ whereẼði; jÞ represents the encrypted value of E''(i, j). It must be pointed out that ifẼði; jÞ equals T p or T n after the XOR operation, we need to add one or subtract one to avoid confusing the original pixels with E''(i, j) = T p or T n , and we use a new location map O 1 to record thoseẼði; jÞ.
Next, we extract the MSB plane of Y'(i, j). Both the MSB plane and the location maps O and O 1 are next compressed via LC coding. Here, arithmetic coding is used in our work because all the elements for compression are made up of "0"s and "1"s. We then replace the MSB of Y'(i, j) with the following parts: compression code of the original MSB plane of Y'(i, j) and its length, and compression codes of O and O 1 and their lengths.
Finally, we compositeẼði; jÞ and Y''(i, j) to generate the final encrypted image, denoted as I', and send it to the cloud server. Fig. 3a, b shows the PEHs before and after the stream cipher, respectively. As can be seen in Fig. 3b, the values approximately obey a uniform distribution after the stream cipher, except for the reserved space.
To explain the security of our encryption methods, we suppose that there is an image of size 512 × 512 running through the entire encryption process. In the first part of the encryption, there are 512 Â 512 Â 1 4 ¼ 65; 536 reference pixels that remain unchanged during the shuffling, and 512 Â 512 Â 3 4 ¼ 196; 608 non-reference pixels that are rearranged. Therefore, there are P 196608 196608 possibilities in total after rearrangement, which is a rather large number for probability analysis. In the second part of the stream cipher, we suppose that 25% of pixels are working as payload, and the remaining 75% of pixels are XOR with a random 7-bit sequence produced by K EY2 . As a result, there are 512 × 512 × 0.75 × 2 7 = 25,165,824 possibilities after the stream cipher. After running through the entire encryption process, there are P 196608 196608 Â 25; 165; 824 possibilities in total that are different from the original image. It is difficult or even impossible to detect only one original image from such a multitude of probabilities. In addition, Khelifi et al. [28] presented a mathematical security analysis of shuffling followed by a stream cipher.

Data hiding
After receiving I', due to the lack of K EY1 and K EY2 , the CP cannot access to the original image. However, he can embed data inẼði; jÞ without any knowledge of the ori-  ginal image. In this paper, owing to the elaborately designed encryption method, applying PEE in the encryption domain becomes feasible. The three steps of our data hiding strategy are listed in detail as follows.
Step 1: The data hider selects the multiary parameter Q, which is suggested to be set to 2, 3, and 4 according to the actual embedding requirement. The reason why Q is recommended to be set to 2, 3, and 4 is that we need to shift histogram during the data embedding process and it will cause inevitably underflow/overflow issues. In our method, we use a location map to deal with these issues. If Q is set to be higher than 4, the size of the location map will increase correspondingly and the embedding capacity might be influenced by the limited space in the bit plane of reference pixels. Before embedding, to increase the security of the proposed method, a new key K EY3 is used to shuffle the to-be-embedded message and transfer the message to the corresponding  Q-ary format. Note that our method belongs to RRBE, and if we encrypt the message before the embedding phase, all we need to do is extract the encrypted message and then decrypt this message in a reverse manner. The peak points T p and T n from the histogram ofẼði; jÞ are then searched for.
Step 2: Shift the histogram ofẼði; jÞ to vacate space for data hiding as Eði; jÞ−Q þ 1; ifẼði; jÞ < T n ; Eði; jÞ; otherwise: whereẼ 0 ði; jÞ stands for the shifted value for data hiding. Note that a new location map O 2 is needed to rec-ordẼ 0 ði; jÞ with underflow/overflow issues. O 2 can then be compressed and embedded into Y''(i, j) via an MSB replacement, similar to O and O 1 .
Step 3: Embed the message as whereẼ 00 ði; jÞ denotes the ultimate value after data hiding, and W stands for the Q-ary message. Finally, the marked encrypted image is composited by Y''(i, j) andẼ 00 ði; jÞ.
For the ease of understanding, Fig. 4a, b shows two simple examples of the encryption and data embedding processes. In Fig. 4a, we suppose a 3 × 3 block through the prediction process, in which the pixels valued "1" and "0" are determined to carry additional data. We first apply K EY1 to shuffle the prediction errors, and then XOR the block with a random matrix generated by K EY2 . Finally, we obtain an encrypted version of this block. Figure 4b illustrates the schematic diagram during the data hiding process when Q is set to 2, 3, or 4. Pixels with values that are not 1 and 0 are shifted by (Q − 1) to reserve space for data hiding and are denoted by the dashed lines in Fig. 4(b). Moreover, the pixels valued "1" and "0" are modified to carry additional Q-ary message data, which have been shuffled by K EY3 in advance to guarantee the data security, as shown in Fig. 4b by the solid lines.

Data extraction and image recovery
Because the proposed scheme is separable, the restore process can be divided into two cases, which are described as follows.
Case #1: Data extraction before image recovery To extract the data correctly, Q, T p , and T n are essential, and all these parameters can be observed from the histogram ofẼ 00 ði; jÞ. In conjunction with O 2 , which can be extracted from Y''(i, j) , we can remove the pixels with overflow/underflow issues that may influence the data extraction. Then, W can be extracted as W ¼Ẽ 00 ði; jÞ−T p ; ifẼ 00 ði; jÞ∈½T p ; T p þ Q−1; T n −Ẽ 00 ði; jÞ; ifẼ 00 ði; jÞ∈½T n −Q þ 1; T n : ( Eventually, we use K EY3 to restore the original message data.
After data extraction, the recovery process can be divided into four steps as Step 1: Recover the histogram to the condition before embedding, as T p ; ifẼ 00 ði; jÞ∈½T p ; T p þ Q−1; T n ; ifẼ 00 ði; jÞ∈½T n −Q þ 1; T n ; E 00 ði; jÞ þ Q−1; ifẼ 00 ði; jÞ < T n −Q þ 1; E 00 ði; jÞ−Q þ 1; otherwise: Step 2 Extract the compressed code of O, O 1 , and original MSB plane of Y'(i, j) from the MSB plane of Y''(i, j). We recover them by using the corresponding decompression method. Next, we replace the MSB plane of Y''(i, j) with the MSB plane of Y'(i, j) to recover Y'(i, j). The original reference pixels can then be recovered as Step 3: Increase or decreaseẼði; jÞ by one if its corresponding element in O 1 is equal to one. We then processẼði; jÞ as where O 1 (i, j) is an element in O 1 . After the stream cipher, we use K EY1 to restore E'(i, j).
Step 4: Modify E'(i, j)according to O as where O(i, j) is an element in O. At last, according to Section 2.1, we compute the prediction values P(i, j) and add them to the corresponding E(i, j) so that we can recover the original image without any error.
Case #2 Data extraction after image recovery First, we implement the method mentioned in Case# 1 to create an image that includes the additional message. It is worth noting that we should create a new location map O 3 to mark the pixels ofẼ 00 ði; jÞ that includes the message and keep them unchanged during the entire recovery process. Next, referring to Section 2.1, we compute the prediction errors for the non-reference pixels and use K EY1 to re-shuffle them. Then, we can use O 3 to identify the prediction errors whether message data is included or not and extract the message W according to Eq. (9). Finally, the correct message data can be restored via K EY3 .

Results and discussion
In this section, we present a series of experiments implemented to demonstrate the effect of the proposed method. The standard images we used in our experiments were Lena, Couple, Boat, Man, Peppers, and Baboon, which are shown in Fig. 5. The size of all test images was 512×512. All the experiments were implemented on a personal computer with an Intel® Core™ i7-4720HQ, GTX 960M and 12 GB RAM.

Encryption security
Security is a significant indicator of justifying a method for RDHEI. As the owner wants only the intended recipient to perceive the information in the encryption image, we must guarantee the imperceptibility of the encrypted image. In this section, to demonstrate   [20] in the two following aspects. First, a comparison was carried out via subjective visual effects. Figure 6 shows the encryption results of our work on six test images. It can be observed in Fig. 6 that we perceive almost no useful information from the encrypted images. Figures 7 and 8 show the encryption results of [18,20], respectively. As can be observed in Fig. 7, there are still some contour lines of the original images in the encryption domain, which is a fatal weakness for [18]. It should be pointed out that although the imperceptibility of the encryption images in [20] is similar to ours in Fig. 8, the shuffling strategy applied in [20] was based on block shuffling, and it is more likely that attackers can perceive the contours via a mathematical analysis of an image encrypted with block shuffling than one encrypted with our proposed method, which uses a shuffling strategy based on pixels.
In addition, we used the peak signal-to-noise ratio (PSNR) as an objective evaluation index to demonstrate the imperceptibility of the encryption result. Table 1 lists the PSNR results for the three methods. As can be observed from Table 1, the PSNR values from [20] indicate a performance similar to the method from [18]; however, the PSNR results of our work are approximately 75% of the other two methods. From comparisons of the above two aspects, our method performs the best in terms of security.
In addition to a comparison of the security level, we also compared the computing complexity among the three methods, and we used operation time as a metric to evaluate the complexity, as is shown in Table 2. From Table 2, we can find that the method from [18] spent approximately 0.3 s on each test image to complete the encryption process, which is nearly half the time of the method from [20] and the proposed method. However, according to the analysis of the security level, the method in [18] leaves the contour lines in the encryption version of all test images. Moreover, the average operation times for the proposed method and [20] are at the same level. To summarize, our method strikes a satisfactory balance between security and computational complexity.

Embedding capacity
The embedding capacity is another significant indicator that identifies whether an RDHEI method is efficient. In order to justify the embedding efficiency of the proposed method, we added three more test images, i.e., Barbara, Lake, and Airplane, to our experiments, and we chose three different Q values, i.e., Q = 2, 3, 4, to compute the maximum ER for each image with each Q. Figure 9 shows the ER performance of the proposed method on nine test images.
It can be observed from Fig. 9 that the capacity of our method mainly depends on the prediction accuracy and the parameter Q. Specifically, with the increase in Q, the ER also increases. Except for Baboon, the ERs of all other images are higher than 0.2 bits per pixel (bpp), which represents a relatively satisfying result for RDHEI, when Q is set to 4. As our method depends on the prediction to reserve space for data hiding, images with a complex texture, such as Baboon, which has neighboring pixels that vary greatly from one another, cannot be predicted precisely with the existing prediction algorithms so that limited space can be reserved.
For a comprehensive evaluation of the ER of the proposed method, we tested our method on the uncompressed color image database (UCID) [32] with the setting Q = 4. Figure 10 shows the ER for each image, and the average ER for the UCID images is 0.433 bpp, which is satisfactory for most embedding applications.

Comparison and discussion
In this section, we compare the ERs and PSNRs of the recovered images that included additional messages among the four methods. Note that, in this section, we also consider a method from [33], which was further development of [18].
First, we compared the PSNRs and maximum ERs among the four methods with different parameter settings for each method. Their corresponding results are shown in Table 3, where "n/a" strands for "not available, " for the current parameter setting. From Table 3, for the method from [20], the average values of PSNRs and ERs are 50.07 dB and 0.394 bpp, respectively, when T is set to 1. For the method from [33], when β is set to 0, the average PSNR and ER are 29.24 dB and 0.184 bpp, respectively; when β is set to 1, the average PSNR and ER are 28.32 dB and 0.322 bpp, respectively. For the method from [18], when T n is set to −1 and T p is set to 0, the average PSNR and ER are 49.87 dB and 0.140 bpp, respectively; when T n is set to −1 and T p is set to 1, the average PSNR and ER are 45.97 dB and 0.207 bpp, respectively; when T n is set to −2 and T p is set to 1, the average PSNR and ER are 44.62 dB and 0.257 bpp, respectively. For our method, the average values of PSNRs and ERs are 59.76 dB and 0.140 bpp when Q is set to 2, 54.54 dB and 0.222 bpp when Q is set to 3, and 51.32 dB and 0.280 bpp when Q is set to 4. Note that for a fair comparison, five test images, not including Baboon, are used to compute the average PSNRs and ERs with different parameters for all four methods, because the additional message could not be embedded in Baboon in the method from [20]. Comparing the average values between the proposed method and that from [20], the embedding performance of the method from [20] is better than the proposed method when Q is set to 2 or 3, and the embedding performance of the proposed method is close to the method from [20] when Q is set to 4. However, it is worth pointing out that the method proposed in [20] is not entirely separable, and the data can be extracted correctly only from the encryption domain. On the contrary, the proposed method is perfectly separable, and the data can be extracted correctly not only from the encryption domain but also from the plaintext domain. Comparing the average values between the proposed method and the method in [33], we find that the average ER of the method from [33] is slightly higher than the ER of our method. However, as the cost of higher ER, the average PSNR of [33] is significantly lower than that of the other methods. By comparing the average values between the proposed method and [18], it can be observed that the embedding performances of both methods are similar at a low ER. However, with the increment in the parameters, the increment of the ER of the proposed method is higher than that of the method from [18] at a similar PSNR level. Because PSNR is not sufficient to show the visual quality of a decrypted image, we also evaluated our method through another metric, SSIM, which stands for the structural similarity between a pair of decrypted and original images. The experimental results are shown in Table 4. From Table 4, for [20], the average value of SSIM is 0.9999 when T is set to 1. For [33], the average values of SSIM are 0.9830 when β is set to 0 and 0.9787 when β is set to 1. For [18], the average values of SSIM are 0.9999 (when T n and T p are set to −1 and 0, respectively), 0.9998 (when T n and T p are set to−1 and 1, respectively), and 0.9996 (when T n and T p are set to −2 and 1). For the proposed method, the average values of SSIM are 0.9999, whether Q is set to 2, 3, or 4. Note that here, for a fair comparison, five test images, not including Baboon, were used to compute the average values. From the results in Table 4, [18,20], and the proposed method have the same level in terms of SSIM and achieve a better performance than [33].
In addition, we also tested the four methods on UCID, and the average values of SSIM, PSNR, and ER for different methods are shown in Table 5. Note that the parameters T, β, T n , T p , and Q were set to 1, 1, − 2, 1, and 4, respectively. As can be observed in Table 5, our method achieves the highest average values in terms of SSIM and PSNR, while the method from [20] achieves the highest average ER.
Next, we compared performance in terms of ER versus PSNR among the four methods. Figure 11 illustrates the performance comparisons among the four methods in terms of ER versus PSNR. From Fig. 11, it is evident that all the PSNR curves of the proposed method are higher than the other three methods under any ER value. It is worth pointing out that the methods in [18,33], and the proposed method can generate a directly decrypted image while keeping all the embedded data, but [20] cannot keep the embedded data in the directly decrypted image.
In addition to comparisons of the capacity and visual quality of the four methods, we also compared the computing complexity of the four methods during the embedding process. We used operation time as a metric to evaluate the computing complexity, and set T, β, T n , T p , and Q to 1, 1, − 2, 1, and 4, respectively. The results are shown in Table 6. The average operation times for [18,20,33], and the proposed method are 0.0714 s, 0.4070 s, 0.3220 s, and 0.1226 s, respectively. From Table 6, the method from [20] consumes the most time because the embedding strategy in this method is based on bit operations. Moreover, the operation time of the method from [33] is also higher than the method from [18] and the proposed method, due to the fact that the embedding strategy in the method from [33] is based on module operations. However, both the method from [18] and the proposed method attain efficient results, with both average embedding times lower than 0.2000 s. Note that the reason why our method requires more time than the method from [18] is because of the overflow/underflow processing involved in our embedding process.

Conclusions
In this study, a separable multiary RDHEI method that involved prediction and replacement, image encryption, data hiding, data extraction, and recovery of the original image was developed. A stream cipher and shuffling were used to encrypt the original image so that we could obtain a high-security level. The data hider could embed additional data into the image without knowing the original image because we had already vacated space for the data hider. It is worth pointing out that, in our work, the data hider is also capable of adaptively modifying the reserved space according to the actual embedding demand. After the receiver acquires the encrypted image with the additional data, the receiver can extract the entire additional message either from the encryption domain or the plaintext domain because our scheme is entirely separable. It should be noted that there are two limitations in our method: (1) the amount of overflow/ underflow data increases dramatically with an increase in Q, and (2) the ER of our method relies heavily on the accuracy of the predictions. Both issues will be addressed in our future research.