JPEG image steganography payload location based on optimal estimation of cover co-frequency sub-image

Wang, Jie; Yang, Chunfang; Zhu, Ma; Song, Xiaofeng; Liu, Yuan; Lian, Yuemeng

doi:10.1186/s13640-020-00542-2

Research
Open access
Published: 06 January 2021

JPEG image steganography payload location based on optimal estimation of cover co-frequency sub-image

Jie Wang¹,
Chunfang Yang ORCID: orcid.org/0000-0001-6487-379X¹,
Ma Zhu¹,
Xiaofeng Song²,
Yuan Liu³ &
…
Yuemeng Lian⁴

EURASIP Journal on Image and Video Processing volume 2021, Article number: 1 (2021) Cite this article

4443 Accesses
21 Citations
Metrics details

Abstract

The excellent cover estimation is very important to the payload location of JPEG image steganography. But it is still hard to exactly estimate the quantized DCT coefficients in cover JPEG image. Therefore, this paper proposes a JPEG image steganography payload location method based on optimal estimation of cover co-frequency sub-image, which estimates the cover JPEG image based on the Markov model of co-frequency sub-image. The proposed method combines the coefficients of the same position in each 8 × 8 block in the JPEG image to obtain 64 co-frequency sub-images and then uses the maximum a posterior (MAP) probability algorithm to find the optimal estimations of cover co-frequency sub-images by the Markov model. Then, the residual of each DCT coefficient is obtained by computing the absolute difference between it and the estimated cover version of it, and the average residual over coefficients in the same position of multiple stego images embedded along the same path is used to estimate the stego position. The experimental results show that the proposed payload location method can significantly improve the locating accuracy of the stego positions in low frequencies.

1 Introduction

Digital steganography is the technique that embeds information, known as the payload, into the redundant parts of multimedia data such as digital images, video, audio, and text, termed the cover, to conceal secret communications. In the past decades, a series of steganographic algorithms have been proposed with image, text, audio, or video as cover [1,2,3,4,5,6,7,8]. Correspondingly, many steganalysis algorithms also have been proposed to detect the stego object [9,10,11,12,13,14]. However, in real life, the investigators often not only satisfy with distinguishing the cover objects and the stego objects, but also are eager to extract the hidden information. Compared with the detection of the stego objects, the extraction of hidden information is much more difficult and requires more clues, such as the stego key space, the stego positions, and the selection scheme of stego positions. The technique to identify the stego positions is referred as steganography payload location. In [15, 16], Yang et al. and Liu et al. have reported that when the selection scheme of stego positions is known, if the investigator can locate the steganography payload with the accuracy higher than randomly guessing, he (or she) can extract the hidden information by a collision attack.

Although Quach [17] has proved the locatability of modified pixels in a single stego image, the actual steganography payload algorithms designed for a single stego image can only locate the steganography payload with low accuracy because it is very difficult to precisely estimate the cover of the given stego image and about half of the stego elements are still unchanged [18]. However, for the convenience of communication, many communication participants use the same key in a certain period of time and limit the embedding ratio. At this point, if they use multiple images with the same size to embed a large amount of data, the investigator may possess a number of stego images each containing payload at the same locations. Under such a scenario, in 2008, Ker [19] firstly proposed a payload location algorithm based on weighted stego-image (WS) residuals for least significant bit (LSB) replacement. After that, many payload location algorithms have been proposed for spatial image steganography under this condition. Chiew and Pieprzyk [20] modified Ker’s algorithm to locate the payload of binary image replacement steganography under the same condition. Ker and Lubenko [21] proposed a payload location algorithm for LSB matching, which filters the horizontal, vertical, and diagonal wavelet subbands of stego images by Wiener filter, and locates the stego pixel positions according to the absolute sum of the wavelet residuals in the same positions of multiple images embedded messages into the same positions. Quach [22, 23] proposed several payload location algorithms for LSB replacement and LSB matching, which employ the Viterbi decoding algorithm or Quadratic Pseudo-Binary Optimization (QPBO) algorithm to find the optimal estimate of the cover image, and compute the residuals between the estimated cover images and the stego images to locate the payload. Gui et al. [24] proposed a payload location algorithm for LSB matching steganography by fusing the mean of 4 neighborhood pixels and 8 residuals computed along 8 different directions by the algorithm proposed by Quach [22]. Liu et al. [25] proposed a payload location algorithm for embedding messages into the spatial images subjected to JPEG compression by LSB replacement or LSB matching, which estimates the cover images by JPEG re-compressing the stego images and decompressing the re-compressed versions. Yang et al. [15] proved the properties of the optimal stego subset of the multiple least significant bits (MLSB) steganography, then proposed a payload location algorithm and a stego key recovery algorithm based on the optimal stego subset. Sun et al. [26] proposed a payload location algorithm base on a tailored deep neural network (DNN) equipped with the improved feature named the “mean square of adjacency pixel difference.”

The above algorithms can locate the payload of LSB replacement, LSB matching, and MLSB replacement steganography with high accuracy and even can be used to estimate groups in group parity steganography or extract the hidden message for some special cases. However, they cannot work for the steganography algorithms with JPEG image as cover.

When the messages are embedded into the JPEG images, recently, the authors [27] proposed a payload location method based on co-frequency sub-image filtering for a category of pseudo-random scrambled JPEG image steganography. The accuracy of this payload location method is influenced by the fidelity of the estimated cover images and can be improved if a more precise estimator can be designed.

Activated by the optimal cover estimation method proposed by Quach in [22] for spatial image steganography, this paper proposes a payload location method for JPEG image steganography based on the optimal estimation of cover co-frequency sub-image. Instead of directly applying the maximum a posterior (MAP) probability algorithm to the given stego spatial image to estimate the cover spatial image by the method in [22], the proposed method divides the stego JPEG image into 64 co-frequency sub-images, then applies the MAP algorithm to estimate the optimal cover co-frequency sub-images, and combines them to obtain the optimal cover JPEG image. This makes use of the correlation between the coefficients in the same position of adjacent blocks with a size of 8 × 8.

The structure of this paper is as follows: Section 2 briefly introduces the random JPEG image steganography targeted in this paper. Section 3 proposes the payload location method based on the optimal estimation of cover co-frequency sub-image. Section 4 gives a specific payload location algorithm for F5 steganography. Section 5 presents the experimental results and the discussions. Finally, the paper is summarized in Section 6.

2 Related work—Pseudo-random JPEG image steganography

In order to improve the security of JPEG image steganography, the steganographer often embeds secret messages into the quantized DCT coefficients scrambled pseudo-randomly. And because there are a lot of quantized DCT coefficients with value of 0 in JPEG images, if the steganographer embeds messages into these coefficients, the doubtful artificial clue will be found by steganalyzer. Thus, many JPEG image steganography methods do not embed message bits into these coefficients and do not embed message bits into the coefficients whose values would be changed to be 0. These JPEG image steganography methods can be described as follows.

Input: a cover JPEG image C = c₁c₂…c_N, a secret message bit sequence M = m₁m₂…m_L and a stego key K.

Output: a stego JPEG image.

Steps:

1.
Scramble the quantized DCT coefficients in the cover JPEG image C according to the stego key K, to generate the scrambled coefficient sequence C^′ = Scr(C, K), where $ {C}^{\prime }={c}_1^{\prime }{c}_2^{\prime}\dots {c}_N^{\prime } $ denotes the scrambled coefficient sequence and Scr(C, K) is the scrambling function.
2.
Embed the secret message bit sequence M into the scrambled coefficient sequence C^′.
1. 2.1.
  Assign the initial index of the secret message bit as 1, viz. i = 1, and assign the initial index of the scrambled coefficient as 1, viz. j = 1.
2. 2.2.
  Take the ith message bit m_i from the secret message bit sequence M.
3. 2.3.
  Take the jth coefficient $ {c}_j^{\prime } $ from the scrambled coefficient sequence C^′.
4. 2.4.
  If the value of coefficient $ {c}_j^{\prime } $ cannot carry a message, for example, the value of coefficient $ {c}_j^{\prime } $ is 0, go to step 2.8.
5. 2.5.
  Embed the ith message bit into the jth coefficient $ {c}_j^{\prime } $.
6. 2.6.
  If the embedding changes the value of coefficient $ {c}_j^{\prime } $ to be the value which cannot carry a message, for example, F5 steganography changes the coefficient value 1 to be 0, assign the index of the scrambled coefficient as j + 1, viz. j = j + 1. If j > N, return 0, otherwise go to step 2.3.
7. 2.7.
  Assign the index of the secret message bit as i + 1, viz. i = i + 1. If i > L, go to step 3.
8. 2.8.
  Assign the index of the scrambled coefficient as j + 1, viz. j = j + 1. If j > N, return 0, otherwise go to step 2.2.
3.
Inverse scramble the coefficient sequence after embedding according to the stego key K;
4.
Encode the obtained coefficient sequence to a stego JPEG image, and return the generate stego JPEG image.

3 Methods—Payload location based on optimal estimation of cover co-frequency sub-image

3.1 Principle

When the secret messages are embedded into the pseudo-randomly scrambled coefficients as described in Section 2, if the investigator possesses T stego images S₁, S₂, ⋯, S_T embedded along the same embedding path, then either of the following two cases may happen to the coefficients S₁(i, j), S₂(i, j), …, S_T(i, j) in the same position (i, j) of T stego images:

1)
If the position (i, j) is a stego position, the steganographer will determine whether to embed the message bit into the coefficient in this position according to whether the coefficient is available. Thus, any coefficient of S₁(i, j), S₂(i, j), …, S_T(i, j) is either an unavailable coefficient or a stego coefficient containing a message bit.
2)
If the position (i, j) is a non-stego position, the steganographer will not embed the message bit into the coefficient in this position regardless of whether the coefficient is available. Thus, no coefficients of S₁(i, j), S₂(i, j), …, S_T(i, j) contain a message bit.

Let C₁, C₂, …, C_T denote the corresponding cover images of the stego images S₁, S₂, …, S_T. A residual r_t(i, j) of the coefficient in the position (i, j) of the tth stego image is defined as

$$ {r}_t\left(i,j\right)=\left|{S}_t\left(i,j\right)-{C}_t\left(i,j\right)\right| $$

(1)

Let $ \overline{r}\left(i,j\right) $ denote the mean of all r_t(i, j) over T stego images in the position (i, j).

If the position (i, j) is a non-stego position, $ \overline{r}\left(i,j\right) $ must equal to 0, viz. $ \overline{r}\left(i,j\right)=0 $. If the position (i, j) is a stego position, $ \overline{r}\left(i,j\right) $ must be larger than or equal to 0, viz. $ \overline{r}\left(i,j\right)\ge 0 $, where the equal sign only holds in the case of that all of the coefficients C₁(i, j), C₂(i, j),…, C_T(i, j) are not modified. When one possesses enough stego images, the probability that none of the coefficients C₁(i, j), C₂(i, j),…, C_T(i, j) is modified is small. Thus, the investigator should be able to distinguish the stego positions from the non-stego positions according to the means of residuals if he can obtain the cover images.

However, the investigator often cannot know the cover JPEG images. In this case, if the investigator can estimate the cover images, which are denoted by $ {\hat{C}}_1,{\hat{C}}_2,\dots, {\hat{C}}_T $, he can compute the mean of the estimated residuals in the same position (i, j) of different stego images as follows:

$$ \overset{\sim }{r}\left(i,j\right)=\frac{\sum_{t=1}^T{\hat{r}}_t\left(i,j\right)}{T}=\frac{\sum_{t=1}^T\left|{S}_t\left(i,j\right)-\hat{C_t}\left(i,j\right)\ \right|}{T} $$

(2)

If the investigator possesses enough stego images embedded along the same path and can estimate the covers of them accurately enough, he may also be able to distinguish the stego positions from the non-stego positions with a success rate higher than a random guess based on the averaged estimated residuals as follows:

$$ f\left(i,j\right)=\left\{\begin{array}{cc}1,& \overset{\sim }{r}\left(i,j\right)\ge Thr\\ {}0,& \overset{\sim }{r}\left(i,j\right)< Thr\end{array}\right. $$

(3)

where f(i, j) = 1 denote that the position (i, j) is determined as a stego position, f(i, j) = 0 denote the position (i, j) is determined as a non-stego position, and Thr is a decision threshold.

Certainly, the more accurately the cover JPEG images are estimated, the higher the accuracy of payload location is. Therefore, in the following subsection of this section, a method is proposed to estimate the optimal cover co-frequency sub-images, then combine them to estimate the cover JPEG image.

3.2 Optimal cover JPEG image estimation

In [22], Quach et al. considered the strong correlation between neighboring pixels of spatial image and used the maximum a posterior (MAP) probability algorithm to estimate the optimal cover image corresponding to the stego image of LSB replacement and LSB matching steganography, which was used to locate the hidden information of LSB replacement and LSB matching steganography. In JPEG compression, the DCT transformation of pixel values greatly reduces the correlation between adjacent coefficients. And in order to improve the efficiency of JPEG compression, the DCT transformation is performed on each non-overlapping pixel block with a size of 8 × 8. Since the coefficients in the same position represent the magnitude of energy in the same frequency and the adjacent blocks in an image still have strong similarity, the coefficients in the same position of adjacent blocks still have a strong correlation. According to the property, this section will use the same method in [27] to divide the given JPEG images into 64 co-frequency sub-images, then use the maximum a posterior probability algorithm to estimate the optimal cover co-frequency sub-images, and combine them to get the optimal estimation of cover JPEG image.

3.2.1 Markov model of co-frequency sub-image

Let $ {S}_t^d $ and $ {C}_t^d $ denote the co-frequency sub-images composed of the dth quantized DCT coefficients in all 8 × 8 blocks of the tth stego image and its cover image, d = 1, 2, …, 64. In a statistical sense, the optimal estimation of cover co-frequency sub-images corresponding to $ {S}_t^d $ should be the cover co-frequency sub-image estimation $ {\hat{C}}_t^d $ with the maximum a posterior probability, that is

$$ {\displaystyle \begin{array}{c}{\hat{C}}_t^d=\arg \underset{C_t^d}{\max }p\left({C}_t^d|{S}_t^d\right)\\ {}=\arg \underset{C_t^d}{\max }p\left({S}_t^d|{C}_t^d\right)p\left({C}_t^d\right)\end{array}} $$

(4)

Then, the optimal cover co-frequency sub-image estimation is transformed into a problem of maximum a posterior probability estimation.

Similar to [22], the following two assumptions are set:

$$ p\left({S}_t^d|{C}_t^d\right)={\prod}_ip\left({S}_t^d(i)|{C}_t^d(i)\right) $$

(5)

$$ p\left({C}_t^d\right)={\prod}_ip\left({C}_t^d(i)\right|{C}_t^d\left(i-1\right),{C}_t^d\left(i-2\right),\dots, {C}_t^d\left(i-k\right)\Big) $$

(6)

where k is a given positive integer. Eq. (5) indicates that each quantized DCT coefficient in the stego co-frequency sub-images is only related to the corresponding quantized DCT coefficient in the cover co-frequency sub-images, while Eq. (6) indicates that the cover co-frequency sub-image $ {C}_t^d $ is modeled with a k-order Markov model.

For a given steganography algorithm, one can calculate the probabilities that the quantized DCT coefficient value changes to different possible values under a specific embedding rate α, viz. the transition probability in assumption (5). Besides, the prior probability in (6) can be computed from a large number of cover images.

After dividing all quantized DCT coefficients into 64 co-frequency sub-images, each sub-image is scanned by four modes as shown in Fig. 1 to calculate the co-occurrence matrices of the adjacent elements.

In JPEG image, the distributions of coefficient values in different co-frequency sub-images show obvious differences. As shown in Fig. 2, the absolute values of coefficients in the low frequencies (corresponding to the upper left positions) are usually larger and equal to zero with the lowest probabilities, and most of the absolute values of coefficients in the high frequencies (corresponding to the lower right positions) equal to zero. Figure 3 presents the frequencies of zero coefficient in the different sub-images, where 10,000 images with a size of 512 × 512 in Bossbase 1.01 (http://agents.fel.cvut.cz/stegodata/) are JPEG compressed with a quality factor of 75. The abscissa is the index of the position in the 8 × 8 block from left to right and top to bottom. It can be seen that the relative frequencies of zero coefficient in the sub-images corresponding to the lower right positions are close to 1.

3.2.2 Optimal cover JPEG image estimation based on first-order Markov model

In theory, we should compute the probabilities for all possible covers and search the cover which satisfies Eq. (4). But there are too many possible coefficient values in the cover image to search the whole possible space. Fortunately, the co-frequency sub-image can be modeled by the hidden Markov model, and the Viterbi algorithm is a common method to solve the problem of the hidden Markov model. It has been used in cover image estimation of spatial steganography such as LSB replacement and LSB matching in [22]. Therefore, The Viterbi algorithm will also be adopted to search the optimal cover co-frequency sub-image. The Viterbi algorithm first computes the scores of the possible values of the first cover element as follows:

$$ v\left({c}_{1i}\right)=p\left({s}_{1i}|{c}_{1i}\right)p\left({c}_{1i}\right). $$

(7)

Then, the scores of the possible values of the subsequent cover elements are computed as follows:

$$ v\left({c}_{ki}\right)={}_{c_{k-1,i}}{}^{\mathit{\max}}v\left({c}_{k-1,i}\right)p\left({c}_{ki}|{c}_{k-1,i}\right)p\left({s}_{ki}|{c}_{ki}\right) $$

(8)

where c_{k, i} is possible value of the kth cover element in the ith image.

Take a stego co-frequency sub-image with four quantized DCT coefficients S = (2, 0, −1, 1) of the typical F5 steganography as example, where the embedding ratio is 0.5. According to the embedding rule of F5 steganography, the possible values of the four cover coefficients are c₁ ∈ {2, 3}, c₂ ∈ {−1, 0, 1}, c₃ ∈ {−1, −2}, and c₄ ∈ {1, 2}. Figure 4 shows the trellis for Viterbi algorithm, which takes the possible values of four cover coefficients as nodes. The Viterbi algorithm first computes the scores of nodes in the first column of the trellis, where the value of p(c₁) can be obtained by statistics of a large number of cover JPEG images. For ease of understanding, it is assumed that the values of p(c₁) are as shown in the second column of Table 1. When the embedding ratio of F5 steganography is q, the coefficient value transition probability of F5 steganography is as follows:

$$ p\left({s}_i|{c}_i\right)=\left\{\begin{array}{c}1-\frac{q}{2},{s}_i={c}_i-1\kern0.5em \mathrm{and}\ {s}_i>0\\ {}1-\frac{q}{2},{s}_i={c}_i+1\kern0.5em \mathrm{and}\ {s}_i<0\\ {}\kern2em \frac{q}{2},{s}_i={c}_i\ \mathrm{and}\ {s}_i\ne 0\kern2.25em \\ {}\kern2em 1,{s}_i={c}_i=0\kern5.5em \\ {}\kern1.75em 0,\mathrm{others}.\kern7.25em \end{array}\right. $$

(9)

Table 1 Example of the first-order cover probability model

Full size table

Then the scores of the subsequent nodes are computed in sequence by Eq. (8), and each node is connected with the previous node which maximizes its score. The values of p(c_k| c_k − 1) also can be obtained by statistics of a large number of cover JPEG images. It is assumed that the values of p(c_k| c_k − 1) are as shown in the last column of Table 1.

Finally, take the coefficient values in the path ending at the node with the largest score in the last column as the optimal estimation of the cover coefficients, as shown by the gray node in Fig. 4. It can be seen that when the embedding ratio is 0.5, the optimal estimation of the cover coefficient sequence of S = (2, 0, −1, 1) is $ \hat{\mathrm{c}}=\left(3,-1,-2,2\right) $.

After the optimal estimation of each cover co-frequency sub-image is obtained by the Viterbi algorithm, one can place the coefficients of all estimated cover co-frequency sub-images at the original positions of them to combine the optimal estimation of the cover JPEG image. The whole process is shown in Fig. 5, which is described in Algorithm 1.

In theory, each cover co-frequency sub-image may be estimated more precisely by the first-order Markov model in the corresponding frequency. However, in many frequencies, there are a large number of coefficients with value of 0 which result in that the statistical significance of non-zero coefficient is not significant. Thus, in follows the first-order Markov model merged over different positions is used to estimate the cover co-frequency sub-images.

4 Payload location algorithm for F5 steganography without Matrix Encoding

The F5 steganography algorithm improves F4 by using shuffling. In F5 steganography, the positive odd and negative even represent the bit 1, while the positive even and negative odd represent the bit 0, and the DCT coefficients with value of 0 and DC coefficients do not carry secret information. The coefficient value transition probability of F5 steganography is shown by (9). When T stego JPEG images of F5 steganography are given, we can adopt the existing quantitative steganalysis algorithms to estimate the embedding ratios and then use the proposed Algorithm 1 in Section 3 to estimate the corresponding cover JPEG images. For each given stego JPEG image, we can scan it by 4 different modes as shown in Fig. 1, and then 4 estimated cover JPEG images can be obtained by Algorithm 1.

After that, the residuals between the given stego image and the estimated cover JPEG images are computed as follows:

$$ {r}_t\left(i,j\right)\left\{\begin{array}{c}0,\kern0.75em \mathit{\operatorname{mod}}\left(i,8\right)=0\ and\ \mathit{\operatorname{mod}}\left(j,8\right)=0\\ {}\left|{S}_t\left(i,j\right)-{\hat{C}}_t\left(i,j\right)\right|,\kern0.5em others\end{array}\right. $$

(10)

which is slightly different from the previous residual calculation Eq. (1). For each position, 4T residuals can be computed from the given T stego JPEG images and 4T estimated cover JPEG images by (10), and then be averaged. The averaged value will be used to determine whether this position is a stego position. The detailed steps of the payload location for F5 steganography are given in Algorithm 2.

5 Results and discussion

5.1 Experimental setup

In total, 10,000 PGM images with a size of 512 × 512 were downloaded from the BOSSbase1.01 and converted to cover JPEG images with a quality factor of 75. Nine thousand images were randomly selected from the generated cover JPEG images to count the first-order Markov model of cover co-frequency sub-image. The remaining 1000 images were used to test the performance of the proposed algorithm. A pseudo-random path was generated by scrambling the integer sequence 1, 2,…, 512 × 512. Then along the generated path, the pseudo-random message bits were embedded into the remaining 1000 images by F5 steganography (without matrix encoding) with ratio q = 0.5.

5.2 Markov model selection

From Algorithm 1 and 2, it can be found that the payload location accuracy is highly affected by the adopted first-order Markov model. In Section 3, we suggest to merge the Markov models over different frequencies to estimate the cover co-frequency sub-image more precisely. Thus, we tried to merge proper Markov models.

Firstly, the 64 Markov models m₁…m₆₄ counted from sub-images corresponding to 64 positions in 8 × 8 matrix were applied to estimate the cover JPEG images separately, and the Markov model m_i with the highest payload location accuracy was selected. Then, each of the remaining 63 models was merged to m_i to obtain 63 new merged modes m_i1…m_i63, and the merged Markov model m_ij with the highest payload location accuracy was selected. This operation was repeated until all models were merged. The merged model with the highest payload location accuracy was selected as the final model.

One thousand test stego JPEG images with embedding ratio 0.5 were used to select the proper merged Markov model. Table 2 presents the location correctness of each co-frequency sub-images with the single corresponding Markov model, namely, 64 co-frequency sub-image models are used for the corresponding sub-images respectively. Table 3 shows the results when the optimal merged Markov model was used.

Table 2 Location accuracy for co-frequency sub-images with the individual corresponding first-order Markov model

Full size table

Table 3 Location accuracy for co-frequency sub-images with the optimal merged Markov model

Full size table

In Tables 2 and 3, the correctness in the most upper left is not shown because the DC coefficients are not changed by F5 steganography. Comparing Table 2 with 3, we can see that for most positions, the location accuracy by using the optimal merged Markov model is much higher than that by using the individual model. Especially, the algorithm with the optimal merged Markov model can rightly distinguish the stego positions in low frequencies with accuracy close to 90%, even close to 95%. For the high-frequency positions, because there are very few available coefficients, it is still hard to distinguish the stego positions.

5.3 Performance analysis of location proposed algorithm for F5 steganography

Figure 6 shows the payload location accuracy of MAP-F5 with the optimal merged Markov model for different numbers of stego images when the embedding ratio is 0.5. It can be seen that the more the number of stego images, the higher the accuracy. As the number of images increases, the fluctuation of the residual means becomes smaller, and the residual means are closer to the change caused by information embedding. Therefore, the number of stego images is very important for locating the stego positions.

Figure 7 compares the accuracies of the proposed algorithm and the payload location algorithm based on co-frequency sub-image wavelet filtering (CSW-F5 )[27]. The 1000 stego images are generated with the same embedding path and the embedding ratio of 0.5. In the upper left corner of 8 × 8 block where the number of the 0 coefficient is relatively small, MAP-F5 obtains better results than CSW-F5. In practice, the results of the two payload location algorithms can be further combined.

6 Conclusion

This paper proposes a payload location method based on optimal estimation of cover co-frequency sub-image. The proposed method divides each given stego JPEG image into 64 co-frequency sub-images, then estimates the optimal cover JPEG image by applying the maximum a posterior probability algorithm to the co-frequency sub-images, and finally determines the stego positions according to the averaged residuals between given multiple stego images embedded along the same path and the estimated cover images. The proposed method is applied to the payload location for F5 steganography without matrix encoding and the experimental results show that the proposed algorithm can locate the stego positions with higher accuracy than prior works.

However, the proposed payload location method cannot work for the modern adaptive JPEG image steganography, JUNIWARD, UERD, and GUED. Therefore, in future, we will try to adapted the proposed cover JPEG image estimation method for the modern adaptive JPEG steganography. Besides, we will also try to improve the performance by using unsupervised learning to cluster the image blocks with similar contents [28].

Availability of data and materials

Please contact the author for data requests.

Abbreviations

JPEG:: Joint photographic experts group
LSB:: Least significant bit
QPBO:: Quadratic pseudo-binary optimization
MLSB:: Multiple least significant bits
DNN:: Deep neural network
DCT:: Discrete cosine transform
MAP:: Maximum a posterior
DC:: Direct current
CSW:: Co-frequency sub-image wavelet filtering

References

C. Qin, W. Zhang, F. Cao, X.P. Zhang, C.C. Chang, Separable reversible data hiding in encrypted images via adaptive embedding strategy with block selection. Signal Process. 153, 109–122 (2018)
Article Google Scholar
Y. Zhang, C. Qin, W.M. Zhang, F.L. Liu, X.Y. Luo, On the fault-tolerant performance for a class of robust image steganography. Signal Process. 146, 99–111 (2018)
Article Google Scholar
X. Liao, Y.B. Yu, B. Li, Z.P. Li, Z. Qin, A new payload partition strategy in color image steganography. IEEE Trans. Circuits Syst. Video Technol. (2019) https://doi.org/10.1109/TCSVT.2019.2896270
Y. Zhang, X.Y. Luo, Y.Q. Guo, C. Qin, F.L. Liu, Multiple robustness enhancements for image adaptive steganography in lossy channels. IEEE Trans. Circuits Syst. Video Technol. (2019) https://doi.org/10.1109/TCSVT.2019.2923980
L.Y. Xiang, Y. Li, W. Hao, P. Yang, X.B. Shen, Reversible natural language watermarking using synonym substitution and arithmetic coding. CMC 55(3), 541–559 (2018)
Google Scholar
X. Liao, Z. Qin, L.P. Ding, Data embedding in digital images using critical functions. Signal Process. 58, 146–156 (2017)
Article Google Scholar
B. Li, J. He, J. Huang, Y.Q. Shi, A survey on image steganography and steganalysis. J. Inf. Hiding Multimedia Signal Process. 2(2), 142–172 (2011)
Google Scholar
F.H. Wang, J.S. Pan, L.C. Jain, Innovations in digital watermarking techniques (Springer, Berlin-Heidelberg, 2009)
Book Google Scholar
Y.Y. Ma, X.Y. Luo, X.L. Li, Z.K. Bao, Y. Zhang, Selection of rich model steganalysis features based on decision rough set α-positive region reduction. IEEE Trans. Circuits Syst. Video Technol. 29(2), 336–350 (2019)
Article Google Scholar
C.F. Yang, Y. Zhang, P. Wang, X.Y. Luo, F.L. Liu, J.C. Lu, Steganalysis feature subspace selection based on Fisher criterion. IEEE Int. Conf. Data Sci. Adv. Analytics., 514–521 (2017) https://doi.org/10.1109/DSAA.2017.53
C.F. Yang, F.L. Liu, X.Y. Luo, Y. Zeng, Pixel group trace model-based quantitative steganalysis for multiple least-significant bits steganography. IEEE Trans. Inf. Forensics Secur. 8(1), 216–228 (2013)
Article Google Scholar
Y.H. Kang, F.L. Liu, C.F. Yang, X.Y. Luo, Zhang, T.T. Zhang, Color image steganalysis based on residuals of channel differences. Comput. Mater. Continua 59(1), 315–329 (2019)
Article Google Scholar
X.F. Song, F.L. Liu, L.J. Chen, C.F. Yang, X.Y. Luo, Optimal Gabor filters for steganalysis of content-adaptive JPEG steganography. KSII Trans. Internet Inf. Syst. 11(1), 552–569 (2017)
Google Scholar
L.Y. Xiang, G.Q. Guo, J.M. Yu, V.S. Sheng, P. Yang, A convolutional neural network-based linguistic steganalysis for synonym substitution steganography. Math. Biosci. Eng. 17(2), 1041–1058 (2019)
Article MathSciNet Google Scholar
C.F. Yang, X.Y. Luo, J.C. Lu, F.L. Liu, Extracting hidden messages of MLSB steganography based on optimal stego subset. SCIENCE CHINA Inf. Sci. 61, 119103 (2018) https://doi.org/10.1007/s11432-017-9328-2
Article Google Scholar
J.F. Liu, Y.G. Tian, T. Han, J.C. Wang, X.Y. Luo, Stego key searching for LSB steganography on JPEG decompressed image. SCIENCE CHINA Inf. Sci. 59(3), 32105 (2016)
Article Google Scholar
T.T. Quach, in Media Watermarking, Security, and Forensics. Locatability of modified pixels in steganographic images (2012), p. 83030Q
Google Scholar
C.F. Yang, J. Wang, C.L. Lin, H.Q. Chen, W.J. Wang, Locating steganalysis of LSB matching based on spatial and wavelet filter fusion. Comput. Mater. Continua 60(2), 633–644 (2019)
Article Google Scholar
A.D. Ker, in Proc. 10th Multimedia and Security Workshop. Locating steganographic payload via WS residuals (2008), pp. 27–32
Chapter Google Scholar
K.L. Chiew, P. Josef, Identifying steganographic payload location in binary image. Proc. Pac. Rim Conf. Multimedia Part I. 6297, 590–600 (2010)
Google Scholar
A.D. Ker, I. Lubenko, in Proceedings of SPIE- The International Society for Optical Engineering. Feature reduction and payload location with WAM steganalysis (2009), p. 7254
Google Scholar
T.T. Quach, Optimal cover estimation methods and steganographic payload location. IEEE Trans. Inf. Forensics Secur. 6(4), 1214–1222 (2011)
Article Google Scholar
T.T. Quach, Cover estimation and payload location using Markov random fields. Media Watermarking Secur. Forensics., 90280H (2014)
X.L. Gui, X.L. Li, B. Yang, in Proceedings of the 19th IEEE International Conference on Image Processing. Improved payload location for LSB matching steganography (2012), pp. 1125–1128
Google Scholar
J.F. Liu, Y.G. Tian, T. Han, C.F. Yang, W.B. Liu, LSB steganographic payload location for JPEG decompressed images. Digit Signal Process. 38, 66–76 (2015)
Article Google Scholar
Y. Sun, H. Zhang, T. Zhang, R. Wang, Deep neural networks for efficient steganographic payload location. Real-Time Image Process. 16(3), 635–647 (2019)
Article Google Scholar
J. Wang, C.F. Yang, P. Wang, X.F. Song, J.C. Lu, Payload location for JPEG image steganography based on co-frequency sub-image filtering. Int. J. Distributed Sens. Netw. 16(1) (2020) https://doi.org/10.1177/1550147719899569
L.Y. Xiang, G.H. Zhao, Q. Li, W. Hao, F. Li, TUMK-ELM: a fast unsupervised heterogeneous data learning approach. IEEE Access. 6, 35305–35315 (2018)
Article Google Scholar

Download references

Acknowledgements

Thanks to all those who have suggested and given guidance for this article

Funding

This work is supported by the National Natural Science Foundation of China (Grant Nos. 61872448, 61772549, U1804263) and Natural Science Basic Research Plan in Shanxi Province of China (No. 2018JM6017).

Author information

Authors and Affiliations

Zhengzhou Science and Technology Institute, Zhengzhou, 450001, China
Jie Wang, Chunfang Yang & Ma Zhu
School of Information and Communication, National University of Defense Technology, Xi’an, 710106, China
Xiaofeng Song
Huanghe S & T University, Zhengzhou, 450000, China
Yuan Liu
Henan Huizhi Scientific & Technical Development Co., Ltd, Zhengzhou, 450002, China
Yuemeng Lian

Authors

Jie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chunfang Yang
View author publications
You can also search for this author in PubMed Google Scholar
Ma Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofeng Song
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yuemeng Lian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors took part in the discussion of the work described in this paper. The author Jie Wang carried out the experiments of the paper and wrote the paper. The author Chunfang Yang designed the algorithms of this work and revised the paper. The author Ma Zhu helped conduct the experiments. All authors read and approved the final manuscript.

Authors’ information

Chunfang Yang is currently an associate professor of Zhengzhou Science and Technology Institute. He received his MA and PhD degrees in computer science and technology from Zhengzhou Information Science and Technology Institute, Zhengzhou, China, in 2008 and 2012, respectively. His research interest includes image steganography and steganalysis technique.

Jie Wang is currently a master degree candidate of Zhengzhou Science and Technology Institute. His research interest includes image steganography and steganalysis technique.

Ma Zhu is currently an associate professor of Zhengzhou Science and Technology Institute. She received her MA degree in computer science and technology from University of Electronic Science and Technology of China, Chengdu, China, in 2007. Her research interest includes computer network and multimedia security technique.

Xiaofeng Song is currently an associate professor of School of Information and Communication, National University of Defense Technology, Xi’an, China. He received his MA degree in computer science and technology from Xidian University, Xi’an, China, in 2009 and received his PhD degrees in computer science and technology from Zhengzhou Information Science and Technology Institute, Zhengzhou, China, in 2016. His research interest includes image steganography and steganalysis technique.

Yuan Liu is currently an associate professor of Huanghe S & T University, Zhengzhou, China. She received her MA degree in computer science and technology from Harbin Institute of Technology, Harbin, China, in 1992 and received her PhD degrees in computer science and technology from Zhengzhou Information Science and Technology Institute, Zhengzhou, China, in 2005. Her research interest includes information security technique.

Yuemeng Lian is currently an engineer of Henan Huizhi Scientific & Technical Development Co., Ltd. Zhengzhou, China. Her research interest includes information security technique.

Corresponding author

Correspondence to Chunfang Yang.

Ethics declarations

Competing interests

The authors declare that they have no competing interests

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, J., Yang, C., Zhu, M. et al. JPEG image steganography payload location based on optimal estimation of cover co-frequency sub-image. J Image Video Proc. 2021, 1 (2021). https://doi.org/10.1186/s13640-020-00542-2

Download citation

Received: 01 May 2020
Accepted: 16 November 2020
Published: 06 January 2021
DOI: https://doi.org/10.1186/s13640-020-00542-2

JPEG image steganography payload location based on optimal estimation of cover co-frequency sub-image

Abstract

1 Introduction

2 Related work—Pseudo-random JPEG image steganography

3 Methods—Payload location based on optimal estimation of cover co-frequency sub-image

3.1 Principle

3.2 Optimal cover JPEG image estimation

3.2.1 Markov model of co-frequency sub-image

3.2.2 Optimal cover JPEG image estimation based on first-order Markov model

4 Payload location algorithm for F5 steganography without Matrix Encoding

5 Results and discussion

5.1 Experimental setup

5.2 Markov model selection

5.3 Performance analysis of location proposed algorithm for F5 steganography

6 Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Authors’ information

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords