Image splicing detection based on inter-scale 2D joint characteristic function moments in wavelet domain
- Tae Hee Park^{1},
- Jong Goo Han^{2},
- Yong Ho Moon^{3} and
- Il Kyu Eom^{2}Email author
https://doi.org/10.1186/s13640-016-0136-3
© The Author(s). 2016
Received: 23 November 2015
Accepted: 21 September 2016
Published: 3 October 2016
Abstract
In this paper, we propose an image splicing detecting method using the characteristic function moments for the inter-scale co-occurrence matrix in the wavelet domain. We construct the co-occurrence matrices by using a pair of wavelet difference values across inter-scale wavelet subbands. In this process, we do not adopt the thresholding operation to prevent information loss. We extract the high-order characteristic function moments of the two-dimensional joint density function generated by the inter-scale co-concurrent matrices in order to detect image splicing forgery. Our method can be applied regardless of the color or gray image dataset using only luminance component of an image. By performing experimental simulations, we demonstrate that the proposed method achieves good performance in splicing detection. Our results show that the detection accuracy was greater than 95 % on average with well-known four splicing detection image datasets.
Keywords
1 Introduction
In recent years, with the increasing popularity and usage of digital cameras, together with the development of image editing technologies, it has become much easier for people with minimal expertise to edit image data. Image tampering or manipulation is often carried out as simple entertainment or as the initial step of a photomontage, which is popular in the field of image editing. However, the use of manipulated images for malicious purposes can have adverse consequences on human society because it is difficult to detect tempered images with the human eyes [1]. Therefore, the development of reliable image forgery detection methods is important to enable us to determine the authenticity of the images. Recently, various kinds of image forgery detection approaches have been proposed [2, 3].
Image splicing, which involves combining two or more images into a new image, is one of the most common types of image tempering. The majority of research into splicing detection is based on the fact that the image splicing process can cause discontinuities along edges and corners. These abnormal transitions are an important clue in the verification of image’s authenticity. Early attempts to detect spliced images focused on changes of the global statistical characteristics caused by abrupt discontinuities in the spliced images [4–7]. However, the statistical moment-based splicing detection methods are limited in that the statistical moments for an entire image do not efficiently reflect the local discontinuities caused by a splicing operation. Splicing detection algorithms that are capable of extracting local transitions caused by spliced image region have been presented. One of these approaches is the run-length-based splicing detection method [8–10]. Using this method, we can extract abnormal local transitions caused by splicing forgery. Run-length-based splicing detection methods have achieved notable detection performances with a small number of features. However, the detection rates of these algorithms are not ideal because the final features are extracted from the moments of various run-length matrices.
Other promising splicing detection techniques that exploit local transition features are Markov model-based approaches. Markov model-based features are reasonably useful for the detection of forged images that have been spliced. In 2012, He et al. [11] introduce a Markov model in both discrete cosine transform (DCT) and discrete wavelet transform (DWT) domains, and they detect image splicing according to the cross-domain Markov features. This method achieved a detection rate of 93.55 % on Colombia gray image dataset [12]. However, this scheme required up to 7290 features. Therefore, a dimension reduction algorithm such as recursive feature elimination (REF) was necessary. An enhanced Markov state selection method [13] was reported as a means of reducing the number of features. This approach analyzes the distribution characteristic of transform domain coefficients and maps a large number of coefficients with limited states that have coefficients based on various presupposed function models. However, to reduce the number of features, this method sacrificed the detection performance. El-Alfy et al. proposed a blind detection method of image splicing using Markov features in both spatial and DCT domains [14]. They also used principal component analysis (PCA) to select the most relevant features. They achieved a detection rate of 98.82 % with an easier testing condition (they used tenfold cross-validation, while the majority used sixfold cross-validation). In 2015, an image splicing detection technique [15] using a two-dimensional (2D) non-causal Markov model was introduced. In this method, a 2D Markov model was applied in the DCT domain and the discrete Meyer wavelet transform domain, and the cross-domain features were considered as the final discriminative features for classification purposes. This scheme achieved a detection rate of 93.36 % on Colombia gray image dataset; however, up to 14,240 features were required.
Recently, splicing direction methods applicable to color datasets are presented [16–19]. An image splicing detection algorithm using the run-length run-number and kernel PCA was presented [16]. This algorithm achieves a good detection accuracy with small number of features. However, splicing detection methods for color images should test for three color channels, and then, select one color channel which has the best detection accuracy. Muhammad et al. proposed an imposing image forgery detection method based on a steerable pyramid transform and local binary pattern with feature reduction [17]. This method demonstrated the best performance to rate of 97.33 % on CASIA2 dataset [20]. However, this scheme requires an enormous of features and additional feature selection techniques to reduce the number of features. An image splicing detection method using multi-scale Weber local descriptors [18] was presented. This algorithm achieves high detection accuracies on three color image datasets for splicing forgery detection. However, this method also requires feature dimension reduction as well as color channel selection. In 2016, an image slicing detection algorithm [19] using inter-scale joint characteristic function moments in the wavelet domain. This algorithm showed that the discriminability for slicing detection increased through the maximization process, and threshold expansion reduces the information loss caused by the coefficient thresholding that is used to restrict the number of Markov features. To compensate the increased number of features due to the threshold expansion, this method introduced even-odd Markov state decomposition algorithm. The detection accuracy of this method was 98.50 % for CASIA1 and 94.87 % for CASIA2 image dataset. However, this scheme is not applicable to gray images.
In summary, Markov model-based approaches suffer from information loss because of the required thresholding operation to reduce the number of states. While a large threshold value can reduce the information loss, the number of features becomes high. Furthermore, a larger number of features can result in an over-fitting problem, which degrades detection performance. Therefore, the choice of threshold becomes a trade-off between the detection performance and computational cost. In this paper, we propose an efficient image splicing detection algorithm by using both local and global statistical features in the wavelet domain. First, we construct co-occurrence matrices that can extract local statistical features by using the wavelet coefficient differences across inter-scale wavelet subbands at the same location. In this process, the information is not discarded by the thresholding operation. However, the inter-scale co-concurrent matrices have many features that enable us to detect spliced images. Therefore, the high-order characteristic function (CF) moments of the inter-scale co-concurrent matrices, which are the global statistical features, are exploited when detecting splicing forgery. The characteristic function is defined as the Fourier transform of the probability density function, and its moments are widely used in steganalysis techniques [21, 22]. We use the high-order CF moments as the global feature for the inter-scale co-concurrent matrices in the wavelet subbands. The number of feature used in this paper for splicing detection was 144. The proposed features can be further reduced up to 100 by using the principal component analysis (PCA) without performance degradation. The proposed algorithm can achieve good detection performance with a small number of features.
This paper organized as follows. In Section 2, we briefly review the splicing detection methods based on local statistical features. The proposed splicing detection method using the characteristic function moments of the inter-scale co-occurrence matrices is discussed in Section 3. In Section 4, we present the experimental results obtained using the proposed approach, and Section 5 draws conclusions from this paper.
2 Splicing detection methods based on local statistical features
2.1 Local statistical features for splicing detection
In [11], four intra-block Markov transition probability matrices are used to detect spliced images, and four inter-block Markov features are exploited to detect forged images in a similar manner. Consequently, 8(2T + 1)^{2} Markov features in the DCT domain are used in [11]. El-Alfy et al. [14] exploit \( {M}_w^{\to}\left(s,t\right) \), \( {M}_w^{\downarrow}\left(s,t\right) \), \( {M}_w^{\searrow}\left(s,t\right) \), and \( {M}_w^{\nwarrow}\left(s,t\right) \) as Markov features in both the DCT and spatial domain. A 2D non-causal Markov model [15] can also be described by the combination of (4). The majority of the Markov feature selection methods reported in the literatures can be obtained using the various combination of \( {M}_w^d\left(s,t\right). \)
2.2 Effect of truncated coefficients by thresholding operation
In general, the difference values in several domains are truncated by the thresholding operation to obtain state transition probabilities as indicated in (4). The threshold T in (3) determines the size of the Markov feature vector. If the value of T is small, and the number of features is small. However, this leads to information loss, and the Markov transition probability matrices may be insufficient to distinguish authentic and forged images. A large value of T can reduce information loss; however, the number of features becomes high. Furthermore, a larger number of features can generate an over-fitting problem, which degrades detection performance. Therefore, the choice of T becomes a trade-off between detection performance and computational cost.
Distributions of DCT coefficient difference values according to T (%)
Threshold | \( \%\left\{{F}_{inter}^{\to}\left(u,v\right)\le T\right\} \) | \( \%\left\{{F}_{inter}^{\downarrow}\left(u,v\right)\le T\right\} \) | \( \%\left\{{F}_{inter}^{\searrow}\left(u,v\right)\le T\right\} \) |
---|---|---|---|
T = 3 | 68.8 | 68.3 | 67.5 |
T = 4 | 72.3 | 71.8 | 71.0 |
3 Proposed splicing detection method
3.1 Construction of co-occurrence matrices in wavelet domain
\( {C}_o^j\left(p,q\right) \) is related to the Markov transition probability as \( {C}_o^j\left(p,q\right)={M}_o^j\left(p,q\right) \Pr \left(D{W}_o^j\left(x,y\right)=q\right) \), where \( {M}_o^j\left(p,q\right) \) is a Markov transition probability of the inter-scale wavelet subbands. While the Markov transition probability is the joint probability conditioned on \( \Pr \left(D{W}_o^j\left(x,y\right)=q\right) \), the probability of the co-occurrence matrix is a simple 2D joint probability.
3.2 Feature extraction method using characteristic function moments
Steganography is the art of hiding information within a cover medium without providing any visual indication of its presence [23]. As the counterpart of steganography, steganalysis aims to detect the presence of secret messages within suspicious images [24]. The moment-based features are commonly used in steganalysis, which aims to detect the presence of secrete messages within suspicious images. Theoretical studies have investigated the use of statistical moments in steganalysis [21, 22]. In these studies, the CF moments that is the Fourier transform version of the probability distribution function (PDF) moments are more effective than PDF moments to detect hidden messages.
The image splicing operation and inserting secrete message both cause discontinuities in an image. The steganography inserts a secrete message in an image at a random locations. On the other hand, the image splicing forgery causes abrupt discontinuities in the form of edge. Because the statistical moment-based approach only exploits a global statistical nature by abrupt discontinuities, the CF moment-based splicing detection approach can be an effective solution to extract features for splicing detection. As shown in (10), there is no information loss when constructing \( {C}_o^j\left(p,q\right) \) because we do not use the threshold operation. However, using \( {C}_o^j\left(p,q\right) \) as a feature vector is practically impossible because the number of the probabilities is extremely large. Therefore, we introduce the (k, l)-th characteristic function moments for \( {C}_o^j\left(p,q\right) \) as a feature vector in this paper.
4 Simulation results
4.1 Datasets and classifier
To verify the performance of the proposed splicing detection method, we first used the Columbia Image Splicing Detection Evaluation Data Set (DVMM) [12]. This gray dataset consists of 933 authentic images and 912 spliced images, and it covers a variety of contents such as smooth, textured, arbitrary object boundary, and straight boundary. All of the images in this dataset are in BMP format with a size of 128 × 128. The spliced images are created from the authentic ones in the dataset by crop-and-paste operation along object boundaries or crop-and-paste operation of the horizontal or vertical strips.
To detect the splicing forgery, we employed a support vector machine (SVM) classifier with a radial basis function (RBF) kernel [25]. The important parameters of the RBF kernel SVM are “complexity” and “shape.” These parameters were set by grid search processing. We used sixfold cross-validation to evaluate the SVM model parameters. In sixfold cross-validation, we randomly divided each of the authentic images and the spliced images into six equal groups. In each iteration, we used five groups each from the authentic images and the forged images for training, while the remaining was used for testing. Therefore, at the end of six iterations, all the six groups had been tested. There was no overlapping between the training set and the testing set in an iteration. As with other studies, we constructed independent experiments 50 times and used the average results to reduce the stochastic impact.
Our splicing detection scheme was implemented in MATLAB R2013a. Tests were performed on a desktop running 64-bit Windows 7 with 19.0 GB RAM and Intel(R) Core(TM) i5-3570 3.40 GHz CPU. The detection time for an image was composed of two parts; feature extraction time and verification time. We selected a 128 × 128 image randomly of Columbia gray DVMM dataset and performed detection process 10 times. The average feature extraction time was 0.13 s. The verification time was approximately 0.02 s. In total, the splicing detection time takes approximately 0.15 s as much as possible.
4.2 Detection performance
Detection accuracies of the proposed method according to the number of features for the Columbia gray dataset (unit: %)
Number of features | TPR | TNR | ACC |
---|---|---|---|
144 | 95.3 | 95.2 | 95.3 |
120 | 94.3 | 97.5 | 95.8 |
100 | 95.5 | 97.0 | 96.2 |
80 | 95.3 | 96.7 | 96.0 |
70 | 94.5 | 94.4 | 94.5 |
50 | 93.6 | 93.5 | 93.5 |
Detection accuracies for the comparison between the proposed approach and other methods for the Columbia gray dataset (unit: %)
4.3 Detection results for color datasets
For detecting color image splicing, we selected three datasets such as Columbia color DVMM [12], CASIA1, and CASIA2 [20]. The Columbia color image dataset consists of 183 authentic and 180 spliced images in TIFF format. The image size is 1152 × 768 and no post-processing was applied to the forged image. All the forged images are spliced images. The CASIA1 dataset contains 800 authentic and 921 forged color images. Different geometric transforms such as scaling and rotation have been applied on the forged images. All the images have a size of 384 × 256 pixels in JPEG format. The CASIA2 dataset is an extension of the CASIA1 dataset. This dataset consists of 7491 authentic and 5123 forged color images in JPEG, BMP, and TIFF format, where image sizes vary from 240 × 160 to 900 × 600 pixels.
Detection results on the comparison between the proposed approach and other methods for the color image datasets. (unit: %)
5 Conclusions
In this paper, we described an image splicing detection technique using the 2D joint characteristic function moments of the inter-scale co-occurrence matrices in the wavelet domain. We constructed the inter-scale co-occurrence matrices by using the pair of the wavelet difference values across the inter-scale subbands. As the features for splicing detection, we extracted the first third-order characteristic function moments of the 2D joint probability density function generated by the co-occurrence matrices. By performing experimental simulations, we verified that the proposed method achieves high performance in splicing detection. The best detection accuracy was 96.2 % for the Columbia image splicing detection evaluation dataset. In addition, our algorithm generates reasonable detection performance for color splicing detection datasets.
Declarations
Acknowledgements
This work was supported by BK21PLUS, Creative Human Resource Development Program for IT, and Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (grant number: NRF-2012R1A1A2042034).
Authors’ contributions
THP and JGH proposed the framework of this work, carried out the whole experiments, and drafted the manuscript. YHM offered useful suggestions and helped to modify the manuscript. IKE initiated the main algorithm of this work, supervised the whole work, and wrote the final manuscript. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Authors’ Affiliations
References
- H Farid, A picture tells a thousand lies. New Sci. 2411, 38–41 (2003)Google Scholar
- H Farid, A survey of image forgery detection. IEEE Signal Process. Mag. 26, 6–25 (2009)View ArticleGoogle Scholar
- B Mahdian, S Saic, A bibliography on blind methods for identifying image forgery. Signal Process. Image Commun. 25(6), 389–399 (2010)View ArticleGoogle Scholar
- T.T. Ng, S.F. Chang, Q. Sun, Blind detection of photomontage using higher order statistics. Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS). pp. 688–691 (2004)Google Scholar
- D. Fu, Y.Q. Shi, W. Su, Detection of image splicing based on Hilbert–Huang transform and moments of characteristic functions with wavelet decomposition. Proceedings of the 5th International Workshop on Digital Watermarking (IWDW). 4283, pp. 177–187 (2006)Google Scholar
- W. Chen, Y.Q. Shi, W. Su, Image splicing detection using 2-D phase congruency and statistical moments of characteristic function. SPIE Electronic Imaging: Security, Steganography, and Watermarking of Multimedia Contents. pp. 65050R.1–65050R.8 (2007)Google Scholar
- Y.Q. Shi, C. Chen, W. Chen, A natural image model approach to splicing detection. Proceedings of ACM Multimedia and Security (MM&Sec). pp. 51–62 (2007)Google Scholar
- Z He, W Sun, W Lu, H Lu, Digital image splicing detection based on approximate run length. Pattern Recognit. Lett. 32(12), 591–1597 (2011)View ArticleGoogle Scholar
- Z. He, W. Lu, W. Sun, Improved run length based detection of digital image splicing. Proceedings of the 10th International Workshop on Digital-Forensics and Watermarking (IWDW), pp. 349–360 (2012)Google Scholar
- Z. Moghaddasi, H.A. Jalab, R. Noor, Improving RLRN image splicing detection with the use of PCA and kernel PCA. Sci. World J. Article ID 606570, (2014). doi:10.1155/2014/606570Google Scholar
- Z He, W Lu, W Sun, J Huang, Digital image splicing detection based on Markov features in DCT and DWT domain. Pattern Recog. 45(12), 4292–4299 (2012)View ArticleGoogle Scholar
- T.T. Ng, S.F. Chang, A dataset of authentic and spliced image blocks. Technical Report 203–2004, Columbia University (2004). http://www.ee.columbia.edu/ln/dvmm/downloads/
- B Su, Q Yuan, S Wang, C Zhao, S Li, Enhanced state selection Markov model for image splicing detection. Eurasip. J. Wirel. Comm. 2014(7), 1–10 (2014)Google Scholar
- M. El-Alfy, M.A. Qureshi, Combining spatial and DCT based Markov features for enhanced blind detection of image splicing. Pattern Anal Appl. 18(3), 713-723 (2015)Google Scholar
- X Zhao, S Wang, S Li, J Li, Passive image-splicing detection by a 2-D noncausal Markov model. IEEE Trans. Circuits Syst. Video Technol. 25(2), 185–199 (2015)View ArticleGoogle Scholar
- Z Moghaddasi, HA Jalab, R Md Noor, Improving RLRN image splicing detection with the use of PCA and kernel PCA, Sci. World J. Article ID 606570, (2014). doi:10.1155/2014/606570.Google Scholar
- G Muhammad, MH Al-Hammadi, M Hussian, G Bebis, Image forgery detection using steerable pyramid transform and local binary pattern. Mach. Vis. Appl. 25(4), 985–995 (2014)View ArticleGoogle Scholar
- M Hussain, S Qasem, G Bebis, G Muhammad, H Aboalsamh, H Mathkour, Evaluation of image forgery detection using multi-scale Weber local descriptors. Int. J. Artif. Intell. Tools 24(4), 1540016 (2015). doi:10.1142/s0218213015400163 View ArticleGoogle Scholar
- JG Han, TH Park, YH Moon, IK Eom, Efficient Markov feature extraction method for image splicing detection using maximization and threshold expansion. J. Electron. Imaging 25(2), 023031 (2016)View ArticleGoogle Scholar
- J. Dong, W Wang, CASIA tampered image detection evaluation (TIDE) database, v1.0 and v2.0 (2011). http://forensics.idealtest.org/
- Y Wang, P Moulin, Optimized feature extraction for learning-based image steganalysis. IEEE Trans. Inf. Forens. Security 2(1), 31–45 (2007)View ArticleGoogle Scholar
- X Luo, F Liu, S Lian, C Yang, S Gritzalis, On the typical statistical features for image blind steganalysis. IEEE J. Sel. Areas Commun. 21(7), 1404–1422 (2011)View ArticleGoogle Scholar
- FAP Petitcolas, RJ Anderson, MG Kuhn, Information hiding—a survey. Proc. IEEE 87(7), 1062–1078 (1999)View ArticleGoogle Scholar
- A Nissar, AH Mir, Classification of steganalysis techniques: a study. Digital Signal Processing 20, 1758–1770 (2010)View ArticleGoogle Scholar
- C.C. Chang, C.J. Lin, LIBSVM—a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, (2011). doi:10.1145/1961189.1961199.Google Scholar
- N Kambhatla, TK Leen, Dimension reduction by local principal component analysis. Neural Comput. 9(7), 1493–1516 (1997)View ArticleGoogle Scholar