Moving shadow detection based on stationary wavelet transform
EURASIP Journal on Image and Video Processing volume 2017, Article number: 49 (2017)
Many surveillance and forensic applications face problems in identifying shadows and their removal. The moving shadow points overlap with the moving objects in a video sequence leading to misclassification of the exact object. This article presents a novel method for identifying and removing moving shadows using stationary wavelet transform (SWT) based on a threshold determined by wavelet coefficients. The multi-resolution property of the stationary wavelet transform leads to the decomposition of the frames into four different bands without the loss of spatial information. The conventional discrete wavelet transform (DWT), which has the same property, suffers from the problem of shift invariance due to the decimation operation leading to a shift in the original signal during reconstruction. Since SWT does not have the decimation operation, the problem of shift invariance is solved which makes it feasible for change detection, pattern recognition and feature extraction and retrieves the original signal without the loss of phase information also. For detection and removal of shadow, a new threshold in the form of a variant statistical parameter—“skewness”—is proposed. The value of threshold is determined through the wavelet coefficients without the requirement of any supervised learning or manual calibration. Normally, the statistical parameters like mean, variance and standard deviation does not show much variation in complex environments. Skewness shows a unique variation between the shadow and non-shadow pixels in various environments than the previously used thresholds—standard deviation and relative standard deviation. The experimental results prove that the proposed method works better than other state-of-art-methods.
The automated visualization applications are shaped mainly to acquire the appearance of moving points from a video sequence using various pattern recognition and machine learning techniques. In particular, people and vehicles are the most important subjects for monitoring in any typical surveillance and video forensics applications. They usually describe humans with their appearance, size and behaviour and vehicles by their make, model and type. Similarly annotations of humans and vehicles help in video forensic applications to search for people or vehicle with a recognized portrayal by a witness. The major issue in such surveillance and forensic videos is moving shadows. They prevent the correct identification of the suspects in case of any unlawful activities. Serious flaws like merging of more than one object, incorrect object shape and deformation of its contour and cues may happen if moving shadows are not removed from the target object. Proper induction of the moving object segmentation algorithms may help the video surveillance and forensic applications, object tracking applications, etc., to perform well up to the customer satisfaction. If not, categorization or evaluation of moving object position will produce erroneous results.
Sanin et al.  classified the moving shadow detection methods under various categories. Readily available cues which make a distinction between the target object, its shadow and the place where it is cast (background) are discussed to get a clear understanding of the shadow detection concept.
Sanin et al.  discussed few shadow detection algorithms which consider the physical and geometrical properties of the shadows to identify them. They are categorized under model-based techniques. Many of the shadow detection algorithms consider the colour-based methods [2,3,4,5,6,7,8] and texture-based [9,10,11,12] methods to differentiate shadows from the foreground objects. They are categorized under property-based techniques.
The colour-based methods use the following facts about the shadow and background. The red-green-blue (RGB) values, hue and saturation in hue-saturation-value (HSV) colour space and grey level of a shadow are inferior to the background in the corresponding pixel [12, 13]. The variation between those values of shadow and background show a gradual growth between adjacent pixels .
The shadow and the background have similar texture. Texture-rich object has a texture-less shadow [12, 14]. The entropy value or the Gaussian filter derivatives are helpful in extracting the texture of the image segments . There is a difference in the illumination of shadow and background which also helps to differentiate shadow pixels. Lesser boundary when compared to background and not many interior edges when compared to objects are used to identify shadow pixels. But still, the shadows remain attached to their objects . Even though object and shadow have the same movement, their position separates them . Skewness is a variant statistical feature with respect to shadow and non-shadow areas for locating shadows provided a proper edge detection algorithm is used . An invariant feature across shadow boundary is the image gradient.
For majority of the moving shadow detection methods, chromatic analysis is the first step. The hue to intensity ratio  is used to categorize shadow and non-shadow pixels. Also, c1c2c3 colour space is used to distinguish shadow areas by introducing a variation in the colour space transformation. But still, if the RGB values remain the same, the variation in the transformation formula will not show any difference. Colour space-based methods  such as HSI, HSV, YIQ and YCbCr analyse the intensity and colour properties of shadows in aerial images and calculate a hue to intensity ratio for each pixel. The shadow regions are extracted based on a threshold, which is not clearly classifying the dark blue and dark green surfaces. To improve the shadow detection results, a consecutive thresholding  can be done. Based on the lighting conditions, an invariant colour model  which identifies the shadow can be used, but this may not work for all types of images. HSI colour space along with the colour attenuation relationship  is analysed to detect the shadow in colour aerial images.
The shadow properties discussed so far can be utilized in image and video forensics to expose the falsification of photos and videos by detecting the shadows using their geometric and shading information . Those images and videos can then be properly annotated for future forensic analysis after the removal of fake content. In image and video forensics field, the detection of shadow regions decides whether they are really formed or manipulated. The inconsistency of the texture between the shadow and background regions reveal the tampered content from the images  since it is known that the shadow does not change the texture of the background [12, 14]. Farid  gave three geometric techniques for detecting the traces of digital manipulation: vanishing points, reflections and shadows. The author reported with both real-time image and video forgery by analysing the shadows in them. This supports the need of detection of shadow in image and video forensics by which we were motivated to propose a moving shadow detection method. All the methods discussed so far dealt the shadow detection or removal process in spatial domain techniques that suffer from inaccurate segmentation due to the non-removal of noise or failure to detect new appearance automatically. Only few methods are available in wavelet domain that rectifies the problems in spatial domain techniques for a better identification of the moving cast shadows.
Guan  explored the properties of the HSV colour model for shadow detection and removal using dyadic multi-scale wavelet transform. The standard deviation of the wavelet coefficients from the value component helps to identify the shadow pixels from the foreground pixels. In a similar fashion, Khare et al.  used relative standard deviation of the wavelet coefficients to separate the shadows from the foreground pixels. Since the wavelet transforms decompose the input into high- and low-frequency values, the threshold values get updated automatically. Combining the saturation component with the value component, the threshold identifies the region of the shadows to be removed. The algorithms discussed so far still have the deficiency of the accurate shadow detection and removal.
In the proposed approach, we have also used the HSV colour space as it corresponds closely to the human perception of colour and separates chromaticity and luminosity easily. We also used variant statistical parameter “skewness” of the wavelet coefficients to detect and remove the shadows.
In the rest of the paper, Section 2 describes stationary wavelet transform (SWT); Sections 3 and 4 explains the threshold selection and the proposed method respectively. Section 5 explains the experimental results, and Section 6 gives the comparisons of the proposed method and other existing and state-of-art-methods [1, 7, 12, 26, 27]. Finally, discussions and conclusions of the work are given in Sections 7 and 8.
2 Stationary wavelet transform (SWT)
In this section, we have briefly discussed SWT and the reasons for using SWT for detection and removal of shadow from moving objects. Guan , Khare et al.  used discrete wavelet transform (DWT) that lacks in phase information and translation invariance and that creates problems in reconstruction of the image. So, we propose a method which uses SWT.
Fourier transform (FT) analyses a signal by decomposing it into constituent sinusoids of different frequencies. It has a major drawback of losing the temporal information. But cooperation of short-time Fourier transform (STFT) with a predefined window helps to retrieve the temporal information along with their frequencies.
DWT has an advantage over the Fourier transform in terms of localisation in frequency domain as well as in spatial domain. The DWT can be applied on a discrete signal containing N samples. The signal is decomposed into low-frequency band (L) using low-pass filter and high-frequency band (H) using high-pass filter. Each band is sub-sampled by a factor of two. In the case of 2D signal (image), each row of an image is filtered by a low-pass filter l[m] and high-pass filter h[m]. But the decimation operation of DWT makes it a shifted version of signal but not equivalent to the shift in DWT of signal.
SWT solves this problem of shift invariance . SWT differs from conventional DWT in terms of decimation and shift invariance, which makes it feasible for change detection, pattern recognition and feature extraction. In SWT, the input signal is convolved with low l[m]- and high h[m]- pass filter in a similar manner as in DWT, but no decimation is performed to obtain wavelet coefficients of different sub-bands. As there is no decimation involved in SWT, the number of coefficients is twice that of the samples in the input signal that helps for a better reconstruction of the given image.
3 Threshold selection
An optimal threshold has to be determined for the accurate shadow detection process. Guan  used standard deviation (σ) as the threshold, Khare et al.  used relative standard deviation (σ/μ) as the threshold and we have chosen skewness as the threshold in SWT domain. For a sample of n values, skewness (Skew) parameter is defined as:
where x is the one of the sample value, μ is the sample mean and σ is the standard deviation of the n values. Skewness is a measure of the degree of asymmetry of a distribution. The skewness value can be positive, negative or even undefined . A negative skewness indicates that the tail on the left side of the probability density function is longer than the right side and the bulk of the values lie to the right of the mean. A positive skewness indicates that the tail on the right side is longer than the left side and the bulk of the values lie to the left of the mean as shown in Fig. 1.
The motivation for selecting skewness as one of the distinguishing features to represent shadow is it always shows a variation from the skewness of the creating object and the casted background . For an image having homogeneous reflective surface, the skewness of the luminance histogram and sub-band histograms are correlated with the shadows regions present in it . In Eq. 1, μ is the average luminance of the object and σ is the standard deviation of the luminance of the object along with its neighbours in wavelet domain. The acceleration of (x − μ) with the cubic value enhances the edges around the object present in a scene; the summation and normalization make the edges of shadows blurry since they are soft and fasten the process of shadow detection.
So, the skewness of the wavelet coefficients has the boundary information of the objects which helps us to clearly segregate them from the shadows. Also, the existing thresholds may not show any differentiation of either as object or shadow for those pixels whose values of saturation and hue are undefined. The mean μ and standard deviation σ of such patterns as shown in Fig. 2 are identical leading to misclassification. They differ only in the sign of skewness because of the cubic acceleration and normalization of the luminance value of the object.
Therefore, skewness is selected as a stable threshold in our work for classification of moving shadow pixels from the objects. Though the thresholds σ and (σ/μ) are able to detect and remove shadow, their performance degrades when the foreground object and the background share the same or dark colour.
In the proposed method, we have applied the threshold on stationary wavelet coefficients of value component of HSV to detect moving object with shadow. For removal of shadow, we have applied logical AND on thresholded stationary wavelet coefficients of value component and those of saturation component. Instead of manual calibration of threshold which needs some predefined parameters, we propose a variant statistical parameter, i.e. skewness as a new threshold for detection and removal of shadow in SWT domain. Guan  and Khare et al.  applied the thresholds to the discrete wavelet coefficients (DWT) that losses the phase information. The loss of phase information in DWT and the difficulty faced when there are similar patterns of input frames are rectified by applying SWT and skewness as a threshold respectively.
4 Proposed algorithm
The proposed approach considers a reference frame as the background model from the video sequence and the consecutive frames are processed one by one to detect the foreground and shadow pixels. The reference frame and the current frame are converted from RGB colour space to HSV colour space. The absolute difference of the two frames is taken with respect to hue, saturation and value component. They are represented as ΔH, ΔV and ΔS respectively. The stationary wavelet transform is applied to the absolute difference components of value (ΔV) and saturation (ΔS) to get the wavelet coefficients denoted as WΔV and WΔS. Also, the variant statistical parameter skewness is calculated for the wavelet coefficients for the value and saturation component denoted as (Skew)WΔV and (Skew)WΔS respectively. They act as an automatic threshold to classify the moving pixels as foreground or shadow pixels.
The pixels having a greater value than the automatic threshold “variant statistical value − skewness” of the value component are categorized as the moving pixels which include the foreground and shadow pixels. Equation 2 represents this situation.
Also, to remove the shadow pixels, we consider the automatic threshold “variant statistical value − skewness” of the saturation component. Equation 3 represents this situation clearly.
Finally, the reconstruction of the shadow-detected image and shadow-removed image is done by the inverse wavelet transform. Binary closing morphological operations are applied to smoothen the reconstructed images.
5 Experimental results
To make our performance effective and systematic, an extensive result on several well-known benchmarks for which ground truth data was available is presented. The chosen benchmarks consist of indoor and outdoor scenes from UCSD CVRR Laboratory, CAVIAR Test Scenarios and Institute for Infocomm Research (I2R). A regional language movie’s song of our country named “Sarvam” is also used to test the proposed method with the other existing methods. Specifically, Intelligent Room, Hallway and CAVIAR are typical indoor environments while Highway I is an outdoor scenario of roadways. In addition, comparison with several existing methods to prove the superiority from the aspects of quality and quantity of our method, including Sanin et al. , Cucchiara et al. , Guan et al.  methods is done. The results are shown in Fig. 3. The first image (a) displays the original frames and (b) gives the ground truths of the corresponding frames in (a). Then the figure (c), (d) and (e) presents the results of the various methods [7, 12, 26] and finally (f) gives a visualization of the results of the proposed method.
6 Performance measures
The performance of the proposed method against previously used thresholds is discussed. Figures 4 and 5 shows the vertical, horizontal and diagonal wavelet coefficients of the absolute difference of the reference frame and current frame of Highway I video sequence (frame no: 100, outdoor) and Intelligent Room video sequence (frame no: 130, indoor) plotted against their frequency of occurrence respectively. The shadow regions have lower values than the object region regions [12, 13, 29]. In both the figures, the original shadow regions are highlighted with the grey line and arrow mark indicates the end of the shadow points. The blue square indicates the detected shadow pixels using our proposed method—skewness as threshold. The black and red squares indicate the detected shadow pixels using standard deviation  (σ) and relative standard deviation  (σ/μ). The plots clearly indicate that our proposed method detects perfectly the shadow pixels than the previously used thresholds.
In order to analyse the proposed method objectively and quantitatively, the shadow detection rate η and shadow discrimination rate ε  are considered. The effectiveness of the proposed method is shown along with the other methods in Table 1 giving the shadow detection rate (η) and Table 2 giving the shadow discrimination rate (ε). Also, the computation time needed to process for a single frame in seconds of the proposed method along with the other methods is given in Table 3.
The shadow discrimination rate of our method for various videos compared with the thresholds used by Guan  and Khare et al.  is shown in Fig. 6. The visualized comparison of the proposed method with the state-of-art-methods is shown in Fig. 7. In all the sub-figures, the first image shows the input frame, the second shows the corresponding ground truth, the third shows the proposed method results and the fourth and fifth show the results of the state-of-art-methods [26, 27] respectively. The video sequences used are Highway sequence, regional language movie song (Sarvam) and Lobby Hall. All the performance measures show that the proposed method outperforms than the existing and state-of-art-methods in the literature.
The existing moving shadow detection methods in wavelet domain using DWT lack in translation invariance and loss phase information during reconstruction. To address these issues, we have used stationary wavelet transform. The novelty of our method lies in the usage of a variant statistical parameter skewness that shows a difference for any kind of pattern. Previously used thresholds like standard deviation (σ) and relative standard deviation (σ/μ) will remain the same for undefined values of saturation and hue leading to misclassification. Whenever there is equality in RGB values, the conversion of such pixels to HSV colour space produces an undefined saturation and hue value . The proposed threshold has been tested with various benchmark videos and compared with the existing thresholds to prove its usability. The limitation of the proposed method is the redundancy of the wavelet coefficients which can be reduced by incorporating global feature preservation techniques like principal component analysis (PCA) or linear discriminant analysis (LDA)  with the wavelet transform as in many object and face detection methods.
In this article, a stationary wavelet transform (SWT)-based shadow detection method is proposed. A new threshold based on the variant statistical feature, calculated with the help of the wavelet coefficients, is used to classify the moving objects and shadows. When compared with the discrete wavelet transform (DWT), the shift invariance and multi-resolution property of SWT supports the reconstruction of the image from the sub-bands without the loss of phase information. Also, the determined threshold from the wavelet coefficients helps in a unique discrimination of the shadows and objects. The promising results of the proposed method highlight the advantages of both the SWT and the variant statistical threshold. The performance of the method degrades with high speed of the object and non-stationary background; therefore, this work can be extended for the non-stationary background for the future work. Also, there is a redundancy of the wavelet coefficients which can be reduced by incorporating PCA or LDA with the wavelet transform in the future.
A. Sanin, C. Sanderson, B.C. Lovell, Shadow detection: a survey and comparative evaluation of recent methods. Pattern Recogn. 45(4), 1684–1695 (2012)
A. Prati, R. Cucchiara, I. Mikic, M.M. Trivedi, Analysis and detection of shadows in video streams: a comparative evaluation. Proc. IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR) 2, 571–576 (2001)
Chun-Ting Chen, C.T., Chung-Yen Su, Wen-Chung Kao, “An enhanced segmentation on vision-based shadow removal for vehicle detection”, Proc. Int. Conf. on Green Circuit and Systems, pp. 679–682, 2010
K. Nagarathinam, R.S. Kathavarayan, A survey of moving cast shadow detection methods. Int. J. Sci. Eng. Res 5(5), 93–98 (2014)
E. Salvador, A. Cavallaro, T. Ebrahimi, Shadow aware object-based video processing. Proc. IET Int. Conf. on Vision, Image and Signal Processing 152(4), 398–406 (2005)
E. Salvador, A. Cavallaro, T. Ebrahimi, Cast shadow segmentation using invariant color features. Comput. Vis. Image Underst. 95(2), 238–259 (2004)
R. Cucchiara, C. Grana, M. Piccard, A. Prati, Detecting moving objects, ghosts and shadows in video streams. IEEE Trans. Pattern Anal. Mach. Intell. 25(10), 1337–1342 (2003)
A. Prati, R. Cucchiara, I. Mikic, M.M. Trivedi, Detecting moving shadows: algorithms and evaluation. IEEE Trans. Pattern Anal. Mach. Intell. 25(7), 918–923 (2003)
L. Pei, R. Wang, Moving cast shadow detection based on PCA. Proc.Int. Conf. on Natural Computation. 2, 581–584 (2009)
A. Leone, C. Distante, Shadow detection for moving objects based on texture analysis. Pattern Recogn. 40(4), 1222–1233 (2007)
Rui Qin, Shengcai Liao, Zhen Lei, Stan Z. Li, “Moving cast shadow removal based on local descriptors”, Proc. Int. Conf. on Pattern Recognition (ICPR), pp. 1377–1380, 2010
Andres Sanin, Conrad Sanderson, Brain C. Lovell, “Improved shadow removal for robust person tracking in surveillance scenario”, Proc. Int. Conf. On Pattern Recognition (ICPR), pp. 141–144, 2010
Zhang, L. and X.M. He, “Fake shadow detection based on SIFT features matching”, Proceedings of the WASE International Conference on Information Engineering, IEEE Xplore Press, Beidaihe, Hebei, pp: 216–220. doi:10.1109/ICIE.2010.58, 2010
S. Kumar, A. Kaur, Algorithm for shadow detection in real-colour images. Int. J. Comput. Ci Eng. 2, 2444–2446 (2010)
Jiejie Zhu, Kegan G.G. Samuel, Syed Z. Masood and Marshall F. Tappen, “Learning to recognize shadows in monochromatic natural images”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, IEEE Xplore Press, San Francisco, CA., pp: 223–230. doi:10.1109/CVPR.2010.5540209, 2010
J Vinitha Panicker and M. Wilscy, “Detection of moving cast shadows using edge information”, Proceedings of the 2nd International Conference on Computer and Automation Engineering, IEEE Xplore Press, Singapore, pp: 817–821. doi:10.1109/ICCAE.2010.5451878, Feb 2010
F. Cogun, A.E. Cetin, Moving shadow detection in video using cepstrum. Int. J. Adv. Robot. Syst. 10(18), 2013 (2013). doi:10.5772/52942
Bangyu Sun and Shutao Li, “Moving cast shadow detection of vehicle using combined color models”, Proceedings of the Chinese Conference on Pattern Recognition, Oct. 21–23, IEEE Xplore Press, Chongqing, pp: 1–5. doi:10.1109/CCPR.2010.5659321, 2010.
J.D. Tsai Victor, A comparative study on shadow compensation of color aerial images in invariant color models. IEEE Trans. Geosci. Remote Sens. 44, 1661–1671 (2006)
K.-L. Chung, Y.-R. Lin, Y.H. Huang, Efficient shadow detection of color aerial images based on successive thresholding scheme. IEEE Trans. Geosci. Remote Sens. 47, 671–682 (2009)
D Finlayson, C Fredembach, MS Drew, “Detecting illumination in images” in IEEE 11th International Conference on Computer Vision (ICCV), Rio de Janeiro, Brazil, pp. 1–8, 2007
Wenxuan Shi, Jie Li, “Shadow Detection in color aerial images based on HSI space and color attenuation relationship”, EURASIP J. Adv. Signal Process., (1), pp. 141, 2012
Eric Kee, James F O’Brien, and Hany Farid. “Exposing photo manipulation from shading and shadows”, ACM Trans. Graph. 33(5):165:1–165:21, 2014
Yongzhen Ke, Fan Qin, Weidong Min, and Guiling Zhang. “Exposing image forgery by detecting consistency of shadow,” Sci. World J., Article ID 364501, 9 pages, 2014. doi:10.1155/2014/364501
Hany Farid, “How to detect faked photos”, American Scientist, pp. 77-81, March-April issue, 2017
G.Y. Peng, Spatio-temporal motion-based foreground segmentation and shadow suppression. IET Comput. Vis. 4(1), 50–60 (2010)
M. Khare, R.K. Srivastava, A. Khare, Moving shadow detection and removal—a wavelet transform based approach. IET Comput. Vis. 8(6), 701–717 (2014)
Zhang, Yudong, Shuihua Wang, Yuankai Huo, Lenan Wu, and Aijun Liu, “Feature extraction of brain MRI by stationary wavelet transform and its applications”, J. Biol. Syst.(18), issue spec01 pp. 115–132, 2010
L. Sharan, E.H. Adelson, I. Motoyoshi, S.’y. Nishida, Non-oriented filters are better than oriented filters for skewness detection. Perception 36, 6a (2007)
M. Golchin, F. Khalid, L.N. Abdullah, S.H. Davarpanah, Shadow detection using color and edge information. J. Comput. Sci. 9(11), 1575–1588 (2013)
Kavitha Nagarathinam and Ruba Soundar Kathavarayan, “Moving shadow detection in videos using HSI color space along hue singular pixels”, International Journal of Printing, Packaging & Allied. Sciences (4), No. 2, pp. 1217–1225, 2016
K. Ruba Soundar, K. Murugesan, Preserving global and local features—a combined approach for recognizing face images. Int. J. Pattern Recognit. Artif. Intell. 1(24), 39–53 (2010)
We declare that there is no funding source for this work as of now.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Nagarathinam, K., Kathavarayan, R.S. Moving shadow detection based on stationary wavelet transform. J Image Video Proc. 2017, 49 (2017). https://doi.org/10.1186/s13640-017-0198-x