Shadow removal with background difference method based on shadow position and edges attributes
EURASIP Journal on Image and Video Processing volume 2012, Article number: 22 (2012)
This article presents a shadow removal algorithm with background difference method based on shadow position and edges attributes. First, a novel background subtraction method is proposed to obtain moving objects. This method mainly includes three parts, namely detecting the moving regions approximately by calculating the inter-frames differences of symmetrical frames and counting the static index of each probable moving point; modeling for background by the statistics of brightness information and updating this model combining motion templates; then extracting moving objects and its edges. Second, based on the above processing, we suppress shadows in the HSV color space first, then the direction of shadow is determined by shadow edges and positions combining with the horizontal and vertical projections of the edge image, respectively, the position of the shadow is located accurately through proportion method, the shadow can be removed finally. Experimental results indicate that the proposed method is easy to be realized and can determine the direction of the shadow adaptively, then eliminate the shadow and extract the whole moving object accurately, especially when the chrominance invariant principle is ineffective.
Video object segmentation is of fundamental importance in many advanced video applications such as tracking and interpreting human behavior, surveillance, motor traffic analysis or environmental monitoring, and so on [1–4]. Many segmentation approaches for background subtraction have been proposed over the past decades [5–9]. Some methods include parametric and non-parametric background density estimates and spatial correlation approaches . However, most video objects extracted results are usually unsatisfactory in case that shadows exist in every frame of video sequence. So, moving shadow detection is critical for accurate object detection in video streams since shadow points are often misclassified as object points, causing errors in segmentation and tracking .
Shadows have two important visual properties. One is that they are so different from background that may wrongly be extracted as foreground. The other is that both shadows and moving objects have the same motion features. Because of these two visual properties, shadows usually make the geometrical shape of moving objects distorted, sometimes even cause the losing or merging of moving objects. Therefore, detecting and removing shadows from object regions have great practical significance and tremendous challenge in the field, which have attracted a great deal of attention recently.
A rather novel method  detects shadows and moving objects based on the sound physical models with good efficiency. In , all four assumptions (such as light source causing a cast shadow has a certain extent) are used at the same time to detect image regions changes by moving cast shadows. Because the chromaticity information is not affected by the change of illumination for some cases, a shadow region can be detected by selecting the region which is darker than its neighboring regions but has similar chromaticity information. According to this illumination invariant property of chromaticity, several efficient methods [14, 15] have been developed to detect shadows for color images efficiently. Tian et al.  use image information theory to deduce the tricolor attenuation model and employ blackbody irradiance to estimate its parameters, then detect shadows based on the new model. Tsai  presents an automatic property-based approach for the detection and compensation of shadow regions with shape information preserved in complex urban color aerial images for solving problems caused by cast shadows in digital image mapping. Lu et al.  first sampled shadow pixels based on the estimated shadow direction, afterwards, three shadow attributes were calculated based on sampled shadow pixels, but it costs significant processing time. In , moving shadows were detected by using its motion characteristics and the underlying physics based on the color segmentation, this method had poor accuracy since it could not distinguish shadows from black objects. Cucchiara et al.  proposed an approach which exploits color information for both background subtraction and shadow detection to improve object segmentation and background update.
Overall, the existing shadow removal methods can be divided into two categories: model-based method and feature-based method. The former one generally uses some prior knowledge such as scene, object, illumination condition, and so on, through which can match the arrises, lines, angles of the moving objects in order to detect shadows. Unfortunately, the prior knowledge is usually very difficult to obtain. This kind of method also has long processing time and it is minimally used in practice. Feature-based method as proposed in references prefers using shadow’s brightness, color information, saturation, texture, and geometric features directly to detect shadows. This kind of method is commonly used, while also has drawbacks. For example, if the color feature method is adopted, when the object and shadow have the similar color, then the invariant color features are not applicable, and the shadows cannot be differentiated.
To avoid drawbacks in featured-based method, we propose an approach of shadow detection and elimination based on shadow position and edge attributes after HSV shadow removal. This approach is shown in Figure 1, which is simple and easier to be carried out. It includes two important parts, namely moving object extraction and shadow detection and removal, shown in Sections 2 and 3. Section 2 lays foundation for future process. Section 3 is the emphasis of the whole approach, which is novel to some extent. The proposed method can still locate the shadow direction adaptively and suppress it exactly even under the circumstance that most color-based removal methods cannot work well when the invariant color features are not applicable.
2. Moving object extraction by background subtraction method
2.1. Generation of change detection mask and motion template
The process for the generation of CDM (change detection mask and motion template) is clearly shown in Figure 2. Let f k (x,y) be the k frame of the video sequence which is gray-valued. The symmetrical frame distance is δ (one to two frames generally). So, the inter-frame difference for current k frame is defined as follows
T1 and T2 denote frame difference threshold which should be chosen according to moving speed, range, noise distribution of the video object motion, and they are determined by experiments in the proposed algorithm. Parameters T1 and T2 are usually selected about 10 in most experimental video sequences. abs() denotes absolute operation. Although the accumulative frames difference can reflect the boundary and region of moving objects well, while it also increases noises. So, it is necessary to filter the noises.
After the noises filtering (using morphologic process), a method to obtain the motion template by using static index is proposed in this article. The main idea of this method is shown as follows: first, select the denoised video sequence for some frames (may be the whole video sequence) according to the requirement of real-time, supposing the total frames number is n_NumFrame, and the size of each frame is width × height. Since the background point remains still, that is to say d k (x,y) = 0 within most video sequences, those points can be seen as static. According to this, treat all the frames, count the number of times when the pixel point (X,Y) is static in n_NumFrame frames, which is called static index n_Static(X,Y) for pixel point (X,Y). Finally, if the static index n_Static(X,Y) of the selected pixel point is bigger than 0.93 × n_NumFrame (0.93 is determined by experiments in the proposed algorithm), then the pixel point (X,Y) is background point, otherwise it is foreground point. And the motion template PMask(X,Y) of the video sequence is generated as follows:
2.2. Background modeling and updating
The simplest way of background modeling happens when there is no moving object in the scene. However, it is hard to satisfy this requirement in actual application. Thus, it is very necessary to create and update background model adaptively when there is a moving object in the scene. We use statistical modeling approach to build the background model. Specific strategies are as follows: first, let G(x,y) be the set of all possible pixel values of pixel point (X,Y) in n_NumFrame frame, count the frequency that every pixel value appears at the pixel position (X,Y) in the n_NumFrame frame; second, define the pixel value which appears with most high frequency at the pixel position (X,Y) as the background pixel value Bkground(X,Y).
Count() in (5) denotes the operation of counting the number of pixels.
Combining the motion template PMask(X,Y) obtained in section 2.1, substitute the background pixels values of non-motion region with the corresponding pixels values of the current frame and substitute the background pixels values of motion region with 0. Then, the background model at (x,y) of n- th frame can be updated as follows
2.3. Motion object detection and edge extraction
Based on the above steps, it is able to extract the moving objects by using background subtraction.
Then, we can get the alpha template pVOPalpha n (x,y) by threshold. If pVOPalpha n (x,y) is 255, it indicates that pixel point (X,Y) is considered as a foreground point, otherwise it is considered as a background point.
Both Figures 3 and 4c,f show the extraction results by the background subtraction method we presented above. The first three images in Figure 3a display the segmented results of frames 57, 59, and 64 in “mother–daughter” video sequence. The last three images in Figure 3b are the segmented results of frames 3, 14, and 47 in “Akiyo” video sequence, respectively. We can see clearly from Figure 3 that this new method is very applicable to the video sequence which does not have shadows on it.
Figure 4 shows the extraction results of the 1st and 3rd frames of video sequence “Table”. As shown in Figure 4, the existing of shadows seriously affects the accuracy of extraction, though the moving object keeps itself integrated. Therefore, the shadows must be eliminated to ensure the accuracy of extraction results.
Through a large number of experiments and observations, we can find that comparing with video objects edges, the shadow edges seem much simpler, they are adjacent to object points and most of them distribute mainly on the outer contour. Based on the attributes of shadow edges, first, we attempt to detect and eliminate shadow edges based on the edge extraction including moving objects edges and shadows edges; second, the moving object can be reconstructed by the remaining edges. This method not only can reduce the computational load greatly, but also be easier to detect shadows.
3. Shadow detection and removal
3.1. Shadow suppression in HSV
The existing approaches of shadow suppression based on the color characteristic are mainly concentrated in RGB and HSV color space. In RGB color space, human perception differences have less consistency with computational differences. Moreover, the correlation of the three components in RGB color space often leads to less effective detection. But these shortcomings could be overcome in HSV color space. It indicates that HSV color space can reflect the intensity and color information better than RGB color space, and it has better color perception consistency in HSV color space. In shadow detection, relative to the pixels of background region, V component becomes smaller with big change, which is an important parameter for distinguishing shadows from foreground regions. S component has little value and its difference with the background will be negative. H component varies hardly. We first eliminate shadows in HSV color space according to the rules described in (8).
In (8), α, β, T s , T H are, respectively, the threshold of intensity, hue, and saturation (0 < α < β <1). Figure 6 shows the edge images of foreground of 1st and 3rd frames of “Table” after HSV shadow suppression, where α = 0.1,β = 0.9,T s = T H = 0.2.
3.2 Shadow removal based on shadow position
In most instances, shadow suppression in HSV color space seems effective. However, this method is not reliable when the background brightness is low or the background has the similar chrominance with foreground. Once the brightness of background is low, it is very difficult to distinguish all the shadows from background because its brightness will change a little when shadows cover on the background as shown in Figure 6. Meanwhile, some pixel points inside the moving object may be eliminated as shadow points.
To overcome the shortcomings of HSV suppression, we propose an approach based on the combination of shadow position and edge attributes after HSV suppression which can be divided into five parts as shown in Figure 1.
As mentioned in Section 2, most shadow regions have less inner edges and the shadow edges mainly concentrate on the outer contour when the background has simple texture. After the shadow suppression in HSV color space, the outer contour edges of undetected shadow become much sparser. In addition, we find that the outer contour edges of shadows are usually adjacent to the moving object through many experiments and observations. The two points mentioned above are the shadow edge attribute and shadow position feature, respectively.
We can acquire the initial shadow position by proportion method based on the shadow position feature. Then the precise shadow pixels positions can be determined by the shadow edge attribute and then the remaining shadow can be eliminated further. The specific steps are as follows.
Step 1: Distribution statistics after HSV suppression
In order to locate shadow position, we should project each frame of edge video sequence PVOPalpha n (j,i) to horizontal and vertical directions, respectively, as shown in Equation (9) after shadows have been eliminated initially in HSV. Figure 7 reflects the distribution statistics results, from which the foreground pixels number of each row or each column can be seen clearly.
Step 2: Estimate the approximate shadow position and determine the search direction
After projection in Step 1, some key positions should be found to get ready for determining shadow position and search direction. These key positions include i n,min i n,max corresponding to i when Horizontal n [i] is not zero and j n,min j n,max corresponding to j when Vertical n [j] is not zero as shown in Figure 8 which have marked in red. And then, count pixels number as follows:
In accordance with shadow edges attribute, the approximate shadow position can be judged preliminarily in the sparser part. Therefore, the corresponding relationship between the shadow approximate position and search direction is shown in Table 1.
Step 3: Search for possible adjacent positions between shadows and moving objects
Shadow position feature denotes that shadow outer contour edges are usually adjacent to the moving objects. From Step 2 we have obtained search directions of each frame which have been marked by red arrows in Figure 7. So, the adjacent position could be searched by the strategy as follows:
When the probable adjacent i or j appears, stop searching and go to Step 4, otherwise continue searching.
Step 4: Remove the false adjacent position
The adjacent positions i or j got in Step 3 by Equation (11) are very possible to be the cuspidal points of moving objects which are the false adjacent positions. These bad position points are supposed to be filtered out. The judgment rule based on proportion method is as follows.
If j in Equation (12) is a true adjacent position, we will reserve it, otherwise eliminate it. This circumstance is just an example of horizontal direction which supposes that the shadow is in the right. Other seven cases should be judged according to the similar rules.
Step 5: Eliminate shadow edge points accurately
When the shadow position has been located accurately, count the pixel values of each line and column in horizontal and vertical directions of shadow region. In the shadow region, comparing the pixel values of each row or each column, once the number is smaller than the threshold obtained by experiments, then we remove the whole row or column by setting their values as zero in the edge image which got in Section 2 that the shadows will be eliminated from. That is the key to keep moving objects extracted completely. Figure 9 shows the shadow removal results without and with Step 3.
Finally, fill up the remaining moving object edges and process them with mathematical morphology, and then map the alpha plane to the moving object.
4. Experiment results and discussion
To verify the effectiveness of the proposed algorithm, we select two types of video sequences with shadow as displayed in Table 2 to conduct experiments.
Figure 10 gives the shadow elimination result comparison between the algorithm in  (only considering HSV color space) and the proposed algorithm for the 1st and 3rd frames of video sequence “Table”. Figure 11 gives the shadow elimination result comparison between the algorithm in  and the proposed algorithm for the 10th and 13th frames of video sequence “Silent”.
In Figure 10, the pixel values of the background and foreground have a high contrast. But, it varies little when shadow projects on the background. If we still suppress shadow just in HSV color space, some shadow edges will be considered as moving object edges.
In Figure 11, the background brightness varies little when shadow projects on the background, meanwhile the background is complex and close to the foreground color. This will cause the color invariability ineffective and result in some shadow edges miss-detected and wrong-detected, and then make some moving object edges to be eliminated as shadow edges wrongly. From the comparison results in Figures 10 and 11, it is obvious that the proposed algorithm can overcome these problems successfully and extracts the complete moving object accurately and robustly.
To validate the adaptability to shadow direction of the proposed algorithm further, we also shoot two video sequences which have shadows in different directions. The experimental results are shown in Figure 12.
The proposed algorithm can also be applied in more complex situation, such as in the case of multiple objects. The experimental results of video sequence “Men walking” are shown in Figure 13.
To further prove our algorithm in the case where shadows connect with other foreground objects and their shadows, we shoot another two video sequences: “Lamp man” and “Car men”. Both of these two sequences contain multiple objects and crossed shadows. The results are shown in Figures 14 and 15. It is obvious that our proposed method can effectively cope with such situations.
In order to test the proposed algorithm for its ability in dealing with shadows of different directions, results of another three video sequences (“Ball man A”, “Ball man B”, and “Ball man C”) are illustrated in the article. These three video sequences are shot at the same spot in different times of the day. “Ball man A” and “Ball man B” are shot at around 9 and 10 am, respectively. While “Ball man C” is shot at noon when it is overcast, its shadows are very weak and much smaller comparing with the others. Apparently, the proposed method can also be applied in such circumstances. Results are shown in Figures 16, 17, and 18.
In order to evaluate the validity and effectiveness of the proposed algorithm objectively, we use the criterion proposed by Wollborn and Mech  to judge the extracted results of the proposed algorithm. The video object mask spatial accuracy (SA) of each frame defined in this criterion is as follows:
A t ref(x, y) and A t est(x, y) denote the video object mask of the reference frame and the VOP alpha mask we obtained respectively. ⊕ is “XOR” operation. SA reflects the level of similarity between the video object mask of the reference frame and the VOP alpha mask we obtained of each frame. Higher SA indicates more accurate segmentation, while lower SA indicates poor result.
In this article, we get the reference video object masks by hand. Both Figures 19 and 20 display the SA of the first 15 frames of video sequence “Table” and “Silent” by using two different shadow elimination methods.
It can clearly be seen from Figure 19 that, if we just suppress shadow in HSV, the SA which shifts from 0.11 to 0.38 is much smaller than that of the proposed algorithm which shifts from 0.62 to 0.80. In Figure 20, the SA even keeps above 0.90 of the proposed algorithm. All these indicate the validity and accuracy of the proposed algorithm.
Tables 3, 4, 5, 6, 7, 8, 9, and 10 list the SA of extracted results by using the proposed method for the first 14 frames of video sequences “Men”, “Wait”, “Men walking”, “Lamp man”, “Car men”, “Ball man A”, “Ball man B”, and “Ball man C”, respectively.
Taking into consideration that parameters are set heuristically based on the test sequences, machine learning techniques could be useful for strengthening the adaptability of parameter determination. Franek and Jiang  address the parameter selection problem in image segmentation and presents a novel unsupervised framework for automatically choosing parameters. A supervised learning algorithm for quantum neural networks based on a novel quantum neuronnode implemented as a very simple quantum circuit is proposed and investigated in . Methods like PSO, ML, and SVM are also well work in overcoming the inadaptability of parameters selection [24–26]. After researching these references relevant to machine learning, we will take the method of revised SVM into use in our future work in order to realize automatic and adaptive parameter determination.
In this article, we propose an effective background difference approach of shadow detection and suppression based on shadow position and edge attributes after HSV shadow removal. Comparing with other methods, this method can locate various shadow positions and remove them accurately even in the cases that the chrominance invariant principle is ineffective, or the color of background texture is similar to the color of moving objects. Meanwhile the proposed method can keep the completeness of the extracted moving object while the most color-based shadow removal methods cannot work well. The results of the experiments also demonstrate that the proposed method is simple and robust.
Jiang H, Ardö H, Öwall V: A hardware architecture for real-time video segmentation utilizing memory reduction techniques. IEEE Trans. Circuits Syst. Video Technol 2009, 19(2):226-235.
Huang SS, Fu LC, Hsiao PY: Region-level motion-based foreground segmentation under a Bayesian network. IEEE Trans. Circuits Syst. Video Technol 2009, 19(4):522-531.
Wang Y: Real-time moving vehicle detection with cast shadow removal in video based on conditional random field. IEEE Trans. Circuits Syst. Video Technol 2009, 19(3):437-441.
Christopher W: Richard, A Ali, D Trevor, P Alex. Pfinder: real-time tracking of the human body. IEEE Trans. Pattern Anal. Mach. Intell 1997, 19(7):780-785. 10.1109/34.598236
Yong H, Tian JW, Chu Y, Tang QL, Liu J: Spatiotemporal smooth models for moving object detection. IEEE Signal Process. Lett 2008, 15: 497-500.
McHuch JM, Konrad J, Saligrama V, Pieree-Marc J: Foreground-adaptive background subtraction. IEEE Signal Process. Lett 2009, 16(5):390-393.
Chien SY, Ma SY, Chen LG: Efficient moving object segmentation algorithm using background restoration technique. IEEE Trans. Circuits Syst. Video Technol 2002, 12(7):577-586. 10.1109/TCSVT.2002.800516
Haritaoglu I, Harwood D, Davis LS: W4: real-time surveillance of people and their activities. IEEE Trans. Pattern Anal. Mach. Intell 2000, 22(8):809-830. 10.1109/34.868683
Tsai DM, Lai SC: Independent component analysis-based background subtraction for indoor surveillance. IEEE Trans. Image Process 2009, 18(1):158-167.
Piccardi M: Background subtraction techniques. a review, in IEEE International Conference on Systems, Man and Cybernetics, vol. 4 (10-13 Oct., 2004), 3099-3104.
Prati A, Mikic I, Trivedi MM, Cucchiara R: Detecting moving shadows: algorithms and evaluation. IEEE Trans. Pattern Anal. Mach. Intell 2003, 25(7):918-923. 10.1109/TPAMI.2003.1206520
Nadimi S, Bhanu B: Physical models for moving shadow and object detection in video. IEEE Trans. Pattern Anal. Mach. Intell 2004, 26(8):1079-1087. 10.1109/TPAMI.2004.51
Stauder J, Mech R, Ostermann J: Detection of moving cast shadows for object segmentation. IEEE Trans. Multimed 1999, 1(1):65-76. 10.1109/6046.748172
Jodoin P-M, Mignotte M, Konrad J: Statistical background subtraction using spatial cues. IEEE Trans. Circuits Syst. Video Technol 2007, 17(12):1758-1763.
Cucchiara R, Grana C, Piccardi M, Prati A, Sirotti S: Improving shadow suppression in moving object detection with HSV color information. In in Proceedings of the IEEE Intelligent Transportation Systems Conference. Oakland, CA; 334-339. 25-29 Aug 2001
Tian JD, Sun J, Tang YD: Tricolor attenuation model for shadow detection. IEEE Trans. Image Process 2009, 18(10):2355-2363.
Tsai VJD: A comparative study on shadow compensation of color aerial images in invariant color models. IEEE Trans. Geosci. Remote Sens 2006, 44(6):1661-1671.
Lu YH, Xin HJ, Kong J, Li BB, Wang Y: Shadow removal based on shadow direction and shadow attributes. In in Proceedings of the IEEE International Conference on Computational Intelligence for Modeling Control and Automation. Sydney, NSW; 37-41. Nov. 28 2006-Dec. 1, 2006
Pan X: Moving shadow detection based on color information and edge features. J. Zhejiang Univ. (Engineering Science) 2004, 38(4):389-391.
Cucchiara R, Grana C, Piccardi M, Prati A: Detecting moving objects, ghosts and shadows in video streams. IEEE Trans. Pattern Anal. Mach. Intell 2003, 25(10):1337-1342. 10.1109/TPAMI.2003.1233909
Wollborn M, Mech R: Refined procedure for objective evaluation of video object segmentation algorithms. 1998.
Franek L, Jiang X: Adaptive parameter selection for image segmentation based on similarity estimation of multiple segmenters. Lecture Notes Comput. Sci 2011, 6493: 697-708. 10.1007/978-3-642-19309-5_54
da Silva AJ, de Oliveira WR, Ludermir TB: Classical and superposed learning for quantum weightless neural networks. Neurocomputing 2012, 75(1):52-60. 10.1016/j.neucom.2011.03.055
de Carvalho AB, Pozo A: Measuring the convergence and diversity of CDAS multi-objective particle swarm optimization algorithms: a study of many-objective problems. Neurocomputing 2012, 75(1):43-51. 10.1016/j.neucom.2011.03.053
Lorena AC, Costa IG, Spolaor N, de Souto MCP: Analysis of complexity indices for classification problems: cancer gene expression data. Neurocomputing 2012, 75(1):33-42. 10.1016/j.neucom.2011.03.054
Gomes TAF, Prudencio RBC, Soares C, Rossi ALD, Carvalho A: Combining meta-learning and search techniques to select parameters for support vector machines. Neurocomputing 2012, 75(1):3-13. 10.1016/j.neucom.2011.07.005
The authors would like to express their appreciation to the anonymous reviewers for their insightful comments, which help improving this article. The study was supported by the National Natural Science Foundation of China (NSFC) under Grants nos. 61075011 and 60675018, also the Scientific Research Foundation for the Returned Overseas Chinese Scholars from the State Education Ministry of China.
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
About this article
Cite this article
Zhu, S., Guo, Z. & Ma, L. Shadow removal with background difference method based on shadow position and edges attributes. J Image Video Proc 2012, 22 (2012). https://doi.org/10.1186/1687-5281-2012-22