Probabilistic motion pixel detection for the reduction of ghost artifacts in high dynamic range images from multiple exposures
© An et al.; licensee Springer. 2014
Received: 15 April 2014
Accepted: 7 August 2014
Published: 21 August 2014
This paper presents an algorithm for compositing a high dynamic range (HDR) image from multi-exposure images, considering inconsistent pixels for the reduction of ghost artifacts. In HDR images, ghost artifacts may appear when there are moving objects while taking multiple images with different exposures. To prevent such artifacts, it is important to detect inconsistent pixels caused by moving objects in consecutive frames and then to assign zero weights to the corresponding pixels in the fusion process. This problem is formulated as a binary labeling problem based on a Markov random field (MRF) framework, the solution of which is a binary map for each exposure image, which identifies the pixels to be excluded in the fusion process. To obtain the ghost map, the distribution of zero-mean normalized cross-correlation (ZNCC) of an image with respect to the reference frame is modeled as a mixture of Gaussian functions, and the parameters of this function are used to design the energy function. However, this method does not well detect faint objects that are in low-contrast regions due to over- or under-exposure, because the ZNCC does not show much difference in such areas. Hence, we obtain an additional ghost map for the low-contrast regions, based on the intensity relationship between the frames. Specifically, the intensity mapping function (IMF) between the frames is estimated using pixels from high-contrast regions without inconsistent pixels, and pixels out of the tolerance range of the IMF are considered moving pixels in the low-contrast regions. As a result, inconsistent pixels in both the low- and high-contrast areas are well found, and thus, HDR images without noticeable ghosts can be obtained.
The dynamic ranges of most commercial image sensors and display devices are narrower than the radiance range of an actual scene, and hence, under- or over-exposure is often inevitable. In order to overcome such limitations of image sensors and displays, a number of multi-exposure capturing and processing techniques have been proposed, which can be roughly categorized into two approaches: high dynamic range imaging (HDRI) with tone mapping [1–4] and image fusion methods [5–10]. The former generates an image of higher dynamic range (i.e., higher bit depth for each pixel) from multiple images having different exposures. To obtain this image, the camera response function (CRF) must be known or estimated, and a tone mapping process is needed when showing the synthesized HDR image on a low dynamic range (LDR) display. On the other hand, the latter generates a tone-mapped-like high-quality image by the weighted addition of multiple exposure images and thus does not need CRF estimation, HDR image generation, and tone-mapping process. Hence, fusion approaches tend to require fewer computations than conventional HDRI, while providing comparable image quality for the LDR displays. Of course, HDRI is the more appropriate solution when showing images on HDR devices.
The conventional exposure fusion and the HDRI work well for the static scene when multi-exposure images are well registered and there is no moving object. But the ghost artifact is often observed in the HDR image from the dynamic scene where the images are not aligned and/or some objects are moving. Hence, there have been much efforts to alleviate the ghosting problem in the case of HDRI approaches. Some of the existing algorithms consider misalignment of input frames and moving objects simultaneously, while others assume well-aligned input or pre-registration of misaligned frames and concentrate on the detection of moving objects that cause inconsistency. For example, the study  exploits the measure of local entropy differences to identify regions that might contain moving pixels, which are then excluded from the HDRI generation process. In addition, Khan et al.  proposed an iterative method that gives larger weights to static and well-exposed pixels, thereby diminishing the weights for pixels that can cause ghosts. Li et al. [13, 14] proposed methods to detect and modify moving pixels based on the intensity mapping function (IMF) . There are also patch-based methods, in which patches including moving objects are excluded [16, 17]. To simultaneously deal with the misalignment and moving objects, Zimmer et al.  proposed an optical flow-based energy minimization method, and Hu et al.  used non-rigid dense correspondence and color transfer function for this task. Recently, low-rank matrix-based algorithms [20, 21] have also been presented, based on the assumption that irradiance maps are linearly related to LDR exposures.
In the case of exposure fusion, there are also similar approaches for ghost removal. For example, the median threshold bitmap approach was proposed to detect clusters of inconsistent pixels , which are then excluded when fusing the images. In addition, a gradient domain approach was introduced that gives smaller weights to inconsistent pixels . The IMF is used to exclude region of inconsistent pixels in the fusion process [24, 25], where the images are over-segmented and the IMF is used to detect the inconsistent regions. In our previous work , we proposed a method to detect inconsistent pixels based on a test of the reciprocity law of exposure and the measure of zero-mean normalized cross-correlation (ZNCC). It is noted that the ZNCC between a region in an image and its corresponding region in the reference is close to 1 when there is no moving object. Hence, a pixel is considered to be inconsistent when the region around the pixel shows low ZNCC under a certain threshold, i.e., the hard thresholding of ZNCC was used.
In this paper, we propose a probabilistic approach to constructing a ghost map, which is a binary image depicting the pixels to be excluded in the exposure fusion process. We assume that the images are well registered, otherwise apply a registration algorithm. The basic measure is also based on the ZNCC, but probabilistic soft thresholding is used instead of the hard thresholding used in our previous work. Specifically, ZNCC histogram is modeled as a Gaussian mixture function, where the parameters are found using an expectation maximization (EM) algorithm. Generating a ghost map is then posed as a binary labeling problem based on a Markov random field (MRF) framework, where the energy to be minimized is designed as a function of the ZNCC distribution parameters. It will be shown that the proposed method provides a less noisy and more accurate binary map than the simple hard thresholding method.
However, as in other feature-based methods, the ZNCC shows meaningful differences only for well-contrasted and highly textured regions. Hence, feature-based methods often give incorrect results in low-contrast regions where the pixel values are about to be saturated due to over- or under-exposure and also in low-textured regions. For these regions, we exploit the IMF between the images, which was successfully used in [13, 14, 24, 25]. In this paper, the IMF is estimated from regions having high ZNCC only, because other regions are saturated or moving object regions that have low credibility in estimating the IMF. Then, the pixels lying outside the IMF tolerance are considered pixels on the faint moving objects. To determine the ghost map in this region, we also develop an optimization technique, which yields less noisy results than conventional IMF-based thresholding methods. Experimental results show that the proposed method constructs plausible ghost maps and hence yields pleasing HDR images without noticeable ghost artifacts.
The rest of this paper is organized as follows: In the second section, we review the conventional weight map generation method . In the third section, we describe the proposed algorithm that excludes the ghost pixels from the weight map. Then, we show some experimental results, and finally, conclusions are given in the last section.
Review of exposure fusion
The core of the ghost reduction algorithm is to find inconsistent pixels that can cause artifacts, for excluding them from the fusion process. For this task, we first determine a reference frame among the multi-exposure images, one that has the largest well-contrasted region. Then, in all other input frames except for the reference frame, we find regions that have moving objects with respect to the reference. More specifically, we construct a ghost map (binary image) for each input frame except for the reference, which indicates which pixel to exclude or include in the image fusion process. When a pixel in the ghost map is 1, the corresponding pixel in the input frame will be included in the fusion process and vice versa.
The proposed method begins by finding the reference frame that has the largest well-contrasted region (i.e., smallest saturated region) as in conventional methods [16, 24–26, 29, 30]. It needs to be noted that our method identifies the inconsistent pixels in high-contrast regions and low-contrast regions separately. For this, we define a saturation map b, which is also a binary matrix with the size of input image. Note that this matrix can be constructed when finding the reference frame, because we check the contrast of regions at this time. Precisely, if we denote the element of b at the pixel position p as b(p), then it is given 1 when the p belongs to a well-contrast region in the reference frame and 0 when it belongs to the low-contrast region. In summary, for each of the input frames except the reference, we find the ghost maps for the region of b(p)=0 and b(p)=1 separately. The ghost map for the well-contrast region (b(p)=1) will be denoted as g w and the ghost map for the low-contrast region (b(p)=0) as g l in the rest of this paper. After finding these ghost maps for an input frame, the overall ghost map for the frame is constructed as g=g w ∪g l . In this paper, finding g w and g l are posed as binary labeling problems, i.e., as the energy minimization problems that are solved by graph cuts .
Construction of g w
where g w (p) is the pixel value (1 or 0) at a pixel p, P W is the set of pixels in the well-contrasted region (all the p’s with b(p)=1), N W is the set of all unordered pair of neighboring pixels over the areas of b(p)=1, and γ W is a weighting factor for balancing the data cost and the smoothness cost .
The data cost
where i is the state and input data x is the ZNCC values. From the learning by EM algorithm, we find the parameters of two Gaussian density functions such as mean μ i , variance σ i , and weight P r (i).
The smoothness cost
where B(p,q) means the edge cue which represents pixel intensity difference. When the adjacent pixels are bordering the edge, the ‘smoothness cost’ is diminished by B(p,q).
Construction of g l
The ghost map g w for the well-contrasted region is found from the above procedure, and now we find the ghost map g l for the low-contrast region (for the regions with b(p)=0). The problem with the low-contrast region is that there are too little textures to apply the feature-based methods (such as median pixel value , gradient , and ZNCC). Hence, we resort to intensity relationship between the frames for detecting the motion pixels in these areas. The basic idea is that the static area shows the intensity changes according to the amount of exposure difference, whereas the areas with motion pixels will not follow that. In other words, the static area will have the luminance changes according to the IMF, whereas the dynamic areas will not.
where g l (p) is the pixel value (1 or 0) at a pixel p, P L is the set of pixels in the low-contrast region (all the p’s with b(p)=0), N L is the set of all unordered pair of neighboring pixels over the areas of b(p)=0, and γ L is a weighting factor for balancing the data cost and the smoothness cost . The smoothness cost that prevents noisy result is defined the same as Equations 6a and 6c, except that g w is replaced by g l .
The data cost
As stated above, we use the ‘compliance of IMF’ for detecting the moving pixels in the low-contrast region, and thus, we have to estimate the IMF. In this paper, unlike the existing IMF estimation methods in [24, 25] which use all the pixels without considering the pixel quality, we use the pixels only in the areas of ‘high-contrast region without moving objects,’ which correspond to the region of g w (p)=1 for a given image.
We have proposed an HDR image fusion algorithm with reduced ghost artifacts, by detecting inconsistent pixels in the high-contrast and low-contrast regions separately. To detect inconsistent pixels in high-contrast areas, a ZNCC measure is used based on the observation that the ZNCC histogram displays a unimodal distribution in static regions, whereas it has a multimodal shape in dynamic regions. A cost function based on the parameters of these probability distributions is designed, whose minimization yields the ghost map for the highly contrasted region. To detect the ghost map in the low-contrast region, the IMF is first estimated using pixels from the high-contrast regions having no moving objects. Next, a cost function that encodes the IMF compliance of the pixel pairs is designed, whose minimization gives the ghost map for the low-contrast areas. The overall ghost map is defined as the logical operation of these two maps, and the ghost pixels are excluded from the fusion process. Since the proposed algorithm can find faint moving objects in areas where the pixel values are about to be saturated due to over- and under-exposure, it provides satisfactory HDR outputs with no noticeable ghost artifacts. However, the proposed method has limitations in correcting moving foreground object when it is saturated in the reference frame (Figure 15), because they are simply excluded from the fusion process. In this case, we have to manually select a reference frame that has well-exposed foreground objects, which can degrade the fusion results due to the narrower well-exposed background region than in the reference. Otherwise, we need to correct the inconsistent pixels instead of simply excluding them, which is a very challenging problem, especially when the moving foreground object is not consistently detected in each frame due to saturation, noise, or non-rigid motion.
This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (2009-0083495).
- Debevec PE, Malik J: Recovering high dynamic range radiance maps from photographs. In Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’97). Los Angeles; Aug 1997:369-378.View ArticleGoogle Scholar
- Reinhard E, Ward G, Pattanaik S, Debevec P: High Dynamic Range Imaging: Acquisition, Display, and Image-Based Lighting (The Morgan Kaufmann Series in Computer Graphics). Morgan Kaufmann, San Francisco; 2005.Google Scholar
- Mann S, Picard R, Mann S, Picard RW: On being ‘undigital’ with digital cameras: extending dynamic range by combining differently exposed pictures. In Proceedings of the 48th IS&T’s Annual Conference. Washington, DC; May 1995:442-448.Google Scholar
- Devlin K: A review of tone reproduction techniques. Technical report CSTR-02-005. Department of Computer Science, University of Bristol; 2002.Google Scholar
- Goshtasby AA: Fusion of multi-exposure images. Image Vis. Comput 2005, 23(6):611-618. 10.1016/j.imavis.2005.02.004View ArticleGoogle Scholar
- Mertens T, Kautz J, Reeth FV: Exposure fusion: a simple and practical alternative to high dynamic range photography. Comput. Graph Forum 2009, 28(1):161-171. 10.1111/j.1467-8659.2008.01171.xView ArticleGoogle Scholar
- Raman S, Chaudhuri S, matte-less A: variational approach to automatic scene compositing. In IEEE International 11th Conference on Computer Vision. Los Alamitos; Oct 2007:1-6.Google Scholar
- Malik MH, Asif S, Gilani M: Wavelet based exposure fusion. In Proceedings of the World Congress on Engineering. London; July 2008:688-693.Google Scholar
- Raman S, Chaudhuri S: Bilateral filter based compositing for variable exposure photography. In Proceeding of Eurographics Short Papers. Munich; Mar 2009:1-4.Google Scholar
- Shen J, Zhao Y, He Y: Detail-preserving exposure fusion using subband architecture. Vis. Comput 2012, 28(5):463-473. 10.1007/s00371-011-0642-3View ArticleGoogle Scholar
- Jacobs K, Loscos C, Ward G: Automatic high-dynamic range image generation for dynamic scenes. IEEE Comput. Graph. Appl 2008, 28(2):84-93.View ArticleGoogle Scholar
- Khan E, Akyuz A, Reinhard E: Robust generation of high dynamic range images. In Proceedings of the IEEE International Conference on Image Processing. Atlanta; Oct 2006:2005-2008.Google Scholar
- Li Z, Rahardja S, Zhu Z, Xie S, Wu S: Movement detection for the synthesis of high dynamic range images. In Proceedings of the IEEE International Conference on Image Processing. Hong Kong; Sept 2010:3133-3136.Google Scholar
- Wu S, Xie S, Rahardja S, Li Z: A robust and fast anti-ghosting algorithm for high dynamic range imaging. In Proceedings of the IEEE International Conference on Image Processing. Hong Kong; Sept 2010:397-400.Google Scholar
- Grossberg MD, Nayar SK: Determining the camera response from images: what is knowable. IEEE Trans.Pattern Anal. Mach. Intell 2003, 25(11):1455-1467. 10.1109/TPAMI.2003.1240119View ArticleGoogle Scholar
- Gallo O, Gelfand N, Chen W, Tico M, Pulli K: Artifact-free high dynamic range imaging. In IEEE International Conference on Computational Photography. San Francisco; Apr 2009:1-7.Google Scholar
- Zheng J, Li Z, Zhu Z, Wu S, Rahardja S: Hybrid patching for a sequence of differently exposed images with moving objects. IEEE Trans. Image Process 2013, 22(12):5190-5201.MathSciNetView ArticleGoogle Scholar
- Zimmer H, Bruhn A, Weickert J: Freehand HDR imaging of moving scenes with simultaneous resolution enhancement. Comput. Graph. Forum 2011, 30(2):405-414. 10.1111/j.1467-8659.2011.01870.xView ArticleGoogle Scholar
- Hu J, Gallo O, Pulli K: Exposure stacks of live scene with hand-held cameras. In 12th European Conference on Computer Vision. Firenze; Oct 2012:499-512.Google Scholar
- Oh T-H, Lee J-Y, Kweon IS: High dynamic range imaging by a rank-1 constraint. In IEEE International Conference on Image Processing. Melbourne; Sept 2013:790-794.Google Scholar
- Lee C, Li Y, Monga V: Ghost-free high dynamic range imaging via rank minimization. IEEE Signal Process. Lett 2014, 21(9):1045-1049.View ArticleGoogle Scholar
- Pece F, Kautz J: Bitmap movement detection: HDR for dynamic scenes. In The 11th European Conference on Visual Media Production. London; Nov 2010:1-8.Google Scholar
- Zhang W, Cham W-K: Gradient-directed composition of multi-exposure images. In IEEE Conference on Computer Vision and Pattern Recognition. San Francisco; June 2010:530-536.Google Scholar
- Raman S, Chaudhuri S: Bottom-up segmentation for ghost-free reconstruction of a dynamic scene from multi-exposure images. In Proceedings of the Seventh Indian Conference on Computer Vision, Graphics and Image Processing. Chennai; Dec 2010:56-63.View ArticleGoogle Scholar
- Raman S, Chaudhuri S: Reconstruction of high contrast images for dynamic scenes. Vis. Comput 2011, 27(12):1099-1114. 10.1007/s00371-011-0653-0View ArticleGoogle Scholar
- An J, Lee SH, Kuk JG, Cho NI: A multi-exposure image fusion algorithm without ghost effect. In IEEE International Conference on Acoustics, Speech, and Signal Processing. Prague; May 2011:1565-1568.Google Scholar
- Malik J, Perona P: Preattentive texture discrimination with early vision mechanisms. J. Opt. Soc. Am 1990, 7(5):923-932. 10.1364/JOSAA.7.000923View ArticleGoogle Scholar
- Burt P, Adelson E: The Laplacian pyramid as a compact image code. IEEE Trans. Comm 1983, 31(4):532-540. 10.1109/TCOM.1983.1095851View ArticleGoogle Scholar
- An J, Ha SJ, Kuk JG, Cho NI: Reduction of ghost effect in exposure fusion by detecting the ghost pixels in saturated and non-saturated regions. In IEEE International Conference on Acoustics, Speech, and Signal Processing. Kyoto; Mar 2012:1101-1104.Google Scholar
- Hu J, Gallo O, Pulli K, Sun X: HDR deghosting: how to deal with saturation? In IEEE Conference on Computer Vision and Pattern Recognition. Portland; June 2013:1163-1170.Google Scholar
- Boykov YY, Jolly MP: Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images. In Proceedings of Internation Conference on Computer Vision. Vancouver; July 2001:105-112.Google Scholar
- Boykov Y, Veksler O, Zabih R: Markov random fields with efficient approximations. In IEEE Conference on Computer Vision and Pattern Recognition. Santa Barbara; June 1998:648-655.Google Scholar
- He K, Sun J, Tang X: Guided image filtering. IEEE Trans. Pattern Anal. Mach. Intell 2013, 35(6):1397-1409.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.