Saliency detection in complex scenes
© Xu et al.; licensee Springer. 2014
Received: 27 July 2013
Accepted: 17 June 2014
Published: 24 June 2014
Detecting multiple salient objects in complex scenes is a challenging task. In this paper, we present a novel method to detect salient objects in images. The proposed method is based on the general ‘center-surround’ visual attention mechanism and the spatial frequency response of the human visual system (HVS). The saliency computation is performed in a statistical way. This method is modeled following three biologically inspired principles and compute saliency by two ‘scatter matrices’ which are used to measure the variability within and between two classes, i.e., the center and surrounding regions, respectively. In order to detect multiple salient objects of different sizes in a scene, the saliency of a pixel is estimated via its saliency support region which is defined as the most salient region centered at the pixel. Compliance with human perceptual characteristics enables the proposed method to detect salient objects in complex scenes and predict human fixations. Experimental results on three eye tracking datasets verify the effectiveness of the method and show that the proposed method outperforms the state-of-the-art methods on the visual saliency detection task.
Visual saliency is a state or quality which makes an item, e.g., an object or a person, prominent from its surroundings. Humans, as well as most primates, have a marvelous ability to interpret complex scenes and pay their attention to the salient objects or regions in the visual environment in real time. Two approaches for the deployment of algorithm based on visual attention have been proposed: the bottom-up and the top-down .
For many researches in physiology , neuropsychology , cognitive science , and computer vision , it is essential to study the mechanisms of human attention. The understanding of visual attention is helpful for object-of-attention image segmentation [5, 6], adaptive coding , image registration , video analysis , and perceptual image/video representation . Most models of attention are bottom-up and biologically inspired. Typically, these models posit that saliency is the impetus for selective vision. Saliency detection can be performed based on center-surround contrast , information theory [11, 12], graph model , common similarity [14–16], or learning methods [17, 18].
Appearance distinctness between an object and its surroundings. It is generally considered that the response of neurons comes from the contrast of the center region and the surrounding regions . For a pixel p which is inside an object, the pixel is salient if the object is distinct from its surroundings.
Unevenness of appearance within an object while appearance similarity within its surrounding region. For a pixel p inside a region (the center), the pixel is salient if the variability of the region is not too low since some intermediate spatial frequency stimuli may evoke a peak response . Meanwhile, the stimuli from the surrounding region should be as weak as possible because the surround is antagonistic to the center and a stronger response will be evoked without the surrounding stimuli . So, the surrounding region should have low variability.
Large object size. According to the neuropsychological experiments, attentive response increases when the stimulus size is large and the object is attended . If multiple objects have the same distinctness with respect to their surroundings, humans may pay attention to the larger object first.
Extending our previous work , we propose a novel method to measure visual saliency based on biologically plausible saliency mechanisms with a reasonable mathematical formulation. We define the saliency of an image region in a statistical way by means of the scatter matrix. For a pixel in an image, the center region and the surrounding region are defined as the regions centered at the pixel. For a center and a surrounding region, the saliency value is determined by two scatter matrices of the visual features. The first is the ‘within-classes scatter matrix’ SW which expresses the similarity between the features of the center region and the surrounding region. The second is the ‘between-classes scatter matrix’ SB which describes how the feature statistics in the center region diverge from those in the surrounding region. For a pixel, there exist many concentric regions with different radii, which have different prominence with respect to their surroundings. In order to detect the most salient objects in the scenes, the saliency support region of a pixel is explored, which is the most salient center region among all the concentric center regions. In order to make the large object more salient, the saliency value is weighted using the radius of the saliency support region.
The proposed method has two advantages. First, it is based on the computational architecture of human visual attention. The mechanism of the method is consistent with human perceptual characteristics. So, the method has a good performance for human fixation prediction. Secondly, the proposed method searches the potential saliency support regions to measure the saliency of multiple objects at different scales. This mechanism enables the method to explore various salient objects adaptively. It is effective for saliency detection in complex scenes.
The proposed method is evaluated on three eye tracking datasets which comprise natural images in different scenes and the corresponding human fixation data. Compared to 12 state-of-the-art methods and the human fixation data, the experimental results show that our method outperforms all other methods in terms of receiver operator characteristic (ROC) area metrics for the human fixation prediction task.
This paper is organized as follows: The related work is in Section 2. Section 3 describes the proposed method for saliency detection. The experimental results are provided in Section 4 to verify the effectiveness of the method. Finally, the conclusion is drawn in Section 5.
2 Related work
In the past few decades, a lot of bottom-up saliency-driven methods have been proposed in cognitive fields, which can be broadly classified as biologically inspired, purely computational, or an integration of the two .
Many attention models are based on the biologically inspired architecture proposed by Koch and Ullman , which is motivated from Treisman and Gelade’s feature integration theory (FIT) . This structure explains the human visual search strategies, i.e., the visual input is firstly divided into several feature types (e.g., intensity, color, or orientation) which are explored concurrently, and then the conspicuities of the features are combined into a saliency or master map which is a scalar, two-dimensional map providing higher intensities for the most prominent areas.
According to this biologically plausible architecture, a popular bottom-up attention model is proposed by Itti et al. . In Itti’s model, three multi-resolution extracted local feature contrasts, i.e., luminance, chrominance, and orientation, are mixed to produce a saliency map. Walther and Koch  extended Itti’s model to infer proto-object regions from individual contrast maps at different spatial scales. These models obtained good results in applications from computer vision to robotics [19, 32].
In the last decade, many purely computational methods came up to model saliency with less biological motivation. Ma and Zhang  proposed a fuzzy growing method to extract salient objects based on local contrast analysis. Achanta et al.  estimated the center-surround contrast by using a frequency-tuned technology. In order to solve the object scale problem, Achanta et al. extended their work by using a symmetric-surround method to vary the bandwidth of the center-surround filtering near image borders . Hu et al.  presented a composite saliency indicator and a dynamic weighting strategy to estimate saliency. Hou and Zhang  extracted the saliency map from the spectral residual of the log-spectrum of an image. According to the global rarity principle of saliency, Zhai and Shah  and Cheng et al.  used the histogram-based method to detect the global contrast of a pixel or region. By filtering color values and position values, Perazzi et al.  computed the uniqueness and distribution to detect salient regions. Recently, based on a graph-based manifold ranking method, Yang et al.  detected saliency of the image elements by ranking the similarity to background and foreground queries. Li et al.  performed saliency detection by integrating the dense and sparse reconstruction errors of image regions. These state-of-the-art methods can extract salient regions effectively.
Some of the other methods model saliency based on both of the biological and computational models. Harel et al.  used Itti’s model to create feature maps, which are integrated into activation maps by using a graph-based approach. Finally, saliency maps are generated by a Markovian algorithm. Bruce and Tsotsos  represented the probability distribution of local image patches by using the independent component analysis (ICA). They computed the self-information of image regions to implement a neurally plausible circuit that closely corresponds to visual saliency. Wang et al.  use the learned sparse codes to extract some sub-band feature maps which are represented by a random walk-based graph model to simulate the information transmission among the neurons.
Besides Bruce’s work, some other saliency detectors also established models based on information theory. Itti and Baldi  presented a Bayesian definition of surprise to describe saliency. Gao and Vasconcelos  proposed a discriminant saliency detection model by maximizing the mutual information of the center and surrounding regions in an image. Klein and Frintrop  used the integral histograms to estimate the distributions of the center and surrounding region and expressed the saliency by the Kullback-Leibler divergence (KLD) of these distributions.
Statistical theory has also gotten into the field of saliency detection. Zhang et al.  computed saliency based on the self-information of local image features using natural image statistics. Also using natural image statistics, Vigo et al.  detected salient edge based on ICA.
We implement the computation of saliency based on the statistics of local image regions. Our work is most closely related to the within-classes scatter matrix and between-classes scatter matrix in Fisher’s linear discriminant analysis (LDA) which is commonly used for dimensionality reduction before later classification . We use these two scatter matrices to measure the variability within/between the center and surrounding regions which are defined in Section 3.1. Furthermore, we compute the visual saliency based on principles 1 and 2.
Some methods detect saliency at a single spatial scale [33, 35], while others combine feature maps at multiple scales to the final saliency map [1, 19, 20]. Without knowing the scale of the object, these methods may not detect the most salient object accurately. The proposed method finds the saliency support region for a local area and computes the saliency of this region with respect to its surroundings to detect multiple salient objects adaptively.
3 Proposed method
In this section, we propose a computational method for saliency detection in images, which is performed in the CIELAB color space. We first define the center region and surrounding region that are used for the center-surround contrast computation. Secondly, a central stimuli sensitivity-based model is proposed to compute the saliency of the center region. Then, the saliency support region of a given pixel is searched to mimic the maximum response of the receptive field in the neurophysiological experiment. Finally, we introduce the visual saliency map generation.
3.1 Center region and surrounding region
3.2 Central stimuli sensitivity-based saliency model
According to principles 1 and 2, the appearance distinctness between the object and its surrounding and the appearance similarity of each of them are key for visual saliency detection. In order to make an object prominent, the stimuli from the center region should make the HVS sensitive while the stimuli from the surrounding region should not. Following the band-pass characteristic of the spatial frequency response of the HVS , we measure the sensitivity in a statistics-theoretic way.
where ω c is a empirically set parameter to control the contribution of the similarity of R c to the computed saliency value. By setting ω c to 1, a flat center region may produce a large saliency value. If ω c is decreased, an uneven center region with a higher spatial frequency may generate a large saliency value. We will demonstrate in Section 4.5 that the weight ω c plays a significant role in the saliency computing.
where μ is the overall mean feature vector of the pixels in . For two regions which are distinct from each other, the eigenvalues of SB are large.
A center region which is distinct from its flat surrounding region has a high saliency value.
3.3 Saliency support region
As shown in Figure 3, many center regions exist for a pixel. Some of them are salient, such as the region within the blue circle, while some others are not, such as the regions within the red and purple circles. As mentioned in Section 2, some of the previous work preset multiple spatial scales or use a single scale to detect saliency, which may fail to find the salient object.
According to the spatial summation curves in the neurophysiological experiments, when the visual stimuli cover the area of receptive field center, the neural responses reach the peak . We believe that for a salient object, there exists a support to form the saliency quality, which generates a peak response in the neuron. We attempt to find the support region which generates the most intensive saliency with respect to its surrounding region, referred to as the saliency support region.
3.4 Visual saliency map
where p is the center pixel of the saliency support region (SSR), and r(SSR) denotes the radius of SSR.
In this section, we apply the proposed method on three public eye tracking datasets (two color image datasets and one gray image dataset) to evaluate the performance of human fixation prediction. These datasets comprise natural images, containing different objects and scenes, and the corresponding human fixations. The proposed method is compared with the state-of-the-art bottom-up methods based on a well-known validation approach. The qualitative and quantitative assessments of detection results are reported.
4.1 Parameter setting
There is a parameter in the proposed method: the weight ω c of the within-classes scatter matrix of the center region in (2). We set the weight ω c =0.1 because it can obtain large areas under the ROC curves in the experiments on the three datasets. The relationship between the parameter and the performance of the method is discussed in Section 4.5.
4.2 Experiments on BRUCE color image dataset
In the first experiment, we perform saliency computations on the popular color image dataset introduced by Bruce and Tsotsos , which consists of 120 images in indoor and outdoor scenes, such as human objects, furniture, phones, fruits, cars, buildings, streets, etc. All the image sizes are 681×511 pixels. In the dataset, 20 subjects’ fixations are recorded for each image. To compare the saliency maps with the human fixations objectively, we use the popular validation approach as in . The area under the ROC curve is used to quantitatively evaluate the performance of visual saliency detection.
The state-of-the-art methods
Itti’s model (IT)
Matlab code by Harel 
Attention information max. (AIM)
Matlab code by author
Spectral residual (SR)
Matlab code by author
Graph-based visual saliency (GB)
Matlab code by author
Site entropy rate (SER)
Executable code by author
Context aware (CA)
Matlab code by author
Salient region detection (AC)
Executable code by author
Maximum symmetric surround (MSS)
Executable code by author
Region-based contrast (RC)
Executable code by author
Saliency filters (SF)
C code by author
Graph-based manifold ranking (MR)
Matlab code by author
Dense and sparse reconstruction (DSR)
Matlab code by author
The ROC areas on three eye tracking datasets
The results listed in Table 2 are different from some of the reported results [11, 13, 41]. In the existing comparison methods, the fixation mask is obtained by setting a quantization threshold, i.e., the threshold classifies the locations in a fixation density map into fixations and non-fixations. So, different quantization thresholds lead to different results. To perform a fair comparison, we use the fixation points provided by the dataset as the ground truth for all the compared methods, i.e., only the points are fixations and the rest are non-fixations. The ROC areas of the compared methods are generated using the Matlab code provided by Harel et al. .
4.3 Experiments on MIT-1003 color image dataset
We perform saliency computations on another color image dataset introduced by Judd et al. . The MIT-1003 dataset contains 1,003 natural images of varying dimensions (the maximal dimension of the width and height is 1,024 pixels), along with human fixation data from 15 subjects. The images in this dataset contain different scenes and objects, as well as many semantic objects, such as faces, people, body parts, and text, which are not modeled by bottom-up saliency .
The results of ROC areas of the compared methods on this dataset are shown in the third column of Table 2. Among the existing methods, the very recent method DSR shows the best performance on this dataset. The proposed method achieves a slightly higher ROC area than DSR and also outperforms the state-of-the-art methods on this human fixation dataset. We notice that the improvement of our method on this dataset is not as overt as on the BRUCE dataset. The main reason is that the MIT-1003 dataset contains many semantic objects which put forward challenges to the bottom-up models. The detected results by our method are based on the bottom-up contrast, which may diverge from the fixations of the subjects. Using some high-level features may improve the results.
4.4 Experiments on DOVES gray image dataset
In the third experiment, we test the proposed method on a gray image dataset, DOVES, which is introduced by van der Linde et al. . The DOVES dataset contains 101 natural images and the eye tracking data from 29 subjects. All the image sizes are 1,024×768 pixels. Because the first fixations of each eye movement trace of the subjects are forced at the center of the image , these fixations are removed in the experiments.
The results of ROC area of all the compared methods on the DOVES dataset are presented in the fourth column of Table 2. Method MR shows the highest ROC area (0.7375) on this dataset compared to the previous methods. The main reason is that most of the subjects tend to focus their fixations on the image center if there are no very prominent regions. Compared with MR, the proposed method achieves about 2% improvement of ROC area. It shows that the proposed method outperforms the state-of-the-art methods on fixation prediction for gray images.
In our model, the parameter ω c is designed to determine the region of which frequency will be assigned a high saliency value. If ω c is set to 1 and 0, the regions with very low and high frequency will be assigned high saliency values, respectively. According to the HVS principles, the very low and high frequency regions may weaken the response of the HVS. So, an inappropriate value of ω c will lead to the wrong detection, i.e., the ROC area may be small.
The performance improvement of the proposed method in the fixation prediction experiments verifies the effectiveness of the scatter matrix-based saliency computation and the saliency support region exploration. However, since we use the pixel-wise processing manner and the SSR is searched for every processed pixel, the method is computationally expensive. We therefore adopt the subsampling method to reduce the cost. The average running time on the BRUCE dataset to generate the saliency map is 60.63 s when measured on an Intel 3.20-GHz CPU with 3-GB RAM in Matlab implementation. In the future, we will study the superpixel-based processing to make the algorithm more efficient.
In this paper, we propose a novel method to compute visual saliency in a statistical way. According to three principles of human visual attention, we use the within-classes scatter matrix and the between-classes scatter matrix to measure the similarity and distinctness within and between the center region and the surrounding region, respectively. Furthermore, the saliency of the center region is computed by the two scatter matrices. In order to detect the salient objects with different sizes, the saliency support region is explored and the saliency value of the center pixel of the region is obtained. To make the large object more salient, the saliency value is weighted by the radius of the saliency support region. Experimental results are obtained by applying the proposed method to three eye tracking datasets. The results show that the proposed method outperforms the state-of-the-art methods on saliency detection in complex scenes and human fixation prediction.
This work was partially supported by NSFC (No. 61201274), National High Technology Research and Development Program of China (863 Program, No. 2012AA011503), and Fundamental Research Funds for the Central Universities (No. ZYGX2012J025).
- Itti L, Koch C, Niebur E: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell 1998, 20(11):1254-1259. 10.1109/34.730558View ArticleGoogle Scholar
- Kastner S, Ungerleider LG: Mechanisms of visual attention in the human cortex. Ann. Rev. Neurosci 2000, 23: 315-341. 10.1146/annurev.neuro.23.1.315View ArticleGoogle Scholar
- Egeth HE, Yantis S: Visual attention: control, representation, and time course. Ann. Rev. Psychol 1997, 48: 269-297. 10.1146/annurev.psych.48.1.269View ArticleGoogle Scholar
- Desimone R, Duncan J: Neural mechanisms of selective visual attention. Ann. Rev. Neurosci 1995, 18: 193-222. 10.1146/annurev.ne.18.030195.001205View ArticleGoogle Scholar
- Li H, Ngan KN: Saliency model based face segmentation in head-and-shoulder video sequences. J. Vis. Commun. Image Represen 2008, 19(5):320-333. 10.1016/j.jvcir.2008.04.001View ArticleGoogle Scholar
- Li H, Ngan KN: Learning to extract focused objects from low DOF images. IEEE Trans. Circuits Syst. Video Technol 2011, 21(11):1571-1580.View ArticleGoogle Scholar
- Liu KC: Prediction error preprocessing for perceptual color image compression. EURASIP J. Image Video Process 2012, 2012: 3. 10.1186/1687-5281-2012-3View ArticleGoogle Scholar
- Mahapatra D, Sun Y: Rigid registration of renal perfusion images using a neurobiology-based visual saliency model. EURASIP J. Image Video Process 2010, 2010: 195640.Google Scholar
- You J, Liu G: A novel attention model and its application in video analysis. Appl. Math. Comput 2007, 185(2):963-975. 10.1016/j.amc.2006.07.023MathSciNetView ArticleGoogle Scholar
- Mancas M, Gosselin B, Macq B: Perceptual image representation. EURASIP J. Image Video Process 2007, 2007: 098181. 10.1186/1687-5281-2007-098181View ArticleGoogle Scholar
- Bruce N, Tsotsos JK: Saliency based on information maximization. Adv. Neural Inform. Process. Syst 2006, 18: 155-162.Google Scholar
- Luo W, Li H, Liu G, Ngan KN: Global salient information maximization for saliency detection. Signal Process.: Image Commun 2012, 27(3):238-248. 10.1016/j.image.2011.10.004Google Scholar
- Harel J, Koch C, Perona P: Graph-based visual saliency. Adv. Neural Inform. Process. Syst 2006, 19: 545-552.Google Scholar
- Li H, Ngan KN: A co-saliency model of image pairs. IEEE Trans. Image Process 2011, 20(12):3365-3375.MathSciNetView ArticleGoogle Scholar
- Meng F, Li H, Liu G, Ngan KN: Object co-segmentation based on shortest path algorithm and saliency model. IEEE Trans. Multimedia 2012, 14(5):1429-1441.View ArticleGoogle Scholar
- Li H, Meng F, Ngan KN: Co-salient object detection from multiple images. IEEE Trans. Multimedia 2013, 15(8):1896-1909.View ArticleGoogle Scholar
- Liu T, Sun J, Zheng NN, Tang X, Shum HY: Learning to detect a salient object. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Minneapolis; 18–23 June 2007:1-8.Google Scholar
- Li J, Tian Y, Huang T, Gao W: Multi-task rank learning for visual saliency estimation. IEEE Trans. Circuits Syst. Video Technol 2011, 21(5):623-636.View ArticleGoogle Scholar
- Walther D, Koch C: Modeling attention to salient proto-objects. Neural Netw 2006, 19(9):1395-1407. 10.1016/j.neunet.2006.10.001View ArticleGoogle Scholar
- Achanta R, Estrada F, Wils P, Süsstrunk S: Salient region detection and segmentation. In Proceedings of International Conference on Computer Vision Systems (ICVS). Santorini; 12–15 May 2008:66-75.Google Scholar
- Achanta R, Hemami S, Estrada F, Süsstrunk S: Frequency-tuned salient region detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Miami; 20–25 June 2009:1597-1604.Google Scholar
- Cheng MM, Zhang GX, Mitra NJ, Huang X, Hu SM: Global contrast based salient region detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Colorado Springs; 20–25 June 2011:409-416.Google Scholar
- Judd T, Ehinger K, Durand F, Torralba A: Learning to predict where humans look. In Proceedings of IEEE International Conference on Computer Vision (ICCV). Kyoto; 27 Sept–4 Oct 2009:2106-2113.Google Scholar
- Gao D, Vasconcelos N: Decision-theoretic saliency: computational principles, biological plausibility, and implications for neurophysiology and psychophysics. Neural Comput 2009, 21: 239-271. 10.1162/neco.2009.11-06-391View ArticleGoogle Scholar
- Sundberg KA, Mitchell JF, Reynolds JH: Spatial attention modulates center-surround interactions in macaque visual area V4. Neuron 2009, 61(6):952-963. 10.1016/j.neuron.2009.02.023View ArticleGoogle Scholar
- Kelly DH: Motion and vision.I. Stabilized images of stationary gratings. J. Opt. Soc. Am 1979, 69: 1266-1274. 10.1364/JOSA.69.001266View ArticleGoogle Scholar
- Cavanaugh JR, Bair W, Movshon JA: Nature and interaction of signals from the receptive field center and surround in macaque V1 neurons. J. Neurophysiol 2002, 88(5):2530-2546. 10.1152/jn.00692.2001View ArticleGoogle Scholar
- Herrmann K, Montaser-Kouhsari L, Carrasco M, Heeger DJ: When size matters: attention affects performance by contrast or response gain. Nat. Neurosci 2010, 13(12):1554-1559. 10.1038/nn.2669View ArticleGoogle Scholar
- Xu L, Li H, Zeng L, Wang Z, Liu G: Saliency detection using a central stimuli sensitivity based model. In Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS). Beijing; 19–23 May 2013:945-949.Google Scholar
- Koch C, Ullman S: Shifts in selective visual attention: towards the underlying neural circuitry. Hum. Neurobiol 1985, 4: 219-227.Google Scholar
- Treisman AM, Gelade G: A feature-integration theory of attention. Cogn. Psychol 1980, 12: 97-136. 10.1016/0010-0285(80)90005-5View ArticleGoogle Scholar
- Frintrop S, Rome E, Christensen HI: Computational visual attention systems and their cognitive foundation: a survey. ACM Trans. Appl. Percept 2010, 7: 6:1-6:39.View ArticleGoogle Scholar
- Ma YF, Zhang HJ: Contrast-based image attention analysis by using fuzzy growing. In Proceedings of ACM International Conference of Multimedia. Berkeley; 2–8 Nov 2003:374-381.Google Scholar
- Achanta R, Süsstrunk S: Saliency detection using maximum symmetric surround. In Proceedings of IEEE International Conference on Image Processing (ICIP). Hong Kong; 26–29 Sept 2010:2653-2656.Google Scholar
- Hu Y, Xie X, Ma W, Chia L, Rajan D: Salient region detection using weighted feature maps based on the human visual attention model. In Proceedings of Fifth Pacific Rim Conference on Multimedia. Tokyo; 30 Nov–3 Dec 2004:993-1000.Google Scholar
- Hou X, Zhang L: Saliency detection: a spectral residual approach. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Minneapolis; 18–23 June 2007:1-8.Google Scholar
- Zhai Y, Shah M: Visual attention detection in video sequences using spatiotemporal cues. In Proceedings of ACM International Conference of Multimedia. Santa Barbara; 23–27 Oct 2006:815-824.Google Scholar
- Perazzi F, Krähenbühl P, Pritch Y, Hornung A: Saliency filters: contrast based filtering for salient region detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Providence; 16–21 June 2012:733-740.Google Scholar
- Yang C, Zhang L, Lu H, Ruan X, Yang MH: Saliency detection via graph-based manifold ranking. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Portland; 23–28 June 2013:3166-3173.Google Scholar
- Li X, Lu H, Zhang L, Ruan X, Yang MH: Saliency detection via dense and sparse reconstruction. In Proceedings of IEEE International Conference on Computer Vision (ICCV). Sydney; 1–8 Dec 2013:2976-2983.Google Scholar
- Wang W, Wang Y, Huang Q, Gao W: Measuring visual saliency by site entropy rate. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Francisco; 13–18 June 2010:2368-2375.Google Scholar
- Itti L, Baldi P: Bayesian surprise attracts human attention. Adv. Neural Inform. Process. Syst 2006, 19: 547-554.Google Scholar
- Gao D, Vasconcelos N: Bottom-up saliency is a discriminant process. In Proceedings of IEEE International Conference on Computer Vision (ICCV). Rio de Janeiro; 14–20 Oct 2007:1-6.Google Scholar
- Klein DA, Frintrop S: Center-surround divergence of feature statistics for salient object detection. In Proceedings of IEEE International Conference on Computer Vision (ICCV). Barcelona; 6–13 Nov 2011:2214-2219.Google Scholar
- Zhang L, Tong MH, Marks TK, Shan H, Cottrell GW: SUN: a Bayesian framework for saliency using natural statistics. J. Vis 2008, 8(7):1-20. 10.1167/8.7.1View ArticleGoogle Scholar
- Vigo DR, van de Weijer J, Gevers T: Color edge saliency boosting using natural image statistics. In Proceedings of IS&T’s fifth European Conference on Colour in Graphics, Imaging, and Vision (CGIV)f. Joensuu; 14–17 June 2010:228-234.Google Scholar
- Belhumeur PN, Hespanha JP, Kriegman DJ: Eigenfaces Vs.Fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell 1997, 19(7):711-720. 10.1109/34.598228View ArticleGoogle Scholar
- Sceniak MP, Ringach DL, Hawken MJ, Shapley R: Contrast’s effect on spatial summation by macaque V1 neurons. Nat. Neurosci 1999, 2(8):733-739. 10.1038/11197View ArticleGoogle Scholar
- Goferman S, Zelnik-Manor L, Talm A: Context-aware saliency detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR). San Francisco; 13–18 June 2010:2376-2383.Google Scholar
- van der Linde I, Rajashekar U, Bovik AC, Cormack LK: DOVES: a database of visual eye movements. Spat. Vis 2009, 22(2):161-177. 10.1163/156856809787465636View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.