 Research
 Open Access
 Published:
3D visual discomfort prediction using low complexity disparity algorithms
EURASIP Journal on Image and Video Processing volume 2016, Article number: 23 (2016)
Abstract
Algorithms that predict the degree of visual discomfort experienced when viewing stereoscopic 3D (S3D) images usually first execute some form of disparity calculation. Following that, features are extracted on these disparity maps to build discomfort prediction models. These features may include, for example, the maximum disparity, disparity range, disparity energy, and other measures of the disparity distribution. Hence, the accuracy of prediction largely depends on the accuracy of disparity calculation. Unfortunately, computing disparity maps is expensive and difficult and most leading assessment models are based on features drawn from the outputs of high complexity disparity calculation algorithms that deliver high quality disparity maps. There is no consensus on the type of stereo matching algorithm that should be used for this type of model. Towards filling this gap, we study the relative performances of discomfort prediction models that use disparity algorithms having different levels of complexity. We also propose a set of new discomfort predictive features with good performance even when using low complexity disparity algorithms.
Introduction
The human consumption of stereoscopic 3D (S3D) movies and images has dramatically increased in recent years. 3D content can better allow the user to understand the visual information being presented, thereby enhancing the viewing experience by providing a more immersive, stereoscopic visualization [1]. However, stereo images that have lowquality content or shooting errors can induce unwanted effects such as fatigue, asthenopia, eye strain, headache, and other phenomena conductive to a bad viewing experience [2]. A large number of studies have focused on finding features (e.g., disparity, spatial frequency, stimulus width, object size, motion [3], and crosstalk effects) that can be reliably extracted from 3D images (stereopairs) towards creating automatic 3D discomfort prediction algorithms to predict and potentially reduce feelings of visual discomfort experienced when viewing 3D images [2, 4].
Several possible factors of visual discomfort have been extensively studied, such as the vergenceaccommodation conflict [5, 6], excessive disparities and disparity gradients [7], prolonged viewing, the viewing distance [8], and the amount of defocusblur [9]. Prolonged exposure to conflicts between vergence and accommodation is a main determinant of the degree of experienced visual discomfort and fatigue when viewing S3D content [9–11]. Hence, several predictive models have been built to simulate and predict occurrences of this phenomenon. Commonly, the features used in discomfort prediction models were extracted from disparity maps. These features included the disparity location, disparity gradient, disparity range, maximum angular disparity, and disparity distribution [7, 12–16]. Hence, the predictive powers of these discomfort assessment models strongly depends on the accuracy of disparity calculation.
However, there is no consensus regarding the type of disparity calculation algorithm that should be used for 3D visual discomfort. Early on, some developers used stereo matching algorithms that extract only sparsely distributed disparities (e.g., at luminance edges) to achieve low complexity, fast computation [13, 14]. More, recent studies have emphasized the use of high complexity dense stereo matching algorithms that deliver high quality disparity maps, such as the matching algorithm [17] used in [7], dynamic programming [15, 18], the Depth Estimation Reference Software [19] used in [12], and combinations of sparse and dense disparity estimation methods [16].
Although high complexity dense disparity calculation algorithms deliver more accurate disparity results, speed of computation is desirable in many settings, e.g., on realtime 3D videos. However, there is scarce literature on the performance differences of 3D discomfort prediction models deploying different disparity algorithms nor of the causative factors contributing to these differences, such as complexity. Furthermore, little attention has been paid to balancing speed against prediction accuracy by making use of low complexity disparity algorithms. Towards filling these gaps, we begin by studying the performance differences of S3D discomfort prediction models using three nominal disparity algorithms having different levels of complexity. We then introduce two new sets of discomfort predictive features, the uncertainty map and natural scene statistics, which have previously found use in 3D image quality assessment models [20–22]. These features efficiently improve the performance of prediction models that use low complexity disparity calculation methods.
Background
The main difference between viewing natural scenes and viewing a stereoscopic display is that vergence and accommodation normally occur in a synergistic manner in natural viewing but they do not when viewing a display. In a 3D scene viewed on a stereoscopic display, accommodation is fixed by the distance of the dichoptic images from the two eyes but vergence is free to adapt to the disparitydefined depth planes that occur when a fused image is achieved. This perceptual conflict is a main cause of visual discomfort. As the binocular disparity signal is the primary cue in evoking vergence [23], extracting accuracy disparity signals from stereoscopic image pairs is the first important step to make good predictions of the degree of visual discomfort experienced when viewing 3D images.
Stereo matching is the most common method to extract disparity signals from image pairs. The disparity signals (in pixels) which are extracted by stereo matching algorithms can be converted to retinal disparities (in angles) given the viewing parameters and the size of the display [24]. Although this conversion is not linear, most studies prefer to using pixel disparities when conducting visual discomfort modeling to simplify algorithm design [7, 13–16, 25]. We will also use pixel disparitybased features.
Research on stereo matching algorithm design has been a topic of intense inquiry for decades. Stereo matching algorithms can be classified into sparse and dense stereo matching. Sparse stereo matching methods do not calculate disparity at every pixel and are deployed for their low complexity or if only sparse data is needed. Dense stereo matching methods calculate disparity at every pixel. Most recent discomfort assessment models are built on dense stereo matching algorithms [26].
All dense stereo matching algorithms use some method of measuring the similarity of pixels between the two image views. Typically, a matching function is computed at each pixel for all disparities under consideration. The simplest matching functions assume that there is little or no luminance difference between corresponding left/right pixels, but more robust methods may allow for (explicitly or implicitly) radiometric changes and/or noise. Common pixelbased matching functions include absolute differences, squared differences, or samplinginsensitive absolute differences [27]. Common windowbased matching functions include the sum of absolute or squared differences (SAD, SSD), normalized crosscorrelation (NCC), and rank and census transforms [28]. Some matching functions can be implemented efficiently using unweighted and weighted median filters [29, 30]. More complicated similarity measures are possible and have included mutual information or approximate segmentwise mutual information as used in the layered stereo approach of Zitnick [31]. Some methods not only try to employ new combined matching functions but also propose secondary disparity refinement to further remove the remaining outliers [32].
In order to gain insights into the influence of the choice of stereo algorithm on the performance of 3D visual discomfort models, we selected three popular and characteristic dense stereo algorithms, ranging from a computationally expensive, high performance model (e.g., as assessed on the Middlebury database [33]) to a very simple, inexpensive model that delivers reasonable performance.
Researchers have deployed a wide variety of stereo matching algorithms to obtain disparity maps for assessment 3D discomfort prediction models [16–19]. The algorithms previously used are characterized by high computational complexity and generally deliver highly accurate disparity maps. Of the three disparity engines we use, the optical flow software (DFLOW) [17] delivers highly competitive predictions of disparity on the Middlebury Stereo Evaluation dataset [33]. This tool has been utilized in a mature 3D visual discomfort assessment framework which achieves good predictive power [7].
The second comparison algorithm is a windowbased stereo matching algorithm based on the SSIM [34] index (DSSIM) [20]. The disparity map of a stereo pair is generated by using SSIM as the matching objective, resolving ties by a minimum disparity criterion. This algorithm was used in a popular 3D QA model [20] but has not yet been utilized in previous 3D visual discomfort assessment models.
The third algorithm (DSAD) was chosen for its very low complexity. It uses a windowbased sumofabsolute difference (SAD) luminance matching functional without a smoothness constraint. This is a very basic stereo matching algorithm that has only been used in early, simple 3D visual discomfort prediction models.
Affect comparison of disparity estimation on visual discomfort prediction
Figure 1 shows four images (“cup,” “human,” “lawn,” and “stone”) from the IEEESA stereo image database [35] and disparity maps extracted by these three algorithms. Figure 2 are corresponding depth distribution histograms computed from the depth maps delivered by these three algorithms. The search range of DSSIM and DSAD was fixed at [ −120, 90] which is the maximum and minimum disparities of images in the IEEESA database. The values of the disparity maps range from dark to white denoting disparity ranging from maximum to minimum.
It is apparent that the disparity maps extracted by DFLOW yield the highest quality of depth detail. The disparity maps delivered by DSSIM are of much lower reliability than those of DFLOW. The disparity maps from DSAD are even worse than those of DSSIM. There are many areas with false disparities. Among the three methods DFLOW, DSSIM, and DSAD, there is a decreasing degree of coherence and segmentability of the computed disparity patterns. Often, disparity errors occur on complex textured regions which the lower complexity stereo algorithms handle less well.
Clearly, the DSSIM and DSAD disparity maps would be difficult to apply in 3D visual discomfort prediction frameworks that require depth segmentation. Hence, we instead only study discomfort prediction frameworks based on analysis of the disparity distribution. Four features are extracted based on the study in [7]. The first two features are the mean values of the positive and negative disparities. These are computed separately since it is known that the sign of disparity can affect experienced visual discomfort [13, 36]:
In (1) and (2), D(n) is the nth smallest value in the disparity map, while N _{Pos} and N _{Neg} are the number of positive and negative values in the disparity map, respectively. If N _{Pos}=0 or N _{Neg}=0, then f _{1}=0 or f _{2}=0.
The average of the upper and lower 5 % disparities define the third and fourth features:
where N _{5 % } and N _{95 % } are the number of values that are lower and higher than 5 % and 95 % of the disparity values, respectively.
We extracted these four basic statistical features from disparity maps calculated by the three abovementioned stereo depthfinding algorithms on the stereo pairs in the IEEESA stereo image database [35]. The IEEESA stereo image database contains 800 stereo image pairs of highdefinition (HD) resolution (1920 ×1080 pixels). An integrated twinlens PANASONIC AG3DA1 3D camcorder was used to capture the 3D content in the database. The subjective discomfort assessment experiment was conducted in a laboratory environment commensurate with standardized recommendations for subjective evaluation of picture quality [37]. A 46in. polarized stereoscopic monitor of HD resolution was used to display the test stereo images. Each subject viewed the test stereo images from a distance of about 170 cm, or about three times the height of the monitor. Twentyfour valid subjects participated in the subjective test. Each subject was asked to assign a visual discomfort score to each stereo test image on a Likertlike scale: 5 = very comfortable, 4 = comfortable, 3 = mildly comfortable, 2 = uncomfortable, and 1 = extremely uncomfortable. More information can be found in [25].
Simply stated, the images and corresponding MOS of these images were divided into test and training subsets. A support vector regression (SVR) was deployed as a regression tool on the training set and then applied to the test set. To implement the SVR, we used the LibSVM package [38] with the radial basis function kernel, whose parameters were estimated by crossvalidation during the training session. One thousand iterations of the traintest process were applied where the image database was randomly divided into 80 % training and 20 % test at each iteration. The training and testing subsets did not overlap in content. The performance was measured using Spearman’s Rank Ordered Correlation Coefficient (SROCC) and (Pearson’s) linear correlation coefficient (LCC) between the predicted scores and the MOS. Higher SROCC and LCC values indicate good correlation (monotonicity and accuracy) with human quality judgments. We obtained the mean, median, and standard deviations of LCC and SROCC of the three models against MOS over all 1000 traintest trials, as tabulated in Table 1. Values of LCC and SROCC close to 1 mean superior linear and rank correlation with MOS, respectively. Obviously, the higher the mean and median, the better the LCC and SROCC performance. Conversely, a higher standard deviation implies more unstable performance.
From the results, we can see that the predictive power of the fourfeature discomfort prediction models is dramatically reduced by the use of a low complexity stereo algorithm instead of a high performing, high complexity algorithm.
There is a significant increase in pixels having large estimated disparity errors in the disparity maps extracted by DSSIM and DSAD. By observing the histograms of the disparity distributions in Fig. 2, it may be seen that the disparities produced by DSSIM and DSAD span nearly the entire disparity range. Hence, it is difficult to obtain accurate values of the mean negative and positive disparities, nor the top 5 % biggest and smallest disparities. For example the four feature values (1)–(4) extracted by DFLOW on the image “human” were [1.69, –12.5, –26.9, 2.5], the values computed using DSSIM were [32.6, –33.5, –107.4, 78.8], and those using DSAD were [45.8, –45.0, –111.4, 85.5]. The largest and smallest 5 % of disparities found by DSAD essentially bracket the entire disparity.
Table 2 compares the computation times and estimation accuracies of these three disparity calculation methods. The computation times were recorded in units of hours on the IEEESA database. Since IEEESA does not provide ground truth maps, the estimation accuracies of these three algorithms were tested on the Middlebury stereo database [33]. The average percentages of bad pixels was recorded for each algorithm. From Table 2, it is apparent that the DSAD disparity algorithm executes with the fastest computation speed but it achieves the worst estimation accuracy.
Feature extraction from disparity distributions measured on the DSSIM and DSAD maps will likely be seriously affected by the high percentages of estimated errors, thereby adversely affecting discomfort prediction results. This would seem to advocate the use of only high complexity, high performance stereo modules in S3D visual discomfort prediction models. However, another possibility worth exploring to improve the usability of disparity maps extracted by low complexity algorithms like DSAD or DSSIM, is to develop additional resilient features on them that can ameliorate the effects of disparity estimation errors.
Uncertainty map
A promising approach is to understand the distribution of estimated errors, from which useful features may be developed to improve the performance of discomfort prediction models using lowcomplexity stereo algorithms.
Pixels associated with disparity errors are often dissimilar with features computed on the corresponding disparity shifted pixels in the other view. The authors of [39] defined a disparity uncertainty map to estimate the uncertainty produced by DSSIM and used it as a feature to improve the task of 3D noreference distortion assessment. The uncertainty is defined as:
where l is the leftview image and r is the disparitycompensated rightview image of a stereo pair, μ and σ are the local weighted mean and weighted standard deviation computed over a local Gaussian window, and C=0.01 is a constant that ensures stability. An 11×11 Gaussian weighting matrix with a space constant 3.67 pixels was used to compute μ and σ as in [39]. The uncertainty reflects the degree of similarity between the corresponding pixels of a stereo pair. Hence, the uncertainty distribution of a disparity map can be used to represent the distribution of estimated errors. Figure 3 shows the uncertainty distributions of DFLOW, DSSIM, and DSAD maps computed on the image “human.” It may be observed that the histogram computed on the DFLOW uncertainty map corresponds to a very peaked distribution. The histograms of the DSSIM and DSAD uncertainty maps are less peaky since more large estimated errors occur. This is consistently the case for the distributions of DFLOW, DSSIM, and DSAD maps on the other images in the IEEESA database. This phenomenon may be understood by observing that the stereo matching algorithms find good matches (with low uncertainty) at most places, while less common occluded or ambiguous flat or textured areas may cause sparse disparity errors (with high uncertainty). A lognormal distribution can be fit to the histogram of the uncertainty map [39]. The probability density function of a lognormal distribution is:
where μ is the location parameter and σ is the scale parameter. A simple maximum likelihood method can be used to estimate μ and σ for a given histogram of uncertainties [39].
To summarize, the features used to describe estimated disparity errors are the bestfit lognormal parameters (μ and σ), and the sample skewness and kurtosis of the uncertainty map which are calculated as (7) and (8):
where U _{(i,j)} is the uncertainty value at coordinate (i,j), \(\bar U\) is the mean, σ _{ U } is the standard deviation, and N is the number of pixels.
3D NSS model
Towards ameliorating the weaknesses introduced by the use of lowcomplexity stereo models, we take a statistical approach towards characterizing the errors introduced by these algorithms. We accomplish this by subjecting the computed disparity maps to a perceptual transform characterized by a bandpass process followed by a nonlinearity. The resulting data are then amenable to analysis under a simple but powerful natural scene model. Research on natural scene statistics (NSS) has clearly demonstrated that images of natural scenes belong to a small set of the space of all possible signals and that they obey predictable statistical laws [40]. Further, the studies of Hibbard [41] and Liu [42] found that the distribution of disparity follows a Laplacian shape. The authors of [39] processed the depth and disparity maps by local mean removal and divisive normalization and found that the histograms of the processed depth and disparity maps take a zeromean symmetric Gaussianlike shape. One form of this process is [43]:
where i, j are spatial indices, μ and σ are the local weighted mean and weighted standard deviation computed over a local Gaussian window, and C=0.01 is a constant that ensures stability. An 11×11 Gaussian weighting matrix with a space constant 3.67 pixels is used to compute μ and σ as [39].
We applied the identical process (9) to DSAD, DSSIM, and DFLOW maps. The processed histograms for each computed on image “cup,” “human,” “lawn,” and “stone” are shown in Fig. 4 a–c. All of the histograms computed from DFLOW maps take zeromean symmetric Gaussianlike shape as elaborated in [39]. Most of the histograms computed on DSSIM maps also take the same shape, but the modes of a few of them are shifted (e.g., “cup”). Other than image “lawn,” the histograms of images processed by DSAD then subjected to DSAD disparity extraction fail to take a symmetric Gaussianlike shape. As in [39], when the Gaussian model fails, a generalized Gaussian distribution (GGD) fit may be attempted:
where μ, σ ^{2}, and γ are the mean, variance, and shapeparameter of the distribution,
and Γ(.) is the gamma function:
The parameters (σ and γ) are estimated here using the method used in [44].
The authors of [39] use the GGD parameters (μ and σ), along with the sample standard deviation, skewness, and kurtosis of these coefficients as 3D features to estimate the quality of 3D images. Here, we deploy the same features to model a perceptually processed disparity distribution. Since the histograms of perceptually processed low quality disparity maps extracted by low complexity stereogram algorithms such as DSSIM or DSAD do not result in very good fitting results, then the average GGD fitting error is extracted as a useful feature:
where N is the number of distributions in histogram, H(x) is the quantity of pixels at coordinate x, and g _{ x }(x) is the fit result of GGD.
Performance evaluation
To summarize our model, we have devised two kinds of features that are designed to improve the prediction performance of 3D visual discomfort model that rely low complexity disparity calculation algorithms. These features are the uncertainty map (UM) which simulates estimated disparity errors; the bestfit lognormal parameters (μ and σ), skewness, and kurtosis of the uncertainty map; and 3D NSS features that serve as a prior constant on true disparity including the GGD parameters (μ and σ), standard deviation, skewness and kurtosis of perceptually processed disparity maps, along with the average GGD fitting error.
The testing that was done is similar to what was described earlier, but using combinations of these new features. The test was conducted on the IEEESA stereo image database [35], SVR was deployed as the regression tool, 1000 iterations of the traintest process were used, and image database was randomly divided into 80 % training and 20 % test sets. The performance was measured using SROCC and LCC between the predicted scores and the MOS. The operation environment was an Apple computer running Matlab: MacPro 4.1 with Intel xeon cpu e5520 2.27 Ghz and 6 GB of RAM.
Several combinations of the features are selected: UM, NSS, and (UM+NSS) integrated into the existing prediction framework. Three same disparity calculation were used.
We obtained the mean, median, and standard deviations of LCC and SROCC of the performance results of these combinations of features against MOS over all 1000 traintest trials, as tabulated in Tables 3, 4, and 5 for DSAD, DSSIM, and DFLOW, respectively. Table 6 shows the performance results of these combinations without considering the features from disparity. The performance results of prior models are also tabulated in Table 5. We tested the models contributed by Park [7], Nojiri [13], Yano [14], Choi [15], and Kim [16].
From Table 6, it may be observed that 3D NSS and the UM are predictive of the degree of visual discomfort induced by 3D images.
By observation of Tables 1 and 3, both kinds of features contribute to improving the performance of the nominal discomfort prediction framework using DSAD. In terms of mean SROCC, it is increased significantly from 0.5873 to 0.6793 using UM, and to 0.6678 using NSS. The combination of these features achieves the best results with mean SROCC of 0.7100 and LCC of 0.7314. These results are better than those of Nojiri [13], Yano [14], Choi [15], and Kim [16], and close to Park [7]. The stability of the predictive power is also improved in regard to the standard deviation of SROCC, 0.0493 to 0.0366.
A similar result is attained when using the DSSIM algorithm. The combination of features improves the performance of the prediction framework from SROCC 0.6628 to 0.7307 which is better than the result attained on DSAD. The stability is improved too.
The new features also improve the performance of prediction framework based on the high complexity algorithm DFLOW, as shown in Table 7. Unlike the results on DSAD and DSSIM, here NSS contributes the most to the performance improvement. That may follow because the uncertainty map may not be able to improve the models much if the disparities are already accurately estimated. The contribution of NSS is stable over the visual discomfort models.
Table 7 shows the results of Ftests conducted to assess the statistical significance of the errors between the MOS scores and the model predictions on the IEEESA database. (UM+NSS) _{DF} means the model with features of UM, NSS, and disparity using the DFLOW disparity calculation method. The residual error between the predicted score of a model and the corresponding MOS value in the IEEESA database can be used to test the statistical efficacy of the model against other models. The residual errors between the model predictions and the MOS values are:
where Q _{ i } is the ith objective visual discomfort score and MOS_{ i } is the corresponding ith MOS score. The Ftest was used to compare one objective model against another objective model at the 99.9 % significance level. Table 7 is the result of the Ftest. A symbol value of “1” indicates that the statistical performance of the model in the row is better than that of the model in the column, while “0” indicates the performance in the row is worse than that in the column, and “–” indicates equivalent performance. The results indicate that both UM and NSS features improve the performances of the models with statistical significance.
Compared to the computation time of DSAD (3.51 h), the average computation time of these two features on the IEEESA database was much reduced (0.78 h). Hence, UM and NSS can efficiently improve visual discomfort models without much extra computation.
Conclusions
We studied the performance differences of 3D discomfort prediction models that rely on three disparity calculation algorithms having different complexity levels. The experimental results showed that the predictive power of a nominal prediction model is dramatically reduced when using a low complexity algorithm instead of a high complexity algorithm. The performance of models under the low complexity algorithm is also more unstable. Two kinds of new features were introduced to stabilize lowcomplexity results: features of a disparity uncertainty map (UM) and features of a 3D NSS model. We find that integrating these features significantly elevates the performance of the nominal discomfort model using low complexity stereo algorithms like DSAD or DSSIM. The new features also improve performance when a high complexity disparity estimator is used.
Abbreviations
 GGD:

Generalized Gaussian distribution
 LCC:

Linear correlation coefficient
 MOS:

Mean opinion score
 NSS:

Natural scene statistics
 QA:

Quality assessment
 SAD:

Sumofabsolute difference
 SROCC:

Spearman rank order correlation coefficient
 SSIM:

Structural similarity
 SVR:

Support vector regression
 S3D:

Stereoscopic 3D
References
 1
GR Jones, D Lee, NS Holliman, D Ezra, in Photonics West 2001Electronic Imaging. Controlling perceived depth in stereoscopic images (International Society for Optics and PhotonicsSan Jose, 2001), pp. 42–53.
 2
M Lambooij, M Fortuin, I Heynderickx, W IJsselsteijn, Visual discomfort and visual fatigue of stereoscopic displays: a review. J. Imaging Sci. Technol.53(3), 30201–1 (2009).
 3
J Li, M Barkowsky, P Le Callet, Visual discomfort of stereoscopic 3d videos: Influence of 3d motion. Displays. 35(1), 49–57 (2014).
 4
M Lambooij, WA IJsselsteijn, I Heynderickx, Visual discomfort of 3d tv: Assessment methods and modeling. Displays. 32(4), 209–218 (2011).
 5
M Urvoy, M Barkowsky, P Le Callet, How visual fatigue and discomfort impact 3dtv quality of experience: a comprehensive review of technological, psychophysical, and psychological factors. Ann. TelecommunAnn. Télécommun.68(11–12), 641–655 (2013).
 6
R Patterson, Review paper: Human factors of stereo displays: An update. J. Soc. Inf. Disp.17(12), 987–996 (2009).
 7
J Park, S Lee, AC Bovik, 3d visual discomfort prediction: vergence, foveation, and the physiological optics of accommodation. IEEE J. Sel. Top. Sign. Process.8(3), 415–427 (2014).
 8
R Patterson, Human factors of 3d displays. J. Soc. Inf. Disp.15(11), 861–871 (2007).
 9
FL Kooi, A Toet, Visual comfort of binocular and 3d displays. Displays. 25(2), 99–108 (2004).
 10
M Emoto, T Niida, F Okano, Repeated vergence adaptation causes the decline of visual functions in watching stereoscopic television. Disp. Technol. J.1(2), 328–340 (2005).
 11
T Shibata, J Kim, DM Hoffman, MS Banks, The zone of comfort: Predicting visual discomfort with stereo displays. J. Vis.11(8), 11–11 (2011).
 12
H Sohn, YJ Jung, Si Lee, YM Ro, Predicting visual discomfort using object size and disparity information in stereoscopic images. IEEE Trans. Broadcast.59(1), 28–37 (2013).
 13
Y Nojiri, H Yamanoue, A Hanazato, F Okano, in Electronic Imaging 2003. Measurement of parallax distribution and its application to the analysis of visual comfort for stereoscopic hdtv (International Society for Optics and PhotonicsSanta Clara, 2003), pp. 195–205.
 14
S Yano, S Ide, T Mitsuhashi, H Thwaites, A study of visual fatigue and visual comfort for 3d hdtv/hdtv images. Displays. 23(4), 191–201 (2002).
 15
J Choi, D Kim, S Choi, K Sohn, Visual fatigue modeling and analysis for stereoscopic video. Opt. Eng.51(1), 017206–1 (2012).
 16
D Kim, K Sohn, Visual fatigue prediction for stereoscopic image. IEEE Trans. Circ. Syst. Vi. Technol.21(2), 231–236 (2011).
 17
D Sun, S Roth, MJ Black, in Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference On. Secrets of optical flow estimation and their principles (IEEESan Francisco, 2010), pp. 2432–2439.
 18
D Scharstein, R Szeliski, A taxonomy and evaluation of dense twoframe stereo correspondence algorithms. Int. J. Comput. Vis.47(1–3), 7–42 (2002).
 19
M Tanimoto, T Fujii, K Suzuki, N Fukushima, Y Mori, Depth estimation reference software (ders) 5.0. ISO/IEC JTC1/SC29/WG11 M. 16923:, 2009 (2009).
 20
MJ Chen, CC Su, DK Kwon, LK Cormack, AC Bovik, Fullreference quality assessment of stereopairs accounting for rivalry. Signal Process. Image Commun.28(9), 1143–1155 (2013).
 21
AC Bovik, Automatic prediction of perceptual image and video quality. Proc. IEEE. 101(9), 2008–2024 (2013).
 22
MJ Chen, LK Cormack, AC Bovik, Noreference quality assessment of natural stereopairs. IEEE Trans. Image Process.22(9), 3379–3391 (2013).
 23
AM Horwood, PM Riddell, The use of cues to convergence and accommodation in naïve, uninstructed participants. Vis. Res.48(15), 1613–1624 (2008).
 24
TH Lin, SJ Hu, Perceived depth analysis for view navigation of stereoscopic threedimensional models. J. Electron. Imaging. 23(4), 043014–043014 (2014).
 25
J Park, H Oh, S Lee, AC Bovik, 3d visual discomfort predictor: Analysis of disparity and neural activity statistics. IEEE Trans. Image Process.24(3), 1101–1114 (2015).
 26
IP Howard, Seeing in Depth, Vol. 1: Basic Mechanisms (University of Toronto Press, Toronto, 2002).
 27
S Birchfield, C Tomasi, A pixel dissimilarity measure that is insensitive to image sampling. IEEE Trans. Pattern. Anal. Mach. Intell.20(4), 401–406 (1998).
 28
R Zabih, J Woodfill, in Computer Vision–ECCV’94. Nonparametric local transforms for computing visual correspondence (SpringerNew York, 1994), pp. 151–158.
 29
K Mühlmann, D Maier, J Hesser, R Männer, Calculating dense disparity maps from color stereo images, an efficient implementation. Int. J. Comput. Vis.47(1–3), 79–88 (2002).
 30
C Rhemann, A Hosni, M Bleyer, C Rother, M Gelautz, in Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference On. Fast costvolume filtering for visual correspondence and beyond (IEEEColorado Springs, 2011), pp. 3017–3024.
 31
CL Zitnick, SB Kang, M Uyttendaele, S Winder, R Szeliski, in ACM Transactions on Graphics (TOG). Highquality video view interpolation using a layered representation, vol. 23 (ACMLos Angeles, 2004), pp. 600–608.
 32
J Jiao, R Wang, W Wang, S Dong, Z Wang, W Gao, Local stereo matching with improved matching cost and disparity refinement. IEEE MultiMedia. 21(4), 16–27 (2014).
 33
D Scharstein, R Szeliski, Middlebury stereo evaluationversion 2. vision. middlebury. edu/stereo (2002). http://vision.middlebury.edu/.
 34
Z Wang, AC Bovik, HR Sheikh, EP Simoncelli, Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process.13(4), 600–612 (2004).
 35
Standard for the quality assessment of three dimensional (3d) displays. 3D Contents and 3D Devices based on Human Factors. IEEE P3333.1. 2012: (2012). doi:http://grouper.ieee.org/groups/3dhf.
 36
S Ide, H Yamanoue, M Okui, F Okano, M Bitou, N Terashima, in Electronic Imaging 2002. Parallax distribution for ease of viewing in stereoscopic hdtv (International Society for Optics and PhotonicsSan Jose, 2002), pp. 38–45.
 37
ITUR Recommendations, 500.7, methodology for the subjective assessment of the quality of television pictures. Recommendations, ITUR, Geneva (1995). http://www.itu.int/rec/RRECBT.5007199510S/en.
 38
CC Chang, CJ Lin, Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST). 2(3), 27 (2011).
 39
MJ Chen, LK Cormack, AC Bovik, Noreference quality assessment of natural stereopairs. IEEE Trans. Image Process.22(9), 3379–3391 (2013).
 40
DL Ruderman, The statistics of natural images. Netw. Comput. Neural Syst.5(4), 517–548 (1994).
 41
PB Hibbard, A statistical model of binocular disparity. Vis. Cogn.15(2), 149–165 (2007).
 42
Y Liu, AC Bovik, LK Cormack, Disparity statistics in natural scenes. J. Vis.8(11), 19 (2008).
 43
A Mittal, AK Moorthy, AC Bovik, Noreference image quality assessment in the spatial domain. IEEE Trans. Image Process.21(12), 4695–4708 (2012).
 44
K Sharifi, A LeonGarcia, Estimation of shape parameter for generalized gaussian distributions in subband decompositions of video. IEEE Trans. Circ. Syst. Video Technol.5(1), 52–56 (1995).
Acknowledgements
The work for this paper was supported by NSFC under 61471234, MOST under Contact 2015BAK05B03, 2013BAH54F04.
Authors’ contributions
JC carried out this visual discomfort prediction model and drafted the manuscript. JZ and JS participated in the design of the study and performed the statistical analysis. AB conceived of the study, participated in its design and coordination, and helped to draft the manuscript. All authors read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Author information
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Visual discomfort
 Low complexity disparity calculation algorithms
 3D NSS
 Uncertainty map