- Research Article
- Open Access
Towards Video Quality Metrics Based on Colour Fractal Geometry
EURASIP Journal on Image and Video Processing volume 2010, Article number: 308035 (2010)
Vision is a complex process that integrates multiple aspects of an image: spatial frequencies, topology and colour. Unfortunately, so far, all these elements were independently took into consideration for the development of image and video quality metrics, therefore we propose an approach that blends together all of them. Our approach allows for the analysis of the complexity of colour images in the RGB colour space, based on the probabilistic algorithm for calculating the fractal dimension and lacunarity. Given that all the existing fractal approaches are defined only for gray-scale images, we extend them to the colour domain. We show how these two colour fractal features capture the multiple aspects that characterize the degradation of the video signal, based on the hypothesis that the quality degradation perceived by the user is directly proportional to the modification of the fractal complexity. We claim that the two colour fractal measures can objectively assess the quality of the video signal and they can be used as metrics for the user-perceived video quality degradation and we validated them through experimental results obtained for an MPEG-4 video streaming application; finally, the results are compared against the ones given by unanimously-accepted metrics and subjective tests.
1. Video Quality Metrics
There is a plethora of metrics for the assessment of image and video quality . They used to be: (i) full reference or reference based, when both the video sequence at the transmitter and the video sequence at the receiver are available, then the sequence at receiver is compared to the original sequence at transmitter, and (ii) no reference or without reference, when the video sequence at the transmitter is not available; therefore, only the video sequence at the receiver is being analyzed. Recently a third class of metrics emerged: the so-called "reduced-reference" [2, 3] which are based on the sequence at the receiver and on some features extracted from the original signal at the transmitter. This is the case of the fractal measures we propose.
For the quality assessment of an image or a video sequence, the metrics can be also divided into subjective and objective. During the last decade, several quality measures, both subjective and objective, have been proposed, especially for the assessment of the quality of an image after lossy compression, image rendering on screen or for digital cinema . Most of them use models of the human visual system to express the image perception as a specific pass-band filter (to be more precise, a pass-band filter for the achromatic vision and a low pass-filter for the chromatic one) . In this paper we explore a well-known property of the human visual system, that is, to be "sensitive" to the visual complexity of the image. We use fractal features—thus a multiscale approach—to estimate this complexity. In addition, we rely on the hypothesis that the fractal geometry is capable of characterizing the image complexity in its whole—the space—frequency complexity and the colour content–-thus the complexity of the image reflected in a certain colour space, and any of the aspects of the image degradation, like a more spread power spectrum and local discontinuities of the natural correlation of the image.
The most complex metrics are based on models of the human visual system, but some of them are now classical signal fidelity metrics like the signal-to-noise ratio (SNR) and its variant peak SNR (PSNR), the mean-squared error (MSE) and root MSE (RMSE) which are simply distance measures. These simple measures are unable to capture the degradation of the video signal from a user perspective . On the other hand, the subjective video quality measurements are time consuming and must meet complex requirements (see the ITU-T recommendations [7–10]) regarding the conditions of the experiments, such as viewing distance and room lighting. However, the objective metrics are usually preferred, because they can be implemented as algorithms and are human-error free.
The Video Quality Experts Group (VQEG) (http://www.vqeg.org/) is the main organization dealing with the perceptual quality of the video signal and they reported on the existing metrics and measurement algorithms . A survey of video-quality metrics based on models of the human vision system can be found in  and several no-reference blockiness metrics are studied and compared in . A more recent state-of-the-art of the perceptual criteria for image quality evaluation can be found in . OPTICOM (http://www.opticom.de/) is the author of one metric for video quality evaluation called "Perceptual Evaluation of Video Quality" (PEVQ), which is a reference-based metric used to measure the quality degradation in case of any video application running in mobile or IP-based networks. The PEVQ Analyzer  measures several parameters in order to characterize the degradation: brightness, contrast, PSNR, jerkiness, blur, blockiness, and so forth. Some of the first articles that proposed quality metrics inspired by the human perception [16, 17] drew also the attention on some of the drawbacks of the MSE and the importance of subjective tests. Among the unanimously accepted metrics for the quantification of the user-perceived degradation are the ones proposed by Winkler use image attributes like sharpness and colourfulness [18–20]. In , the authors propose a no-reference quality metric also based on the contrast, but taking into account the human perception, and in , the hue feature is exploited. Wang proposes in  a metric based on the structural similarity between the original image and the degraded one. The structural similarity (SSIM) unifies in its expression several aspects: the similarity of the local patch luminances, contrast, and structure. This metric was followed by a more complex one, based on wavelets, as an extension of SSIM to the complex wavelet domain, inspired by the pattern recognition capabilities of the human visual system . Together with Wang, Rajashekar is the author of one of the latest image quality metric based on an adaptive spatiochromatic signal decomposition [25, 26]. The method constructs a set of spatiochromatic function basis for the approximation of several distortions due to changes in lighting, imaging, and viewing conditions. Wavelets are also used by Chandler and Hemami to develop a visual signal-to-noise ratio (VSNR) metric  based on their recent psychophysical findings [28–30]. Related to the wavelets, a multiresolution model based on the natural scene statistics is used in .
Most of the existing metrics for the video quality are used to quantify the degradation introduced by the compression algorithm itself, as a consequence of the reduced bit rate. We are interested in objectively assessing the degradation in video quality caused by the packet loss at network level . In our experiments, we identified two kinds of degradation: (i) the degradation that affects the sequence, that is, the temporal component of the signal and (ii) the degradation that affects the frames, that is, the spatial component. Given the way the majority of the video frames are degraded (see Figure 1), the most useful metric would be the blockiness, which objectively quantifies the impairments. To quantify the degradation of a single video frame, one could simply measure the affected area in number of pixels of number of blocks or an appropriate perceptual metric, able to quantify the degradation from a human perspective. Apart from blockiness, the degraded frames are "dirty", that is, many blocks containing other information than they should. Therefore, a metric able to quantify the dirtiness would be useful.
The degradation that affects the video frames is in fact a mixture of several impairments, including blockiness and the sudden occurrence of new colours. The modifications of the image content reflect both in the colour histograms—a larger spread of the histogram due to the presence of new colours—and the spectral representation of the luminance and chrominance (high frequencies due to blockiness). Given all the above considerations, we believe that metrics like blur, contrast, brightness, and even blockiness lose their meaning, and they are not able to reflect the degradation; therefore, they cannot be applied for such degraded video frames. Metrics able to capture all the aspects of the degradation that reflect the colour spread–-the amount of new colours occurring in the degraded video frames would be more appropriate. We, therefore, consider that the approaches based on multiscale analysis and image complexity are more adapted to the video-quality assessment. Fractal analysis-based approaches offer the possibility to synthesize into just one measure adapted to the human visual system, all the relevant features for the quality of an image (e.g., colourfulness and sharpness) instead of analyzing all image characteristics independently and then to find a way to combine the intermediate results. Due to its multiscale nature, the fractal analysis is in accordance with the spirit of all multiresolution wavelet-based approaches mentioned before, which unfortunately work only for gray-scale images. Therefore, one of the advantages of our approach would be the fact that it also takes into account the colour information. In addition, the fractal measures are invariant to any linear transformation like translation and rotation.
Our choice is also justified by the way that humans perceive the fractal complexity. In a study on human perception conducted on fractal pictures , the authors conclude that "the hypothesis on the applicability and fulfillment of Weber-Fechner law for the perception of time, complexity and subjective attractiveness was confirmed". Their tests aimed at correlating the human perception of time, complexity, and aesthetic attractiveness with the fractal dimension and the Lyapunov exponent, based on the hypothesis that the perception of fractal objects may reveal insights of the human perceptual process. In , the most attractive fractals appeared to be the ones with the fractal dimension comprised between 1.1 and 1.5. According to , "the prevalence of fractals in our natural environment has motivated a number of studies to investigate the relationship between a pattern's fractal character and its visual properties", for example, [36, 37]. The authors of  investigate the visual appeal as a function of the fractal dimension, and they establish three intervals: [1.1–1.2] low preference, [1.3–1.5] high preference, and [1.6–1.9] low preference. Pentland finds in this psychophysical studies [38, 39] that for the one-dimensional fractional Brownian motion and the two-dimensional Brodatz textures, the correlation between the fractal dimension and the perceived roughness is more than 0.9.
Last but not least, the very essence of the word "complex" of Latin-etymology—meaning "twisted together", designating a system composed of closely connected components—emphasizes the presence of multiple components that interact with each other, generating an emergent property .
2. Fractal Analysis
The fractal geometry introduced by Mandelbrot in 1983 to describe self-similar sets called fractals  is generally used to characterize natural objects that are impossible to describe by using the classical (Euclidian) geometry. The fractal dimension and lacunarity are the two most-known and widely used fractal analysis tools. The fractal dimension characterizes the complexity of a fractal set, by indicating how much space is filled, while the lacunarity is a mass distribution function indicating how the space is occupied . These two fractal properties are successfully used to discriminate between different structures exhibiting a fractal-like appearance [43–45], for classification and segmentation, due to their invariance to scale, rotation, or translation. The fractal geometry proved to be of a great interest for the digital image processing and analysis in an extremely wide area of applications, like finance , medicine [44, 47, 48], and art .
There exist several different mathematical expressions for the fractal dimension, but the box-counting is the most popular due to the simplest algorithmic formulation, compared to the original Hausdorff definition expressed for continuous functions . The box-counting definition of the fractal dimension is , where is the number of boxes of size needed to completely cover the fractal set. The first practical approach belongs to Mandelbrot, but that was followed by the elegant probability measure of Voss [51, 52]. On a parallel research path, Allain and Cloitre  and Plotnick et al.  developed their approach as a version of the basic box-counting algorithm. All the other approaches for the computation of the fractal dimension, like -parallel body method  (a.k.a. covering-blanket approach, Minkowsky sausage, or morphological covers) or fuzzy  are more complex from a point of view of implementation and more difficult to extend to a multidimensional colour space. However, we proposed in  a colour extension of the covering blanket approach based on a probabilistic morphology. On the other hand, despite the large number of algorithmic approaches for the computation of the fractal dimension and lacunarity, only few of them offer the theoretical background that links them to the Hausdorff dimension.
However, such tools were developed long time ago for grey-scale small-size images, but due to the evolution of the acquisition techniques the spatial resolution significantly increased and, in addition, the world of images became coloured. The very few existing approaches for the computation of fractal measures for colour images are restricted to a marginal colour analysis, or they transform a gray-scale problem in false colour . In the following section, we briefly present our colour extension of the existing probabilistic algorithm by Voss , fully described in , which were validated on synthetic colour fractal images  and used to characterize the colour textures representing psoriatic lesions, in the context of a medical application in dermatology . Then, we show how the colour fractal dimension and lacunarity can be used to characterize the degradation of the video signal for a video streaming application. Without loss of generality, we present the results we obtain in the case of an MPEG-4 video-streaming application.
3. Colour Fractal Dimension and Lacunarity
The existing approaches for the estimation of the fractal dimension, especially the box-counting-like approaches, consider the gray-scale image a set of points in an Euclidian space of dimension . In the probabilistic algorithm defined by Voss  upon the proposal from Mandelbrot , the spatial arrangement of the set is characterized by the probability matrix , the probability of having points inside a cube of size (called box), centered in an arbitrary point of the set . In other words, is the probability that the signal "visited" the box of size . The matrix is normalized so that , where is the maximum number of pixels that are included in a box of size . Given the total number of points in the image is , the number of boxes that contain points is . Thus, the total number of boxes needed to cover the image is
Consequently is proportional to , where is the fractal dimension to be estimated.
If a gray-scale image is considered to be a discrete surface , where is the luminance in every point of the space, then a colour image is a hyper-surface in a 3-dimensional colour space. Thus, we deal with a 5-dimensional hyper-space where each pixel is a 5-dimensional vector. We use RGB for the representation of colours due to its cubical organization, even though it is not a Euclidian uniform space. The classical algorithm of Voss uses boxes of variable size centered in the each pixel of the image and counts how many pixels fall inside that box. We generalize this by counting the pixels for which the Minkowski infinity norm distance to the center of the hyper-cube is smaller than . Practically, for a certain square of size in the plane, we count the number of pixels that fall inside a 3-dimensional RGB cube of size , centered in the current pixel –-the colour of the current pixel. The theoretical development and validation on synthetic colour fractal images can be found in .
Even from the very beginning, when Mandelbrot introduced the fractal geometry, he was aware of the fact that the fractal dimension itself is not sufficient to fully capture the complexity of nondeterministic objects; therefore, he defined the lacunarity as a complementary metric. Later on, Voss expressed it based on the probabilities and using the first and second order moments of the measure distribution (2). Following the previous considerations, the lacunarity can be therefore defined and computed for colour images as well. See also  for a complete view of the definition and the interpretation of lacunarity for synthetic and natural colour fractal images
The lacunarity characterizes the topological organisation of a fractal object, an image in our particular case, being a scale-dependent measure of spatial heterogeneity. Images with small lacunarity are more homogeneous with respect to the size distribution and spatial arrangement of gaps. On the other hand, images with larger lacunarity are more heterogeneous. In addition, lacunarity must be taken into consideration after inspecting the fractal dimension: in a similar manner with the Hue-saturation couple in colour image analysis, the lacunarity becomes of greater importance when complexity, that is, the fractal dimension, increases.
4. Approach Argumentation and Validation
In Figure 1, we present two video frames: one from the original video sequence and the corresponding degraded video frames from the sequence at the receiver, along with the pseudoimage representing the absolute difference between the former two. The computed colour fractal dimensions are 3.14, 3.31, and 3.072, respectively. One can see that the larger fractal dimension reflects the increased complexity of the degraded video frame. The increased complexity comes from the blockiness effect, as well as from the dirtiness and the augmented colour content (see also the 3D histograms in Figure 3).
The corresponding lacunarity curves are depicted in Figure 2. One can see that the curve for Figure 1(b) is placed highly above the curve for the Figure 1(a) indicating a more lacunar and heterogeneous image. Surprisingly enough, the difference Figure 1(c) has a very similar lacunarity to the one of the original image, but the difference pseudoimage is more lacunar than the original for small values of : –-indicating that the degradation mainly takes place in blocks of pixels–-while for larger values of it is less lacunar–-more uniform, clearly seen, and justified by the smaller variations of colours. The complexity revealed by the lacunarity curves is in accordance with the fractal dimension: the original unaffected video frame being a less lacunar image than the degraded one.
Because the lacunarity is a measure of how the space is occupied, we present in Figure 3 the 3D histograms in the RGB colour space, as a visual justification. One can see that the histogram of the degraded video frame is more spread than the one of the original video frame, indicating a more rich image from the point of view of its colour content.
For the quantification of the spread of the 3D histograms, we computed the co-occurrence matrices for the three images in Figure 1. This choice is justified by the fact that in the case of a random fractal the fractal dimension is proportional to the variance of the increments . Therefore, we computed the co-occurrence matrices for a neighborhood distance of one pixel, on the horizontal direction. In this way, the computed co-occurrence is a measure of the correlation between pixels. In Figure 4, for the two video frames we show the three overlayed co-occurrence matrices, one for each RGB component. The results indicate that the variance of the values is larger for the degraded video frames, indicating a smaller correlation between the neighbour pixels. The lack of correlation is the natural consequence of the sum of impairments that affect the degraded frame. As shown in , that the co-ocurrence matrix shape is linked to the fractal dimension of the signal or image. These two points of view—the 3D histograms and the co-occurrence matrices—are a first validity proof and justification for a fractal approach.
For an even further investigation and argumentation, we analyze the video frames from the point of view of their spectral fluctuations. Random function or signal complexity can be defined based on its power-density spectrum: for a random fractal signal , the power-density function varies upon a power law in . So, the Fourier transform computed on time samples of allows to express the spectral density function as
The link between the power law of and the fractal dimension is defined by the relation (5) from , where is the dimension of the Euclidian space representing the topological dimension of the signal (e.g., for a one-dimensional signal and for an image) and is the Hurst factor, which indicates the complexity of the fractal object. is comprised between and and intimately connected to the fractal dimension. A value of close to indicates a very complex object, while a value close to indicates a "simpler" object, that is, a smoother signal
Given that it is almost impossible to estimate the impact of the artifacts in the spatial domain, without any reference (original video signal), in the frequence domain is clearly enough that the artifacts induce very high frequencies and a specific modification of the spectrum which could be close to a complexity induced by a fractal model.
In Figure 5, we show the 2D FFT of the two video frames, for each colour plane, and in Figures 6, 7, and 8 the horizontal and vertical slices of the spectra, corresponding to the spatial frequencies and , respectively. One can clearly note that the marginal analysis (plane by plane) is not able to reflect the entire colour degradation that affects the video signal, but the degradation induces a complexity fluctuation that is, well captured by the fractal dimension. So, it is yet another proof that justifies the use of a colour estimation of degradation by means of colour fractal geometry.
The order of complexity of our approach, for an image of size is , where represents the results of the sum , being the maximum hypercube size–-41 in our case. Given that the sum of the squares of the first odd natural numbers is
In addition, due to the complexity of the colour Fourier transform based on Quaternionic approaches, our approach is the more suitable at this moment for a real-time implementation. For an image of size , the complexity of a parallel implementation of our approach would be , while for a 2D Fast Fourier Transform the best case is of complexity.
In Figure 9(a), we depict the block diagram that illustrates the use of the colour fractal dimension and lacunarity as video-quality metrics in a reduced reference scenario. At the source, the two fractal measures are computed for each video frame and sent along with the coded video frames over the network. At destination, the same fractal measures are computed for the received video frames and compared with the references.
5. Experimental Results
From the plethora of IP-based video application, we chose an MPEG-4 streaming application. Streaming applications usually use RTP (Real-Time Protocol) over UDP; therefore, the traffic generated by such an application is inelastic and doesnot adapt to the network conditions. In addition, neither UDP itself or the video streaming application implement a retransmission mechanism. Therefore, the video streaming applications are very sensitive to packet loss: any lost packet in the network will cause missing bits of information in the MPEG video stream.
Given that packet loss is the major issue for an MPEG-4 video streaming application, in our experiments the induced packet loss percentage varied from 0% to 1.3%. Above this threshold, the application cannot longer function (i.e., the connection established between the client and the server breaks), and tests cannot be performed. The test setup is depicted in Figure 9(b): the MPEG-4 streaming server we used was the Helix streaming server from Real Networks (http://www.realnetworks.com/) and the MPEG-4 client was mpeg4ip (http://mpeg4ip.sourceforge.net/). We modified the source code of the client to record the received video sequence as individual frames in bitmap format. We ran the tests using three widely used video sequences: "football", "female", and "train", MPEG-4 coded. The video sequences were 10 seconds long, with 250 frames, each of size. The average transmission rate was approximately 1 Mb/s, which was a constrained from using a trial version of the MPEG-4 video streaming server–-however it represents a realistic scenario.
The monitoring system we designed and implemented uses two Fast Ethernet network taps to "sniff" the application traffic on the links between two Linux PCs that run the video streaming server and client. The traffic is further recorded as packet descriptors by the four programmable Alteon UTP (Unshielded Twisted Pair) and NICs (Network Interface Card), two for each tap, in order to mirror the full-duplex traffic. From each packet, all the information required for the computation of the network quality of service (QoS) parameters is extracted and stored in the local memory as packet descriptors. The host PCs, that control the programmable NICs, periodically collect this information and store it in descriptor files. These traffic traces are analyzed in order to accurately quantify the quality degradation induced by the network emulator: one-way delay, jitter, and packet loss, as instantaneous or average values, as well as histograms. In parallel, the video signal is recorded for the offline processing. Since the two measurements described above are correlated from the point of view of time, the effects of the measured network degradation on the quality of the video signal can be estimated by the module denoted user-perceived quality (UPQ) meter. More results and details about the experimental setup are to be found in [62–64].
In Figure 10, one may see three type of degradation that occurs in our tests: important or severe degradation (top); less-affected frames (middle) and special or green degraded frames (bottom). The difference between the colour fractal dimension of the degraded and the original corresponding video frame will be considerable for the first two images that exhibit an important degradation–-that is, almost the entire image is affected by severe blockiness, and the scene cannot be understood. will be small, but still positive for less affected images (the football players may no longer be identifiable, but the rest of the scene is unchanged). For the "green" images the colour fractal dimension is smaller than the one of the corresponding original frames, therefore, the will be negative.
The corresponding lacunarity curves are depicted in Figure 11–-the blue curve for the original video frame, the red curve for the degraded video frame, and the black one for the absolute difference pseudoimage. The largest lacunarity is for the most affected video frames, as expected. From a human perception point of view, the colour lacunarity curves are able to reveal the correct ranking, as well as the colour fractal dimension.
In order to analyse the degradation in time, in Figure 12 the evolution of the colour fractal dimension in time is depicted. One can see that the original "football" sequence is characterized by a large variation in the complexity of the image, due to the fact that the scene changes and also due to the high dynamicity. Therefore, the variation of the colour fractal dimension due to degradation is almost insignificant. In addition, due to the lost video frames, the two curves will get more and more desynchronized in time, which makes the analysis more difficult. However, it is possible to create a reference-based metric by using the colour fractal dimension (note the grey zones that indicate a slight increase of the fractal dimension due to quality degradation).
One can note that for the original "football" video sequence the colour lacunarity has also an important variation (see Figure 13) from frame to frame, but its values are comprised between 0 and 1.5. For the degraded video sequence (b), we can see that the lacunarity skyrockets up to 3.0 for the interval of video frames affected by important degradation (the first interval market with grey). The less important degradation (the next greyed intervals) can only be detected if we take as reference the lacunarity of the original video sequence. In order to implement a no-reference metric, lacunarity 1.5 can indicate the severe degradation.
We analyzed two more video sequences: "female" and "train" (Figure 14). The corresponding colour fractal dimension as a function of time are depicted in Figure 15. The lacunarity curves are presented in Figure 16.
For the "female" and the "train" video sequences, one may note another interesting characteristic of the lacunarity curves, which exhibit a certain periodicity in time (see Figure 17). The explanation is the fact that from time to time the video signal is affected by a not-so-severe blockiness due to the encoding mechanisms only. This is not visible on the "train" video sequence, due to the high-complexity content of the image scene, but it can be easily seen on the "female" video sequence–-an example is depicted in Figure 14(b).
In this section, we present a comparison with some of the metrics mentioned in the introduction: SNR, PSNR, MSE, SSIM, and VSNR. For the computation of the SSIM we used the Matlab code (http://www.ece.uwaterloo.ca/~z70wang/research/ssim/) provided by the author of the metric proposed in  and for VSNR the Matlab implementation available (http://foulard.ece.cornell.edu/dmc27/vsnr/vsnr.html) provided by the authors of . For colour images, the MSE (8), SNR (9), and PSNR (10) metrics are often computed independently for the red, green, and blue (RGB) colour channels and averaged together in order to compute the final distortion. We chose to compute these classical signal fidelity measures in the RGB colour space, despite of the very well-known fact that the RGB space is not perceptually uniform–-to be consistent with the definition of the colour fractal approach, which was developed based on the RGB colour space. We are aware of the fact that metrics like SNR and MSE could perform better in a perceptual colour space (e.g., CIELAB) and in addition we envisage a further development of the colour fractal approach in Lab and HSV
where is the original image, is the degraded image, both of them of size
where is the maximum intensity level, that is, for an image.
In Table 1, we show the results we obtain for the images in Figure 10, when we compute the difference between the colour fractal dimension of the degraded video frame and the colour fractal dimension of the original video frame, along with the various metrics mentioned above. The values of are very well correlated to SNR, PSNR, and MSE, and well correlated to VSNR, but they are not at all correlated to SSIM. However, for the minimum visible degradation—images 10.13 and 10.14 for which is small—the SSIM indicates the largest similarity, as well as PSNR, and VSNR has also a large value. For the largest visible degradation—images 10.21, 10.22, 10.25, and 10.26—the VSNR well captures it, while SSIM does not reach its minimum values.
We plan to perform a further comparison between the metrics on larger databases of test images. In addition, we have to mention the fact that the SSIM and VSNR were mainly used to assess the quality degradation induced by the image compression algorithms, case in which the image degradation is not as violent as in our experiments. Therefore the right way to compare our method against all the existing approaches is not straightforward and, definitely, not amongst the goals of the current paper.
In addition, in Table 2 we show a comparison of our approach against the SNR, PSNR, MSE, SSIM, and VSNR from the point of view of the required algorithmical complexity. We are assuming an image of size .
The constant for the complexity of SSIM approach is given by the size of the window for computing the local mean and variance——and the circular-symmetric Gaussian weighting function that is, used when computing the map of local SSIM values. The maximum complexity bounds in case of VSNR is clearly given by the complexity of the discrete wavelet transform (DWT) that is, used. It is known that an efficient implementation of DWT is in . The following relationship is evident: ; however, the complexity of a parallel implementation of our approach would be in .
7. Subjective Tests
The original hypothesis was that the quality perceived is directly proportional to the fractal complexity of an image. In order to validate from a subjective point of view the approach we proposed for the assessment of the video quality, we performed several subjective tests, on different video frames from video sequences—sport videos of football matches, in particular. The aim of the experiments was to prove that the complexity of colour fractal images is in accordance with the human perception; therefore, the colour fractal analysis-based tools are appropriate for the development of video quality metrics.
We ran our experiments on a set of 27 individuals, guided by the general recommendations from . In the experiment, we used video frames—original and degraded—from the standard test "football" video sequence. Pairs of images were presented, thus the experiments were reference-based. After presenting the minimum and the maximum degradation that may affect the video frames, the individuals were asked to grade the perceived degradation with a score comprised between 0 and 5, according to the levels of degradation presented in Table 3, in accordance with the quality levels specified by the ITU.
For the images in Figure 10, the mean opinion score and the standard deviation, , computed based on the 27 responses are presented in Table 4, as well as the colour fractal dimension (CFD) and its variation, .
If we exclude the images 10.22 and 10.26, for which the estimated colour fractal dimension variation is negative because of the important degradation and lack of information, the correlation coefficient between the MOS and is 0.8523. Despite of the fact that these results must be extended to a bigger image set, the approach creates a new perspective on the perception of colour image complexity. If we take into account the two images, 10.22 and 10.26, the correlation between mean score and estimated colour fractal complexity is 0.4857. This result, induced by the negative value for the colour fractal complexity variation, may lead to new developments for colour fractal measures. Clearly enough, the perceived complexity of those images is lower than the one of the others.
We conclude that the fractal dimension reflects the perceived visual complexity of the degraded images, as long as the degradation is not extreme and is not negative. We plan to run more subjective experiments in order to augment the pertinence of the results from a statistical point of view and to propose a better colour fractal estimator to deal with this minor numerical inconsistency.
We conclude that the colour lacunarity itself can be used as a no-reference metric to detect the important degradation of the video signal at the receiver. The colour fractal dimension and lacunarity can be definitely used as a reference-based metrics, but this is usually impossible in a real environment setup when the original signal is not available at the receiver. The colour fractal dimension is not enough to be used as a stand-alone metric but in a reduced-reference scenario, the fractal features we propose—the colour fractal dimension and the colour lacunarity–-can be used to objectively assess any degradation of the received video signal and, given that they are correlated to the human perception, they can be used for the development of quality of experience metrics. An important aspect, which represents an invaluable advantage, is the robustness of the fractal measures to any modification of the video signal during the broadcast, like translation, rotation, mirroring or even cropping (e.g., when the image format is changed from to ).
For the computation of the two metrics we propose a colour extension of the classical probabilistic algorithm designed by Voss. We show that our approach is able to capture the relative complexity of the video frames and the sum of aspects that characterize the degradation of an image, thus the colour fractal dimension and lacunarity can be used to characterize and objectively assess the degradation of the video signal. To support our approach and conclusions, we also investigated the 3D histograms, the co-occurrence matrices and the power density functions of the original and degraded video frames. In addition, we present the results of our subjective tests. Given that the fractal features are well correlated to the perceived complexity by the human visual system, they are of great interest as objective metrics in a video quality analysis tool set.
Our choice of using the RGB colour space perfectly suits the probabilistic approach, and the extension from cubes to hypercubes was natural and intuitive. We are aware of the fact that the RGB colour space may not be the best choice when designing an image analysis algorithm from the point of view of the human visual system and given that a perceptual objective metric is desired, we plan to further develop our colour fractal metrics by using other colour spaces, for example, Lab or HSL, capable of better capturing and reflecting the human perception of colours, but with a higher computational cost.
Fernandez-Maloigne C: Fundamental study for evaluating image quality. Proceedings of the Annual Meeting of TTLA, December 2008, Taiwan
Yamada T, Miyamoto Y, Serizawa M, Harasaki H: Reduced-reference based video quality metrics using representative-luminance values. Image Communication 2009,24(7):525-547.
Oelbaum T, Diepold K: Building a reduced reference video quality metric with very low overhead using multivariate data analysis. Proceedings of the 4th International Conference on Cybernetics and Information Technologies, Systems and Applications (CITSA '07), 2007
Fernandez-Maloigne C, Larabi MC, Anciaux G: Comparison of subjective assessment protocols for digital cinema applications. Proceedings of the 1st International Workshop on Quality of Multimedia Experience (QoMEX '09), July 2009, San Diego, Calif, USA
Rosselll V, Larabl M-C, Fernandez-Malolgne C: Objective quality measurement based on anisotropic contrast perception. Proceedings of the 4th European Conference on Colour in Graphics, Imaging, and Vision (CGIV '08), June 2008 108-111.
Wang Z, Bovik AC: Mean squared error: love it or leave it? A new look at Signal Fidelity Measures. IEEE Signal Processing Magazine 2009,26(1):98-117.
ITU-R Recommendation BT.500 : Subjective quality assessment methods of televisionpictures. International Telecommunications Union; 1998.
ITU-T Recommendation P.910 : Subjective video quality assessment methods formultimedia applications. International Telecommunications Union; 1996.
ITU-R Recommendation J.140 : Subjective assessment of picture quality in digitalcable television systems. International Telecommunications Union; 1998.
ITU-T Recommendation J.143 : User requirements for objective perceptual videoquality measurements in digital cable television. International Telecommunica-tions Union; 2000.
Video Quality Experts Group : The validation of objective models of video quality assessment. Final report, 2004
van den Branden Lambrecht CJ: Survey of image and video quality metricsbased on vision models. presentation, August 1997
Winkler S, Sharma A, McNally D: Perceptual video quality and blockiness metrics for multimedia streaming applications. Proceedings of the 4th International Symposium on Wireless Personal Multimedia Communications, September 2001 553-556.
Pappas TN, Safranek RJ, Chen J: Perceptual criteria for image quality evaluation. In Handbook of Image and Video Processing. 2nd edition. Academic Press, San Diego, Calif, USA; 2000:669-686.
OPTICOM GmbH Germany : Pevq—advanced perceptual evaluation of videoquality. white paper, 2005
Teo PC, Heeger DJ: Perceptual image distortion. Proceedings of IEEE International Conference of Image Processing, 1994 982-986.
Karunasekera SA, Kingsbury NG: A distortion measure for blocking artifacts in images based on human visual sensitivity. IEEE Transactions on Image Processing 1995,4(6):713-724. 10.1109/83.388074
Winkler S: Visual fidelity and perceived quality: towards comprehensive metrics. Human Vision and Electronic Imaging, January 2001, Proceedings of SPIE 4299: 114-125.
Winkler S: Issues in vision modeling for perceptual video quality assessment. Signal Processing 1999,78(2):231-252. 10.1016/S0165-1684(99)00062-6
Winkler S: Digital Video Quality: Vision Models and Metrics. John Wiley &Sons, New York, NY, USA; 2005.
Bringier B, Richard N, Larabi MC, Fernandez-Maloigne C: No-reference perceptual quality assessment of colour image. Proceedings of the 14th European Signal ProcessingConference (EUSIPCO '06), September 2006, Florence, Italy
Quintard L, Larabi M-C, Fernandez-Maloigne C: No-reference metric based on the color feature: application to quality assessment of displays. Proceedings of the 4th European Conference on Colour in Graphics, Imaging, and Vision (CGIV '08), June 2008 98-103.
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP: Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing 2004,13(4):600-612. 10.1109/TIP.2003.819861
Sampat MP, Wang Z, Gupta S, Bovik AC, Markey MK: Complex wavelet structural similarity: a new image similarity index. IEEE Transactions on Image Processing 2009,18(11):2385-2401.
Rajashekar U, Wang Z, Simoncelli EP: Quantifying color image distortions based on adaptive spatio-chromatic signal decompositions. Proceedings of IEEE International Conference on Image Processing (ICIP '09), November 2009, Cairo, Egypt 2213-2216.
Rajashekar U, Wang Z, Simoncelli EP: Perceptual quality assessment of color images using adaptive signal representation. Human Vision and Electronic Imaging XV, January 2010, San Jose, Calif, USA, Proceedings of SPIE 7527:
Chandler DM, Hemami SS: VSNR: a wavelet-based visual signal-to-noise ratio for natural images. IEEE Transactions on Image Processing 2007,16(9):2284-2298.
Chandler DM, Lim KH, Hemami SS: Effects of spatial correlations and global precedence on the visual fidelity of distorted images. Human Vision and Electronic Imaging XI, January 2006, San Jose, Calif, USA, Proceedings of SPIE 6057:
Chandler DM, Hemami SS: Effects of natural images on the detectability of simple and compound wavelet subband quantization distortions. Journal of the Optical Society of America A 2003,20(7):1164-1180. 10.1364/JOSAA.20.001164
Chandler DM, Hemami SS: Suprathreshold image compression based on contrast allocation and global precedence. Human Vision and Electronic Imaging VIII, January 2003, Santa Clara, Calif, USA, Proceedings of SPIE 5007: 73-86.
Sheikh HR, Bovik AC, Cormack L: No-reference quality assessment using natural scene statistics: JPEG2000. IEEE Transactions on Image Processing 2005,14(11):1918-1927.
Malkowski M, Claßen D: Performance of video telephony services in UMTS using live measurements and network emulation. Wireless Personal Communications 2008,46(1):19-32. 10.1007/s11277-007-9353-5
Mitina OV, Abraham FD: The use of fractals for the study of the psychology of perception: psychophysics and personality factors, a brief report. International Journal of Modern Physics C 2003,14(8):1047-1060. 10.1142/S0129183103005182
Sprott JC: Automatic generation of strange attractors. Computers and Graphics 1993,17(3):325-332. 10.1016/0097-8493(93)90082-K
Taylor RP, Spehar B, Wise JA, Clifford CWG, Newell BR, Hagerhall CM, Purcell T, Martin TP: Perceptual and physiological responses to the visual complexity of fractal patterns. Nonlinear Dynamics, Psychology, and Life Sciences 2005,9(1):89-114.
Knill DC, Field D, Kersten D: Human discrimination of fractal images. Journal of the Optical Society of America A 1990,7(6):1113-1123. 10.1364/JOSAA.7.001113
Cutting JE, Garvin JJ: Fractal curves and complexity. Perception and Psychophysics 1987,42(4):365-370. 10.3758/BF03203093
Pentland AP: Fractal-based description of natural scenes. IEEE Transactions on Pattern Analysis and Machine Intelligence 1984,6(6):661-674.
Pentland AP: On perceiving 3-d shape and texture. Proceedings of the Symposium on Computational Models in Human Vision, 1986, Rochester, NY, USA
Ghosh K, Bhaumik K: Complexity in human perception of brightness: a historical review on the evolution of the philosophy of visual perception. OnLine Journal of Biological Sciences 2010,10(1):17-35. 10.3844/ojbsci.2010.17.35
Mandelbrot BB: The Fractal Geometry of Nature. W.H. Freeman and Co, New York, NY, USA; 1982.
Tolle CR, McJunkin TR, Rohrbaugh DT, LaViolette RA: Lacunarity definition for ramified data sets based on optimal cover. Physica D 2003,179(3-4):129-152. 10.1016/S0167-2789(03)00029-0
Chen W-S, Yuan S-Y, Hsiao H, Hsieh C-M: Algorithms to estimating fractal dimension of textured images. Proceedings of IEEE Interntional Conference on Acoustics, Speech, and Signal Processing (ICASSP '01), May 2001 1541-1544.
Lee W-L, Chen Y-C, Hsieh K-S: Ultrasonic liver tissues classification by fractal feature vector based on M-band wavelet transform. IEEE Transactions on Medical Imaging 2003,22(3):382-392. 10.1109/TMI.2003.809593
Frazer GW, Wulder MA, Niemann KO: Simulation and quantification of the fine-scale spatial pattern and heterogeneity of forest canopy structure: a lacunarity-based method designed for analysis of continuous canopy heights. Forest Ecology and Management 2005,214(1–3):65-90.
Peters EE: Fractal Market Analysis: Applying Chaos Theory to Investmentand Economics. John Wiley & Sons, New York, NY, USA; 1952.
Nonnenmacher TF, Losa GA, Weibel ER: Fractals in Biology and Medicine. Birkhäuser, New York, NY, USA; 1994.
Manousaki AG, Manios AG, Tsompanaki EI, Tosca AD: Use of color texture in determining the nature of melanocytic skin lesions—a qualitative and quantitative approach. Computers in Biology and Medicine 2006,36(4):419-427. 10.1016/j.compbiomed.2005.01.004
Taylor RP, Spehar B, Clifford CWG, Newell BR: The visual complexity of pollock's dripped fractals. Proceedings of the International Conference of Complex Systems, 2002
Falconer K: Fractal Geometry, Mathematical Foundations and Applications. John Wiley & Sons, New York, NY, USA; 1990.
Voss R: Random fractals: characterization and measurement. In Scaling Phenomena in Disordered Systems. Plenum Press, New York, NY, USA; 1985:1-11.
Keller JM, Chen S, Crownover RM: Texture description and segmentation through fractal geometry. Computer Vision, Graphics and Image Processing 1989,45(2):150-166. 10.1016/0734-189X(89)90130-8
Allain C, Cloitre M: Characterizing the lacunarity of random and deterministic fractal sets. Physical Review A 1991,44(6):3552-3558. 10.1103/PhysRevA.44.3552
Plotnick RE, Gardner RH, Hargrove WW, Prestegaard K, Perlmutter M: Lacunarity analysis: a general technique for the analysis of spatial patterns. Physical Review E 1996,53(5):5461-5468. 10.1103/PhysRevE.53.5461
Maragos P, Sun F: Measuring the fractal dimension of signals: morphological covers and iterative optimization. IEEE Transactions on Signal Processing 1993,41(1):108-121. 10.1109/TSP.1993.193131
Pedrycz W, Bargiela A: Fuzzy fractal dimensions and fuzzy modeling. Information Sciences 2003, 153: 199-216.
Ivanovici M, Richard N: Colour covering blanket. Proceedings of the International Conference on Image Processing, Computer Vision and Pattern Recognition, July 2010, Las Vegas, Nev, USA
Ivanovici M, Richard N: Fractal dimension of colour fractal images. IEEE Transactions on Image Processing. Inrevision
Ivanovici M, Richard N: Colour fractal image generation. Proceedings of the International Conference on Image Processing, Computer Vision and Pattern Recognition, July 2009, Las Vegas, Nev, USA 93-96.
Ivanovici M, Richard N, Decean H: Fractal dimension and lacunarity of psoriatic lesions—a colour approach. Proceedings of the 2nd WSEAS International Conference on Biomedical Electronics and Biomedical Informatics (BEBI '09), August 2009, Moskow, Russia 199-202.
Ivanovici M, Richard N: The lacunarity of colour fractal images. Proceedings of the International Conference on Image Processing (ICIP '09), November 2009, Cairo, Egypt 453-456.
Ivanovici M: Objective performance evaluation for mpeg-4 video streaming applications. Scientific Bulletin of University "POLTEHNICA" Bucharest C 2005,67(3):55-64.
Ivanovici M, Beuran R: User-perceived quality assessment for multimedia applications. Proceedings of the 10th International Conference on Optimization of Electricaland Electronic Equipment (OPTIM '06), Ma 2006 55-60.
Ivanovici M, Beuran R: Correlating quality of experience and quality of service for network applications. In Quality of Service Architectures for Wireless Networks: Performance Metrics and Management. IGI-Global; 2010:326-351.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Ivanovici, M., Richard, N. & Fernandez-Maloigne, C. Towards Video Quality Metrics Based on Colour Fractal Geometry. J Image Video Proc 2010, 308035 (2010). https://doi.org/10.1155/2010/308035
- Fractal Dimension
- Video Sequence
- Video Frame
- Video Quality
- Human Visual System