Skip to main content

Evaluation of noise robustness for local binary pattern descriptors in texture classification


Local binary pattern (LBP) operators have become commonly used texture descriptors in recent years. Several new LBP-based descriptors have been proposed, of which some aim at improving robustness to noise. To do this, the thresholding and encoding schemes used in the descriptors are modified. In this article, the robustness to noise for the eight following LBP-based descriptors are evaluated; improved LBP, median binary patterns (MBP), local ternary patterns (LTP), improved LTP (ILTP), local quinary patterns, robust LBP, and fuzzy LBP (FLBP). To put their performance into perspective they are compared to three well-known reference descriptors; the classic LBP, Gabor filter banks (GF), and standard descriptors derived from gray-level co-occurrence matrices. In addition, a roughly five times faster implementation of the FLBP descriptor is presented, and a new descriptor which we call shift LBP is introduced as an even faster approximation to the FLBP. The texture descriptors are compared and evaluated on six texture datasets; Brodatz, KTH-TIPS2b, Kylberg, Mondial Marmi, UIUC, and a Virus texture dataset. After optimizing all parameters for each dataset the descriptors are evaluated under increasing levels of additive Gaussian white noise. The discriminating power of the texture descriptors is assessed using tenfolded cross-validation of a nearest neighbor classifier. The results show that several of the descriptors perform well at low levels of noise while they all suffer, to different degrees, from higher levels of introduced noise. In our tests, ILTP and FLBP show an overall good performance on several datasets. The GF are often very noise robust compared to the LBP-family under moderate to high levels of noise but not necessarily the best descriptor under low levels of added noise. In our tests, MBP is neither a good texture descriptor nor stable to noise.

1 Introduction

The texture of objects in digital images is an important property utilized in many computer vision and image analysis applications such as face recognition, object classification, and segmentation. Despite its frequent use and the many attempts to describe it in general terms, texture lacks a precise definition. This makes the development of new texture descriptors an ill-posed problem [1, 2]. The recent textbook by Pietikäinen et al. [3] provide a good description of texture in stating that “A textured area in an image can be characterized by a non-uniform or varying spatial distribution of intensity or color”.

Local binary patterns (LBPs) emerged in the mid-1990s. At first, they were introduced as a local contrast descriptor [4] and a further development of the texture spectra introduced in [5]. Shortly thereafter, LBP was shown to be an interesting texture descriptor [6]. Many extensions to the classic LBP have since then been proposed. A comprehensive book about the LBP family of texture descriptors was recently published [3]. While some propositions focus on different sampling patterns to effectively capture the characteristics of certain textures, others propose descriptors focusing on improving the robustness to noise by using different encoding or thresholding schemes. The latter group is the focus of this article; considering LBP-based descriptors where the thresholding and encoding schemes are modified to create more noise robust descriptors.

Although several new LBP-based texture descriptors have been published, there is a limited number of comparative studies and evaluations. However, the recent study in [7], and the previous study by the same authors in [8], together cover six datasets from different applications, mainly in the biomedical area. They report results achieved using different sampling patterns and thresholding schemes as well as combinations of LBP-based descriptors with integrated ensembles of support vector machine (SVM) classifiers. The parameter values explored are limited and the focus is on optimizing combinations of LBP-based descriptors that work well for several types of texture datasets. Another recent survey is [9] where a large number of LBP-based descriptors are compared and put into a unifying framework called histograms of equivalent patterns (HEP). These descriptors are evaluated on 11 general texture datasets and the descriptors are then ranked based on pairwise comparisons of the classification results in the pursuit for the overall best descriptor in the HEP framework.

Unlike the previously mentioned surveys the aim of this article is to evaluate the noise robustness of a number of LBP-based descriptors. The selected descriptors are all designed to be noise robust alternatives to the original LBP by altering the thresholding or encoding scheme. The descriptors are namely improved LBP (ILBP), median binary patterns (MBP), local ternary patterns (LTP), improved local ternary patterns (ILTP), local quinary patterns (LQP), robust LBP (RLBP), shift LBP (SLBP), and fuzzy/soft LBP (FLBP). The SLBP descriptor is proposed in this article as a fast and simple approximation to FLBP. The discriminating power of the texture descriptors are evaluated by applying them to six different texture datasets followed by a cross-validated classification using a first nearest neighbor classifier (1-NN). Before the noise robustness is assessed all the descriptors parameters are thoroughly optimized, exploring a search space larger than a few combinations of parameter values, which is commonly the case reported in the literature.

When using LBP, it is quite common to exclude the specificity of the so-called non-uniform patterns and count their occurrences as simply non-uniform [10]. In brief, binary codes with more transitions between ‘0’ and ‘1’ than a specific value (typically two) are called non-uniform. In this way, the number of possible binary codes decreases but at the same time some important information may be lost, see for example [10, 11]. This is why both uniform and non-uniform binary codes are considered in this article.

To put the performance of the LBP-based descriptors into perspective they are compared to the classical LBP, a set of Gabor filters [12] and a set of commonly used descriptors derived from the gray-level co-occurrence matrix (GLCM) introduced by Haralick et al. [1].

2 Material

To evaluate the texture descriptors six publicly available texture image datasets are used. They were chosen to have different characteristics in terms of number of classes, number of samples, class homogeneity with regards to scale, perspective, and illumination. The texture datasets are Brodatz [13], KTH-TIPS2b [14], Kylberg [15], Mondial Marmi [16], UIUC [17], and a Virus texture dataset [11]. Figure 1 shows four samples from four classes in each of the six datasets. The basic properties of the datasets as well as links to websites where they are accessible are listed in Table 1.

Figure 1
figure 1

Texture examples. For each dataset four texture samples from four classes are shown. For the Virus dataset a dashed circle shows the perimeter of the region wherein the texture descriptors are computed.

Table 1 Properties of the six datasets used; references to the datasets are included

The Brodatz dataset consists of digitized photographs of natural and manmade textures. In the form the Brodatz photos are used here the dataset has many, 111, classes but only very few, 9, relatively homogeneous samples per class. The samples are 213 × 213 pixels in size and there is a considerable overlap between a few of the classes making them indistinguishable. Some classes also include large structures making the nine samples not equally representative.

The KTH-TIPS2b dataset has 11 classes, some very heterogeneous, with 432 samples each. In each class, four objects have been imaged under varying scale, illumination, and pose conditions. For example, in the class “wool” four different fabrics and knitwear are represented which make this class very heterogeneous not only due to the varying imaging conditions. Most samples are 200 × 200 pixels in size, but some are smaller due to scale issues. See the documentation in [19] for details. In contrast to [14] where the dataset is used to study recognition of material categories we will use images from all four material samples as examples of the same class when training the classifier.

The Kylberg dataset has 28 classes of 160 samples each with gray-scale images of different natural and manmade textured surfaces. The classes are very homogeneous in terms of perspective, scale, and illumination. The images in the Kylberg dataset are available in different rotations Θ{0, 1 6 π, 2 6 π,, 11 6 π}. In this article, one orientation per image is randomly selected. The 576 × 576 pixels images are here divided into four 288 × 288 pixels, non-overlapping, sub images resulting in 640 samples of each class.

The Mondial Marmi dataset is a collection of images of granite surfaces acquired as JPEG color images (with noticeable compression artifacts) under controlled illumination conditions. The dataset was used in [21] to evaluate robustness to rotation for LBP, coordinated clusters representation, and ILBP. While the texture samples are available in nine orientations (both hardware and software rotated) only one orientation (0°) is used here. The 544  ×  544 pixel images in the Mondial Marmi dataset are divided into four 272  ×  272 pixel, non-overlapping, sub images. The samples are converted to gray scale as 0.2989 R+0.5870 G+0.1140 B, where R, G, and B are the red, green, and blue intensities, respectively.

The UIUC dataset is based on images of different textured surfaces. The images are provided as JPEG images and appear to have only very minor compression artifacts. Each class contains 40 samples (640 × 480 pixels) of different perspectives and scales of a texture. The classes are more heterogeneous than in the Brodatz, KTH-TIPS2b, Kylberg, and Mondial Marmi datasets, see Figure 1.

The Virus dataset was first used in [11], and is based on transmission electron microscopy images of 15 different virus types. The virus types vary both in size (diameters from 25 to 270 nm) and shape; some are icosahedral while others are elongated. Texture patches are extracted as disk-shaped regions with the same diameter as the viruses, centered in automatically (not always correctly) segmented virus particles, see [11] for more details. The texture samples are then resampled to the same size (41 × 41 pixels) using a Lanczos kernel with a sinc window of a = 2. This disk-shaped region is shown inFigure 1.

3 Methods

In the original description of LBP [6], a window of 3 × 3 pixels is used. The pixels in the window are compared to the value of the center pixel. By coding and < for each comparison as a binary number the local binary code is retrieved when reading these binary numbers anticlockwise as a sequence, see Figure 2(left). The histogram of occurring binary codes in a region is the resulting feature vector for that region. Early on, the definition was generalized to consider N sample points evenly distributed on a circle with radius R from the center pixel [25], as illustrated in Figure 2(right). To make the comparison in this article as fair as possible, the same generalization (using N samples on a radius R) is introduced for the whole LBP family of descriptors. The implementations of all the LBP family of descriptors are based on the original LBP implementation by Heikkilä and Ahonen accessible at [26].

Figure 2
figure 2

LBP generalization. The eight neighbors in a 3 × 3 neighborhood used in the classic LBP (left). The generalized neighborhood with N samples at radius R (right). The numbers indicate the ordering of samples.

To put the performance of the LBP family of descriptors into perspective, two other well-known texture descriptors are evaluated on the same datasets. The selected reference descriptors are Gabor filter banks (GF) and commonly used descriptors derived from the GLCM, also known as Haralick features. Table 2 lists all the descriptors in the comparison.

Table 2 Evaluated texture descriptors with abbreviations and references

3.1 LBPs

The generalized LBP definition from [25] is used with N sample points evenly distributed on a radius R around a center pixel p c located at (x c ,y c ). The position, (x p ,y p ), of the neighbor point p, where p {0,…,N - 1} is given by

( x p , y p )  =  x c + R cos ( 2 πp / N ) , y c - R sin ( 2 πp / N ) .

The local binary code for the position (x c ,y c ) is defined as:

LBP N , R ( x , y )  =  p  =  0 N - 1 s ( g p - g c ) 2 p ,


s ( x )  =  1 , x 0 0 , otherwise .

If a point p does not coincide with a pixel center, bilinear interpolation is used to compute the gray value g p . Finally, the histogram of occurring binary codes in a region is the feature vector of this region.

3.2 ILBPs

ILBP, introduced in [27], is closely related to LBP. The main difference is that the threshold used is the mean value of the whole neighborhood including the center pixel. In addition, p c will also be a part of the binary code making it N+1-bits long. Following [27], ILBP is defined as

ILBP N , R ( x , y )  =  p  =  0 N - 1 s ( g p - g mean ) 2 p + s ( g c - g mean ) 2 N ,


g mean  =  1 N + 1 p  =  0 N - 1 g p + g c ,

and the function s is defined as in Equation 3.

3.3 MBPs

MBP was introduced in [28]. In analogy to ILBP, the center pixel p c is included in the neighborhood but here the median gray value of the neighborhood is used instead, giving the following definition:

MBP N , R ( x , y )  =  p  =  0 N - 1 s ( g p - g med ) 2 p + s ( g c - g med ) 2 N ,


g med  =  median { g 0 , g 1 , , g N - 1 , g c } ,

and the function s is defined as in Equation 3.

3.4 LTPs

To deal with the noise sensitivity of the LBP descriptor, the magnitude of the intensity difference between the center pixel and neighboring points can be taken into consideration. However, involving the magnitude implies that the complete invariance to intensity scaling is lost. In [29], the LTP descriptor is proposed. Here, the difference between neighboring values g p and the center pixel value g c are encoded with three values using one threshold t 1

LTP N , R ( x , y )  =  p  =  0 N - 1 s 3 ( g p , g c , t 1 ) 2 p ,


s 3 ( g p , g c , t 1 )  =  1 , g p g c + t 1 0 , g c - t 1 g p < g c + t 1 - 1 , otherwise .

Instead of using a code with base 3 to encode the three states, LTP uses two binary codes representing the positive and the negative components of the ternary code, i.e., two binary codes coding for the two states {-1,1}. These binary codes are collected in two separate histograms and, as a last step, the histograms are concatenated to form the LTP feature vector.

3.5 ILTPs

In analogy with the extension of LBP to ILBP, where the neighborhood mean value is used as the local threshold, LTP can be extended to ILTP. This was done in [30] arriving at the following definition:

ILTP N , R ( x , y )  =  p  =  0 N - 1 s 3 ( g p - g mean ) 2 p + s 3 ( g c - g mean ) 2 N ,

where the function s 3 is defined as in Equation 9 and g mean as in Equation 5.

3.6 LQP

In [8], LQP is introduced, extending the encoding of the local differences to five values corresponding to two thresholds t 1 and t 2 resulting in

LQP N , R ( x , y )  =  p  =  0 N - 1 s 5 ( g p , g c , t 1 , t 2 ) 2 p ,

where the two thresholds are used in the s 5-function according to

s 5 ( g p , g c , t 1 t 2 )  =  2 , g p g c + t 2 1 , g c + t 1 g p < g c + t 2 0 , g c - t 1 g p < g c + t 1 - 1 , g c - t 2 g p < g c - t 1 - 2 , otherwise .

In analogy to LTP, the quinary code is split into four binary codes, coding for the states {-2,-1,1,2}. Four histograms are computed followed by a concatenation.

3.7 RLBP

By changing the expression (g p -g c ) in Equation 2 to (g p -g c -t 1) the gray value in point p has to be t 1 higher than g c to produce a 1. This modification is called RLBPs and was introduced in [31]. The RLBP descriptor is supposed to improve robustness against small changes in local intensities. Following the description above, RLBP for a position (x,y) and a threshold value t 1 is defined as

RLBP N , R ( x , y , t 1 )  =  p  =  0 N - 1 s ( g p - g c - t 1 ) 2 p ,

where the function s is defined as in Equation 3.

3.8 FLBP

In fuzzy [32]/soft [33] LBP (FLBP) one pixel position may contribute to several bins in the histogram of possible patterns. A membership function for a neighboring point p to a ‘0-class’, m 0, and the antonym function m 1, expressing belongingness to a ‘1-class’ is defined as

m 0 ( p , f )  =  0 , g p g c + f f - g p + g c 2 · f , g c - f g p < g c + f 1 , otherwise ,
m 1 ( p , f )  =  1 - m 0 ( p ) ,

where f governs the interval of fuzzy belongingness. Figure 3 shows a plot of function m 0 and m 1. The contribution from one pixel position (x,y) to a bin i in the histogram H of occurring binary patterns is

FLBP N , R ( x , y , i )  =  p  =  0 N - 1 b p ( i ) m 1 ( g c - g p ) + ( 1 - b p ( i ) ) m 0 ( g c - g p ) ,
Figure 3
figure 3

Membership functions in FLBP. The two membership functions used in FLBP. The gray value difference g p -g c on the x-axis and belongingness on the y-axis.

where b p (i){0,1} is the value of the p th bit of the binary representation of pattern i. By remembering that all considered pixel positions may contribute to bin i in the histogram it follows that

H FLBP ( i )  =  x , y FLBP N , R ( x , y , i ) .

Analogous to the other LBP-based descriptors, the resulting histogram constitutes the FLBP feature vector.

3.9 SLBP

In the classical LBP definition, one pixel position generates one local binary code corresponding to exactly one bin in the histogram of possible codes. In SLBP, a fixed number of local binary codes are generated for each pixel position. In analogy with RLBP the sign of an expression (g p -g c -k) is considered rather than the sign of (g p -g c ) as in the original LBP (Equation 2). However, in SLBP, k is varied within an interval defined by an intensity limit l. Each time k is changed, a new binary code is created and added to the histogram of occurring binary patterns. SLBP for a position (x,y) and a shift value k is defined as

SLBP N , R ( x , y , k )  =  p  =  0 N - 1 s ( g p - g c - k ) 2 p ,

where the function s is defined as in Equation 3, and k is defined as

k [ - l , l ] ℤ.

The number of generated binary patterns, K, for one pixel position equals the number of different values k assumes. From this and Equation 19 it follows that

K  =  2 · l + 1 .

As an example, if l = 3, the parameter k will assume values {-3,-2,…,3}. K will hence be 7 which means that each pixel position will contribute with 7 binary codes to the histogram. For neighborhoods with high local contrast, the K binary codes may all be the same, while neighborhoods with contrast lower than l will generate a distribution of binary codes picking up some of the fuzzy nature of that neighborhood. The values in the final histogram are divided by K, giving the histogram the sum equal to the number of pixel positions considered (like the rest of the LBP-family).

3.10 Rotation invariance of the LBP-family

One straight forward way to make LBP rotation invariant is to rotate the binary code, i.e., bit-shift it, to its lowest value [25]. For most LBP-based features, it is trivial to introduce rotation invariance following this scheme. Indeed, in [34], rotation invariance was introduced to FLBP following this approach. ILBP, MBP, RLBP, and SLBP are made rotation invariant in this way. LTP, ILTP, and LQP are somewhat different due to the concatenation of binary codes. The binary codes are therefore made rotation invariant prior to concatenation of the histograms here.

3.11 Gabor filters

In 1978, Granlund [12] generalized Gabor filters to 2D and applied them to images. In this article, the definition of the 2D Gabor filter in the spatial domain, ψ, is defined as in [35]

ψ ( x , y , F , Θ , γ , η )  =  F 2 πγη exp - F 2 ( x / γ ) 2 + ( y / η ) 2  ×  exp i 2 πF x ,


x  =  x cos Θ + y sin Θ ,
y  =  - x sin Θ + y cos Θ.

F is the frequency of the wave, and Θ is the angle between the direction of the wave and the x-axis. The Gaussian envelope is defined by the standard deviation parallel to the wave, γ, and standard deviation perpendicular to the wave, η.

A set of Gabor filters with different orientations and frequencies is commonly called a GF. Bianconi and Fern a ́ ndez [35] show that parameters with a significant impact on the texture classification using GF are the frequency ratio and the standard deviations for the Gaussian envelope. They also conclude that a small change of a reasonable number of orientations, n O , or number of frequencies, n F , in a GF does not significantly influence the discriminating power for the texture datasets they consider. Based on their findings, a GF with a frequency ratio equal to 2 is used here. The highest central frequency, F M , is computed according to [35] as

F M  =  γ 2 ( γ + ( log 2 / π ) ) ,

where γ is the standard deviation of the Gaussian envelope parallel to the wave. Figure 4 shows an example of four Gabor filter kernels of the orientation Θ = π/7 using γ = 4,η = 4F M 0.53 and a frequency ratio of ( 2).

Figure 4
figure 4

Examples of Gabor filters used. The real part of the Gabor filter kernels of one specific orientation (Θ = π/7) and one Gaussian envelope (γ = 4,η = 4) are shown. (a) Highest central frequency computed to F M 0.53. (b–d) The three following lower frequencies with frequency ratio equal to 2 .

When the GF descriptor is applied to a texture sample the texture is convolved with the complex conjugate of each one of the constructed filters in the filter bank. The mean, μ, and standard deviation, σ, are computed for the magnitude of each filter response and these values are used as the feature values. This results in a feature vector with n O  × n F  × 2 elements on the following form

GF = { μ 00 , σ 00 , μ 01 , σ 01 ,, μ n O - 1 , n F - 1 , σ n O - 1 , n F - 1 }.

Rotation invariance is achieved through the procedure proposed in [36]; for each frequency the dominant direction is computed as the orientation giving the highest mean filter response among the filters with this frequency in the filter bank. The elements in the GF feature vector are then circularly shifted so that μ and σ of the dominant direction can be found on the same positions in the feature vector. In [36], it is shown that a rotation of an image in the spatial domain corresponds to a circular shift of feature vector elements.

3.12 Gray-level co-occurrence matrices

Introduced in 1973 by Haralick et al. [1], descriptors derived from gray-level co-occurrence matrices still have a given place among established texture features. A relation operator is defined describing the distance and direction between pixels whose intensities are to be pairwise compared in the region of interest. A relation operator can, e.g., be ‘one pixel to the right’ and the following co-occurrence matrix, M, will then show how often a certain gray value occurs one pixel to the right of another gray value. The gray levels of an image are commonly quantized into a lower number of intensity levels prior to computing the co-occurrence matrix. Quantization into q gray levels is used in this article resulting in a q × q co-occurrence matrix of the gray levels defined as

M =  p ( 1 , 1 ) p ( 1 , 2 ) p ( 1 , q ) p ( 2 , 1 ) p ( 2 , 2 ) p ( 2 , q ) p ( q , 1 ) p ( q , 2 ) p ( q , q ) ,

where p(i,j) is the probability of the co-occurrence of the gray levels i and j given a relation operator. In this article, the four symmetric relation operators proposed by Haralick et al. is used. From the co-occurrence matrices, the contrast, correlation, energy, and homogeneity descriptors are computed as follows:

contrast  =  i , j | i - j | 2 p ( i , j ) ,
correlation  =  i , j ( i - μ i ) ( j - μ j ) p ( i , j ) σ i σ j ,
energy  =  i , j p ( i , j ) 2 ,
homogeneity  =  i , j p ( i , j ) 1 + | i - j | ,

where μ i and μ j are mean values computed along rows and columns, respectively. In the same way, σ i and σ j are standard deviations computed along rows and columns.

For each of the four descriptors, the average and standard deviation over the four relation operators (directions) are used as feature values. This results in a GLCM feature vector with eight elements. To fully describe the GLCM descriptor, the distance d in the relation operator also needs to be set.

3.13 Classification method

To get comparable noise robustness results and parameter optimization for the descriptors, a 1-NN with Euclidean metric is used. Tenfolded cross-validation is used to minimize overfitting and to ensure that the validation is performed on independent test sets and the cross-validation is done by randomly assigning each sample a number n{1,2,…,10}, creating ten disjoint subsets with equal (or approximately equal) number of samples. In the first cross-validation fold, samples with n = 1 will be the test data and samples with n{2,3,…,10} will serve as training data. In the second fold, samples with n = 2 will be the test data and the rest is used for training, and so on. This means that each sample will be included in the test data once and less biased classification accuracy is obtained compared to using the apparent error. The ten results from the folds are combined into a single estimation.

The cross-validation folds are created once for each dataset and are then kept fixed throughout the comparison. The feature values for all descriptors are normalized to [0,1] prior to the cross-validation.

3.14 Parameter optimization

The parameters for each texture descriptor are optimized separately for each dataset to make as fair comparison as possible. The parameters common for all descriptors in the LBP family are the number of samples N and the radius R. Besides ILBP and MBP all extensions to the classic LBP have additional parameters. The parameters are listed in Table 3 along with the range wherein they are varied. Since several parameters are common to several descriptors, the table also shows for which method each parameter is applicable.

Table 3 Descriptor parameters and the intervals searched during parameter optimization

To restrict the parameter search space, an optimization scheme is designed as follows:

  1. 1.

    Find optimal N and R for LBP using a tenfold cross-validated 1-NN classifier.

  2. 2.

    Use N and R from step 1 and find optimal:

    1. (a)

      fuzziness, f, for FLBP,

    2. (b)

      threshold t 1 for LTP, ILTP, and RLBP,

    3. (c)

      threshold pairs t 1 and t 2 for LQP, and

    4. (d)

      interval limit l for SLBP.

  3. 3.

    For all texture descriptors

    • Perform a new gradient descent parameter search locally around the previous found best point in the current descriptor’s full parameter space. Repeat until stability.

In other words, an exhaustive search for the best LBP parameters is performed. The LBP parameter values are then used when optimizing all the method-specific parameters. They are next used as a starting guess for an iterative optimization procedure based on gradient descent where all parameters in the descriptors are allowed to vary.

The described optimization scheme is applied to each dataset separately. An exhaustive search for each of the parameters is not feasible due to the size of the datasets and total number of parameters across the descriptors.

The parameters of the reference descriptors GF and GLCM are also optimized for each dataset. Table 3 shows the explored set of parameter values for both GLCM and GF. The optimization criterion is the same as for the LBP family of descriptors.

3.15 Introducing noise

When the descriptor parameters have been optimized for each dataset the influence of noise is investigated. The noise model used is additive white (uncorrelated) Gaussian noise. That is, a sample from an Gaussian distribution is added to the intensity of each pixel. This noise model is well suited for modeling thermal noise in CCD and CMOS sensors which are the sensors relevant for the microscopy and photography datasets considered here. The σ for the Gaussian distribution is gradually increased. Figure 5 shows one texture sample from each dataset under three different noise levels. The noise is added to the original datasets, and the noisy datasets are then saved. In this way, all the descriptors are applied and evaluated on the exact same noisy texture samples. The 20 noise levels used are σ from 10-4 to 101 with linearly spaced exponents, i.e., the 20 noise levels are equally spaced in a log10 scale.

Figure 5
figure 5

Examples of noise levels. One texture sample from each one of the six datasets under increasing levels of additive Gaussian white noise. For the Virus example, a dashed circle marks the region wherein the texture descriptors are computed.

4 Results

4.1 Parameter optimization

Table 4 lists the parameter values for each descriptor and dataset after applying the optimization scheme described in Section 3.14. The parameter choice does not only influence the discriminant power of the descriptor but may also, depending on the descriptor, set the number of elements in the feature vector. In the LBP family of descriptors, the feature vector length depends on the number of samples N and whether or not the center pixel is included in the binary code. Table 5 lists the feature vector lengths for the descriptors after the parameter optimization.

Table 4 Parameter settings for each descriptor and dataset after applied optimization scheme
Table 5 Feature vector length for each descriptor and dataset based on the optimized descriptor parameters

4.2 Comparison without added noise

The discriminating power of the descriptors are compared on the datasets without added noise by analyzing the combined classification accuracy of the tenfolded cross-validation. The classification accuracy may vary between datasets and descriptors, but also within a dataset for a specific descriptor, i.e., all classes may not equally be easy or difficult to discriminate. To explore this perspective, Figure 6 shows the distribution of mean accuracy per class for each descriptor and dataset.

Figure 6
figure 6

Descriptor performance without added noise. Distribution of mean accuracy per class for each descriptor and dataset. Circles with dots mark median values. The boxes stretch between the 25th and 75th percentiles, and the lines span all the data points.

Figure 6 shows that almost all descriptors perform well on the Kylberg dataset. LTP and ILTP manage to differentiate almost all classes perfectly in the Kylberg dataset (median very close to 100%, small boxes, and short tails). Most descriptors also perform well on the KTH-TIPS2b dataset. Even for the many classes in the Brodatz dataset all LBP descriptors perform overall well (100% for more than half the classes and boxes starting at >88%) but there are a number of classes no method can discriminate between (lowest class accuracies are between 22 and 44.4%). This is not surprising since there is a considerable overlap between some of the classes in the Brodatz dataset, as mentioned before.

The other three datasets are more problematic with more varied results for the LBP descriptors. The overall low accuracies achieved on the Virus dataset are probably due to the small sample size (only 41 × 41 pixels), as well as the heterogeneous classes originating from the automatic extraction of patches only partly (or sometimes even not at all) containing virus. Across these three datasets, ILTP performs overall well as does FLBP.

GLCM is among the worst performing descriptors for all datasets, except for the Mondial Marmi dataset. Note however that only very few measures on the co-occurrence matrix are extracted.

The GF descriptor performs on the same level as several LBP-based descriptors for several datasets. However, on the Kylberg and UIUC dataset GF is outperformed by most LBP-based descriptors. Comparisons of per-class performance for the different descriptors and datasets (data not shown) show that the GF sometimes produces good results for a few specific classes where the LBP family of descriptors do not. This indicates that GF could be a good complementary texture descriptor and that a combination with, for example, ILTP might improve the overall classification accuracy on some of the datasets. However, combining descriptors to produce the best classification result possible is not the purpose of this article, and is not further investigated here.

4.3 Robustness to noise

Figures 7, 8, 9, 10, 11, and 12 show the mean classification accuracies for the texture descriptors on the six datasets under increasing levels of added noise. In all figures, LBP, GF, and GLCM are shown in red, blue, and green, respectively, and one of the other descriptors at a time in black. A horizontal dotted line marks the mean accuracy of a random decision. The curves are interpolated between data points using piecewise cubic interpolation. For increasing noise levels, it is expected that the performance of all descriptors level out to the mean accuracy of a random classification, i.e., a mean classification accuracy of 1/number of classes. This is easily seen in, for example, Figure 9. The same data as Figures 7, 8, 9, 10, 11, and 12 show can be viewed in tabular form in Tables 6, 7, 8, 9, 10, and 11 but limited to every second noise level. In the tables, the highest mean accuracy for each noise level is highlighted in bold and the lowest in italics.

Figure 7
figure 7

Noise tests on Brodatz. Mean classification accuracy for all descriptors on the Brodatz dataset.

Figure 8
figure 8

Noise tests on KTH-TIPS2. Mean classification accuracy for all descriptors on the KTH-TIPS2 dataset.

Figure 9
figure 9

Noise tests on Kylberg. Mean classification accuracy for all descriptors on the Kylberg dataset.

Figure 10
figure 10

Noise tests on Mondial Marmi. Mean classification accuracy for all descriptors on the Mondial Marmi dataset.

Figure 11
figure 11

Noise tests on UIUC. Mean classification accuracy for all descriptors on the UIUC dataset.

Figure 12
figure 12

Noise tests on Virus. Mean classification accuracy for all descriptors on the Virus dataset.

Table 6 Mean classification accuracy for descriptors computed on the Brodatz dataset
Table 7 Mean classification accuracy for descriptors computed on the KTH-TIPS2b dataset
Table 8 Mean classification accuracy for descriptors computed on the Kylberg dataset
Table 9 Mean classification accuracy for descriptors computed on the Mondial Marmi dataset
Table 10 Mean classification accuracy for descriptors computed on the UIUC dataset
Table 11 Mean classification accuracy for descriptors computed on the Virus dataset

For the Brodatz dataset, Figure 7 and Table 6, GF stands out as the most noise robust texture descriptor but it is not necessarily the best descriptor for low noise levels where ILTP followed by LTP, and SLBP show good performance. These four descriptors are better than LBP for all noise levels. RLBP, LQP, and especially MBP all perform worse than LBP, and in addition, the performance for LQP and MBP drops quickly with increasing levels of noise.

For low levels of noise in the KTH-TIPS2b dataset, Figure 8, all LBP-based descriptors (except MBP) outperform the original LBP and they perform on the same high level as GF. For medium to high levels of noise all LBP-based methods are outperformed by GF and the bottom two LBP-based descriptors are, again, LQP and MBP.

Most LBP-based descriptors show similar performance on the Kylberg dataset, see Figure 9 and Table 8. ILBP, LTP, ILTP are generally somewhat better than LBP. LQP drops in performance faster than the rest. The MBP performance drops with increasing but still low levels of noise, but then increases in performance and is among the better descriptors for high levels of noise. A closer look at the per-class accuracies (data not shown) reveals that it is mainly the second texture class, see Figure 1, with large homogeneous intensity patches in the pattern that causes this dip in the mean accuracy curve for MBP.

For the Mondial Marmi dataset, Figure 8 and Table 9, the curves look and behave rather differently. A reason behind this might be the JPEG compression artifacts. This dataset is the only dataset where GLCM perform well for low levels of noise. GF is also found to perform well for low noise levels and is more stable than the other descriptors for increasing noise levels. ILBP, ILTP, and FLBP are generally better than LBP. However, for low levels of noise all the descriptors in the LBP family are similar, MBP and LBP being the exceptions. MBP is the worst performing descriptor as soon as low levels of noise are added and the performance of LQP drops quickly for higher levels of noise added.

On the UIUC dataset, LTP is the best performing descriptor for low levels of noise and ILTP and FLBP are in general better than the LBP, see Figure 11 and Table 10. GF is not very good for low to moderate noise levels but robust for high levels of noise. ILBP performs poorly for low levels of noise. MBP is the by far the worst performing descriptor followed by GLCM. Again, LQP drop quickly at moderate levels of noise and is hence less noise robust then the other LBP family of descriptors.

On the difficult Virus texture dataset, GF, ILTP, and FLBP are the best performing descriptors with FLBP having a slight upper hand at low levels of noise, see Figure 12 and Table 11. On this dataset, the proposed SLBP descriptor falls between these three best performing descriptors and the rest while MBP and LQP are the two worst.

4.4 Computation time

One of the benefits of the classic LBP is that it is very fast to compute. A comparison of computation times for the more complex LBP descriptors is hence interesting. Computation time for some of the descriptors depend on the image content. Therefore, the CPU time required for the different descriptors is here compared on one sample from each class in the Kylberg dataset using the optimized parameters listed in Table 4. Figure 13 shows computation time relative to the computation time of the classic LBP. Hence, if a descriptor takes 10 times longer than LBP to compute the descriptor has the value 10 in the plot in Figure 13.

Figure 13
figure 13

Computation time relative to classic LBP for descriptors applied to one sample per class in the Kylberg dataset.

Furthermore, two FLBP implementations are compared. The version directly based on [32, 33], called ‘naive’ in Figure 13, computes the histogram bin contribution of all bins for every neighborhood (Equation 16). However, gray value differences outside the fuzzy region [-f, f] restrict the possible binary codes that a neighborhood can contribute to. Utilizing this, a modified implementation was developed, denoted ‘fast’ in Figure 13. It restricts the membership computations to the subset of binary codes possible, given the current local neighborhood. Outside the fuzzy region, the bin contributions will be as in the classic LBP. The computed feature vectors from the ‘naive’ and ‘fast’ implementations of FLBP are of course identical.

Even though the ‘fast’ FLBP implementation is roughly five times faster than the ‘naive’ implementation, they are both very slow compared to all other descriptors. FLBP are 922 times slower than the classic LBP. It should also be said that the computation time for the ‘fast’ FLBP not only depends on the fuzziness parameter (which is the case of the ‘naive’ FLBP), but also depends on the image content. Figure 13 shows that LQP, RLBP, ILTP, LTP, ILBP, and GLCM have comparable computation times to LBP. SLBP is roughly 11 times slower than LBP which is expected since SLBP in this test generates 11 binary codes at every position (l = 5 K = 11, see Equation 20). The MBP is relatively slow compared to most of the LBP descriptors which is also expected since computing median values in this implementation involves sorting the intensity values in each neighborhood. In GF, which is 20 times slower than LBP, each texture sample is convolved with a number of complex filter kernels. This is a more time-consuming task than performing multiple thresholdings in a small neighborhood, the operation performed in most LBP-based descriptors.

5 Conclusions

This article reports on the following:

  • The descriptive performance of eight LBP-based texture descriptors are evaluated and compared on six different datasets under increasing levels of additive Gaussian white noise together with the classic LBP, Haralick descriptors, and GF.

  • A new LBP-based descriptor, SLBP, is introduced as a fast approximation of the computationally heavy FLBP.

  • A roughly five times faster implementation of the FLBP descriptor is described.

The fast implementation of FLBP as well as an implementation of SLBP are available as Matlab code at [37].

The main conclusions that can be drawn regarding the evaluated texture descriptors are

  • ILTP followed by FLBP generally perform well among the LBP-family of descriptors, outperforming the classic LBP in all tests performed.

  • GF is often very robust for moderate to high levels of noise but is many times outperformed by several LBP-based descriptors under low noise conditions.

  • FLBP is very slow compared to the rest of the descriptors but the naive implementation can be improved upon by restricting the belongingness computations to the possible subset of binary codes given a specific neighborhood.

  • MBP is very noise sensitive and has a relatively poor performance even for low levels of noise.

  • LQP suffer more of added noise than the majority of the LBP-based descriptors.

  • It is not possible to know in advance which texture descriptor is the best performing one for a given problem. However, a well-performing descriptor can probably be found among a subset of the tested descriptors, after optimizing their parameters. Such a subset of descriptors could be ILBP, LTP, ILTP, and FLBP. Furthermore, SLBP can sometimes be an alternative to the computationally heavy FLBP.

In accordance with the survey in [9], ILTP is found to be superior to LTP, LQP, and ILBP for all the datasets evaluated. In addition, we show that ILTP retains its discrimination advantage under increasing levels of added Gaussian white noise. The results presented here also show that even if MBP and LQP perform relatively well on noise free data, they both suffer greatly from the introduced noise. Furthermore, we find that FLBP has a good overall performance, similar to ILTP.

It seems that it is preferable to use the more stable local mean value of the neighborhood (including the center pixel) as the local threshold in that ILBP often outperforms LBP, and ILTP often outperforms LTP. The two descriptors using ternary patterns, LTP and ILTP, often outperform their counterparts using binary codes, the LBP and ILBP descriptors, suggesting that the use of ternary patterns has its advantage.

The two descriptors MBP and LQP are often found among the worst performing descriptors both regarding overall accuracy and robustness to noise. The reason for the poor performance of MBP can be explained by its definition. Using the median value as the local threshold results in that half of the gray levels in the neighborhood will be larger and half smaller. This restricts the possible binary codes, and as a consequent, restricts the amount of discriminative information that can be contained in the MBP descriptor.

GF involves convolution with relatively large (between 13 × 13 and 25 × 25 pixels) complex filter kernels and is hence slow in comparison to most of the other descriptors, proves to be a very noise robust descriptor for all datasets but not always among the best performing descriptors at low noise levels.

Under increasing levels of noise the discriminating power of the descriptors is expected to drop monotonically, or at least close to monotonically. This holds for most tests reported on here except for the results for the Mondial Marmi dataset which are somewhat odd, see Figure 8. While the mean classification accuracies have a decreasing trend, the curves are far from monotonically decreasing. One possible cause may be the JPEG compression artifacts present in this dataset. The blocking artifacts from the 8 × 8 blocks used in JPEG compression are at a scale comparable to that of the local neighborhoods used in the LBP family. As expected, GF, with its larger considered regions, shows a smoother decline under increasing levels of noise.

A comparison of the per-class performance and confusion matrices for the descriptors at a few noise levels has been done (data not shown). The LBP family of descriptors tend to have difficulties with mostly the same classes (MBP and LQP have additional difficulties). The per-class accuracy for GF and the LBP descriptors is often similar even though the LBP descriptors are more alike among themselves (apart from MBP). This is in line with the findings reported in [38]. The per-class accuracy for GLCM differs from those of the LBP family and GF mainly in that GLCM has additional difficulties discriminating a number of classes. FLBP has a high over all accuracy but with a slightly different pattern in the per-class accuracy compared to the rest of the LBP-family on the Brodatz, Kylberg, and Virus datasets. Similarly, GF has a slightly different distribution of per-class accuracy than the LBP-family on the Brodatz, KTH-TIPS2b, and Mondial Marmi datasets.

A different distribution of per-class accuracy indicates that the descriptors compared detect different characteristics of the textures. On some datasets used here a combination of ILTP or FLBP and GF could presumably be beneficial for the task of texture classification. However, combining texture descriptors to improve classification accuracy is not within the scope of this article.

In parallel with the 1-NN classifier used in the results reported in this article, SVMs were also investigated on the datasets without added noise using both a linear and a Gaussian kernel with optimized parameters. Similar descriptor parameter values were suggested by the SVM classifiers in the optimization procedure for the texture descriptors. For some dataset–descriptor combinations, the SVMs reached slightly higher classification accuracies. Nevertheless the 1-NN classifier was used in the tests reported on to make the comparison between the descriptors on the same and fair basis.


  1. Haralick RM, Shanmugam K, Dinstein I: Textural features for image classification. IEEE Trans. Syst. Man Cybern 1973, 3(6):610-621.

    Article  Google Scholar 

  2. Jain A: Learning texture discrimination masks. IEEE Trans. Pattern Anal. Mach. Intell 1996, 18(2):195-205. 10.1109/34.481543

    Article  Google Scholar 

  3. Pietikäinen M, Hadid A, Zhao G, Ahonen T: Computer Vision Using Local Binary Patterns, vol. 40 of Computational Imaging and Vision. London: Springer; 2011.

    Book  Google Scholar 

  4. Harwood D, Ojala T, Pietikäinen M, Kelman S, Davis L: Texture classification by center-symmetric auto-correlation, using Kullback discrimination of distributions. Pattern Recognit. Lett 1995, 16: 1-10. 10.1016/0167-8655(94)00061-7

    Article  Google Scholar 

  5. Wang L, He DC: Texture classification using texture spectrum. Pattern Recognit 1990, 23(8):905-910. 10.1016/0031-3203(90)90135-8

    Article  Google Scholar 

  6. Ojala T, Pietikäinen M, Harwood D: A comparative study of texture measures with classification based on featured distributions. Pattern Recognit 1996, 29: 51-59. 10.1016/0031-3203(95)00067-4

    Article  Google Scholar 

  7. Nanni L, Lumini A, Brahnam S: Survey on LBP based texture descriptors for image classification. Expert Syst. Appl 2012, 39(3):3634-3641. 10.1016/j.eswa.2011.09.054

    Article  Google Scholar 

  8. Nanni L, Lumini A, Brahnam S: Local binary patterns variants as texture descriptors for medical image analysis. Artif. Intell. Med 2010, 49(2):117-125. 10.1016/j.artmed.2010.02.006

    Article  Google Scholar 

  9. Fernández A, Álvarez M, Bianconi F: Texture description through histograms of equivalent patterns. J. Math. Imag. Vision 2013, 45: 76-102. 10.1007/s10851-012-0349-8

    Article  Google Scholar 

  10. Mäenpää T, Ojala T, Pietikäinen M, Soriano M: Robust texture classification by subsets of local binary patterns. In Proceedings 15th International Conference on Pattern Recognition, ICPR 2000. Barcelona, Spain; 2000:935-938.

    Google Scholar 

  11. Kylberg G, Uppström M, Sintorn IM: Virus texture analysis using local binary patterns and radial density profiles. In Proceedings of the 16th Iberoamerican Congress on Pattern Recognition, CIARP 2011, vol. 7042 of Lecture Notes in Computer Science. Pucón, Chile; 2011:573-580.

    Google Scholar 

  12. Granlund GH: In search of a general picture processing operator. Comput. Graph. Image Process 1978, 8(2):155-173. 10.1016/0146-664X(78)90047-3

    Article  Google Scholar 

  13. Brodatz P: Textures: A Photographic Album for Artists and Designers. New York: Dover Publications; 1966.

    Google Scholar 

  14. Caputo B, Hayman E, Mallikarjuna P: Class-specific material categorisation. In Proceedings of the 10th IEEE International Conference on Computer Vision, ICCV 2005. China: Beijing; 2005:1597-1604.

    Google Scholar 

  15. Kylberg G: The Kylberg Texture Dataset v. 1.0. External report (Blue series) 35, Centre for Image Analysis, Swedish University of Agricultural Sciences and Uppsala University. Uppsala, Sweden; 2011.

    Google Scholar 

  16. Fernández A, Álvarez MX, Bianconi F: Image classification with binary gradient contours. Opt. Lasers Eng 2011, 49(9–10):1177-1184.

    Article  Google Scholar 

  17. Lazebnik S, Schmid C, Ponce J: A sparse texture representation using local affine regions. IEEE Trans. Pattern Anal. Mach. Intell 2005, 27(8):1265-1278.

    Article  Google Scholar 

  18. Randen T: Brodatz textures at Trygve Randen’s website. 2011.

    Google Scholar 

  19. KTH-TIPS 2b 2012.

  20. Kylberg G: Kylberg Texture Dataset v.1.0. 2012.

    Google Scholar 

  21. Fernández A, Ghita O, González E, Bianconi F, Whelan PF: Evaluation of robustness against rotation of LBP, CCR and ILBP features in granite texture classification. Mach. Vis. Appl 2010, 22(6):913-926.

    Article  Google Scholar 

  22. Mondial Marmi Texture Dataset v. 1.1 2012.

  23. UIUC Texture Database 2012.

  24. Kylberg G: Virus Texture Dataset v. 1.0. 2012.

    Google Scholar 

  25. Ojala T, Mäenpää T: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Mach. Intell 2002, 24(7):971-987. 10.1109/TPAMI.2002.1017623

    Article  Google Scholar 

  26. Heikkilä M, Ahonen T 2012.

  27. Jin H, Liu Q, Lu H, Tong X: Face detection using improved LBP under Bayesian framework. In Proceedings of the 3rd International Conference on Image and Graphics, ICIG 2004. China: Hong Kong; 2004:306-309.

    Google Scholar 

  28. Hafiane A, Seetharaman G, Zavidovique B: Median binary pattern for textures classification. In Proceedings of the 4th International Conference, ICIAR 2007, vol. 4633 of Lecture Notes in Computer Science. Montreal, Canada; 2007:387-398.

    Google Scholar 

  29. Tan X, Triggs B: Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans. Image Process 2007, 19(6):1635-1650.

    MathSciNet  Google Scholar 

  30. Nanni L, Brahnam S, Lumini A: A local approach based on a local binary patterns variant texture descriptor for classifying pain states. Expert Syst. Appl 2010, 37(12):7888-7894. 10.1016/j.eswa.2010.04.048

    Article  Google Scholar 

  31. Heikkilä M, Pietikäinen M: A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell 2006, 28(4):657-662.

    Article  Google Scholar 

  32. Iakovidis DK, Keramidas EG, Maroulis D: Fuzzy local binary patterns for ultrasound texture characterization. In Proceedings of the 5th International Conference on Image Analysis and Recognition, ICIAR 2008, vol. 5112 of Lecture Notes in Computer Science. Portugal: Póvoa de Varzim; 2008:750-759.

    Google Scholar 

  33. Ahonen T, Pietikäinen M: Soft histograms for local binary patterns. In Proceedings of the Finnish Signal Processing Symposium, FINSIG 2007. Oulu, Finland; 2007:1-4.

    Google Scholar 

  34. Herve N, Servais A, Thervet E, Olivo-Marin JC, Meas-Yedid V: Statistical color texture descriptors for histological images analysis. In Proceedings of the IEEE International Symposium on Biomedical Imaging: From Nano to Macro, ISBI 2011. Chicago, USA; 2011:724-727.

    Chapter  Google Scholar 

  35. Bianconi F, Fernández A: Evaluation of the effects of Gabor filter parameters on texture classification. Pattern Recognit 2007, 40(12):3325-3335. 10.1016/j.patcog.2007.04.023

    Article  Google Scholar 

  36. Zhang D, Wong A, Indrawan M, Lu G: Content-based image retrieval using Gabor texture features. In Proceedings of the First IEEE Pacific-Rim Conference on Multimedia, PCM 2000. Sydney, Australia; 2000:1139-1142.

    Google Scholar 

  37. Kylberg G: FLBP and SLBP implementations for Matlab. 2013.

    Google Scholar 

  38. Ghita O, Ilea D, Fernandez A, Whelan P: Local binary patterns versus signal processing texture analysis: a study from a performance evaluation perspective. Sensor Rev 2012, 32(2):149-162. 10.1108/02602281211209446

    Article  Google Scholar 

Download references


The authors would like to thank Vladimir Ćurić for his input on notations. Some of the computations were performed on resources provided by SNIC through Uppsala Multidisciplinary Center for Advanced Computational Science (UPPMAX) under Project p2012012. This study is part of the MiniTEM E!6143 project funded by EU and EUREKA through the Eurostars Programme.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ida-Maria Sintorn.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Kylberg, G., Sintorn, IM. Evaluation of noise robustness for local binary pattern descriptors in texture classification. J Image Video Proc 2013, 17 (2013).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: