Multiscale texture retrieval based on low-dimensional and rotation-invariant features of curvelet transform
- Bulent Cavusoglu^{1}Email author
https://doi.org/10.1186/1687-5281-2014-22
© Cavusoglu; licensee Springer. 2014
Received: 17 August 2013
Accepted: 1 April 2014
Published: 11 April 2014
Abstract
Multiscale-based texture retrieval algorithms use low-dimensional feature sets in general. However, they do not have as good retrieval performances as those of the state-of-the-art techniques in the literature. The main motivation of this study is to use low-dimensional multiscale features to provide comparable retrieval performances with the state-of-the-art techniques. The proposed features of this study are low-dimensional, robust against rotation, and have better performance than the earlier multiresolution-based algorithms and the state-of-the-art techniques with low-dimensional feature sets. They are obtained through curvelet transformation and have considerably small dimensions. The rotation invariance is provided by applying a novel principal orientation alignment based on cross energies of adjacent curvelet blocks. The curvelet block pair with the highest cross energy is marked as the principle orientation, and the rest of the blocks are cycle-shifted around the principle orientation. Two separate rotation-invariant feature vectors are proposed and evaluated in this study. The first feature vector has 84 elements and contains the mean and standard deviation of curvelet blocks at each angle together with a weighting factor based on the spatial support of the curvelet coefficients. The second feature vector has 840 elements and contains the kernel density estimation (KDE) of curvelet blocks at each angle. The first and the second feature vectors are used in the classification of textures based on nearest neighbor algorithm with Euclidian and Kullback-Leibler distance measures, respectively. The proposed method is evaluated on well-known databases such as, Brodatz, TC10, TC12-t184, and TC12-horizon of Outex, UIUCTex, and KTH-TIPS. The best performance is obtained for kernel density feature vector. Mean and standard deviation feature vector also provides similar performance and has less complexity due to its smaller feature dimension. The results are reported as both precision-recall curves and classification rates and compared with the existing state-of-the-art texture retrieval techniques. It is shown through several experiments that the proposed rotation-invariant feature vectors outperform earlier multiresolution-based ones and provide comparable performances with the rest of the literature even though they have considerably small dimensions.
Keywords
Texture retrieval Low dimension Multiresolution Curvelet transform Rotation invariance Principle orientation1. Introduction
Texture classification and retrieval has been investigated by many researchers. Recognizing textures is essential in content-based image retrieval (CBIR) applications since images are actually constructed of many texture combinations. Unfortunately, textures rarely exist in a fixed orientation and scale. Hence, defining rotation-invariant features is important and rotation invariance is a hot research topic since 1980s. In one of the early works [1], rotation-invariant matched filters are used for rotation-invariant pattern recognition. The authors of [2] applied a model-based approach, in which they used statistical features of textures for classification. Using the statistics of spatial features as in [1, 2] may provide good results, however, it may include great interclass variations depending on the recording conditions of textures such as contrast, illumination, etc. Hence, multiscale techniques which have the capability of representing the feature in one or more resolution with lesser effect of these recording conditions have been used since 1990s. The main idea behind multiscale analysis in image processing is to provide the views of the same image in different resolutions to enhance the feature that can be more apparent in a specific resolution. In this way, it is easier to analyze or classify the image based on certain properties and certain scales. Nonstationary structures such as images require their multiscale transforms to be well localized both in time and frequency. However, according to Heisenberg's uncertainty principle, it is impossible to have localization both in time and frequency simultaneously. In other words, one cannot find a particular frequency to represent a certain point in time. Hence, frequency localizations require the time to be defined over a particular time window. It is also important that these localizations can be performed over orthogonal basis of tight frames. Wavelets [3] can address all these requirements. They are generated from one mother wavelet through translations and scalings. In one of the earliest works [4], the authors used statistics of Gabor wavelet as the features over Brodatz database while performing multiscale analysis for texture retrieval. However, the effects of rotations are not considered in this work. Another drawback of this work is such that wavelet transform is able to capture singularities around a point. The textures which have curvature-like structures may not provide good results by using the wavelet transform. Other transforms such as ridgelet [5] which extends wavelets to capture singularities along a line and curvelets [6, 7] which can capture singularities around a curve are proposed to overcome such issues. One promising result of curvelet is that it can capture the edge around a curve in terms of very few coefficients. This creates new opportunities in the area of image processing. Curvelets with their nice features are also used in texture retrieval [8]. However, rotation invariance is not considered in [8]. Rotation invariance in the multiscale framework was first investigated in [9] for Gabor wavelet features. In a similar work, the authors used Gaussianized steerable pyramids for providing rotation-invariant features in [10]. Wavelet-based rotation invariance is introduced in [11] using rotated complex wavelet filters and in [12] using wavelet-based hidden Markov trees. These works show the effectiveness of their methods on the average performance. The details of their work also reveal that the textures with curvature-like structures perform worse than other textures. Hence, curvelet is a good alternative to overcome such issues. However, the authors of [8] realized that curvelet is actually very orientation-dependent and sensitive to rotation. Then, they provided rotation-invariant curvelet features in [13, 14] based on comparison of energies of curvelet coefficients and realigning the curvelet blocks by cycle-shifting them with reference to the highest energy curvelet block. They showed that this scheme creates great advantage when compared to rotation-variant curvelet features. They also showed that their features provide better results when compared to wavelets and rotation-invariant Gabor filters. However, the authors of [15] indicated that the provided method of [13, 14] does not work for all the images, and they proposed another method based on modeling the curvelet coefficients as generalized Gaussian distributions (GGD) and then providing a distance measure by using Kullback-Leibler divergence between the statistical parameters of curvelets. It should be noted that they also use the highest energy curvelet block for circular shifting with the exception that they use only one reference point instead of using different reference points for each scale. This approach may provide good fits for higher scales of curvelet coefficients; however, lower levels of curvelet coefficients tend to not behave as Gaussian. In this study, we investigate the distributions of curvelet coefficients and use kernel density estimation (KDE) which provides better fits for lower scales as well. Although the complexity increases with density estimations, better results are obtained. There are also some latest and comprehensive works in texture retrieval trying to address both the scale invariance and rotation invariance issues. For instance in [16], Harris-Laplace detector [17] is used for salient region detection and then scale-invariant feature transformation (SIFT) [18] is used in order to provide scale invariance and rotation-invariant feature transformation (RIFT) [19] is used for rotation invariance. The results are pretty good; however, feature vector sizes are considerably large, 5,120 (40 × 128) for SIFT descriptor with earth mover distance (EMD). In [20], local binary pattern (LBP) variance is used for rotation invariance, in which two principle orientations are found and local binary pattern variances are used for texture retrieval. The feature dimensions of [20] with feature reduction are in the range of 1,000 s. In [21], both the scale and rotation variance are considered together using LBP, and it provides promising results again with feature sizes around LBP variants.
The main motivation of this study is to provide good retrieval performance with low-dimensional feature sets. The multiresolution structure in the literature has low-dimensional feature sets but not in the desired range of performances. In this study, we provide solutions for low-dimensional rotation-invariant multiresolution features with good retrieval performances by using curvelet transformation. First, a novel method is introduced for obtaining rotation-invariant curvelet features. The proposed method is based on cross energy principle. Second, the low-dimensional feature set based on mean and standard deviation of curvelet coefficients, used in the literature [13, 14], is modified to reflect its support region. The size of this feature vector is 84, and the increase in the performance by this modification is also shown. Third, we use kernel density estimate, a nonparametric density estimation, of curvelet coefficients to estimate the densities and use symmetric Kullback-Leibler distance as the distance measure. Although this feature set has higher dimension, 840, it provides better results and still remains in the low complexity region when compared with the other methods in the literature. It is shown through experiments that the results of the proposed feature sets are better than those of the state-of-the-art techniques in low dimension and comparable in medium dimension feature sets. The organization of the paper is as follows. First, multiresolution transforms are introduced in Section 2. Second, Section 3 explains the proposed texture retrieval scheme. Third, the proposed rotation invariance method is provided in Section 4, and classification is explained in Section 5. Fourth, the experimental results are presented in Section 6. Then, Section 7 includes discussions and comparisons with state-of-the-art texture retrieval techniques. Finally, Section 8 includes conclusions.
2. Background
Multiscale transforms are widely used in CBIR and texture retrieval. Hence, in order to better appreciate and understand the multiscale transforms, especially the curvelet transform, we briefly define wavelets, ridgelets, and curvelet transforms in this section.
2.1. Wavelets
2.2. Ridgelets
2.3. Curvelets
3. Proposed texture retrieval scheme
3.1. Feature extraction
3.2. Mean standard deviation feature vector F_{ μσ }
The images we use are either 128 × 128 or converted to 128 × 128 in the preprocessing stage during our work, and the feature vector used in this study has five scales. Considering 8 angles at 2nd, 16 angles at 3rd and 4th, and 1 for 1st and 5th scales, the size of the feature vector is (1 + 8 + 16 + 16 + 1) × 2 = 84.
3.3. Kernel density feature vector F_{ KDE }
where K is the kernel function and h is the smoothing parameter called bandwidth. The kernel function used in this study is normal kernel with zero mean and unity variance. Each kernel is placed on the data points and normalized over the data to obtain the kernel estimation. A more depth analysis on KDE is given in [23]. The histogram of the curvelet coefficients, corresponding Gaussian fit, and KDE is shown in Figure 6. As it can be seen from the figure, KDE provides much better fit than Gaussian. The non-Gaussian structure of curvelet coefficients can be observed for second-level coefficients of a sample image given in Figure 6. We have evaluated the kernel density at 20 bins, resulting in a feature vector dimension of 840 (42 × 20).
4. Rotation invariance
4.1. Effect of rotation on curvelet transform
This nonuniformity can be observed in Figure 8c,d. In order to overcome this issue, first, we propose to find the most robust area of the image against rotation based on curvelet transform and mark that point as principle orientation; then perform an alignment by cycle-shifting the feature vector with reference to principle orientation. In order to find the least affected rotation angle, we perform cross-correlation check for two adjacent curvelet coefficients at each scale.
4.2. Principle orientation detection
In order to minimize the effect of rotation in the texture, it is necessary to find a reference point, namely, principle orientation, so that all feature vectors can be synchronized by reordering the features. The rotation dependence is expected to be eliminated after the synchronization. The authors of [13, 14] suggest a synchronization routine by means of the curvelet block with the maximum energy. We propose to use cross energy of adjacent curvelet blocks for the principle orientation detection, and the procedure is explained in the following subsection.
4.3. Cross-correlation and cross energy of curvelet coefficients at adjacent angles
The cross-correlation function actually reflects the cross energies for different lags. In obtaining the latter curvelet coefficient on the right hand side of Equation 17, only a rotation is applied to curvelet operator while the image stands still. Also, as it can be seen from Equation 9 that this rotation operator is not supposed to cause a lag in the latter coefficient. Hence, it is expected to get the maximum value of cross-correlation function at 0th lag, that is R_{s,ℓ}(0, 0). As a result, Equation 17 can be used to detect the highest cross-energy blocks. Another view can be expressed as follows: by analyzing the adjacent blocks of curvelet transform in terms of their cross-correlation quantities, one may find the orientation for each scale which is the least affected by rotation. In other words, getting a high correlation between two adjacent blocks means that the directional change has little effect on curvelet coefficients for the specific two orientations at hand. In short, if curvelet coefficients of two adjacent blocks of an image at specific orientation give the highest values, they will also be the ones with the highest correlation values for the rotated version of original texture. The proposed method is structured based on this approach. Since rotation of curvelet operator and rotation of image has the same effect, the observed angle between the curvelet operator and the image for the highest correlation value remains fixed. Based on this principle, we determine the fixed angle by searching for the highest cross correlation and take the first of the highest cross-energy (correlated) blocks as the principle block (orientation) and then cycle-shift all the coefficients in reference to the principle orientation. Hence, this operation provides an alignment based on the highest cross-energy principle. Once the cross-correlation functions are obtained for all scales except the coarsest and finest due to the fact that there is only one coefficient matrix for them, the curvelet coefficients are aligned with reference to the highest 0th lag value of cross-correlations in each scale. The dimension mismatch is generally the case faced for two coefficient matrices of adjacent orientations. If there are not enough coefficients to match the larger sized coefficient block, then the smaller sized coefficient block is padded with zero coefficients in order to overcome the dimension mismatch problem. This zero-filling solves the dimension mismatch problem and does not affect the cross energy.
4.3. Closer look on principle orientation alignment based on cross energy
It should also be noted that the curvelet coefficients of unrotated and rotated images show some differences even after principle orientation alignment. This can also be observed by comparing the first and second columns of Figure 12. This is due to the fact that the curvelet coefficients of these images may be similar; however, it is hardly likely that they will be the same. Hence, the purpose of the alignment is to make the curvelet coefficients of two images comparable as much as possible.
4.5. PO-aligned feature vectors
The mean standard deviation, F_{ μσ }, and kernel density, F_{KDE}, feature vectors are aligned according to principle orientation, following the principle orientation detection. The aligned feature vectors are cycle-shifted versions of the initial ones. The PO-aligned mean-standard deviation feature vector and kernel density feature vector are denoted as ${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ and ${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$, respectively. The rotation-invariant mean-standard deviation feature vector without scaling, ${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathbf{\prime},\mathrm{\text{PO}}}$, is also used in our simulations for comparison purposes. The proposed PO-aligned feature vectors are used in the classification process in this study.
5. Classification
The classification is performed based on nearest neighbor (NN) classifier. In NN, the query image is compared against the training images of all the classes and the image is assigned to the class which has the minimum distance with. Separate distance measures are used in this study for each proposed feature vector. Euclidian distance is used with the mean and standard deviation feature vector and Kullback-Leibler distance measure with kernel density feature vector.
5.1. Distance measures
Euclidian distance
where ${\mathbf{F}}_{\mathit{i},\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ and ${\mathbf{F}}_{\mathit{j},\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ are the feature vector of query image (ith image of the database) and the training image (the feature vector of the jth database image), respectively.
Symmetric Kullback-Leibler distance
6. Experimental results
- (a)
Training images: They are selected randomly from each class of each database. Number of training images is varied from 10 to 70 in increments of 10 s. The results are reported separately for various numbers of training images.
- (b)
Query images: Training images are excluded from the database, and the remaining images are used as queries. The average classification and precision-recall results are reported.
- (c)
Brodatz database: The database is proposed in [24] and includes 112 classes, each with 192 images. In order to create large enough database with translations and rotations, first nonrotated test images are created by dividing each original 512 × 512 image into 16 nonoverlapping 128 × 128 regions; then, 12 rotated test images are obtained for multiple of 30° rotations. The reason for 30° rotations is to obtain results, comparable with [24] which uses the same database with the same setup. A database of 21,504 images (112 × 16 × 12) is constructed in this way. In this setup, each class includes 192 images.
- (d)
Outex TC10 database: The database is proposed in [25] and includes 24 classes each with 180 images. The images are recorded under incandescent (inca) illumination. Each class consists of 20 non-overlapping portions of the same texture with 9 different orientations (0, 5, 10, 15, 30, 45, 60, 75, 90). The database includes a total of 4,320 images (24 × 20 × 9).
- (e)
Outex TC12-horizon database: The database is proposed in [25] and includes 24 classes and 180 images for each class. The same setup of Outex TC10 database is used except that the images are recorded under horizon (horizon sunlight) illumination.
- (f)
Outex TC12-t184 database: The database is proposed in [25] and includes 24 classes and 180 images for each class. Same setup is used as Outex TC10 database except that the images are recorded under t184 (fluorescent 184) illumination.
- (g)
KTH-TIPS database: The database is proposed in [26] and includes 10 classes and 81 images for each class. The images are recorded under varying illumination, pose, and scale. The database includes total of 810 images (10 × 81).
- (h)
UIUCTex database: The database is proposed in [19] and includes 25 classes and 40 images for each class. The images include significant scale and viewpoint variations as well as rotations. The database includes a total of 1,000 images (25 × 40).
The experimental results are reported under two main performance measurement categories, precision-recall curves and classification accuracies. The studies in the literature make use of both performance measures. In order to make our work easily comparable with future works as well as the literature, we have provided our results under these two categories. In order to see only the effect of principle orientation alignment and performance of two feature vectors of this study, the results of the proposed methods are compared generally with only one reference from the literature. The results of [13] are used for general comparison purposes with our results since the authors of [13] also use curvelet features. We make a broader comparison with the literature in the discussion section.
6.1. Precision-recall curves
6.2. Classification rates
Classification rates (%)
Classification | TC10 | TC12-horizon | TC12-t184 | KTH-TIPS | Brodatz | UIUCTex | |
---|---|---|---|---|---|---|---|
Train 10 | ${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ | 91.94 | 92.06 | 90.24 | 80.70 | 95.39 | 65.63 |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ | 91.15 | 91.48 | 88.37 | 75.41 | 92.73 | 65.21 | |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathbf{\prime},\mathrm{\text{PO}}}$ | 89.47 | 89.39 | 85.78 | 69.21 | 90.86 | 52.10 | |
F ^{[13]} | 82.97 | 82.48 | 78.23 | 68.11 | 82.27 | 46.82 | |
Train 20 | ${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ | 94.87 | 95.66 | 93.59 | 86.69 | 97.24 | 71.41 |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ | 94.80 | 95.15 | 93.32 | 82.56 | 95.81 | 70.36 | |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathbf{\prime},\mathrm{\text{PO}}}$ | 92.83 | 92.81 | 90.22 | 76.36 | 94.23 | 57.95 | |
F ^{[13]} | 88.72 | 87.94 | 84.78 | 76.20 | 88.69 | 51.73 | |
Train 30 | ${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ | 96.60 | 96.64 | 95.25 | 88.31 | 98.02 | 74.08 |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ | 96.12 | 96.50 | 95.06 | 87.92 | 96.88 | 73.79 | |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathbf{\prime},\mathrm{\text{PO}}}$ | 94.62 | 94.46 | 91.91 | 79.57 | 95.72 | 61.81 | |
F ^{[13]} | 90.77 | 90.99 | 87.39 | 77.37 | 91.08 | 53.95 | |
Train 40 | ${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ | 97.53 | 97.53 | 95.94 | 90.83 | 98.47 | |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ | 96.86 | 97.49 | 95.86 | 89.42 | 97.70 | ||
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathbf{\prime},\mathrm{\text{PO}}}$ | 95.24 | 95.41 | 92.85 | 83.27 | 96.71 | ||
F ^{[13]} | 92.58 | 92.36 | 88.89 | 82.15 | 93.04 | ||
Train 50 | ${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ | 97.82 | 98.05 | 96.62 | 92.19 | 98.91 | |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ | 97.60 | 97.82 | 96.46 | 91.10 | 98.05 | ||
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathbf{\prime},\mathrm{\text{PO}}}$ | 95.90 | 95.99 | 93.74 | 84.32 | 97.53 | ||
F ^{[13]} | 93.30 | 93.20 | 90.23 | 82.77 | 94.01 | ||
Train 60 | ${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ | 98.00 | 98.40 | 97.20 | 92.84 | 99.04 | |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ | 97.73 | 98.22 | 96.92 | 91.52 | 98.37 | ||
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathbf{\prime},\mathrm{\text{PO}}}$ | 96.16 | 96.40 | 94.54 | 84.00 | 97.64 | ||
F ^{[13]} | 94.10 | 93.74 | 91.03 | 83.14 | 94.76 | ||
Train 70 | ${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ | 98.33 | 98.46 | 97.44 | 93.09 | 99.18 | |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ | 98.22 | 98.29 | 97.28 | 92.36 | 98.72 | ||
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathbf{\prime},\mathrm{\text{PO}}}$ | 96.52 | 96.44 | 94.86 | 86.00 | 98.10 | ||
F ^{[13]} | 94.40 | 94.63 | 91.79 | 85.64 | 95.34 |
7. Discussion
Comparison of classification rates with KLD of[15] for number of training images of 60 and 70
Train 60 | Train 70 | |
---|---|---|
Classification | KTH-TIPS | KTH-TIPS |
${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ | 92.84 | 98.40 |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ | 91.52 | 98.22 |
KLD of [15] | 83.60 | 86.90 |
Comparison of classification rates with LBP variants of[20] for training number of 20
Classification | Feature dimension | TC10 | TC12-horizon | TC12-t184 |
---|---|---|---|---|
${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ | 840 | 94.87 | 95.66 | 93.59 |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ | 84 | 94.80 | 95.15 | 93.32 |
${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ out of class | 840 | 94.87 | 87.35 | 86.25 |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ out of class | 84 | 94.80 | 86.73 | 85.51 |
${\mathrm{\text{LBP}}}_{24,3}^{\mathit{\text{riu}}2}/{\mathrm{\text{VAR}}}_{24,3}$[20] | 416 | 98.15 | 87.03 | 87.15 |
${\mathrm{\text{LBP}}}_{8,1}^{\mathit{\text{riu}}2}/{\mathrm{\text{VAR}}}_{8,1}$[20] | 160 | 96.66 | 77.98 | 79.25 |
${\mathrm{\text{LBP}}}_{24,3}^{\mathit{u}2}{\mathrm{\text{GM}}}_{\mathrm{\text{ES}}}$[20] | 13251 | 97.76 | 95.57 | 95.39 |
${\mathrm{\text{LBP}}}_{8,1}^{\mathit{u}2}{\mathrm{\text{GM}}}_{\mathrm{\text{ES}}}$[20] | 451 | 73.64 | 76.57 | 72.47 |
${\mathrm{\text{LBPV}}}_{24,3}^{\mathit{u}2}{\mathrm{\text{GM}}}_{\mathrm{PD}2}$[20] | 2211 | 97.55 | 94.18 | 94.23 |
${\mathrm{\text{LBPV}}}_{8,1}^{\mathit{u}2}{\mathrm{\text{GM}}}_{\mathrm{PD}2}$[20] | 227 | 72.99 | 76.15 | 72.19 |
Comparison of classification rates with[16] for indicated training numbers
Classification | Feature size | UIUCTex | KTH-TIPS | Brodatz |
---|---|---|---|---|
${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ | 840 | 71.41 | 90.83 | 92.00 |
${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ | 84 | 70.36 | 89.42 | 90.78 |
HSR + LSR SIFT [16] | 40 × 128 = 5,120 | 98.00 | 92.70 | 94.00 |
HSR + LSR RIFT [16] | 40 × 100 = 4,000 | 96.00 | 86.70 | 89.60 |
Performance comparison of the proposed algorithms with the literature
Algorithm | Feature size | Comput. complexity of feature | Class. method | Better than the proposed algorithms | Comparable with the proposed algorithms | Worse than the proposed algorithms | |
---|---|---|---|---|---|---|---|
1 | ${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathbf{\prime}}$ | 84 | Low | NN | - | - | a,b,c,d,e,f |
2 | F ^{[13]} | 84 | Low | NN | - | - | a,b,c,d,e,f |
3 | ${\mathrm{\text{LBP}}}_{8,1}^{\mathit{\text{riu}}2}/{\mathrm{\text{VAR}}}_{8,1}$[20] | 160 | Medium | NN | b | - | c,d |
4 | ${\mathrm{\text{LBPV}}}_{8,1}^{\mathit{u}2}{\mathrm{\text{GM}}}_{\mathrm{PD}2}$[20] | 227 | Medium | NN | - | - | b,c,d |
5 | ${\mathrm{\text{LBP}}}_{24,3}^{\mathit{\text{riu}}2}/{\mathrm{\text{VAR}}}_{24,3}$[20] | 416 | Medium | NN | b | c,d | - |
6 | ${\mathrm{\text{LBP}}}_{8,1}^{\mathit{u}2}{\mathrm{\text{GM}}}_{\mathrm{\text{ES}}}$[20] | 451 | Medium | NN | - | - | b,c,d |
7 | KLD [15] | 840 | Medium | NN | - | - | a,e |
8 | ${\mathrm{\text{LBPV}}}_{24,3}^{\mathit{u}2}{\mathrm{\text{GM}}}_{\mathrm{PD}2}$[20] | 2,211 | High | NN | b,c,d | - | - |
9 | ${\mathrm{\text{LBP}}}_{24,3}^{\mathit{u}2}{\mathrm{\text{GM}}}_{\mathrm{\text{ES}}}$[20] | 13,251 | High | NN | b,c,d | - | - |
10 | HSR + LSR [16] RIFT | 4,000 | High | SVM | f | - | a,e |
11 | HSR + LSR [16] SIFT | 5,120 | High | SVM | a,e,f | - | - |
Table 5 gives a comparison of the proposed algorithm with the rest of the literature in terms of performance, dimension, and complexity in related databases. The algorithms in the top three rows are based on curvelet transformation, which are shown to outperform earlier multiscale-based texture classification methods. It is clear from the table that although they have similar feature sizes and complexities with the proposed algorithms, the proposed algorithms outperform all of them. The variants of LBP proposed in [20] are given in the table as well. It is seen that for dimension size of 160, ${\mathrm{\text{LBP}}}_{8,1}^{\mathit{\text{riu}}2}/{\mathrm{\text{VAR}}}_{8,1}$ algorithm provides worse results than the proposed algorithms in TC12-horizon and TC12-t184 databases but better result in TC-10 database. The performance of the proposed algorithms are better than ${\mathrm{V}}_{8,1}^{\mathit{u}2}{\mathrm{\text{GM}}}_{\mathrm{PD}2}$, whose dimension size is 227, in all of the compared databases. ${\mathrm{\text{LBPV}}}_{8,1}^{\mathit{u}2}{\mathrm{\text{GM}}}_{\mathrm{PD}2}$ and ${\mathrm{\text{LBP}}}_{24,3}^{\mathit{u}2}{\mathrm{\text{GM}}}_{\mathrm{\text{ES}}}$ whose dimensions are 2,211 and 13,251, respectively, provide better results than the proposed algorithms at a high cost of increased feature size. The algorithms of [16] are provided in the 10th and 11th rows. It should be noted that their classification algorithm is based on SVM, an algorithm with higher computational complexity. Moreover, their feature vectors have higher dimensions than those of the proposed algorithms. Even though their algorithm provides better results for HSR + LSR + SIFT, the proposed algorithms outperform HSR + LSR + RIFT in Brodatz and KTH_TIPS databases. This result is suspected to arise from the deduced feature size and less satisfactory performance of RIFT in scale-variant database (KTH-TIPS). In general, the proposed algorithms of this study outperform all of the multiscale-based texture classification algorithms. They also outperform LBP variants of [20] with small dimensions. The performance of the algorithm of [20] with high dimensions is better as expected. Proposed algorithms also outperform the algorithms of [16] in smaller dimensions especially in rotation-variant databases.
Finally, we mention one of the latest works in the rotation and scale-invariant texture retrieval published by Li et al. [21]. They provide scale invariance by finding optimal scale of each pixel. They modify Outex and Brodatz databases to include enough scale and rotation variations and report their results on these databases. For scale and rotation invariance feature, they report average precision rates around 69% for Brodatz and 60% for Outex database. Since they use a modified database, including this database will extend the scope of this study considerably, and we are leaving the scale invariance and the comparison with their database as our future work.
8. Conclusions
Low-dimensional and rotation-invariant curvelet features for multiscale texture retrieval are proposed through two feature vectors in this study. This study is important since it provides the best results for multiscale texture retrieval in the literature to the best of our knowledge. Moreover, the results are comparable with the state-of-the-art techniques in low and medium feature dimension sizes. Rotation invariance is provided by using the cross energies of curvelet blocks at adjacent orientations. The orientations with maximum cross energy are defined as the principle orientation of an image, which is the least affected location by rotation. The corresponding location is selected as the reference point for the image, and the feature vector is cycle-shifted based on this reference point. The feature vector ${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ has 84 elements. The other proposed feature vector ${\mathbf{F}}_{\mathrm{\text{KDE}}}^{\mathrm{\text{PO}}}$ uses KDE, and it has 840 elements. It provides better results than ${\mathbf{F}}_{\mathit{\text{\mu \sigma}}}^{\mathrm{\text{PO}}}$ in the expense of increased complexity. The texture retrieval results of the proposed method are better than earlier works which make use of other rotation-invariant curvelet features and are comparable with the state-of-the-art works in the literature, especially in the low and medium feature dimension ranges. As a result, we provide a novel rotation invariance method for curvelets and two separate feature vectors for texture retrieval in this study. The proposed methods suggest highly effective discriminative power for texture retrieval. The comparisons with the literature show the effectiveness of the proposed algorithms since they provide good performances with low complexity. Addition of scale invariance for curvelet features may provide better results. Thus, we plan to extend this study for scale-invariant features of curvelet transform as our future work.
Declarations
Authors’ Affiliations
References
- Arsenault HH, Hsu YN, Chalasinskamacukow K: Rotation-invariant pattern-recognition. Opt. Eng. 1984, 23: 705-709.View ArticleGoogle Scholar
- Kashyap RL, Khotanzad A: A model-based method for rotation invariant texture classification. IEEE T. Pattern Anal. 1986, 8: 472-481.View ArticleGoogle Scholar
- Mallat SG: A theory for multiresolution signal decomposition - the wavelet representation. IEEE T. Pattern Anal 1989, 11: 674-693. 10.1109/34.192463View ArticleMATHGoogle Scholar
- Manjunath BS, Ma WY: Texture features for browsing and retrieval of image data. IEEE T. Pattern Anal. 1996, 18: 837-842. 10.1109/34.531803View ArticleGoogle Scholar
- Do MN, Vetterli M: The finite ridgelet transform for image representation. IEEE T. Image Process. 2003, 12: 16-28. 10.1109/TIP.2002.806252MathSciNetView ArticleMATHGoogle Scholar
- Candes EJ, Donoho DL: Curvelets, multiresolution representation, and scaling laws. Wavelet Appl Signal Image ProcessViii Pts 1 and 2 2000, 4119: 1-12.Google Scholar
- Candes EJ, Donoho DL: New tight frames of curvelets and optimal representations of objects with piecewise C-2 singularities. Commun. Pur. Appl. Math. 2004, 57: 219-266. 10.1002/cpa.10116MathSciNetView ArticleMATHGoogle Scholar
- Sumana IJ, Islam M, Zhang DS, Lu GJ: Content based image retrieval using curvelet transform, vol. 1 and 2 (2008 IEEE 10th Workshop on Multimedia Signal Processing, 2008). Queensland, Australia; 2008:11-16.Google Scholar
- Haley GM, Manjunath BS: Rotation-invariant texture classification using a complete space-frequency model. IEEE T. Image Process. 1999, 8: 255-269. 10.1109/83.743859View ArticleGoogle Scholar
- Tzagkarakis G, Beferull-Lozano B, Tsakalides P: Rotation-invariant texture retrieval with Gaussianized steerable pyramids. IEEE T. Image Process. 2006, 15: 2702-2718.View ArticleGoogle Scholar
- Kokare M, Biswas PK, Chatterji BN: Rotation-invariant texture image retrieval using rotated complex wavelet filters. IEEE T. Syst. Man. Cy. B 2006, 36: 1273-1282.View ArticleGoogle Scholar
- Rallabandi VR, Rallabandi VPS: Rotation-invariant texture retrieval using wavelet-based hidden Markov trees. Signal Process 2008, 88: 2593-2598. 10.1016/j.sigpro.2008.04.019View ArticleMATHGoogle Scholar
- Zhang DS, Islam MM, Lu GJ, Sumana IJ: Rotation invariant curvelet features for region based image retrieval. Int. J. Comput. Vision 2012, 98: 187-201. 10.1007/s11263-011-0503-6MathSciNetView ArticleGoogle Scholar
- Islam MM, Zhang DS, Lu GJ: Rotation invariant curvelet features for texture image retrieval, Presented at the Icme, vol. 1–3 (2009 IEEE International Conference on Multimedia and Expo. New York, NY, USA; 2009.Google Scholar
- Gomez F, Romero E: Rotation invariant texture characterization using a curvelet based descriptor. Pattern Recogn Lett 2011, 32: 2178-2186.View ArticleGoogle Scholar
- Zhang J, Marszalek M, Lazebnik S, Schmid C: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vision 2007, 73: 213-238. 10.1007/s11263-006-9794-4View ArticleGoogle Scholar
- Mikolajczyk K, Schmid C: Scale & affine invariant interest point detectors. Int. J. Comput. Vision 2004, 60: 63-86.View ArticleGoogle Scholar
- Lowe D: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 2004, 60: 91-110.View ArticleGoogle Scholar
- Lazebnik S, Schmid C, Ponce J: A sparse texture representation using local affine regions. IEEE T. Pattern Anal. Mach. Intel. 2005, 27: 1265-1278.View ArticleGoogle Scholar
- Guo ZH, Zhang L, Zhang D: Rotation invariant texture classification using LBP variance (LBPV) with global matching. Pattern Recogn 2010, 43: 706-719. 10.1016/j.patcog.2009.08.017View ArticleMATHGoogle Scholar
- Li Z, Liu GZ, Yang Y, You JY: Scale- and rotation-invariant local binary pattern using scale-adaptive texton and subuniform-based circular shift. IEEE T. Image Process. 2012, 21: 2130-2140.MathSciNetView ArticleGoogle Scholar
- Lazebnik S, Schmid C, Ponce J: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories, in Computer Vision and Pattern Recognition (IEEE Computer Society Conference on, 2006. New York, NY, USA; 2006:2169-2178.Google Scholar
- Silverman B: Density Estimation for Statistics and Data Analysis. London: Chapman & Hall; 1986.View ArticleMATHGoogle Scholar
- Brodatz P: Textures: A Photographic Album for Artists and Designers. NewYork: Dover; 1966.Google Scholar
- Ojala T, Maenpaa T, Pietikainen M, Viertola J, Kyllonen J, Huovinen S: Outex - new framework for empirical evaluation of texture analysis algorithms, in Pattern Recognition, 2002, vol. 1 (Proceedings. 16th International Conference on, 2002). Quebec City, QC, Canada; 701-706.Google Scholar
- Hayman E, Caputo B, Fritz M, Eklundh J-O: On the significance of real-world conditions for material classification. In Computer Vision. Edited by: Pajdla T, Matas J. Berlin Heidelberg: Springer; 2004:253-266.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.