Concave-convex local binary features for automatic target recognition in infrared imagery

Sun, Junding; Fan, Guoliang; Yu, Liangjiang; Wu, Xiaosheng

doi:10.1186/1687-5281-2014-23

Research
Open access
Published: 21 April 2014

Concave-convex local binary features for automatic target recognition in infrared imagery

Junding Sun¹,
Guoliang Fan²,
Liangjiang Yu² &
…
Xiaosheng Wu¹

EURASIP Journal on Image and Video Processing volume 2014, Article number: 23 (2014) Cite this article

6650 Accesses
16 Citations
Metrics details

Abstract

This paper presents a novel feature extraction algorithm based on the local binary features for automatic target recognition (ATR) in infrared imagery. Since the inception of the local binary pattern (LBP) and local ternary pattern (LTP) features, many extensions have been proposed to improve their robustness and performance in a variety of applications. However, most attentions were paid to improve local feature extraction with little consideration on the incorporation of global or regional information. In this work, we propose a new concave-convex partition (CCP) strategy to improve LBP and LTP by dividing local features into two distinct groups, i.e., concave and convex, according to the contrast between local and global intensities. Then two separate histograms built from the two categories are concatenated together to form a new LBP/LTP code that is expected to better reflect both global and local information. Experimental results on standard texture images demonstrate the improved discriminability of the proposed features and those on infrared imagery further show that the proposed features can achieve competitive ATR results compared with state-of-the-art methods.

Introduction

Automatic target recognition (ATR) is an important and challenging problem for a wide range of military and civilian applications. Many ATR algorithms have been proposed for forward-looking infrared (FLIR) imagery which can be roughly classified into two groups, i.e., learning-based and model-based[1]. For the learning-based methods, a classifier or a subspace representation which is learned for a set of labeled training data is used for target recognition and classification [2, 3]. On the other hand, the model-based approaches involve a set of target templates or feature maps created from CAD models or a model database and then match them with observed features to fulfill the ATR task [4–8]. Li et al. [1] gave an excellent survey of traditional ATR algorithms, including convolution neural network (CNN), principal component analysis (PCA), linear discriminant analysis (LDA), learning vector quantization (LVQ), modular neural networks (MNN), and two model-based algorithms using the Hausdorff metric and geometric hashing. In [9], Patel et al. introduced an interesting ATR algorithm that was motivated by sparse representation-based classification (SRC) [10], which outperforms the traditional ones with promising results. Recently, there are also many hybrid vision approaches that combine learning-based and model-based ideas for object tracking and recognition in visible-band images [11–13].

As one of the learning approaches, the ATR task has also been cast as a texture analysis problem due to rich texture characteristics in most infrared imagery. Various texture-like features, including geometric, topological, and spectral features, were proposed for appearance-based ATR as reviewed in [14]. Moreover, wavelet-based PCA and independent component analysis (ICA) methods were developed in [15]. Both wavelet and Gabor features were used for target detection in [16]. A texture feature coding method (TFCM) was proposed for synthetic aperture radar (SAR)-based ATR in [17]. In this work, we are interested in the texture-based ATR approach. Particularly, we are focused on how to extract effective yet simple local binary pattern (LBP) operators for infrared ATR. This research is motivated by the rapid growth of the development of various LBP features and their promising results in several ATR applications, such as maritime target detection and recognition in [18], infrared building recognition in [19], and ISAR-based ATR in [20].

The LBP operator was proposed by Ojala et al. in [21], which is based on the so-called texture unit (TU) introduced by Wang and He in [22]. LBP has been proved a robust and computationally simple approach to describe local structures and has been extensively exploited in many applications, such as texture analysis and classification [23–30], face recognition [31–36], motion analysis [37–39], and medical image analysis [40–42]. Since Ojala’s original work, the LBP methodology has been developed with large number of extensions for improved performance. For example, Ojala et al. [43] extended the basic LBP to multiresolution gray scale, uniform, and rotation invariant patterns in order to achieve rotation invariance, optional neighborhoods, and strong discriminative capability. Tan and Triggs [32] proposed the local ternary pattern (LTP) to quantize the intensity difference between a pixel and its neighbors into three levels to further enhance the robustness to noise. LTP has been proven effectively in face recognition. Ahonen and Pietikäinen [44] introduced soft LBP (SLBP) to enhance the feature robustness in the sense that a small change in an input image causes only a small change in the feature extraction output by employing fuzzy theory. Guo et al. [45] proposed a complete LBP (CLBP), which fuses the signs (the same as the basic LBP) with the absolute value of the local intensity difference and the central gray level to improve the discriminative capability. Liao et al. [46] proposed a multiblock LBP (MB-LBP), which, instead of comparing pixels, compares local average intensities of neighboring sub-regions to capture not only the microstructures but also the macrostructures. Wolf et al. [47] proposed a similar scheme as [46], called three-patch LBP (TP-LBP) and four-patch LBP (FP-LBP), which involves a unique distance function to compare local blocks (patches), instead of a single pixel as in [21] or average intensity as in [46].

In this work, we are interested in the applicability of LBP/LTP for infrared ATR. We also take a different perspective to improve LBP/LTP features in this new context. Almost all the extensions of the basic LBP features mentioned above focus on local feature extraction in a small neighborhood. However, the local feature may not be sufficient for rotation invariant texture classification and other applications that require some information in a larger area. We will give an example shortly, where it is shown that the two local patches with entirely different gray levels may have the exact same LBP code. Therefore, we propose a concave-convex partition (CCP) scheme to involve global information (i.e., the global mean) for LBP/LTP encoding. We also introduce a simple yet effective blocking technique to further improve the feature discriminability for infrared ATR. We evaluate the newly proposed CCP technique in two standard texture databases, and we compare also our CCP-based local binary features (LBP/LTP) with the latest sparsity-based ATR algorithm proposed in [9] where we have used four implementations from SparseLab [48] and SPGL1 [49].

The rest of the paper is organized as follows. In Section ‘Brief review of LBP-based methods’, we briefly discussed the basic LBP and LTP. Section ‘Concave-convex local binary features’ discusses the limitation of traditional LBP features and presents the new CCP scheme in detail. Section ‘Experiments and discussions’ reports the experimental results on two texture databases and a recent ATR database. Finally, we conclude our study in Section ‘Conclusions’.

Brief review of LBP-based methods

Since its origination, there are many extensions of LBP. In this section, we only give a brief introduction of the basic LBP and one of its extensions, LTP.

Local binary pattern

The basic LBP operator is first introduced in [21] for texture analysis. It works by thresholding a neighborhood with the gray level of the central pixel. The LBP code is produced by multiplying the thresholded values by weights given by powers of 2 and adding the results in a clockwise way. It was extended to achieve rotation invariance, optional neighborhoods, and stronger discriminative capability in [43]. The basic LBP is commonly referred as LBP_P,R, which is written as

{LBP}_{P, R} = \sum_{i = 0}^{P - 1} s (p_{i} - p_{c}) \times 2^{i},

(1)

where $s (x) = \{\begin{array}{l} 1 & if x \geq 0 \\ 0 & otherwise, \end{array}$ , P is the number of sampling pixels on the circle, R is the radius of the circle, p_c corresponds to the gray value of the central pixel, and p_i corresponds to the gray value of each sampling pixels on the circle. Figure 1 gives an example of calculating LBP code with P = 8,R = 1. In Figure 2, three different neighborhoods with a varying number of samples P and different neighborhood radii R are shown. It should be noted that the gray value of neighboring pixels which do not fall exactly on the grid is calculated by linear interpolation of the neighboring pixels.

The classical description of the rotation invariant LBP operator [43] can be defined as below:

{LBP}_{P, R}^{riu2} = \{\begin{array}{l} \sum_{i = 0}^{P - 1} s (p_{i} - p_{c}) & if U ({LBP}_{P, R}) \leq 2 \\ P + 1 & otherwise \end{array},

(2)

where the superscript riu2 refers to the use of rotation invariant uniform patterns that have a U value (U ≤ 2). The uniformity measure U corresponds to the number of transitions from 0 to 1 or 1 to 0 between successive bits in the circular representation of the binary code LBP_P,R, which is defined as

\begin{array}{l} U ({LBP}_{P, R}) = & | s (p_{P - 1} - p_{c}) - s (p_{0} - p_{c}) | \\ + \sum_{i = 1}^{P - 1} | s (p_{i} - p_{c}) - s (p_{i - 1} - p_{c}) | . \end{array}

(3)

All nonuniform patterns are classified as one pattern for ${LBP}_{P, R}^{riu2}$ . The mapping from LBP_P,R to ${LBP}_{P, R}^{riu2}$ , which has P + 2 distinct output values, can be implemented with a lookup table.

Local ternary pattern

Because the gray value of the central pixel is directly used as a threshold, the LBP is sensitive to noise, especially in a smooth image region. To address this problem, Tan and Triggs [32] extended the original LBP to a version with three-value codes, which is called local ternary pattern. In LTP, the indicator s(x) as in Equation 1 is given as

s (p_{i} - p_{c} - τ) = \{\begin{array}{l} 1 & p_{i} - p_{c} \geq τ \\ 0 & | p_{i} - p_{c} | < τ \\ - 1 & p_{i} - p_{c} < - τ, \end{array}

(4)

where τ is a user-set threshold. In order to reduce the dimension of LTP, a coding scheme is also represented by Tan and Triggs [32] by splitting each ternary pattern into two parts: the positive part and the negative part, as illustrated in Figure 3. Though the LTP codes are more resistant to noise, it is no longer strictly invariant to gray-level transformations.

Concave-convex local binary features

In this section, we first discuss the limitation of the traditional LBP-based operators, and then we propose a new method called CCP to improve LBP/LTP features.

Limitation of traditional LBP-based methods

Until now, most LBP extensions concentrate on how to improve the discernibility of the feature extracted from the local neighborhoods. As a result, the emphasis of such LBP-based methods is on the local features. However, is local information sufficient for rotation invariant texture classification and other texture analysis applications? Figure 4 shed some light on this issue. Figure 4a shows two 8 neighborhoods in an texture image (160 × 160). The neighborhoods (a1) and (a2) in Figure 4a have the exact same LBP code (i.e., 219), positive LTP code (145), and negative LTP code (32), although they have entirely different visual perception. That is to say, the operators LBP and LTP mentioned above cannot draw any distinction between (a1) and (a2) in Figure 4a.

For further explanation, we refer to the average gray value of the neighborhood because it is clear that the average gray value of (a1) and (a2) in Figure 4a is very different. Figure 4b gives the distribution of the average gray value for the neighborhoods (P = 8 and R = 1) which have the same LBP code (LBP_8,1 = 4). There are totally 649 neighborhoods and the maximum and minimum of the average gray value are 228.1 and 54.7, respectively. Obviously, many neighborhoods with the same LBP code have different visual perception because their average gray values differ greatly.

Most of other extensions of LBP also have the same limitation as LBP and LTP. The main reason is that such operators concentrate only on local features and the global feature of the image is neglected. Therefore, the basic LBP and most of its extensions cannot distinguish the patterns with different global intensities and with the same local features. There is little effort to address this problem directly so far. The recently proposed CLBP feature in [45]implicitly addresses this problem by involving an additional binary code that encodes the comparison between the central pixel and the global average. In this paper, we study this problem explicitly by dividing the local features into two groups to reflect the relative contrast between the local and global intensities.

Concave-convex partition

Since the neighborhoods with different visual perception may have the same binary code by the LBP-based operators, we can differentiate them from each other before computing the local binary codes. For example, we can use the bit-plane decomposition [50] to compute the local binary codes on each 1-b plane, and then integrate them together. In that case, the global information can be carried by the binary codes. The price paid by this method is the doubled feature dimensionality. Of course, there could be other methods which are able to achieve a similar effect. However, to keep the simplicity and effectiveness of the basic LBP, we propose a simple yet effective method in this paper to address the weakness. The new method is called concave-convex partition or CCP. It is known that the average gray level is a widely accepted statistical parameter for texture analysis. We also chose it to denote the gray-level variation of a neighborhood. Let μ_i,j denote the average gray level of a neighborhood (i,j), which is given as follows:

μ_{i, j} = \frac{1}{P + 1} (p_{i, j} + \sum_{k = 1}^{P} p_{k}) .

(5)

Here, we further manifest the limitations of traditional LBP-based methods on an infrared chip. Figure 5a shows an infrared chip with a target in the center (40 × 75), and Figure 5b presents the distribution of μ_i,j for all neighborhoods (P = 8 and R = 1) which have the same rotation invariant and uniform LBP code (LBP8,1riu2 = 3). There are totally 347 neighborhoods in the image and the maximum and minimum of μ_i,j are 580.37 and 474.57, respectively. It is also obvious that there are many local neighborhoods with different visual perception but having the same code. In order to address this problem, we introduce the CCP technique based on μ_i,j as follows.

Definition of CCP

Let ω(i,j) be a neighborhood centered at pixel (i,j) in an image and μ_i,j be its average gray value. Given α as a CCP threshold, if μ_i,j < α, the central pixel p_i,j is regarded as a concave pixel and the neighborhood ω(i,j) as a concave neighborhood. Otherwise, p_i,j is regarded as a convex pixel and ω(i,j) is as a convex neighborhood. The set of all concave neighborhoods are denoted as Ω₁ = {(i,j)|μ_i,j < α}), and all convex neighborhoods are denoted as Ω₂ = {(i,j)|μ_i,j ≥ μ}. Now the key of CCP is to choose the optimal parameter α which should provide the least square error approximation to all local means:

\hat{α} = arg {min}_{α} \sum_{i} \sum_{j} {(μ_{i, j} - α)}^{2} .

(6)

Hence, it is straightforward to get $\hat{α}$ as the mean of all the local means μ_i,j as

\hat{α} = \frac{1}{N_{I} \times N_{J}} \sum_{i} \sum_{j} μ_{i, j},

(7)

where N_I × N_J is the number of local neighborhoods in the image. To speed up the computation, we further use the global mean μ to approximate $\hat{α}$ .

The proposed CCP technique can be applied to nearly all LBP features and their extensions. The CCP-based feature extraction has two steps. Firstly, the local binary code of each neighborhood is computed by a LBP operator. Secondly, the codes are divided into two categories according to CCP. After that, the original feature histogram of an image is decomposed into two parts, which are called the concave histogram (the statistics of the neighborhoods where μ_i,j < μ) and the convex histogram (the statistics of the neighborhoods where μ_i,j ≥ μ).

It is expected that CCP-based LBP features will be enriched by incorporating the global mean. For example, Figure 4c gives the CCP-based LBP distribution of the texture image in Figure 4a, the LBP histogram (P = 8, R = 1) is decomposed into a concave one and a convex one. Figure 5c presents the CCP-based LBP distribution of the infrared chip shown in Figure 5a; the LBP histogram that is rotation invariant and uniform (represented by ${LBP}_{8, 1}^{riu2}$ ) is decomposed into a concave one and a convex one. It is worth mentioning that we have tried CCP with more than two feature groups without further improvement, showing that the binary CCP is the most cost-effective one.

Dissimilarity measure

Various metrics have been presented to evaluate the dissimilarity between two histograms. As most LBP-based algorithms, we chose the chi square distance as the dissimilarity measure, which is defined as

d (H, B) = \sum_{i = 1}^{K} \frac{{(h_{i} - b_{i})}^{2}}{h_{i} + b_{i}},

(8)

where H = {h_i} and B = {b_i}(i = 1,2,…,K) denote the two feature histograms, and K is the number of bins in the histogram.

Experiments and discussions

In this section, we first evaluate and compare LBP [43], LTP [32], SLBP [44], and CLBP [45] along with their CCP-enhanced versions which are called CCP-based LBP (CCLBP), CCP-based LTP (CCLTP), CCP-based SLBP (CCSLBP), and CCP-based CLBP (CCCLBP) for texture classification to demonstrate the usefulness of the proposed CCP technique. Then, we focus on LTP and CCLTP to examine their effectiveness for infrared ATR where four SRC-based methods are involved for performance evaluation. The experiments were conducted on a PC with the AMD Phenom(tm)II Processor (3.41GHz) (Advanced Micro Devices, Inc., Sunnyvale, CA, USA), 8 GB RAM, and the 64-b Win7 Operating System.

Three different multiresolutions are used for the eight features tested in this work, P = 8 and R = 1, P = 16 and R = 2, and P = 24 and R = 3. To simplify the feature computation, LTP is split into two LBPs as mentioned in [32], i.e., positive LBPs and negative LBPs. Then two histograms are built and concatenated into one histogram (shown in Figure 3) as the image feature, which is denoted as ${LTP}_{P, R}^{riu2}$ . For CLBP, we choose CLBP $_S_{P, R}^{riu2} / M_{P, R}^{riu2}$ operator in the paper. We compare all eight LBP features in terms of their dimension in Table 1 where it is shown that both CLBP and CCCLBP have a significantly higher dimension than others.

Table 1 Dimension comparison of each LBP feature

Full size table

Experiments on texture classification

For texture classification, we chose two commonly used textures databases, i.e., the Outex database [51] and the Columbia-Utrecht Reflection and Texture (CUReT) database [52].

Experimental results on Outex database

For the Outex database [51], we chose Outex_TC_0010 (TC10) and Outex_TC_0012 (TC12), where TC10 and TC12 contain the same 24 classes of textures collected under three different illuminants (‘horizon,’ ‘inca,’ and ‘t184’) and nine different rotation angles (0°, 5°, 10°, 15°, 30°, 45°, 60°, 75°, and 90°). There are 20 nonoverlapping 128×128 texture samples for each class under each situation. For TC10, samples of illuminant inca and angle 0° in each class were used for classifier training and the other eight rotation angles with the same illuminant were used for testing. Hence, there are 480 (24×20) models and 3,840 (24×8×20) validation samples. For TC12, all the 24×20×9 samples captured under illumination tl84 or horizon were used as the test data. Table 2 gives the experimental results of different LBP features, where t represents the test setup of illuminant t184 and h represents horizon. It is clear that CCP can improve the LBP, LTP, SLBP, and CLBP greatly. For P = 8 and R = 1, P = 16 and R = 2, and P = 24 and R = 3, CC ${LBP}_{P, R}^{riu2}$ , ${CCLTP}_{P, R}^{riu2}$ , CCS ${LBP}_{P, R}^{riu2}$ , and CCCLBP $_S_{P, R}^{riu2} / M_{P, R}^{riu2}$ have an averaged accuracy improvement of 12.5%, 5.2%, 7.7%, and 2.9% over their original versions, respectively.

Table 2 Classification accuracy (%) on TC10 and TC12 texture sets using different LBP operators

Full size table

Experimental results on CUReT database

The CUReT database includes 61 classes of textures captured at different viewpoints and illumination orientations [52], and 92 images in each class are selected from the images shot from a viewing angle of less than 60°. To get statistically significant experimental results, N training images (N = 46,23,12, and 6) were randomly chosen from each class while the remaining (92-N) images were used as the test set. The average accuracy over 23 randomly splits are listed in Table 3. Similar conclusions observed from TC10 and TC12 can also be drawn for the experimental results on the CUReT dataset. CC ${LBP}_{P, R}^{riu2}$ , ${CCLTP}_{P, R}^{riu2}$ , CCS ${LBP}_{P, R}^{riu2}$ and CCCLBP $_S_{P, R}^{riu2} / M_{P, R}^{riu2}$ get an averaged accuracy improvement of 8.1%, 3.6%, 5.1%, and 1.6% over their original versions, respectively.

Table 3 Classification accuracy (%) on CuRet database using different LBP operators

Full size table

The results on the two texture databases show the effectiveness of the proposed CCP technique to improve various LBP features. It is worth mentioning that without CCP enhancement, CLBP shows the best performance and LTP is the second. However, the dimension of CLBP is more than 10 times of that LTP. With CCP enhancement, CCCLBP is still moderately better than CCLTP. However, the high dimension of CCCLBP may limit its potential in real applications where multiple local features may be needed to deal with nonstationarity in an image. CCLTP has a more balanced trade-off between complexity and performance.

Experiments for infrared ATR

This section reports the evaluation of the eight LBP features in the context of infrared ATR. The LBP features of three different scales (P = 8 and R = 1, P = 16 and R = 2, and P = 24 and R = 3) are concatenated together as an image feature which are denoted as ${LBP}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ (LBP), CC ${LBP}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ (CLBP), ${LTP}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ (LTP), ${CCLTP}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ (CCLTP), S ${LBP}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ (SLBP), CCS ${LBP}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ (CCSLBP), ${{CLBP}_{S} / M}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ (CLBP), ${{CCCLBP}_{S} / M}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ (CCSLBP), respectively. The Comanche (Boeing-Sikorsky, USA) FLIR dataset is used here, similar to [9]. There are 10 different military targets denoted as T 1,T 2,…,T 10. For each target, there are 72 orientations, corresponding to aspect angles of 0°, 5°, …, 355°. The dataset contains 437 to 759 images (40 × 75) for each target type, totaly 6,930 infrared chips. In Figure 6, we show some infrared chips for 10 targets under 10 different views.

Comparison of LBP, CCLBP, LTP, CCLTP, SLBP, CCSLBP, CLBP, and CCCLBP

In this experiment, we randomly chose about 10% (718 chips), 20% (1,436 chips), 30% (2,154 chips), 40% (2872 chips), and 50% (3590 chips) target chips of each target class as training data. The remaining 90%, 80%, 70%, 60%, and 50% images in the dataset are set as testing data, respectively. The mean and variance of the recognition accuracy averaged by 10 trials are given in Figure 7. It can be seen from the experimental results that

With CCP enhancement, the accuracy improvements averaged over five training datasets of CCLBP, CCLTP, CCLBP, and CCCLBP are 23.25%, 9.22%, 11.43%, and 2.92%, respectively, compared with their original versions. A much less improvement of CCCLBP over CLBP is likely due to its high-dimension nature that may reduce the benefit of CCP.

The experimental result also shows that eight LBP features, i.e., LBP, LTP, SLBP, CLBP, CCLBP, CCLTP, CCSLBP, and CCCLBP, are robust for infrared ATR, because they are fairly stable in 10 random trials for each case.
${{CCCLBP}_{S} / M}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ gives the best performance among the eight operators, followed by ${CCLTP}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ . However, the dimension of ${{CCCLBP}_{S} / M}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ is 2,200 which is over 10 times as the dimension of ${CCLTP}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ (216). On the other hand, the recognition accuracy of ${CCLTP}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ and ${{CCCLBP}_{S} / M}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ is getting closer as training data increase. Therefore, we will further make some detailed comparisons between LTP/CCLTP and other recent sparsity-based ATR methods in the following experiments.

Recognition rates of different blocking methods

The analysis reported in [53] shows that the holistic LBP histogram is appropriate for standard texture analysis but not suitable for facial recognition. The reason is that the texture images have uniform or homogeneous property, however, the face images have large spatial variations due to distinct facial features. Therefore, a facial image is divided into some blocks. Then a LBP histogram is extracted from each block and all LBP histograms are concatenated together as a feature representation for face recognition. Similarly, the targets in infrared chips have strong spatial variations. A holistic LBP or LTP histogram is not suitable for ATR. Therefore, we divide an infrared chip into different blocks and extract a LTP/CCLTP histogram from each block.

We studied six different blocking methods, as illustrated in Figure 8. Seg-1 denotes the whole chip without blocking. Seg-2 divides a chip into the left-right blocks. Seg-3 segments a chip into the top-bottom blocks. Seg-4 divides a chip into four quadrants that are slightly overlapped. Seg-5 segments a chip into six blocks. Seg-6 partitions a chip into nine blocks. The operator, ${CCLTP}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ is chosen as the feature. A histogram ${CCLTP}_{8, 1 + 16, 2 + 24, 3}^{riu2}$ is extracted from each block and all block-wise histograms are concatenated together as a target representation. The average recognition accuracy averaged by 10 trials is shown in Figure 9. The dimensionality and the average feature extraction time cost of each chip are given in Table 3. The average training and recognition time cost for each segmentation method is shown in Table 4. As seen from Figure 9, Tables 4 and 5, Seg-6 yields the best ATR recognition performance if we choose 10% and 20% chips as training data, and Seg-4 provides the best performance in other cases which is only slightly worse than ‘Seg-6’ in the 10% and 20% cases. However, the computational complexity of Seg-6 is much heavier than that of ‘Seg-4’. Therefore, Seg-4 is considered the optimal blocking methods to apply the LTP/CCLTP for the infrared ATR task in this work.

Table 4 Feature dimensionality and average feature extraction time per chip for different blocking schemes shown in Figure 8

Full size table

Table 5 Time consumed for feature extraction, training, and recognition under different training datasets

Full size table

Comparison of LTP/CCLTP and SRC methods

In this section, we compare the performance of LTP and CCLTP with SRC-based methods that achieved the state-of-the-art results in infrared ATR [9]. We have downloaded two SRC software packages which are SparseLab [48] and spectral projected gradient (SPGL1) [49]. These two packages support different sparse optimization algorithms. For SparseLab, we chose three sparse solutions: ‘lasso’, ‘nnlasso’, and ‘OMP’, which are referred to as Sparselab-lasso, Sparselab-nnlasso, and Sparselab-OMP in the following. For SPGL1, we chose one solution lasso, which is called SPG-lasso in this work. The reason we chose these four specific algorithms is mainly due to their relatively low computational load which is acceptable in the ATR experiments. To reduce the dimensionality of infrared chips, several dimensionality reduction techniques were studied in [9] where a simple 16×16 downsampling was shown comparable with others, including Haar wavelet, PCA, and random projection. Thus, the dimensionality of each chip is downsampled from 40×75 to 16×16 as mentioned in [9].

In addition, we randomly selected 10% (718 chips), 20% (1436 chips), 30% (2154 chips), 40% (2,872 chips), 50% (3,590 chips), 60% (4,308 chips), 70% (4,958 chips), and 80% (5,607 chips) target chips of each target class as training data. The remaining 90%, 80%, 70%, 60%, 50%, 40%, 30%, and 20% images in the dataset are set as testing data, respectively. We augment the Seg-4 blocking method with LTP/CCLTP which leads to two new feature representations denoted by sLTP and sCCLTP. The recognition accuracy of sLTP, sCCLTP, and the four sparse-based methods that is averaged by 10 trials is given in Table 6 where we also include the leave-one-out result for each method. The average training and recognition time of six methods is shown in Table 7. As seen from Table 6, sCCLTP is showing improved performance as the training size grows. Especially for the leave-one-out experiment, sCCLTP provides the best ATR performance. The confusion matrices of six methods corresponding to the leave-one-out experiment are shown in Figure 10. It is shown that the sCCLTP result has only three nondiagonal entries greater than 1% (Figure 10a), while sLTP (Figure 10b), Sparselab-lasso (Figure 10c), Sparselab-nnlasso (Figure 10d), Sparselab-OMP (Figure 10e), and SPG-lasso (Figure 10f) have 12, 9, 12, 23, and 5 nondiagonal entries greater than 1%, showing the best robustness of sCCLTP.

Table 6 Accuracy of infrared ATR for (%) six methods under different training datasets

Full size table

Table 7 Averaged time consumed (s) for training and testing (sparsity-based methods do not have training stage)

Full size table

In order to further show the advantages of the proposed CCP methods, Figure 11 presents the curves of pose recognition accuracies vs. acceptable angle errors (up to 60°) for the leave-one-out experiment. As can be seen from Figure 11, sCCLTP again provides the best performance. Specifically, sCCLTP can get 66% pose recognition accuracy with the angle error less than 5°, 85% with the angle error less than 10°, and more than 90% with the angle error less than 15°. On the other hand, sLTP has similar pose recognition performance as Sparselab-lasso and Sparselab-nnlasso.

Conclusions

In this paper, a new LBP-based ATR algorithm is proposed using concave-convex partition local ternary pattern (CCLTP). Based on the analysis of the limitation of traditional LBP-based methods, CCP is presented to group local features into two categories in order to reflect the contrast between local and the global intensities. The improvement of the proposed CCLBP/CCLTP methods over LBP and LTP as well as two improved LBP features (SLBP and CLBP) is first demonstrated on two standard texture databases, and then CCLTP is further enhanced by augmenting a simple yet effective blocking scheme for infrared imagery, leading to sCCLTP which is evaluated against sLTP (block-based LTP) and four SRC-based methods on an infrared ATR database. Experimental results show that sCCLTP outperforms SRC-based methods and sLTP in terms of both recognition accuracy and pose estimation. It is worth mentioning that the proposed CCP technique can be applied to nearly all existing LBP features to improve feature discriminantability.

References

Li B, Chellappa R, Zheng Q, Der S, Nasrabadi N, Chan L, Wang L: Experimental evaluation of forward-looking ir data set automatic target recognition approaches—a comparative study. Comput. Vis. Image Understand 2001, 84(1):5-24. 10.1006/cviu.2001.0938
Article Google Scholar
Chan LA, Nasrabadi NM, Mirelli V: Multi-stage target recognition using modular vector quantizers and multilayer perceptrons. In Proceedings of IEEE Computer Society Conference on Computer Vision Pattern Recognition. San Francisco, CA; 18–20 June 1996:114-119.
Google Scholar
Wang LC, Der SZ, Nasrabadi NM: A committee of networks classifier with multi-resolution feature extraction for automatic target recognition. In Proceedings of IEEE International Conference on Neural Networks. Houston, TX; 9–12 June 1997:1596-1601.
Google Scholar
Lamdan Y, Wolfson H: Geometric hashing: a general and efficient model-based recognition scheme. In Proceedings of the 2nd International Conference on Computer Vision. Tampa, FL; 5–8 December 1988:238-249.
Google Scholar
Olson CF, Huttenlocher DP: Automatic target recognition by matching oriented edge pixels. IEEE Trans. Image Process 1997, 6(1):103-113. 10.1109/83.552100
Article Google Scholar
Grenander U, Miller MI, Srivastava A: Hilbert-Schmidt lower bounds for estimators on matrix lie groups for ATR. IEEE Trans. Pattern Anal. Mach. Intell 1998, 20(8):790-802. 10.1109/34.709572
Article Google Scholar
Venkataraman V, Fan G, Yu L, Zhang X, Liu W, Havlicek JP: Automated target tracking and recognition using coupled view and identity manifolds for shape representation. EURASIP J. Adv. Signal Process 2011, 124: 1-17.
Google Scholar
Gong J, Fan G, Yu L, Havlicek JP, Chen D, Fan N: Joint view-identity manifold for infrared target tracking and recognition. Comput. Vis. Image Understand 2014, 118(1):211-224.
Article Google Scholar
Patel VM, Nasrabadi NM, Chellappa R: Sparsity-motivated automatic target recognition. Appl. Opt 2011, 50(10):1425-1433. 10.1364/AO.50.001425
Article Google Scholar
Wright J, Yang AY, Ganesh A, Sastry SS, Ma Y: Robust face recognition via sparse representation. IEEE Trans. Pattern Anal. Mach. Intell 2009, 31(2):210-227.
Article Google Scholar
Liebelt J, Schmid C, Schertler K: Viewpoint-independent object class detection using 3D feature maps. In Proceedings of IEEE Conference on CVPR. Anchorage, AK; 23–28 June 2008:1-8.
Google Scholar
Khan SM, Cheng H, Matthies D, Sawhney H: 3D model based vehicle classification in aerial imagery. In IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA; 13–18 June 2010:1681-1687.
Google Scholar
Toshev A, Makadia A, Daniilidis K: Shape-based object recognition in videos using 3D synthetic object models. In IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL; 20–25 June 2009:288-295.
Google Scholar
Bhanu B: Automatic target recognition: state of the art survey. IEEE Trans. Aero. Electron. Syst 1986, AES-22(4):364-379.
Article Google Scholar
Messer K, de Ridder D, Kittler J: Adaptive texture representation methods for automatic target recognition. In Proceedings of British Machine Vision Conference BMVC99. Nottingham; 13–16 September 1999:1-10.
Google Scholar
Casasent D, Smokelin Y: Wavelet and Gabor transforms for target detection. Opt. Eng 1992, 31(9):1893-1898. 10.1117/12.59913
Article Google Scholar
Jeong C, Cha M, Kim H-M: Texture feature coding method for SAR automatic target recognition with adaptive boosting. In Proceedings of the 2nd Asian-Pacific Conference on Synthetic Aperture Radar. Xian; 26–30 October 2009:473-476.
Google Scholar
Rahmani N, Behrad A: Automatic marine targets detection using features based on local Gabor binary pattern histogram sequence. In Proceedings of the 1st International Conference on Computer and Knowledge Engineering. Mashhad; 13–14 October 2011:195-201.
Google Scholar
Qin Y, Cao Z, Fang Z: A study on the difficulty prediction for infrared target recognition. In Proceedings of SPIE. Wuhan; 26 October 2013. doi:10.1117/12.2031100
Google Scholar
Wang F, Sheng W, Ma X, Wang H: Target automatic recognition based on ISAR image with wavelet transform and MBLBP. In 2010 International Symposium on Signals Systems and Electronics. Nanjing; 17–20 September 2010:1-4.
Chapter Google Scholar
Ojala T, Pietikäinen M, Harwood D: A comparative study of texture measures with classification based on featured distributions. Pattern Recogn 1996, 29(1):51-59. 10.1016/0031-3203(95)00067-4
Article Google Scholar
Wang L, He D-C: Texture classification using texture spectrum. Pattern Recogn 1990, 23(8):905-910. 10.1016/0031-3203(90)90135-8
Article Google Scholar
Pietikäinen M, Ojala T, Xu Z: Rotation-invariant texture classification using feature distributions. Pattern Recogn 2000, 33(1):43-52. 10.1016/S0031-3203(99)00032-1
Article Google Scholar
Topi M, Matti P, Timo O: Texture classification by multi-predicate local binary pattern operators. In Proceedings of the International Conference on Pattern Recognition. Barcelona; 3–7 September 2000:939-942.
Google Scholar
Ojala T, Valkealahti K, Oja E, Pietikäinen M: Texture discrimination with multidimensional distributions of signed gray-level differences. Pattern Recogn 2001, 34(3):727-739. 10.1016/S0031-3203(00)00010-8
Article Google Scholar
Zhao G, Pietikäinen M: Improving rotation invariance of the volume local binary pattern operator. In Proceedings of IAPR Conference on Machine Vision Applications. Tokyo; 16–18 May 2007:327-330.
Google Scholar
Zhou H, Wang R, Wang C: A novel extended local-binary-pattern operator for texture analysis. Inform. Sci 2008, 178(22):4314-4325. 10.1016/j.ins.2008.07.015
Article Google Scholar
Guo Z, Zhang L, Zhang D: Rotation invariant texture classification using LBP variance (LBPV) with global matching. Pattern Recogn 2010, 43(3):706-719. 10.1016/j.patcog.2009.08.017
Article Google Scholar
Guo Y, Zhao G, PietikäInen M: Discriminative features for texture description. Pattern Recogn 2012, 45(10):3834-3843. 10.1016/j.patcog.2012.04.003
Article Google Scholar
Subrahmanyam M, Maheshwari R, Balasubramanian R: Local maximum edge binary patterns: a new descriptor for image retrieval and object tracking. Signal Process 2012, 92(6):1467-1479. 10.1016/j.sigpro.2011.12.005
Article Google Scholar
Shan C, Gong S, McOwan PW: Facial expression recognition based on local binary patterns: a comprehensive study. Image Vis. Comput 2009, 27(6):803-816. 10.1016/j.imavis.2008.08.005
Article Google Scholar
Tan X, Triggs B: Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Trans. Image Process 2010, 19(6):1635-1650.
Article MathSciNet Google Scholar
Zhang B, Gao Y, Zhao S, Liu J: Local derivative pattern versus local binary pattern: face recognition with high-order local pattern descriptor. IEEE Trans. Image Process 2010, 19(2):533-544.
Article MathSciNet Google Scholar
Huang D, Ardabilian M, Wang Y, Chen L: A novel geometric facial representation based on multi-scale extended local binary patterns. In Proceedings of the IEEE International Conference on Automatic Face & Gesture Recognition and Workshops. Santa Barbara; 21–25 March 2011:1-7.
Google Scholar
Huang D, Shan C, Ardabilian M, Wang Y, Chen L: Local binary patterns and its application to facial image analysis: a survey. IEEE Trans. Syst. Man Cybern. C Appl. Rev 2011, 41(6):765-781.
Article Google Scholar
Nanni L, Lumini A, Brahnam S: Survey on LBP based texture descriptors for image classification. Exp. Syst. Appl 2012, 39(3):3634-3641. 10.1016/j.eswa.2011.09.054
Article Google Scholar
Heikkila M, Pietikainen M: A texture-based method for modeling the background and detecting moving objects. IEEE Trans. Pattern Anal. Mach. Intell 2006, 28(4):657-662.
Article Google Scholar
Zhang S, Yao H, Liu S: Dynamic background modeling and subtraction using spatio-temporal local binary patterns. In Proceedings of the IEEE International Conference on Image Processing. San Diego, CA; 12–15 October 2008:1556-1559.
Google Scholar
Liao S, Zhao G, Kellokumpu V, Pietikainen M, Li SZ: Modeling pixel process with scale invariant local patterns for background subtraction in complex scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. San Francisco, CA; 13–18 June 2010:1301-1306.
Google Scholar
Nanni L, Lumini A, Brahnam S: Local binary patterns variants as texture descriptors for medical image analysis. Artif. Intell. Med 2010, 49(2):117-125. 10.1016/j.artmed.2010.02.006
Article Google Scholar
Srensen L, Shaker SB, De Bruijne M: Quantitative analysis of pulmonary emphysema using local binary patterns. IEEE Trans. Med. Imaging 2010, 29(2):559-569.
Article Google Scholar
Häfner M, Liedlgruber M, Uhl A, Vécsei A, Wrba F: Color treatment in endoscopic image classification using multi-scale local color vector patterns. Med. Image Anal 2012, 16(1):75-86. 10.1016/j.media.2011.05.006
Article Google Scholar
Ojala T, Pietikainen M, Maenpaa T: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell 2002, 24(7):971-987. 10.1109/TPAMI.2002.1017623
Article Google Scholar
Ahonen T, Pietikäinen M: Soft histograms for local binary patterns. In Proceedings of the Finnish Signal Processing Symposium. Oulu; 30 October 2007:1-4.
Google Scholar
Guo Z, Zhang L, Zhang D: A completed modeling of local binary pattern operator for texture classification. IEEE Trans. Image Process 2010, 19(6):1657-1663.
Article MathSciNet Google Scholar
Liao S, Zhu X, Lei Z, Zhang L, Li S: Learning multi-scale block local binary patterns for face recognition. Adv. Biometrics 2007, 828-837.
Chapter Google Scholar
Wolf L, Hassner T, Taigman Y: Descriptor based methods in the wild. In Faces in Real-Life Images Workshop at the European Conference on Computer Vision. Marseille; 17–18 October 2008:1-14.
Google Scholar
SparseLab , website. Accessed 24 September 2013 http://sparselab.stanford.edu
SPGL1 , website. Accessed 24 September 2013 http://www.cs.ubc.ca/~mpf/spgl1
Ko S-J, Lee S-H, Lee K-H: Digital image stabilizing algorithms based on bit-plane matching. IEEE Trans. Consum. Electron 1998, 44(3):617-622. 10.1109/30.713172
Article MathSciNet Google Scholar
Ojala T, Maenpaa T, Pietikainen M, Viertola J, Kyllonen J, Huovinen S: Outex-new framework for empirical evaluation of texture analysis algorithms. In Proceeding of the IEEE International Conference on Pattern Recognition. Quebec; 11–15 August 2002:701-706.
Google Scholar
Dana KJ, Van Ginneken B, Nayar SK, Koenderink JJ: Reflectance and texture of real-world surfaces. ACM Trans. Graph. (TOG) 1999, 18(1):1-34. 10.1145/300776.300778
Article Google Scholar
Yang B, Chen S: A comparative study on local binary pattern (LBP) based face recognition LBP histogram versus LBP image. Neurocomputing 2013, 120: 365-379.
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their valuable comments and suggestions that improved this paper.

This work was supported by the Backbone Teacher Grant of Henan Province (2010GGJS-059), the International Cooperation Project of Henan Province (134300510057), the Doctor and Backbone Teacher Grant of Henan Polytechnic University. The authors would like to thank MVG, Sparselab and SPGL1 for sharing the source codes of LBP and sparse-based methods.

Author information

Authors and Affiliations

School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, 454000, China
Junding Sun & Xiaosheng Wu
School of Electrical and Computer Engineering, Oklahoma State University, Stillwater, 74078, USA
Guoliang Fan & Liangjiang Yu

Authors

Junding Sun
View author publications
You can also search for this author in PubMed Google Scholar
Guoliang Fan
View author publications
You can also search for this author in PubMed Google Scholar
Liangjiang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaosheng Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Guoliang Fan.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Authors’ original file for figure 11

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Sun, J., Fan, G., Yu, L. et al. Concave-convex local binary features for automatic target recognition in infrared imagery. J Image Video Proc 2014, 23 (2014). https://doi.org/10.1186/1687-5281-2014-23

Download citation

Received: 10 October 2013
Accepted: 27 March 2014
Published: 21 April 2014
DOI: https://doi.org/10.1186/1687-5281-2014-23

Concave-convex local binary features for automatic target recognition in infrared imagery

Abstract

Abstract

Introduction

Brief review of LBP-based methods

Local binary pattern

Local ternary pattern

Concave-convex local binary features

Limitation of traditional LBP-based methods

Concave-convex partition

Definition of CCP

Dissimilarity measure

Experiments and discussions

Experiments on texture classification

Experimental results on Outex database

Experimental results on CUReT database

Experiments for infrared ATR

Comparison of LBP, CCLBP, LTP, CCLTP, SLBP, CCSLBP, CLBP, and CCCLBP

Recognition rates of different blocking methods

Comparison of LTP/CCLTP and SRC methods

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords