Skip to main content

Automated classification for HEp-2 cells based on linear local distance coding framework


The occurrence of antinuclear antibodies (ANAs) in patient serum has significant relation to some specific autoimmune diseases. Indirect immunofluorescence (IIF) on human epithelial type 2 (HEp-2) cells is the recommended methodology for detecting ANAs in clinic practice. However, the currently practiced manual detection system suffers from serious problems due to subjective evaluation. In this paper, we present an automated system for HEp-2 cells classification. We adopt a bag-of-words (BoW) framework which has shown impressive performance in image classification tasks because it can obtain discriminative and effective image representation. However, the information loss is inevitable in the coding process. Therefore, we propose a linear local distance coding (LLDC) method to capture more discriminative information. Our LLDC method transforms original local feature to more discriminative local distance vector by searching for local nearest few neighbors of the local feature in the class-specific manifolds. The obtained local distance vector is further encoded and pooled together to get salient image representation. The LLDC method is combined with the traditional coding methods to achieve higher classification accuracy. Incorporated with a linear support vector machine classifier, our proposed method demonstrated its effectiveness on two public datasets, namely, the International Conference on Pattern Recognition (ICPR) 2012 dataset and the International Conference on Image Processing (ICIP) 2013 training dataset. Experimental results show that the LLDC framework can achieve superior performance to the state-of-the-art coding methods for staining pattern classification of HEp-2 cells.


Indirect immunofluorescence (IIF) image analysis has become a hot research topic in recent years. The IIF on human epithelial type 2 (HEp-2) cells is the hallmark protocol for detecting antinuclear antibodies (ANAs) in patient serum, which are in connection with the occurrence of autoimmune diseases such as rheumatoid arthritis, systemic lupus erythematosus, and multiple sclerosis [1]. If the ANAs are contained in patient serum, they bond to the nuclei of the HEp-2 cells, forming a molecular complex. Unbound antibodies will be washed off and a fluorescein-conjugated anti-human immunoglobulin will be retained. Washing after the second incubation will remove any unbound secondary immunoglobulin. The ANAs are finally revealed as fluorescent cells under the fluorescence microscope. The fluorescence intensity and the positive staining patterns for each slide image are identified by highly qualified and skillful physicians.

Due to the effectiveness and high quality of IIF image analysis, we have witnessed a growing demand for diagnostic tests for systemic autoimmune diseases using IIF strategy. However, it is evaluation subjective, labor intensive, and time consuming [2]. Hence, computer-aided diagnostic (CAD) systems aiming to determine the presence of ANAs in the IIF images, offers a solution to overcome all the above limitations and lead to more reliable test results. The typical flow consists of six main techniques, namely, automated preparation of slides with robotic devices [3], image acquisition [4,5], image segmentation [6], mitotic cell recognition [7], fluorescence intensity classification [8], and staining pattern recognition [2,9]. While all aspects of the CAD systems contribute to the automation of IIF procedure in one way or another, staining pattern classification is the most challenging task in the research community. Classifying images into meaningful categories is a challenging and important task [10]. Moreover, compared with the visual signal in the general object classification, HEp-2 cells do not contain abundant structural information. In addition, the features between various HEp-2 cells are much more similar than those between different objects or natural scene images. Therefore, in this study, we investigate into the feature extraction and machine learning methods for automatic staining pattern classification of HEp-2 cells.

The most frequent staining patterns of HEp-2 cells in clinical practice are as follows [11]:

  • Centromere: characterized by 40 to 60 discrete speckles distributed throughout the interphase nuclei and characteristically found in the condensed nuclear chromatin during mitosis as a bar of closely associated speckles;

  • Nucleolar: characterized by large coarse speckles within the nucleoli of interphase cells, with less than six speckles per cell;

  • Homogeneous: characterized by a uniform diffuse fluorescence of the entire interphase nuclei and fluorescence of the chromatin of mitotic cells;

  • Fine speckled: characterized by a fine granular nuclear staining of interphase cell nuclei in a uniform distribution;

  • Coarse speckled: characterized by dense, intermediate-sized particles in interphase nuclei together with large speckles;

  • Cytoplasmic: characterized by staining of the cytoplasm exclusive of the nucleus.

In addition, two staining patterns less frequent occurring in practical clinic are also considered in this paper:

  • Nuclear membrane: a smooth homogeneous ring-like fluorescence of the nuclear membrane in interphase cells;

  • Golgi: staining of a polar organelle adjacent to and partly surrounding the nucleus, composed of irregular large granules.

Examples of specimen images with the most frequent staining patterns are shown in Figure 1.

Figure 1
figure 1

Typical HEp-2 cells with different staining patterns.

The most popular image classification framework consists of two major modules: bag-of-words (BoW) and spatial pyramid matching (SPM). In the framework, an image representation is generated via the following steps. Firstly, local descriptors are extracted from the image. Then, a pre-defined codebook is applied to encode the local descriptors into codes accordingly. Next, the image is divided into increasingly finer subregions. Multiple codes from each subregion are pooled together. Finally, the final image representation is generated by concatenating the histograms from all subregions together.

The framework of SPM based on BoW has been successfully applied to image classification [12,13], and in recent years, it has been improved for HEp-2 cell classification [14,15]. It seems to be suitable for the HEp-2 cell classification task. Within the framework, how to encode each local feature has significant impact on the final classification performance. The traditional and the simplest coding method is vector quantization (VQ) [16], which assigns a local feature to the closest visual word in the codebook, introducing unrecoverable discriminative information loss. The soft assignment (SA) coding method [17-19] is proposed to reduce information loss by assigning a local feature to different visual words according to its memberships to multiple visual words. Apart from information loss, traditional SPM based on VQ has to use a classifier with nonlinear Mercer kernels, resulting in additional computational complexity and reducing scalability for real application. To alleviate these limitations, sparse-coding-based SPM (ScSPM) [12], local coordinate coding (LCC) [20], and locality-constrained linear coding (LLC) [13] aim at obtaining a nonlinear feature representation which works better with linear classifiers.

All the improved methods represent images more accurately and achieve impressive image classification performance. However, information loss in feature quantization is still inevitable and affects the performance for good image classification performance. To avoid information loss caused by coding, naive Bayes nearest neighbor (NBNN) method [21] is proposed by retaining all of the feature descriptors. It shows competitive classification performance with coding-based methods as it alleviates information loss and keeps the discrimination of input features. However, NBNN is sensitive to noisy features and easy to be dominated by outlier features. To simultaneously inherit the advantage of the BoW framework and the NBNN method, linear distance coding (LDC) method [22] has been proposed recently to utilize the discriminative information lost by the traditional coding methods. It transforms each local feature into a distance vector via calculating neighbors in every class-specific manifold.

In this paper, we propose a novel linear local distance coding (LLDC) method to increase the accuracy of staining patterns classification. The LLDC method adopts feature extraction-coding-pooling framework based on local distance vector which is a modification of the distance vector. Local distance vector is generated by using only the local neighbors in a merged feature dataset instead of calculating neighbors in every class-specific feature dataset. Therefore, it can ignore disturbance from isolated classes. Using image-to-class distance makes it more class-specific as desired for classification. Meanwhile, distance vector in LDC method is obtained by using a linear coding scheme which aggravates the information loss. Local distance vector is only based on Euclidean distance and avoids coding process in distance vector transformation, therefore it is more discriminative. In addition, it is proved that image representations via coding distance patterns are complementary to the ones from the original coding methods [22]. Therefore, we directly concatenate the image representations based on local distance vector and local features to achieve superior performance.

In summary, the main contributions of this study are as threefold: (i) we propose a novel local distance vector based on the image-to-class distance. It is more class specific than original local feature. Unlike distance vector, it eliminates the need to calculate the distance for each class, therefore it can speed up the calculation and achieve better classification performance by ignoring the disturbance from the distant classes. (ii) We propose a LLDC method based on the transformed local distance vector. It takes the advantages of BoW framework and NBNN method. It reduces the information loss caused by traditional coding methods while capturing salient features. (iii) The image representations produced by the LLDC method are complementary to the ones from the original coding methods. Their combination can yield superior performance compared with only using single representation. Experiments on two public HEp-2 cells datasets consistently show that the image representation produced by the LLDC framework achieves better performance compared with state-of-the-art coding methods.

The rest of the paper is organized as follows. ‘Related work’ section introduces some related publications on HEp-2 cells classification. ‘Distance vector’ section proposes the linear local distance coding framework. Experimental results are reported in the ‘Experiments and analyses’ section. Finally, ‘Conclusions’ section concludes this work.

Related work

To overcome the limitations in manual inspection, feature extraction and machine learning methods have recently been used in automatic staining pattern classification of HEp-2 cells. In the literature, Perner et al. [23] use automatic thresholding via Otsu’s algorithm to segment the individual cells, followed by extracting a set of textural features and use decision tree classifier. Soda et al. [24] utilize a multiple expert system based on a set of specific features related to statistical and spectral components to assign the pattern of single cell. Cordelli and Soda [25] experimentally compare four different methods for converting a color image into a gray scale one, which are weighted conversion, green channel, intensity channel, and Helmholtz-Kohlrausch (HK) conversion. By considering a heterogeneous set of features, e.g., statistical descriptors, spectral measures, and morphological descriptors, AdaBoost classifier based on intensity channel achieves the best average performance. To extract features of the slide images without segmentation, our group verifies the effectiveness of scale invariant feature transform (SIFT) for representing HEp-2 slide images [26]. Wiliem et al. [27] propose a dual-region codebook-based descriptor combined with the nearest convex hull classifier.

Aforementioned works are based on private datasets and/or different experimental protocols. While each of them addresses one or two aspects in technical advancement, it is difficult in reproduction of their procedures and comparative study in their performance.

Due to the great impact of staining pattern classification of HEp-2 cells on clinical practice, to make the comparisons among different approaches available, the first edition of the HEp-2 Cells Classification Contest hosted by the International Conference on Pattern Recognition (ICPR) 2012 with a publicly available HEp-2 cell database (ICPR 2012 dataset) was released. Applying the same database and experimental protocol, different systems can be compared and evaluated based on the benchmark. Nosaka et al. [28] utilize an extension of local binary pattern (LBP) descriptor, named co-occurrence of adjacent local binary patterns (CoALBP), to extract textural features. Using linear support vector machine (SVM), their method won the first prize in HEp-2 cell classification contest with around 69% of classification accuracy. Xiangfei et al. [11] extract statistical intensity features. After a normalization step, a global texton dictionary is built via K-means clustering which is used to encode images’ features into frequency histograms. They utilize the kNN classifier with χ 2 distance. Li et al. [29] extract LBP, Gabor, discrete cosine transform (DCT), and some global appearance-based statistical features for image representation. Then, a combination of SVMs using a modified AdaBoost.M1 is used in order to improve classification performance. Liu and Wang [11] adopt a deep learning scheme to automatically learn the discriminative features from dense image patches. Following a BoW pipeline, a linear SVM is learned based on the BoW representations. Ghosh and Chaudhary [30] test the performance of various features like BoW representation based on speeded-up robust features (SURF), region-of-interest (ROI)-based feature, texture-based feature, and normalized histogram of orientated gradients (HOG) features. Experimental results show that the combination of HOG, texture-, and ROI-based features using SVM classifier achieves the best classification performance. All the participated methods are reported in [11] for reference.

Inspired by the first edition of the contest for HEp-2 cells classification, increasing researches aroused for improving the performance of HEp-2 cells classification. Based on the same dataset (ICPR 2012 dataset) and experimental protocols, researchers can evaluate their work in a more convincing way. For example, Di Cataldo et al. [31] propose a classification approach based on subclass discriminant analysis (SDA) using the integration of morphological, global, and local texture features. It obtains an accuracy of 72.2%. Liu and Wang [32] utilize the linear projections of the image pixels as the descriptors and propose a multi-projection-multi-codebook strategy to generate multiple pooled vectors for the image. The pooled vectors are concatenated to get the final image representation. It can achieve an overall classification accuracy of 66.6%.


In this section, we present the details of the proposed coding method called the LLDC method based on local distance vector. Our proposed method maintains superior discriminative capability and effectiveness of the traditional coding-based methods. It provides better generalization capability by using the distance between local feature and certain class to estimate image membership. Meanwhile, it preserves more discriminative information by avoiding coding process while obtaining image-to-class distance. Furthermore, the LLDC method avoids poor estimates from isolated classes by eliminating the need to calculate distance vector for each class. Hence, the LLDC method can achieve superior image classification performance compared with the other coding schemes.

Distance vector

The essential idea of LDC method [22] is to calculate the distance vector which is an alternative discriminative pattern of local feature in the class-specific manifold coordinate system.

Let \(\mathbf {X}=\{\mathbf {x}_{1}, \mathbf {x}_{2},\ldots, \mathbf {x}_{N}\}\in \mathbb {R}^{D\times N}\) be a set of D-dimensional local features extracted from an image. It is assumed that the local features of each class are sampled from a class-specific manifold \(\mathbf \mathit {M}^{c} =\left [\mathbf {m}_{1}^{c}, \mathbf {m}_{2}^{c},\ldots, \mathbf {m}_{n_{c}}^{c}\right ]\), which is determined by clustering local features of the training images from the corresponding class c. Then, the distance vector which denotes the distance between a local feature x i and class c is computed by:

$$ \mathit{d}\left(\mathbf{x}_{i}, c\right) = \left\| \mathbf{x}_{i} - \mathbf{x}_{i}^{c} \right\|_{\ell_{2}}^{2}, $$

where \(\mathbf {x}_{i}^{c}\) denotes the mapped point of x i in class c. It can be computed as a linear combination of its neighboring features in the manifold M c. The LDC method calculates \(\mathbf {x}_{i}^{c}\) as follow:

$$\begin{array}{@{}rcl@{}} &&\min_{\mathbf{u}_{i}^{c}} \left\| \mathbf{x}_{i} - \mathbf\mathit{M}^{c}\mathbf{u}_{i}^{c}\right\|_{\ell_{2}}^{2}, \\ &&{\mathit s.t.}\quad u_{ij}^{c} = 0, \quad\text{if} \quad \mathbf{m}_{j}^{c} \not\in \mathcal {N}_{i}^{k} \\ &&\qquad \mathbf{1}^{T} \mathbf{u}_{i}^{c}= 1, \forall i \end{array} $$

where \(\mathbf {u}_{i}^{c} =\left [u_{i1}^{c}, u_{i2}^{c},\ldots, u_{{in}_{c}}^{c}\right ]\) is the linear coefficient of x i on the manifold M c and \(\mathcal {N}_{i}^{k} \) denotes the set of k nearest neighbors of x i on M c. Then, the distance vector can be redefined as:

$$ \mathit{d}_{i}^{c} = \mathit {d}\left(\mathbf{x}_{i}, c\right) = \left\| \mathbf{x}_{i} - \mathbf{x}_{i}^{c} \right\|_{\ell_{2}}^{2} = \left\| \mathbf{x}_{i} - \mathbf{\mathit{M}}^{c}\mathbf{u}_{i}^{c}\right\|_{\ell_{2}}^{2}. $$

Each local feature of an image is transformed to its distance vector \(\mathbf {d}_{i} =\left [{d_{i}^{1}}, {d_{i}^{2}},\ldots, {d_{i}^{C}}\right ] \), where C is the class number.

By generating image representation based on distance vector, the LDC method captures discriminative information and avoids the case where the discriminative features are dominated by outlier or noisy features. Therefore, using the linear SVM, the LDC method shows impressive image classification performance. However, distance vector is obtained by utilizing the approximate fast solution of the LLC coding method which inherently induces information loss. Meanwhile, distance vector treats every class equally because it is produced through calculating the distance from local feature to each class. Such operation easily brings in the uncorrelated information of classes which are far from query local feature, and consequently arouses unnecessary interference. Therefore, distance vector can be improved further to perform better in image classification tasks.

Local distance vector

It is verified that using the distance between local feature and classes (i.e., image-to-class distance) can provide better generation capability. We propose a novel distance pattern, called local distance vector, to define the distance from local feature to a specific class. Local distance vector eliminates the need to search for the nearest few neighbors in every class-specific manifold to generate distance vectors. Instead, it merges all the class-specific manifolds together to form a single dataset, i.e., \(\mathbf \mathit {M}= \left [\mathbf \mathit {M}^{1}, \mathbf \mathit {M}^{2},\ldots, \mathbf \mathit {M}^{C}\right ] = \left \{\mathbf {m}_{i}\right \}_{i=1}^{n}\), where m i is called ‘anchor points’ [33] and n is the total number of points. To obtain the class-specific distance, we search for k nearest neighbors of a local feature x i in M, denoted as N N(x i ,k)={p 1,p 2,…,p k }M. Each neighbor p i has a label Class{p i } identifying it belongs to which class. We define the distance from x i to those classes found in the k nearest neighbors as follow:

$$ \hat{\mathit{d}}_{i}^{c} = \text{min}_{\{\mathbf{p}_{j}\mid \text{Class}(\mathbf{p}_{j})=c\}}\parallel \mathbf{x}_{i} - \mathbf{p}_{j} \parallel_{\ell_{2}}^{2}. $$

The difference between distance vector and local distance vector is shown in Figure 2. Our proposed local distance vector is less influenced by isolated classes since it only calculates distance vector for some classes which are close to the query feature. On the contrary, distance vector has to calculate the distance between the local feature and each class; it is inevitable to bring in some irrelative information from distant classes.

Figure 2
figure 2

Distance vector vs. local distance vector. x i is a query local feature. Distance vector searches the mapping point x i which is determined by the nearest few neighbors in each manifold M c. Local distance vector retrieves only the local neighborhood in M=[M 1,M 2,…,M C].

For those classes that are not found in the k nearest neighbors, we use the distance to the k+1 nearest neighbors of x i to estimate the class-specific distance. And the local distance vector of the local feature x i is denoted as \(\hat {\mathbf {d}}_{i} = \left [\hat {\mathit {d}}_{i}^{1}, \hat {\mathit {d}}_{i}^{2},\ldots, \hat {\mathit {d}}_{i}^{C}\right ] \). The local distance vectors of an image is described in Algorithm 1.

Unlike the original local features, local distance vector is more class specific as desired for classification. Such class-specific distance captures the underlying manifold structure of the local features [22]. Meanwhile, it is obtained by using its nearest few neighbors avoiding coding process and ignoring some irrelative classes far from the local feature. Thus, it gains stronger discriminative capability and more robustness to noise and outlier features. Local distance vector obtains another advantage inherited from distance pattern, that is, all local distance vectors within the same class are more similar in the distance feature space due to the class-specific characteristic. Therefore, it can cooperate better with following pooling procedure. Furthermore, the calculation of local distance vector is significantly faster than that of distance vector because it is produced by searching for nearest neighbors within a merged reference dataset.

Linear local distance coding framework

Our proposed LLDC method utilizes local distance vector to generate discriminative and effective image features, then adopts coding-pooling framework to obtain robust image representation. To verify the effectiveness and generalization of the proposed local distance transformation, we apply two different linear coding method respectively, i.e., locality-constrained LLC [13] and local soft-assignment coding method (LSC) [19], to encode local distance vectors due to their high efficiency and prominent performance.

Let \( \mathbf {X}=\left \{\mathbf {x}_{1}, \mathbf {x}_{2},\ldots, \mathbf {x}_{N}\right \}\in \mathbb {R}^{D\times N}\) be a set of D-dimensional local features extracted from an image. Given a codebook learned beforehand with M entities, i.e., \( \mathbf {B}=\left \{\mathbf {b}_{1}, \mathbf {b}_{2},\ldots, \mathbf {b}_{M}\right \}\in \mathbb {R}^{D\times M}\), the codes of an image \( \mathbf {Y}=\left \{\mathbf {y}_{1}, \mathbf {y}_{2},\ldots, \mathbf {y}_{N}\right \}\in \mathbb {R}^{M\times N}\) can be generated by using various coding schemes. The LLC method transforms each input feature into a linear combination of the basis in a given codebook utilizing the locality constraint. The resulting LLC codes of an image can be calculated via following criteria:

$$\begin{array}{@{}rcl@{}} &&\min_{\mathbf{Y}} \sum_{i=1}^{N} \left\| \mathbf{x}_{i} - \mathbf{B}\mathbf{y}_{i}\right\|^{2}_{\ell_{2}} + \lambda \parallel \mathbf{e}_{i}\odot \mathbf{y}_{i} \parallel^{2}_{\ell_{2}}, \\ &&{\mathit s.t.}\quad \mathbf{1}^{T} \mathbf{y}_{i}= 1, \forall i \end{array} $$

where is element-wise multiplication, \(\mathbf {e}_{i}\in \mathbb {R}^{M}\) is the locality adaptor that measures the similarity between the input descriptor x i and codebook entities b j , and it is defined as:

$$ {e}_{ij}= \text{exp} \left(\frac{\parallel \mathbf{x}_{i} - \mathbf{b}_{j} \parallel_{\ell_{2}}}{\sigma}\right), $$

where σ is used to adjust weight-decay speed for locality adaptor.

With respect to the LSC method, it assigns a local feature to the k nearest visual words of the codebook as follows:

$$\begin{array}{@{}rcl@{}} &&y_{ij} = \frac{\exp\left(-\beta \tilde{d}\left(\mathbf{x}_{i},\mathbf{b}_{j}\right)\right)}{\sum_{l=1}^{M} \exp\left(-\beta \tilde{d}\left(\mathbf{x}_{i},\mathbf{b}_{l}\right)\right)}, \\ && \tilde{d}\left(\mathbf{x}_{i},\mathbf{b}_{l}\right) = \left\{ \begin{array}{ll} d\left(\mathbf{x}_{i},\mathbf{b}_{l}\right)&,\quad \text{if} \quad \mathbf{b}_{l} \in \mathcal{N}_{k}\left(\mathbf{x}_{i}\right)\\ 0&, \quad \text{otherwise.} \\ \end{array}\right. \end{array} $$

where \(\tilde {d}(\mathbf {x}_{i},\mathbf {b}_{j})\) is the local version of d(x i ,b j ) which is the original distance between x i and b j , and \(\mathcal {N}_{k}(\mathbf {x}_{i})\) defines the k nearest neighbors of x i in codebook.

Within the proposed LLDC framework, the local distance vectors are transformed from local features, then the local distance vector and the original local feature are separately encoded and pooled to generated two image representations. It is verified that the image representations produced by the coding methods based on distance patterns are complementary to the ones from the original coding methods [22]. Consequently, we directly concatenate them to extract more discriminative and descriptive image representation.

An overview of the LLDC framework is shown in Figure 3 including following steps:

  1. (1)

    The local features, \(\mathbf {X}=\{\mathbf {x}_{i}\}_{i=1}^{N}\), are extracted from every image;

    Figure 3
    figure 3

    Overview of the LLDC framework.

  2. (2)

    The local distance vectors, \(\hat {\mathbf {d}} = \left \{\hat {\mathbf {d}}_{i}\right \}_{i=1}^{N}\), are transformed from local features one-by-one following Algorithm 1;

  3. (3)

    Local distance vectors are encoded by using LSC or LLC coding scheme based on a pre-trained codebook \(\mathbf {B} = \{\mathbf {b}_{i}\}_{i=1}^{M}\);

  4. (4)

    Max-pooling strategy is performed on the codes within each spatial subregion \(\mathcal {I}^{\ell }\) as follow:

    $$ \mathbf{\mathcal{V}}_{\hat{d}}^{\ell} = \max \left(\mathbf{y}_{k}| \mathbf{y}_{k} \in \mathcal{I}^{\ell}\right), $$

    where max is performed element wisely for the involved vectors in each subregion and \(\ell = 1,2,\ldots,\mathcal {L}\) is the numbering of subregions;

  5. (5)

    The image representation based on local distance vector can be generated by concentrating all the pooled features from every subregion, i.e., \(\mathbf {\mathcal {V}}_{\hat {d}} = \left [\mathbf {\mathcal {V}}_{\hat {d}}^{1}; \mathbf {\mathcal {V}}_{\hat {d}}^{2};\ldots ; \mathbf {\mathcal {V}}_{\hat {d}}^{\mathcal {L}}\right ] \). And the representation is normalized by:

    $$ \mathbf{\mathcal{V}}_{\hat{d}} = \mathbf{\mathcal{V}}_{\hat{d}} / \parallel \mathbf{\mathcal{V}}_{\hat{d}} \parallel_{\ell_{2}}; $$
  6. (6)

    The original local features are also aggregated under the coding-pooling framework through steps 3 to 5 to get the image representation ; and

  7. (7)

    The final image representation obtained by combining aforementioned two image representations \(\mathbf {\mathcal {V}}_{\hat {d}}\) and is fed into a linear SVM classifier to classify the staining patterns of HEp-2 cells.

Experiments and analyses

In this section, we verify the effectiveness and improvement of our proposed LLDC framework for HEp-2 cells classification.


In order to evaluate the performance of the proposed LLDC method, we use two HEp-2 cells datasets: the ICPR 2012 HEp-2 cell classification contest dataset (ICPR 2012 dataset) and International Conference on Image Processing (ICIP) 2013 Competition on cells classification by fluorescent image analysis training dataset (ICIP 2013 training dataset). Some examples of the datasets are shown in Figure 4.

Figure 4
figure 4

Samples of ICPR 2012 dataset and ICIP 2013 training dataset. With different staining patterns of HEp-2 cells.

The ICPR 2012 dataset consists of 1,455 HEp-2 cells segmented from 28 slide images which are acquired by means of a fluorescence microscope (40-fold magnification) coupled with a 50-W mercury vapor lamp and a digital camera utilizing a CCD with square pixel of 6.45 μm. The images have a resolution of 1,388×1,038 pixels and color depth of 24 bits. Each image can be categorized into one of six staining patterns, namely centromere (ce), coarse speckled (cs), cytoplasmic (cy), fine speckled (fs), homogeneous (ho), and nucleolar (nu). Also, fluorescent intensity, i.e., positive or intermediate, is assigned to each image. The cells in the images are manually segmented and annotated by specialists. According to the experimental protocol of the ICPR 2012 contest, the ICPR 2012 dataset is divided into a training set with 721 cells from half of the slide images and a test set with 734 cells from rest of the slide images. The composition of the dataset is reported in Table 1.

Table 1 Composition of the ICPR 2012 dataset

The HEp-2 cell images of ICIP 2013 dataset is obtained by using a monochrome high dynamic range cooled microscopy camera which is fitted on a microscope with a plan-apochromat 20 ×/0.8 objective lens and an LED illumination source. So far, only the training dataset is available. The ICIP 2013 training dataset contains 13,596 cells which are categorized into six classes: homogeneous (ho), speckled (sp), nucleolar (nu), centromere (ce), nuclear membrane (nm), and golgi (go). The dataset includes two patterns less frequent occurring in the practical clinic, which are nuclear membrane pattern and golgi pattern. Thus, it offers a more realistic evaluation on the automatic classification algorithms. We partition the ICIP 2013 training dataset into a training set consisting of 6,842 cells from 42 slide images and a test set consisting of 6,754 cells from 41 slide images. See Table 2 for detailed information about the dataset.

Table 2 Composition of the ICIP 2013 training dataset

Experimental settings

We firstly extract dense SIFT features as the local feature. SIFT features are invariant to scaling and rotation and partially invariant to illumination change, viewpoint change, and noise. These properties are advantageous in HEp-2 cells classification as cell images are unaligned and have high within-class variabilities. In our experiments, SIFT features are extracted at single scale from densely located patches of gray-level images. The patches are centered at every 6 pixels and with a fixed size of 18×18 pixels.

To obtain local distance vectors, the number of anchor points \(\{\mathbf {m}_{i}^{c}\}\) for each class manifold M c are fixed to 1,024, then the size of the merged M for our proposed local distance vectors transformation is 6,144×128. For the original SIFT features and the corresponding local distance vectors, all the codebooks in coding process contain 1,024 visual words learned from training samples by using k-means clustering method. One of the most important parameters for our proposed LLDC method is k LDV that defines the neighborhood of a local feature in local distance vector transformation. In the following coding process, the number of neighbors in the LLC method (i.e., k LLC) is another parameter which can influence the classification performance. We also utilize the LSC method to encode the local distance vector, therefore the impact of neighbor size (i.e., k LSC) will be discussed while the smoothing factor β is fixed as 10. We study the influence of these parameters for HEp-2 cells classification in the ‘Discussion’ section.

After coding process, SPM is used through partitioning each image into three increasingly finer subregions, i.e., 1×1, 2×2, and 4×4. We apply max-pooling strategy to pool the codes for each spatial subregion. The obtained features within all the subregions are concatenated, then the final image representation is fed into a linear SVM classifier in the training and testing phases using the LIBLINEAR package [34], thanks to its efficiency in implementation. The linear SVM is trained based on the training set by tenfold cross-validation strategy and tested using the test set.

The experimental results are reported at the cell level and the image level, respectively. At the cell level, let t p i ,t n i ,f p i and f n i respectively denote the true positives, true negatives, false positives, and false negatives for an individual staining pattern class c i . In our experiments, we use the performance measures accuracy and recall at the cell level which are formulated as:

$$\begin{array}{*{20}l} \text{accuracy} &= \sum{\frac{{tp}_{i}}{{tp}_{i}+{tn}_{i}+{fp}_{i}+{fn}_{i}}}, \end{array} $$
$$\begin{array}{*{20}l} \text{recall}&= \frac{\sum\frac{{tp}_{i}}{{tp}_{i}+{fn}_{i}}}{C}, \end{array} $$

where C is the number of cell classes.

At the image level, the prediction for staining pattern of each image is decided by the most frequently assigned pattern of the cells within that image. In our experiments, we use the number of correctly classified images divided by the number of all the images as the image-level classification accuracy.

Experimental results on the ICPR 2012 dataset

We first test performance of the proposed LLDC method on the ICPR 2012 dataset following the experimental protocol of the HEp-2 cells classification contest by dividing the cell images into a training set and a test set. The subdivision is performed while maintaining approximately the same image pattern distribution over the two sets [11]. To assess the performance of our method, we compare four different image representations: the original SIFT-based BoW image representation (LLC/LSC-sift), the distance-vector-based image representation (LLC/LSC-(sift+dv)), our proposed image representation using local distance vector (LLC/LSC-ldv) and the proposed concatenated image representation (LLC/LSC-(sift+ldv)). Table 3 gives the comparison results of the cell-level and image-level classification performances. It can be observed that the proposed LLDC method outperforms all the other methods. It is worth noting that the LLDC method outperforms CoALBP [28] which is the winner of the contest with 70.4% of the cell-level classification accuracy and 68.4% of recall. Furthermore, the performance obtained by LLC/LSC-(sift+ldv) is better than that obtained by LLC/LSC-sift and LLC/LSC-ldv. In particular, the classification performance achieved by LSC-(sift+ldv) is better than that achieved by LLC-(sift+ldv).

Table 3 Classification performance on the ICPR 2012 dataset

Table 4 shows the confusion matrix at the cell level by the proposed LLDC method using the LSC strategy on the concatenated image representation. The entry in the confusion matrix corresponds to row i, and column j represents the percentage of cells from class i assigned to class j. It is obvious that cytoplasmic, centromere, and homogeneous patterns are classified more accurately than the others. More particularly, cytoplasmic can achieve 100% of classification accuracy. Compared to the cytoplasmic pattern with distinguishable shape and centromere pattern with clear fluorescent dots, speckled pattern and homogeneous pattern have similar characteristics and have hard-to-find discriminative features that hard to separate.

Table 4 The cell-level classification confusion matrix of LSC-(sift+ldv) using the ICPR 2012 dataset

To evaluate the classification performance at the image level, we report the corresponding confusion matrix in Table 5. Similarly, the table represents the percentage of images of class i identified to class j with respect to the total number of images in the test set. Our proposed LLDC method obtains 85.7% of the image-level classification accuracy, which indicates that 12 images are correctly classified while there are 14 images in the test set. Centromere, cytoplasmic, homogeneous, and nucleolar patterns achieve 100% of classification accuracy. The most frequent mistake is existed between fine speckled and homogeneous pattern, which is a common mistake at the cell level.

Table 5 The image-level classification confusion matrix of LSC-(sift+ldv) method using the ICPR 2012 dataset

Experimental results on the ICIP 2013 training dataset

Based on the ICIP 2013 training dataset, the classification performance of different algorithms at the cell level and the image level is shown in Table 6. Our proposed LLDC method achieves the best performance. Particularly, LLC-(sift+ldv) can achieve better classification performance than LSC-(sift+ldv). Table 7 shows the confusion matrix at the cell level by the proposed LLDC method using the LLC strategy on the concatenated image representation. Nuclear membrane pattern gets the highest classification accuracy rate, followed by homogeneous pattern as they have distinguished characteristic compared with other patterns. Golgi pattern is often mistaken for nucleolar pattern, because some golgi pattern have large speckles within the nucleoli while some only have several cluster of irregular granules, which is just similar to nucleolar pattern. Table 8 illustrates the confusion matrix at the image level. The proposed LLDC method obtains the classification accuracy of 90.2% at the image level, which means that 37 images are correctly identified while there are 41 images in the test set. Nucleolar and nuclear membrane patterns particularly obtain 100% of image-level accuracy. It is evident that golgi pattern is wrongly classified as nucleolar, which is very common at the cell level.

Table 6 Classification performance on the ICIP 2013 training dataset
Table 7 The cell-level classification confusion matrix of LLC-(sift+ldv) using the ICIP 2013 training dataset
Table 8 The image-level classification confusion matrix of LLC-(sift+ldv) using the ICIP 2013 training dataset


To provide a more comprehensive analysis of the proposed LLDC method, we further evaluate its performance with respect to the number of nearest neighbors for calculating local distance vector and the coding process, respectively. It should be noted that the classification performance evaluated in this section is classification accuracy at the cell level.

Neighbor number k LDV on calculating local distance vector: In our proposed method, we firstly introduce a merged manifold M for all the classes. Secondly, we transform the original local features to local distance vectors by searching the nearest k LDV neighbors around the local feature without regard to classes isolated from the local feature. Figures 5 and 6 show classification the accuracy under various values of k LDV for ICPR 2012 dataset and ICIP 2013 training dataset, respectively. Obviously, the proposed LLDC method achieves the best classification performance when k LDV=35 while using LSC coding scheme for ICPR 2012 dataset. For ICIP 2013 training dataset, k LDV=50 is the best choice while using LLC coding scheme.

Figure 5
figure 5

Classification accuracy of the LLDC method under various k LDV on ICPR 2012 dataset.

Figure 6
figure 6

Classification accuracy of the LLDC method under various k LDV on ICIP 2013 training dataset.

Neighbor number k LLC on LLC method: We investigate the effect on classification performance under various neighbor number, k LLC, in approximated LLC coding scheme. Figure 7 shows the performance using k LLC{2,5,10,20,30,…,70}. As can be seen, the best classification accuracy is achieved when k LLC=5 and k LLC=60 for ICPR 2012 dataset and ICIP 2013 training dataset, respectively.

Figure 7
figure 7

Classification accuracy of the LLDC method using LLC strategy under various k LLC.

Neighbor number k LSC on LSC method: With respect to LSC coding strategy, only k LSC nearest neighbors of a local feature are considered in coding procedure. We discuss the impact of different k LSC for staining patterns classification performance. Figure 8 shows the classification accuracy under k LSC{2,5,10,15,20,…,40}. Obviously, k LSC=10 is the best choice for ICPR 2012 dataset while k LSC=30 is the best for ICIP 2013 training dataset.

Figure 8
figure 8

Classification accuracy of the LLDC method using LSC strategy under various k LSC.

At last, we evaluate the running time for each method as show in Table 9. Our proposed framework combines two kinds of image representations, therefore it needs more computational time.

Table 9 Running time(s) for each method


In this study, we have presented a promising framework, LLDC, for automatic staining pattern classification of HEp-2 cells to support the diagnosis of specific autoimmune diseases. The LLDC framework can extract more discriminative information and consequently gives better HEp-2 cells classification performance than many existing coding methods. The LLDC method is based on local distance vector which captures discriminative information via image-to-class distance. Furthermore, local distance vector improves the classification performance by making adjustments only to the classes found in the local k LDV nearest neighbors around the local features. It can avoid disturbance from isolated classes. Additionally, the distance patterns and the original local features are proven to be complementary to each other. Therefore, the concatenation of two image representations can achieve higher classification accuracy. Experimental results on the ICPR 2012 dataset and the ICIP 2013 training dataset validate that the proposed LLDC framework can provide superior performance for HEp-2 cells classification, compared with the some improved coding methods.

Compared with traditional coding methods, the LLDC framework is time consuming as it needs to transform original local features to local distance vectors one-by-one and it is an integration of two kinds of image representations. In the future, we plan to design a new model to reduce the algorithm’s complexity [35] while improving the accuracy.


  1. PL Meroni, PH Schur, ANA screening: an old test with new recommendations. Ann. Rheum. Dis. 69(8), 1420–1422 (2010).

    Article  Google Scholar 

  2. R Hiemann, T Büttner, T Krieger, D Roggenbuck, U Sack, K Conrad, Challenges of automated screening and differentiation of non-organ specific autoantibodies on HEp-2 cells. J. Autoimmun. Rev. 9(1), 17–22 (2009).

    Article  Google Scholar 

  3. P Soda, G Iannello, in IEEE Int. Symp. on Computer-Based Med. Syst. A multi-expert system to classify fluorescent intensity in antinuclear autoantibodies testing (Maribor, Slovenia, 2006), pp. 219–224.

  4. R Hiemann, N Hilger, U Sack, M Weigert, Objective quality evaluation of fluorescence images to optimize automatic image acquisition. J. Cytom. Part A. 69(3), 182–184 (2006).

    Article  Google Scholar 

  5. LS Cheong, F Lin, HS Seah, K Qian, F Zhao, PS Thong, KC Soo, M Olivo, S-Y Kung, Embedded computing for fluorescence confocal endomicroscopy imaging. J. Signal Process. Syst. 55(1–3), 217–228 (2009).

    Article  Google Scholar 

  6. G Percannella, P Soda, M Vento, in IEEE Int. Symp. on Computer-Based Med. Syst. A classification-based approach to segment HEp-2 cells (Roma, Italy, 2012), pp. 1–5.

  7. P Foggia, G Percannella, P Soda, M Vento, in IEEE Int. Symp. on Computer-Based Med. Syst. Early experiences in mitotic cells recognition on HEp-2 slides (Perth, Australia, 2010), pp. 38–43.

  8. P Soda, G Iannello, M Vento, A multiple expert system for classifying fluorescent intensity in antinuclear autoantibodies analysis. Pattern Anal. Appl. 12(3), 215–226 (2009).

    Article  MathSciNet  Google Scholar 

  9. R Hiemann, N Hilger, J Michel, J Nitschke, A Boehm, U Anderer, M Weigert, U Sack, Automatic analysis of immunofluorescence patterns of HEp-2 cells. Ann. N. Y. Acad. Sci. 1109(1), 358–371 (2007).

    Article  Google Scholar 

  10. J Yu, F Lin, H-S Seah, C Li, Z Lin, Image classification by multimodal subspace learning. Pattern Recognit. Lett. 33(9), 1196–1204 (2012).

    Article  Google Scholar 

  11. P Foggia, G Percannella, P Soda, M Vento, Benchmarking HEp-2 cells classification methods. IEEE Trans. Med. Imaging. 32(10), 1878–1889 (2013).

    Article  Google Scholar 

  12. J Yang, K Yu, Y Gong, T Huang, in Proc. CVPR. Linear spatial pyramid matching using sparse coding for image classification (Miami, Florida, USA, 2009), pp. 1794–1801.

  13. J Wang, J Yang, K Yu, F Lv, T Huang, Y Gong, in Proc. CVPR. Locality-constrained linear coding for image classification (Perth, Australia, 2010), pp. 3360–3367.

  14. A Wiliem, C Sanderson, Y Wong, P Hobson, RF Minchin, BC Lovell, Automatic classification of human epithelial type 2 cell indirect immunofluorescence images using cell pyramid matching. Pattern Recogn. 47(7), 2315–2324 (2014).

    Article  Google Scholar 

  15. L Shen, J Lin, S Wu, S Yu, HEp-2 image classification using intensity order pooling based features and bag of words. Pattern Recogn. 47(7), 2419–2427 (2014).

    Article  Google Scholar 

  16. S Lazebnik, C Schmid, J Ponce, in Proc. CVPR, 2. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories (Salt Lake City, Utah, USA, 2006), pp. 2169–2178.

  17. JC van Gemert, J-M Geusebroek, CJ Veenman, AW Smeulders, in Proc. ECCV. Kernel codebooks for scene categorization (Marseille, France, 2008), pp. 696–709.

  18. JC van Gemert, CJ Veenman, AW Smeulders, JM Geusebroek, Visual word ambiguity. IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1271–1283 (2010).

    Article  Google Scholar 

  19. L Liu, L Wang, X Liu, in Proc. ICCV. In defense of soft-assignment coding (Barcelona, Spain, 2011), pp. 2486–2493.

  20. K Yu, T Zhang, Y Gong, in Proc. NIPS. Nonlinear learning using local coordinate coding (Vancouver, British Columbia, Canada, 2009), pp. 2223–2231.

    Google Scholar 

  21. O Boiman, E Shechtman, M Irani, in Proc. CVPR. In defense of nearest-neighbor based image classification (Anchorage, Alaska, USA, 2008), pp. 1–8.

  22. Z Wang, J Feng, S Yan, H Xi, Linear distance coding for image classification. IEEE Trans. Image Process. 22, 537–548 (2013).

    Article  MathSciNet  Google Scholar 

  23. P Perner, H Perner, B Müller, Mining knowledge for HEp-2 cell image classification. J. Artif. Intell. Med. 26(1), 161–173 (2002).

    Article  Google Scholar 

  24. P Soda, G Iannello, Aggregation of classifiers for staining pattern recognition in antinuclear autoantibodies analysis. IEEE Trans. Inf. Technol. Biomed. 13(3), 322–329 (2009).

    Article  MathSciNet  Google Scholar 

  25. E Cordelli, P Soda, in IEEE Int. Symp. on Computer-Based Med. Syst. Color to grayscale staining pattern representation in IIF (Bristol, United Kingdom, 2011), pp. 1–6.

  26. X Xu, F Lin, C Ng, KP Leong, Staining pattern classification of ANA-IIF based on sift features. J. Med. Imaging Health Inform. 2(4), 419–424 (2012).

    Article  Google Scholar 

  27. A Wiliem, Y Wong, C Sanderson, P Hobson, S Chen, BC Lovell, in IEEE Workshop on Applications of Computer Vision (WACV). Classification of human epithelial type 2 cell indirect immunofluoresence images via codebook based descriptors (Clearwater Beach, FL, USA, 2013), pp. 95–102.

  28. R Nosaka, Y Ohkawa, K Fukui, in Pac. Rim Symp. Advances in Image and Video Technol. Feature extraction based on co-occurrence of adjacent local binary patterns, (2012), pp. 82–91.

  29. K Li, J Yin, Z Lu, X Kong, R Zhang, W Liu, in Pattern Recognition (ICPR), 2012 21st International Conference On. Multiclass boosting SVM using different texture features in HEp-2 cell staining pattern classification (Tsukuba Science City, Japan, 2012), pp. 170–173.

  30. S Ghosh, V Chaudhary, in Proc. ICPR. Feature analysis for automatic classification of HEp-2 florescence patterns: computer-aided diagnosis of auto-immune diseases (Tsukuba Science City, Japan, 2012), pp. 174–177.

  31. S Di Cataldo, A Bottino, I Ul Islam, T Figueiredo Vieira, E Ficarra, Subclass discriminant analysis of morphological and textural features for HEp-2 staining pattern classification. Pattern Recogn. 47(7), 2389–2399 (2014).

    Article  Google Scholar 

  32. L Liu, L Wang, HEp-2 cell image classification with multiple linear descriptors. Pattern Recognit. 47(7), 2400–2408 (2014).

    Article  Google Scholar 

  33. K Yu, T Zhang, in Proc. ICML. Improved local coordinate coding using local tangents (Haifa, Israel, 2010), pp. 1215–1222.

  34. R-E Fan, K-W Chang, C-J Hsieh, X-R Wang, C-J Lin, Liblinear: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008).

    MATH  Google Scholar 

  35. F Zhao, F Lin, HS Seah, Binary sipper plankton image classification using random subspace. Neurocomputing. 73(10), 1853–1860 (2010).

    Article  Google Scholar 

Download references


This work is partially supported by two research grants, MOE2011-T2-2-037 and RG139/14, from the Ministry of Education, Singapore.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Feng Lin.

Additional information

Competing interests

The authors declare that they have no competing interests.

Rights and permissions

Open Access  This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.

The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Xu, X., Lin, F., Ng, C. et al. Automated classification for HEp-2 cells based on linear local distance coding framework. J Image Video Proc. 2015, 13 (2015).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Indirect immunofluorescence
  • HEp-2 cells classification
  • Linear local distance coding
  • Local distance vector