A probabilistic segmentation and entropy-rank correlation-based feature selection approach for the recognition of fruit diseases

Khan, Muhammad Attique; Akram, Tallha; Sharif, Muhammad; Alhaisoni, Majed; Saba, Tanzila; Nawaz, Nadia

doi:10.1186/s13640-021-00558-2

Research
Open access
Published: 10 May 2021

A probabilistic segmentation and entropy-rank correlation-based feature selection approach for the recognition of fruit diseases

Muhammad Attique Khan ORCID: orcid.org/0000-0002-6347-4890¹,
Tallha Akram²,
Muhammad Sharif³,
Majed Alhaisoni⁴,
Tanzila Saba⁵ &
…
Nadia Nawaz²

EURASIP Journal on Image and Video Processing volume 2021, Article number: 14 (2021) Cite this article

3799 Accesses
17 Citations
Metrics details

Abstract

Agriculture plays a critical role in the economy of several countries, by providing the main sources of income, employment, and food to their rural population. However, in recent years, it has been observed that plants and fruits are widely damaged by different diseases which cause a huge loss to the farmers, although this loss can be minimized by detecting plants’ diseases at their earlier stages using pattern recognition (PR) and machine learning (ML) techniques. In this article, an automated system is proposed for the identification and recognition of fruit diseases. Our approach is distinctive in a way, it overcomes the challenges like convex edges, inconsistency between colors, irregularity, visibility, scale, and origin. The proposed approach incorporates five primary steps including preprocessing,Standard instruction requires city and country for affiliations. Hence, please check if the provided information for each affiliation with missing data is correct and amend if deemed necessary. disease identification through segmentation, feature extraction and fusion, feature selection, and classification. The infection regions are extracted using the proposed adaptive and quartile deviation-based segmentation approach and fused resultant binary images by employing the weighted coefficient of correlation (CoC). Then the most appropriate features are selected using a novel framework of entropy and rank-based correlation (EaRbC). Finally, selected features are classified using multi-class support vector machine (MC-SCM). A PlantVillage dataset is utilized for the evaluation of the proposed system to achieving an average segmentation and classification accuracy of 93.74% and 97.7%, respectively. From the set of statistical measure, we sincerely believe that our proposed method outperforms existing method with greater accuracy.

1 Introduction

The plant diseases affect both quality and quantity of agricultural products by interfering with set of processes including plant growth, flower and fruit development, and absorbent capacity, to name but a few [1]. Therefore, early detection and classification of plant diseases play a vital role in agriculture farming. Nevertheless, two possible options may be availed — manual inspection and computer vision techniques. The former method is quite difficult and requires a lot of efforts and time [2], while the latter is mostly followed because of its improved performance [3]. Plants show range of symptoms from their early to final stages, which can be easily observed on fruits and leaves/stem with the naked eye. Therefore, set of symptoms can be categorized using computer vision (CV) and other machine learning (ML) methods [4].

A great effort has been made in the field of CV to process visual features extracted from fruits’ images for the recognition of multiple diseases [5]. Several existing methods worked well, but not considering different set of constraints — specifically related to image quality [6–11], training/testing samples, number of labels, and disease complexity, to name but a few [12]. In this article, two fruits are selected and four different types of fruits’ diseases are initially focused including apple scab, apple rust, grapes rot leaves, and grapes leaf blight. Mostly existing methods follow a typical architecture, which includes (a) preprocessing block, (b) segmentation block, (c) feature extraction block, and (d) classification block. Several detection methods are employed by scholars working in this domain including clustering, thresholding, color, shape, and texture-based methods, adaptive approaches, etc. All these methods are somewhat problem dependent and by some means following a same trend — addressing one sort of problems while keeping other problems’ parameters fixed. Therefore, no universal mechanism exists which efficiently deals with all kind of problems.

In this article, we are primarily focusing on the classification of aforementioned diseases by following fundamental steps. Our primary contributions are enumerated below:

1.1 Major contributions

In this article, we introduced a new automated method for the identification and recognition of apple and grape diseases. The proposed method consists of five major steps: (a) contrast stretching; (b) identification of disease part by a fusion of novel adaptive and quartile deviation (QD)-based segmentation, which efficiently performs at the change in scale, origin, and irregularity of infection regions; (c) feature extraction and fusion; (d) an integrated framework of entropy and rank correlation is implemented for feature selection; and (e) classification. Our major contributions are listed below.

1.
A contrast stretching technique based on global min and max values is proposed, which defines a contrast range to determine lower and upper threshold values.
2.
An adaptive thresholding method following trapezoidal rule is proposed, which works in two steps: (1) location of infected regions and (2) computing threshold based on maxima and minima — calculated after taking second derivative.
3.
A parallel feature fusion methodology is opted, which jointly takes advantage of three sets of feature (color, texture, and shape)to select the most discriminant value.
4.
To overcome the problem of curse of dimensionality, a feature selection methodology is proposed, which efficiently assigns ranks to set of features based on entropy.

2 Literature review

Several methods exist in literature, which accurately classify fruit diseases using computer vision methods [13–18]. Specifically, for the identification of apple and grape diseases, various methods are proposed, which somehow manage to classify set of diseases with acceptable accuracy and sensitivity [19–22]. In unsupervised methods, range of algorithms are proposed including K-means clustering [23], global thresholding with morphological operations [24], graph cut methods [25], color segmentation [26], CLPSO-based fuzzy color segmentation [27], and adaptive approaches [28], to name but a few.

Bhivini et al. [2] introduced a framework to classify infected regions in apples. In the first stage of segmentation, they utilized K-means clustering to excerpt the infected region and then extract color and texture features from the segmented part. Subsequently, feature fusion is performed using simple concatenation prior to classification using random forest method. Similarly, Shiv et al. [5] introduced a novel method to classify apple diseases based on color, texture, and shape features. The introduced method is comprised of three fundamental steps of segmentation using K-means; extraction of color, texture, and shape features; and classification using multi-class SVM. Following the same trend, Shiv et al. [28] introduced an adaptive approach to detect infectious regions including apple scab, rot, and blotch by achieving a classification accuracy of 93%. The proposed method incorporates three primary steps of segmentation using K-means, feature extraction, and classification using multi-class SVM.

Zhang et al. [29] followed a novel machine learning method for detecting apple diseases. They made use of HSI, YUV, and gray color spaces for the removal of background via thresholding. The infectious regions are extricated by a region growing method to calculate shape, color, and texture features for each region. Finally, the most prominent features are classified using SVM, which are selected using genetic algorithm (GA) and correlation-based feature selection (CFS) method. Similarly, Soni et al. [30] identified plant diseases by following two fundamental steps of segmentation and classification. In the first step, ring-based segmentation is performed to identify infectious regions, followed by the feature extraction step. A probabilistic neural network is used for the final classification of diseases from randomly selected images acquired from the web. Lee et al. [31] implemented a swarm optimization-based method for the identification of apple diseases. Stochastic PSO algorithm finds out 10 spectral features based on pair of bands to return distinctiveness between each pair of classes. The selected features are later utilized by SVM to achieve improved performance. Harshal et al. [32] introduced a framework for the identification and classification of grape diseases. They implemented a background subtraction method for segmentation and later analyze the regions after passing through a high-pass filter. Thereafter, unique fractal-based texture features are extracted and finally classified through a multi-class SVM. They selected downy mildew and black rot diseases for evaluation and achieved classification accuracy of 96.6%.

Pranjali et al. [33] introduced a novel approach of fused classifiers for efficient classification of grape diseases. Initially, both SVM and ANN are utilized independently and then a new ensembles classifier is constructed for final classification. Similarly, Awate et al. [34] introduced a novel idea in which they utilized K-means for segmentation. Later, texture, color, morphological, and structural features are calculated, which are then subjected to ANN classifier for final classification. A general comparison with recent methods is also provided in Table 1 — in terms of segmentation technique, type of features, feature selection, classification method, disease type, and classification accuracy.

Table 1 Comparison of proposed technique with existing methods

Full size table

From the recent studies, it is quite clear that set of methods including fuzzy, thresholding, and K-means are mostly utilized for the identification of infectious regions. Recently, inclusion of saliency and CNN-based techniques show improved performance in this domain of agricultural farming [38]. Moreover, color and texture features are mostly utilized for final classification, but “curse of dimensionality” is somehow ignored. In this article, we are primarily focusing on contrast stretching, infectious region segmentation, and ultimately feature selection to avoid aforementioned problem. The contrast stretching technique improves the visual characteristics of an input image, which can help in the segmentation phase. A proposed feature selection algorithm aids in improving the overall classification accuracy.

3 Proposed method

In this section, the proposed method is explained, which incorporates series of steps including preprocessing, image segmentation and fusion, feature extraction, fusion and selection, and a final step of classification. Figure 1 demonstrates a working framework of the proposed method — clearly explaining series of aforementioned steps.

3.1 Contrast stretching

Contrast stretching is mostly applied on the images in which visual contents need to be enhanced. In this article, a global contrast stretching technique is proposed, which directly affects the infectious regions by making them maximally differentiable compared to the background. This method initially finds the global maxima and minima of each red, green, and blue channel to generate a new global minima and maxima values. These calculated values are later utilized to find a new range of intensity values against each channel, which in turns locate a new low and high threshold values.

Let ψ(i,j,k) is an original 3-dimensional RGB image, (256×,256×3), where $\psi _{1}(i, j)=\frac {r}{\sum _{k=1}^{3}r^{k}},\psi _{2}(i, j)=\frac {g}{\sum _{k=1}^{3}g^{k}}$, and $\psi _{3}(i, j)=\frac {b}{\sum _{k=1}^{3}b^{k}}$ represent the modified red, green, and blue channels. Here, the red channel is fraction of $red=\frac {red}{red+green+blue}$; therefore, we used $\sum $ for addition of all pixel values of three channels, and their histograms are shown in Fig. 2.

Suppose T_L and T_H are low and high threshold values which initialize as 0.01 and 1, respectively. Then global maxima and minima are calculated using initial T_L and T_H values as follows:

$$ \phi_{\text{max}}(k)=Max(\psi_{i,j}); \ \phi_{\text{min}}(k)=Min(\psi_{i,j}) $$

(1)

where ϕ_max and ϕ_min are global maximum and minimum values, Max and Min represents the max and min functions which select the maximum and minimum values from each channel k, where k∈{1:3} of three respective channels red, green, and blue denoted by ψ₁,ψ₂, and ψ₃.

The initial values of global maximum and minimum are 1 and 0. Then calculate a new global minimum pixel image by subtracting ϕ_min in to the original image ψ(i,j,k) and effects are shown in Fig. 3b. The information of subtracted image is stored in a temporary array (T_ar) of size 256×256 and find the maximum and minimum pixel value for the entire processed image by Eqs. 2 and 3:

$$ \phi_{\text{min}}(\text{new})=\psi(i,j,k)-\phi_{\text{min}}(k) $$

(2)

$$ \phi_{\text{max}}(\text{new})=\psi(i,j,k)-\phi_{\text{max}}(k) $$

(3)

These values are utilize to calculate the range of contrast by Eq. 4.

$$ R_{ctr}= \phi_{\text{max}}(\text{new})\times \phi_{\text{min}}(\text{new}) $$

(4)

where R_ctr denotes the contrast range image of dimension 256×256 as shown in Fig. 3c. To control the variation of contrast stretching, the low threshold (T_L) and high threshold values (T_H) are updated by Eqs. 5 and 6.

$$ T_{L}=T_{L} \times R_{ctr}, \ T_{H}=T_{H} \times R_{ctr} $$

(5)

$$ T_{L}=\phi_{\text{min}}(\text{new})+ T_{L}, \\ T_{H}= R_{ctr}- T_{H} $$

(6)

The values of low threshold and high threshold are utilized in contrast stretching cost function to concatenate the results of each channels. The cost function produced the new image, which is more enhanced as compared to original image. The cost function is defined by Eq. 7:

$$ F_{\text{cost}}(i,j,k)= \left\{\begin{array}{ll} 1 & if \ \frac{ T_{ar}}{T_{H}-T_{L}}\geq R_{ctr}\\ 0 & if \ \left (T_{H}-T_{L} < R_{ctr} \right) \end{array}\right. $$

(7)

where F_cost(i,j,k) is a resultant contrast stretched image and R_ctr is contrast range value which lies between 0 and 1. Equation 7 shows that if $\frac { T_{ar}}{T_{H}-T_{L}}\geq R_{ctr}$, then the diseased region in the image is enhanced; otherwise, it improves the background. Contrast stretching final results are shown in Figs. 3 and 4, which are later processed in segmentation phase.

3.2 Disease identification

In this section, the proposed segmentation method is elucidated — comprising of proposed segmentation and fusion methods. In the former one, a trapezoidal based adaptive thresholding and a quartile deviation (Q.D)-based segmentation method are employed independently, while, in the latter, binary images are fused using proposed method of weighted coefficient of correlation. Figure 1 demonstrates set of steps for image segmentation and fusion.

3.2.1 Trapezoidal based adaptive thresholding

Let F_cost(i,j,k) is a contrast stretched image. To identify the infectious regions, a trapezoidal rule is employed [39], which calculates the area of infection by utilizing max and min pixel values.

$$ \Delta =\frac{Max_{\text{old}}-Min_{\text{old}}}{\text{Total}_{n}} $$

(8)

where Total_n denotes the total number of pixels in F_cost(i,j,k). A second derivative of an image is later computed and Eq. 8 is updated to find max and min pixel values. The obtained pixel values are finally embedded into a cost function to extract the infectious regions.

$$ D(i,j)={f}'(F_{\text{cost}}(i,j))\ di \ dj $$

(9)

$$ D^{2}(x, y)=f^{\prime\prime}D(i,j) \ di \ dj $$

(10)

$$ Max_{\text{up}}=Max\left(D^{2}(i, j)\right) $$

(11)

$$ Min_{\text{up}}=Min\left(D^{2}(i,j)\right) $$

(12)

where D(i,j) and D²(i,j) represent the first and second derivatives of an input image, and Max_up and Min_up are the updated max and min pixel values. These updated values are initially compared with the old max and min values, defined in Eq. 8, and later updated to calculate the area of infection.

$$ \int_{\alpha}^{\beta}f(i)di=\frac{Max_{\text{up}}-Min_{\text{up}}}{\text{Total}_{n}}[D^{2}(i,j)] $$

(13)

$ \int _{\alpha }^{\beta }f(i)di$ representing area of the infected region, which is further utilized in the threshold function.

$$ T(i,j)=\left\{ \begin{array}{ll} Foreground & if \ \int_{\alpha}^{\beta}f(i)di \geq \xi\\ Background & if \ \int_{\alpha}^{\beta}f(i)di < \xi \end{array}\right. $$

(14)

where ξ denotes pixels which are directly linked to $\int _{\alpha }^{\beta }f(i)di$, and T(i,j) represents an optimized adaptive segmented image; sample results are shown in Fig. 5.

3.2.2 Quartile deviation-based segmentation

Quartile deviation-based segmentation is a new segmentation method, which can be directly mapped on to the input image, prior to the thresholding step to generate a binary image. This method works on the basis of coupling — depending on the curve changes. The coupling points are utilized with the normalization function, because Q.D is a property of a normal distribution. Let f(t)∈F_cost(i,j,k) having dimension (256×256×3), then the initial function is defined as:

$$ \int_{\mu - r}^{\mu + r}f(t)dt=\frac{1}{2} $$

(15)

where (μ−r) and (μ+r) represent the points of inflection. Taking L.H.S and putting the normalization function in Eq. 15:

$$ =\frac{1}{\sigma\sqrt{2 \pi}}\int_{\mu - r}^{\mu + r}e^{\frac{-1}{2}(\frac{t-\mu}{\sigma})^{2}} dt $$

(16)

Equating $\frac {t-\mu }{\sigma }=X$ and simplify dt=σdX to obtain a new equation:

$$ Q.D=\frac{1}{\sqrt{2 \pi}}\int_{\mu - r}^{\mu + r}e^{\frac{-X^{2}}{2}} dX $$

(17)

$$ \frac{1}{\sqrt{2 \pi}}\int_{-\frac{r}{q}}^{\frac{r}{q}}e^{\frac{-X^{2}}{2}} dX=\frac{1}{2} $$

(18)

According to even property of normal distribution, it will become:

$$ \frac{1}{\sqrt{2 \pi}}\int_{0}^{\frac{r}{q}}e^{\frac{-X^{2}}{2}} dX=\frac{1}{4} $$

(19)

$$ \frac{r}{a}=0.6745,r=0.6745 \sigma $$

(20)

where r denotes final Q.D value, which is finally utilized in desired cost function for the extraction of infectious regions in fruits and plants. The output of the cost function is in the form of infectious and normal pixels.

$$ F_{\text{out}}(t)=\int \frac{1}{2(1+t)} \times \frac{1}{(1+r)^{3}}dt $$

(21)

$$ F_{\text{out}}(t)=\frac{1}{(1+r)^{3}}\int \frac{1}{2(1+t)}dt $$

(22)

$$ F_{\text{out}}(t)=\frac{1}{(1+r)^{3}} \ ln(1+t) $$

(23)

where t∈F_cost(i,j,k) and F_out(t) represents the pixels showing infection, which are set in the threshold function to obtain a binary segmented image.

$$ F_{QD}(i,j)=\left\{ \begin{array}{ll} 1 & if \ F_{\text{out}}(t) \geq t_{i} \\ 0 & if F_{\text{out}}(t)< t_{i} \end{array}\right. $$

(24)

where F_QD(i,j) represents the final Q.D-based segmented image and t_i denotes the current enhanced image pixel. The Q.D segmentation results including their contour, mesh graph, and 3-D contour images are shown in Fig. 6.

3.2.3 Image fusion

Image fusion concept is mostly employed, where information from multiple sources (images) is consolidated into fewer images, usually a single one. In this article, a weighted coefficient of correlation (WCoC)-based technique is implemented for pixel-based fusion of two segmented images. Actual range of CoC lies between (−1 : 1), but in this work, we are working on binary images; therefore, the resultant image is a binary. This method finds a strong correlation between pixels of both images. The highest correlated pixels are assigned higher weights, while lower correlated pixels are considered to be a background and eliminated. Suppose $\{p_{1}, p_{2}, \dots, p_{n}\}$ are uncorrelated pixels from both segmented images T(i,j) and F_QD(i,j) having the same standard deviation, the correlation coefficient is defined as:

$$ \gamma_{12}=\frac{\sum(p_{1}-\bar{p_{1}})(p_{2}-\bar{p_{2}})}{\sqrt{\sum(p_{1}-\bar{p_{1}})^{2}\sum(p_{2}-\bar{p_{2}})^{2}}} $$

(25)

where γ₁₂ denotes a correlation between pixels which is initialized as γ₁₂=0.

Let σ²(p₁)=S² and σ²(p₂)=S² so {i,j}=(u₁+u₂). Let (i,j)∈(x,y), then the mathematical formulation is done as:

$$ \gamma_{xy}=\frac{\sum(x-\bar{x})(y-\bar{y})}{\sqrt{\sum(x-\bar{x})^{2}\sum(y-\bar{y})^{2}}} $$

(26)

Then assign the weight and bias values which are selected to be 0.8 and 2.5.

$$ \gamma_{xy}=W_{xy}\times \frac{\sum(x-\bar{x})(y-\bar{y})}{\sqrt{\sum(x-\bar{x})^{2}\sum(y-\bar{y})^{2}}} \times bias $$

(27)

The above equation is simplified as:

$$ R_{i,j}=0.8 \times \frac{nS^{2}}{2nS^{2}} \times 2.5=1 $$

(28)

where $\sigma ^{2}(p_{1})=\frac {\sum (p_{1}-\bar {p_{1}})^{2}}{n}, S^{2}=\frac {\sum (p_{1}-\bar {p_{1}})^{2}}{n}, nS^{2}=\sum (p_{1}-\bar {p_{1}})^{2}, \sum (x-\bar {x})^{2}=2nS^{2}, \sum (y-\bar {y})^{2}=2nS^{2}, \sum (x-\bar {x})(y-\bar {y})=nS^{2}$ and R_i,j correlate those pixels which sum is 1. The final fusion results are shown in Fig. 7.

3.2.4 Analysis of segmentation results

For the analysis of proposed segmentation technique against each disease, we selected 400 image samples (100 against each disease — apple scab, apple rust, grapes rot leaves, and grape leaf blight); few can be seen in Fig. 8. Three measures are implemented to show the performance of the proposed method including accuracy, Jaccard Index, and false negative rate — calculated as follows:

$$ J_{Index}=\frac{\left | R_{i,j}\cap S(i,j) \right |}{\left |R_{i,j} \right |+\left | S(i,j) \right |-\left |R_{i,j}\cap S(x,y) \right |} $$

(29)

$$ Acc=\frac{TP_{l}}{R_{i,j}(l)+S(i,j)(l)-TP_{l}}\times 100 $$

(30)

$$ FNR=100-Acc $$

(31)

where R_i,j is a proposed segmented image, S(i,j) is a ground truth, and TP_l represents correlated pixels. Results in tabular are provided in Table 2, and graphical results along with their ground truths are shown in Figs. 9 and 10. Additionally, few other sample segmentation results are provided in Fig. 11. The maximum accuracy of 95.63% is achieved from the tested images; moreover, the minimum reported negative rate is 4.37, maximum Jaccard Index is 99.26%, overall average accuracy is 93.74%, average Jaccard Index is 94.17%, and negative rate is 6.26%. Average results are also plotted in Fig. 12, which describes a range of segmented accuracy on all selected images.

Table 2 Performance results of the proposed identification of infection parts in the fruits and plants

Full size table

3.3 Feature extraction

Features play their vital role in recognizing the primary contents of an images or signals. Therefore, in the field of pattern recognition and machine learning, set of techniques are proposed [40–45]. On the one hand, optimal set of features lead to an accurate classification, while, on the other hand, irrelevant and redundant features are one of the factors for high misclassifications. In this article, we are not only focusing on the utilization of multiple set of features but also avoiding feature redundancy by implementing a suitable feature selection method. We utilize three different types of features including statistical, color [46], and texture (segmented local binary patterns (SLBP)) from the segmented images.

For color features, RGB, HSV, LAB, and YCbCr color spaces are used and four measures, mean, standard deviation, entropy, and skewness, are calculated against each channel. From each color space, we obtain a feature vector of size 1×12, which increases up 1×48 for all selected color spaces, and N×48 for N images.

For statistical features, Harlick [47] is implemented, which originally used 14 features, but we added 8 new features including correlation 2, cluster prominence, cluster shade, dissimilarity, energy, homogeneity 1, homogeneity 2, and max probability. Addition of these features improves the overall classification accuracy but also increases the computational time. A complete mathematical description of each feature is provided in Table 3, and the final vector size is 1×88.

Table 3 Extraction of twenty-two GLCM features

Full size table

LBP [48] belongs to a category of texture features, which captures the information related to the neighboring pixels. In this work, ‘A’ channel from LAB color space is utilized as an input for feature extraction, because it provides more information compared to other channels. The proposed segmented local binary pattern features (SLBPF) is based on three steps: (a) calculate the distance between extracted set of LBP features, (b) calculate the statistical features of LBP, and (c) calculate the entropy features of their 8 neighborhood features. The extracted features are simply concatenated each other and make a new feature vector of size 1×72.

$$ \Psi_{LBP}=\sum_{i=0}^{n-1}2^{m} \ S(g_{p}-g_{c}) $$

(32)

where Ψ_LBP is a feature vector and $ S(u)=\left \{ \begin {array}{ll} 1 & if \ \ u\geq 0 \\ 0 & if \ \ u<0 \end {array}\right \}$ is a threshold function, n=8,g_p denotes total number of neighbors, and g_c is a pivot location [49]. Distance between feature is calculated using relation:

$$ \vec{D}_{ij}=\sum_{i=1}^{M}\sum_{j=1}^{N}\left [ f_{i}^{n}-f_{j}^{n} \right ]^{2}, n\in n^{th} \ features $$

(33)

where $\vec {D}_{ij}$ denotes the distance matrix which is later utilized to compute the mean, variance, skewness, and kurtosis. Later, these metrics are concatenated to generate a new vector having dimension 1×64. The entropy features of each 8 neighboring features are computed as:

$$ Ent_{f}(i)=MK \sum_{i=1}^{8}(f_{i}\ ln f_{i}, a_{x}) \\ $$

(34)

$$ Ent_{f}(j)=NK \sum_{j=1}^{8}(f_{j}\ ln f_{j}, a_{y}) $$

(35)

where a_x and a_y denote the neighboring ith and jth features; 8 entropy features are extracted and concatenated with the previous vector to obtain a new feature vector having size 1×72. Finally, all features are fused [50] to generate a resultant vector of size 1×208. The core architecture of feature extraction and selection is shown in Fig. 13.

3.4 Feature selection

To avoid redundancy, the feature selection step plays a primary role by eliminating and discarding the irrelevant and repeated information, hence selecting the most discriminant information. In this article, we implemented a new method based on rank correlation and entropy value of fused vector. The proposed method incorporates three fundamental steps: (a) calculate the correlation between fused features, (b) find the entropy value of fused features, and (c) selection of features with minimum entropy-correlation values. Find the entropy value of fused features and multiply by rank correlation; (c) set a threshold function to select those features, which are minimum to entropy-correlation value. It is given that extracted fused features f₁,f₂,...f_n are rank from 1 to n. We need to find out the correlation between the rank of given features. The rank correlation is defined as:

$$ R_{F}=\frac{n\sum f_{1}f_{2}-\sum(f_{1})\sum(f_{2})}{\sqrt{(n\sum f_{1}^{2}-(\sum f_{1})^{2})(n\sum f_{2}^{2}-(\sum f_{2})^{2})}} $$

(36)

where f₁ and f₂ represents the fused feature vector. The above equation solves and simplifies as $\sum f_{1}, \sum f_{2}=\frac {n(n+1)}{2}$ and $\sum (f_{1})^{2}, \sum (f_{2})^{2}=\frac {n(n+1)(2n+1)}{6}$. Then calculating the difference between fused features, given as: As φ=f₁−f₂, where φ denotes the difference between features and taking square both sides and apply $\sum $ and divided by 2 both sides, then it will become as $\sum f_{1}f_{2}=\frac {n(n+1)(2n+1)}{6}-\frac {\sum \varphi ^{2}}{2}$ and $n\sum f_{1}f_{2}-\sum (f_{1})\sum (f_{2})=\frac {n^{2}(n^{2}-1)}{12}-\frac {n\sum \varphi ^{2}}{2}$. Similarly, $n\sum f_{1}^{2}$ and $ n\sum f_{2}^{2}$ is $=\frac {n^{2}(n^{2}-1)}{12}$. Put these simplifications in Eq. 36 and becomes:

$$ R_{F}=\frac{\frac{n(n^{2}-1)}{6}-\sum \varphi^{2}}{\frac{n(n^{2}-1)}{6}} $$

(37)

$$ R_{F}=1-\frac{6\sum \varphi^{2}}{n(n^{2}-1)} $$

(38)

where $\sum \varphi ^{2}=\sum (f_{1})^{2}+ \sum (f_{2})^{2}-2\sum f_{1}f_{2}$. Then calculate the entropy value of fused feature vector and multiply it with the correlation. The obtained value is compared with each feature of fused vector and select the features based of final threshold function as follows:

$$ EC(f_{i})=Entropy \ \times (1-\frac{6\sum \varphi^{2}}{n(n^{2}-1)}) $$

(39)

$$ \overrightarrow{F(Vec)}=\left\{ \begin{array}{ll} Remove & if \ f_{i}>EC(f_{i})\\ Select & if \ f_{i}\leq EC(f_{i}) \end{array}\right. $$

(40)

Resultant vector $\overrightarrow {F(Vec)}$ is utilized for final classification. We performed simulations several times and found selected vector in the range of 180–195. In several experiments, mostly the selected vector size is between 180 and 195. Finally, the multi-class SVM [51] is used as a base classifier for the classification of apple and grape diseases, and its classification results were compared with other well-known classification methods such as ensemble, decision trees, etc. Two kernel functions of SVM are utilized in this work such as linear and radial basis function (RBF). The linear kernel is used for binary class problem along other parameters such as kernel scale is automatic, classification method is one vs one, and standardized data is true. Similar for RBF kernel, the other parameters include a kernel scale is manual, box constraint level is 4, multi-class method is one vs all, and gamma is initialized as 0.3.

4 Experimental results and discussion

In this section, the proposed method is validated on a publicly available dataset, PlantVillage [52] — containing set of diseased and healthy images (Fig. 14). To prove the authenticity of the proposed algorithm, firstly, individual features are classified and latter fusion and selection is applied. A 10-fold cross-validation methodology is also opted along with a providence of a fair comparison with other state-of-the-art classifiers including decision trees (DT), quadratic discriminant analysis (QDA), quadratic SVM (Q-SVM), cubic SVM (C-SVM), fine KNN (F-KNN), weighted KNN (W-KNN), ensemble boosted trees (EBT), and ensemble subspace discriminant analysis (E-SDA). Six statistical measures are considered for the performance comparison of the proposed method, sensitivity (Sen), specificity (Spec), precision (Prec), false positive rate (FPR), false negative rate (FNR), and accuracy. Training/testing ratio is selected to be 50:50 having 50% training images and 50% for the testing. All the experiments are done in Matlab 2017b, utilizing a personal Intel Core i7 computer.

4.1 Apple scab disease

In this section, the classification results on apple scab diseases are presented. Total 2275 images of apple scab (630) and apple healthy (1645) are collected from the PlantVillage dataset. The results are accomplished in two phases. In the first phase, the results are obtained from each extracted set of features as depicted in Table 4 having maximum accuracy on multi-class SVM 94.1%, 86.3%, and 72.0% for SLBP, statistical, and color features, respectively. Then these results are compared with the proposed entropy-rank correlation-based selection method. Table 5 shows a maximum accuracy of 97.1%, FNR 2.9%, sensitivity 96.15%, specificity 96.2%, FPR 0.039, and precision 96.10%. Proposed results are confirmed with their confusion matrix of apple scab given in Table 6. From Tables 4 and 5, it is clearly shown that the proposed feature selection method produced best results as compared to individual set of features. Moreover, the proposed method is also compared with previous state-of-the-art methods as presented in Table 7, which gives the authenticity of the proposed entropy-rank correlation method.

Table 4 Classification result on each extracted set of features for apple scab disease

Full size table

Table 5 Proposed feature selection results of apple scab disease

Full size table

Table 6 Confusion matrix of all experiments for selected diseases

Full size table

Table 7 Comparison with existing methods

Full size table

4.2 Apple rust disease

A total of 1920 images are collected from the PlantVillage dataset containing apple rust (275) and apple healthy (1645) images. The experiments are being performed in two steps, where in the first step classification results are obtained on each extracted set of features (Table 8). Maximum accuracy achieved is on M-SVM classifier, which is 93.2%, 90.9%, and 95.8% for SLBP, Harlick, and color features, respectively. In the second step, selected features are utilized for classification using the proposed method — showing improved performance (Table 9). Classification results are also confirmed using confusion matrix given in Table 6. From Tables 8 and 9, it is quite cleared, with the proposed feature selection method, performance improved significantly. Additionally, proposed classification results are also compared with the existing methods given in Table 7.

Table 8 Classification result on each extracted set of features for apple rust disease

Full size table

Table 9 Classification results on the proposed feature selection approach for apple rust

Full size table

4.3 Grape diseases

Two types of grape diseases, grapes rot leave and grapes leaf blight, are selected in this section for classification. Total 2679 images are collected from the PlantVillage dataset which include grapes black rot (1180), grapes leaf blight (1076), and healthy (423). The same trend is being followed; in the first step, classification results are obtained on each extracted set of feature (Table 10). In Table 10, the classification results are obtained on grapes rot leaves having accuracy 93.2%, 90.9%, and 95.8% for SLBP, Harlick, and color features, respectively. Also, the proposed classification results of grapes leaf blight are presented in Table 11 with maximum accuracy of 96.30% — also confirmed from the confusion matrix (Table 6). Finally, the proposed results are compared with existing methods described in Table 7, which shows that the proposed method performs significantly well compared to existing methods.

Table 10 Classification result on each extracted set of features for grapes rot leave disease

Full size table

Table 11 Proposed classification results for grapes leaf blight

Full size table

4.4 Final classification

In this section, all selected diseases are utilized for classification, and the proposed method is directly implemented on it. The testing results are given in Table 12 having a maximum accuracy of 97.1% on multi-class SVM. The proposed testing results are confirmed by their confusion matrix given in Table 13, which shows the authenticity of the proposed method.

Table 12 Proposed classification results on all selected diseases

Full size table

Table 13 Confusion matrix of the final classification

Full size table

4.5 Discussion

On a broader perspective, two primary domains are somewhat covered: (1) infected region segmentation and (2) discriminant feature selection. A proposed method of segmentation is directly relying on image fusion from two different sources — selected results can be seen in Figs. 7, 9, 10, and 11 and Table 2 — having maximum achieved accuracy of 95.63% and average accuracy of 93.45%. In the latter phase, feature selection, three types of features are fused by implementing a simple serial-based method, which are finalized using the entropy-rank correlation method. Five experiments are done on selected diseases, apple scab, apple rust, rot grapes leaves, grapes leaf spot, and final classification on all diseases to achieve an accuracy of 97.1%, 94.70%, 96.60%, 96.30%, and 97.7%, respectively. For validation, the classification results are obtained on individual feature type as presented in Tables 4, 8, and 10. The proposed entropy-rank correlation results are presented in Tables 5, 9, 14, 11, and 12, which are confirmed by confusion matrix given in Tables 6 and 13, which clearly shows the authenticity of the proposed method. Additionally, 8 new statistical features improve the overall accuracy by embedding set of unique features (Fig. 15). In Fig. 15, it is explained that when 14 texture features are computed, then the achieved accuracies are 81.9%, 82.7%, 81.8%, and 84.5% for apple scab, rust, grapes rot, and grape blights, respectively, whereas the addition of 8 features increases the overall accuracy to 86.3%, 87.2%, 90.9%, and 91.7%, respectively.

Table 14 Classification results on the proposed feature selection approach for grapes rot leaves

Full size table

In Fig. 16, the F1 score is calculated for the proposed feature selection approach. The F1 score is computed for all selected diseases such as apple scab, apple rust, grapes rot, and grapes leaf blight. The proposed feature selection results in terms of sensitivity, precision, F1 score, and accuracy show that the proposed feature selection method performed better as compared to individual feature sets. Finally, a comparison is conducted with latest techniques in Table 7 which shows that the proposed method performs significantly well as compared to existing methods.

5 Conclusion

Detection and classification of fruit diseases is an important research area in the field of computer vision and pattern recognition. Due to the complexity and irregularity of diseases in apple and grape leaves/fruits, several existing methods are unable to achieve the required classification accuracy. Therefore, in this article, a new technique is implemented for apple and grape disease detection and classification, which is based on fusion of a novel adaptive thresholding and Q.D-based segmentation. Later on, set of different features are extracted to perform a serial-based fusion. A novel entropy-rank correlation technique is implemented for robust feature selection, which works efficiently, compared to individual features and existing related methods in terms of accuracy, sensitivity, precision, and FPR. The proposed method works not only efficiently on WEB images but also efficiently for publicly available datasets, which contains a lot of challenges like noise and background complexity, to name but a few. From this research, we finally conclude that a combination of set of different features increases the overall accuracy but also increases the computational time and complexity. Therefore, it is somewhat mandatory to involve a feature selection method. A segmentation step plays its role in the extraction of better features — leading to better classification. As a future work, deep features will be utilized instead of conventional, as well as, number of disease will be increase, but the selection step is somewhat obligatory even with the deep features.

Availability of data and materials

Not applicable

Declarations

Abbreviations

HOG:: Histogram of oriented gradients
QD:: Quartile deviation
CLPSO:: PSO
SVM:: Support vector machine
ISADH:: Improved sum and difference of histogram
RGB:: Red, green, blue
GA:: Genetic algorithm
CFS:: NNN
GCH:: NN
CCV:: NN
LBP:: Local binary patterns
CLBP:: MM
SGDM:: NN
GLCM:: Gray-level occurrences matrix
SLBP:: Segmented local binary patterns
PSO:: Particle swarm optimization
ANN:: Artificial neural network
SLBPF:: SLBP features
DT:: Decision tree
QDA:: Quadratic discriminant analysis
Q-SVM:: Quadratic SVM
C-SVM:: Cubic SVM
F-KNN:: Fine K-nearest neighbor
W-KNN:: Weighted KNN
EBT:: Ensemble boosted tree
ESDA:: Ensemble subspace discriminant analysis
FPR:: False positive rate
FNR:: False negative rate

References

X. F. Wang, Z. Wang, S. W. Zhang, Y. Shi, in International Conference on Information Technology and Management Innovation (ICITMI 2015). Monitoring and discrimination of plant disease and insect pests based on agricultural IOT (Atlantis Press, 2015), p. 112115.
B. J. Samajpati, S. D. Degadwala, in 2016 International Conference on Communication and Signal Processing (ICCSP). Hybrid approach for apple fruit diseases detection and classification using random forest classifier (IEEE, 2016), pp. 1015–1019.
M. K. Tripathi, D. D. Maktedar, in 2016 International Conference on Computing Communication Control and automation (ICCUBEA). Recent machine learning based approaches for disease detection and classification of agricultural products (IEEE, 2016), pp. 1–6.
A. Camargo, J. S. Smith, An image-processing based algorithm to automatically identify plant disease visual symptoms. Biosyst. Eng.102(1), 9–21 (2009).
Article Google Scholar
S. R. Dubey, A. S. Jalal, Apple disease classification using color, texture and shape features from images. Signal Image Video Process. 10(5), 819–826 (2016).
Article Google Scholar
S. Zhang, X. Wu, Z. You, L. Zhang, Leaf image based cucumber disease recognition using sparse representation classification. Comput. Electron. Agric.134:, 135–141 (2017).
Article Google Scholar
M. Sharif, M. Attique Khan, M. Faisal, M. Yasmin, S. L. Fernandes, A framework for offline signature verification system: best features selection approach. Pattern Recogn. Lett. (2018).
M. A. Khan, T. Akram, M. Sharif, M. Y. Javed, N. Muhammad, M. Yasmin, An implementation of optimized framework for action classification using multilayers neural network on selected fused features. Pattern. Anal. Applic., 1–21 (2018).
M. Nasir, M. A. Khan, M. Sharif, I. U. Lali, T. Saba, T. Iqbal, An improved strategy for skin lesion detection and classification using uniform segmentation and feature selection based approach. Microsc. Res. Tech. (2018).
M. A. Khan, M. Sharif, M. Y. Javed, T. Akram, M. Yasmin, T. Saba, License number plate recognition system using entropy-based features selection approach with SVM. IET Image Process. 12(2), 200–209 (2017).
Article Google Scholar
M. Sharif, M. A. Khan, T. Akram, M. Y. Javed, T. Saba, A. Rehman, A framework of human detection and action recognition based on uniform segmentation and combination of Euclidean distance and joint entropy-based features selection. EURASIP J. Image Video Process. 2017(1), 89 (2017).
Article Google Scholar
S. Zhang, Z. Wang, Cucumber disease recognition based on Global-Local Singular value decomposition. Neurocomputing. 205:, 341–348 (2016).
Article Google Scholar
M. Sharif, M. A. Khan, Z. Iqbal, M. F. Azam, Lali Ikram Ullah M., M. Y. Javed, Detection and classification of citrus diseases in agriculture based on optimized weighted segmentation and feature selection. Comput. Electron. Agric.150:, 220–234 (2018).
S. Zhang, Y. Zhu, Z. You, X. Wu, Fusion of superpixel, expectation maximization and PHOG for recognizing cucumber diseases. Comput. Electron. Agric.140:, 338–347 (2017).
Article Google Scholar
A. Akula, R. Ghosh, S. Kumar, H. K. Sardana, in Proceedings of International Conference on Computer Vision and Image Processing. Local binary pattern and its variants for target recognition in infrared imagery (SpringerSingapore, 2017), pp. 297–307.
Chapter Google Scholar
U. Solanki, U. K. Jaliya, D. G. Thakore, A survey on detection of disease and fruit grading. Int. J. Innov. Emerg. Res. Eng.2(2), 109–114 (2015).
Google Scholar
G. Pass, R. Zabih, J. Miller, in Proceedings of the fourth ACM international conference on Multimedia. Comparing images using color coherence vectors (ACM, 1997), pp. 65–73.
S. R. Dubey, A. S. Jalal, Fruit disease recognition using improved sum and difference histogram from images. Int. J. Appl. Patt. Recog.1(2), 199–220 (2014).
Google Scholar
G. Amayeh, A. Erol, G. Bebis, M. Nicolescu, in ISVC. Accurate and efficient computation of high order zernike moments (Springer, 2005), pp. 462–469.
S. R. Dubey, A. S. Jalal, Apple disease classification using color, texture and shape features from images. Signal Image Video Process.10(5), 819–826 (2016).
Article Google Scholar
A. Kadir, L. E. Nugroho, A. Susanto, P. I. Santosa. Neural network application on foliage plant identification, (2013).
H. Al-Hiary, S. Bani-Ahmad, M. Reyalat, M. Braik, Z. ALRahamneh, Fast and accurate detection and classification of plant diseases. Mach Learn. 14(5) (2011).
S. R. Dubey, P. Dixit, N. Singh, J. P. Gupta, Infected fruit part detection using k-means clustering segmentation technique. Ijimai. 2(2), 65–72 (2013).
Article Google Scholar
V. Ashok, D. S. Vinod, in 2014 International Conference on Contemporary Computing and Informatics (IC3I). Automatic quality evaluation of fruits using Probabilistic Neural Network approach (IEEE, 2014), pp. 308–311.
Y. Boykov, Graph cuts and efficient N-D image segmentation. Int. J. Comp. Vis. (IJCV). 70(2), 109–131 (2006).
Article Google Scholar
B. Sowmya, B. Sheelarani, Colour image segmentation using soft computing techniques. Int. J. Soft Comput. Appl.4:, 69–80 (2009).
Google Scholar
A. Borji, M. Hamidi, in Fuzzy Information Processing Society, 2007. NAFIPS’07. Annual Meeting of the North American. CLPSO-based fuzzy color image segmentation (IEEE, 2007), pp. 508–513.
S. R. Dubey, A. S. Jalal, Adapted approach for fruit disease identification using images. arXiv preprint arXiv:1405.4930 (2014).
Z. Chuanlei, Z. Shanwen, Y. Jucheng, S. Yancui, C. Jia, Apple leaf disease identification using genetic algorithm and correlation based feature selection method. Int. J. Agric. Biol. Eng.10(2), 74–83 (2017).
Google Scholar
P. Soni, R. Chahar, in IEEE International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES). A segmentation improved robust PNN model for disease identification in different leaf images (IEEE, 2016), pp. 1–5.
M. Shuaibu, W. S. Lee, Y. K. Hong, S. Kim, Detection of apple Marssonina blotch disease using particle swarm optimization. Trans. ASABE. 60(2), 303–312 (2017).
Article Google Scholar
H. Waghmare, R. Kokare, Y. Dandawate, in 2016 3rd International Conference on Signal Processing and Integrated Networks (SPIN). detection and classification of diseases of grape plant using opposite colour local binary pattern feature and machine learning for automated decision support system (IEEE, 2016), pp. 513–518.
P. B. Padol, S. D. Sawant, in 2016 International Conference on Global Trends in Signal Processing, Information Computing and Communication (ICGTSPICC). Fusion classification technique used to detect downy and powdery mildew grape leaf diseases (IEEE, 2016), pp. 298–301.
A. Awate, D. Deshmankar, G. Amrutkar, U. Bagul, S. Sonavane, in 2015 International Conference on Green Computing and Internet of Things (ICGCIoT). Fruit disease detection using color, texture analysis and ANN (IEEE, 2015), pp. 970–975.
S. R. Dubey, A. S. Jalal, in 2012 Third International Conference on Computer and Communication Technology (ICCCT). Detection and classification of apple fruit diseases using complete local binary patterns (IEEE, 2012), pp. 346–351.
P. K. Kharde, H. H. Kulkarni, An unique technique for grape leaf disease detection (2016).
P. B. Padol, A. A. Yadav, in Conference on Advances in Signal Processing (CASP). SVM classifier based grape leaf disease detection (IEEE, 2016), pp. 175–179.
H. Wang, G. Li, Z. Ma, X. Li, in 2012 5th International Congress on Image and Signal Processing (CISP). Image recognition of plant diseases based on backpropagation networks (IEEE, 2012), pp. 894–900.
Ş Ozturk, B. Akdemir, Fuzzy logic-based segmentation of manufacturing defects on reflective surfaces. Neural Comput. Appl.29(8), 107–116 (2018).
Article Google Scholar
M. A. Khan, T. Akram, M. Sharif, A. Shahzad, K. Aurangzeb, M. Alhussein, S. I. Haider, A. Altamrah, An implementation of normal distribution based segmentation and entropy controlled features selection for skin lesion detection and classification. BMC Cancer. 18(1), 638 (2018).
Article Google Scholar
A. Liaqat, M. A. Khan, J. H. Shah, M. Sharif, M. Yasmin, S. L. Fernandes, Automated ulcer and bleeding classification from WCE images using multiple features fusion and selection. J. Mech. Med. Biol., 850038 (2018).
M. Raza, M. Sharif, M. Yasmin, M. A. Khan, T. Saba, S. L. Fernandes, Appearance based pedestrians’ gender recognition by employing stacked auto encoders in deep learning. Futur. Gener. Comput. Syst.88:, 28–39 (2018).
Article Google Scholar
T. Akram, M. A. Khan, M. Sharif, M. Yasmin, Skin lesion segmentation and recognition using multichannel saliency estimation and M-SVM on selected serially fused features. J. Ambient Intell. Humanized Comput., 1–20 (2018).
M. A. Khan, T. Akram, M. Sharif, M. Awais, K. Javed, H. Ali, T. Saba, CCDF: automatic system for segmentation and recognition of fruit crops diseases based on correlation coefficient and deep CNN features. Comput. Electron. Agric.155:, 220–236 (2018).
Article Google Scholar
Z. Iqbal, M. A. Khan, M. Sharif, J. H. Shah, M. Habib ur Rehman, K. Javed, An automated detection and classification of citrus plant diseases using image processing techniques: a review. Comput. Electron. Agric.153:, 12–32 (2018).
Article Google Scholar
J. K. Patil, R. Kumar, Color feature extraction of tomato leaf diseases. Int. J. Eng. Trends Technol.2(2), 72–74 (2011).
Google Scholar
R. M. Haralick, K. Shanmugam, Textural features for image classification. IEEE Trans. Syst. Man Cybern.6(1973), 610–621.
T. Ojala, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern. Anal. Mach. Intell.24(7), 971–987 (2002).
Article MATH Google Scholar
L. Zhang, R. Chu, S. Xiang, S. Liao, S. Z. Li, in International conference on biometrics. Face detection based on multi-block lbp representation (SpringerBerlin, 2007), pp. 11–18.
Google Scholar
J. Yang, J. -Y. Yang, D. Zhang, J. -F. Lu, Feature fusion: parallel strategy vs. serial strategy. Pattern Recognit.36(6), 1369–1381 (2003).
Article MATH Google Scholar
Y. Liu, Y. F. Zheng, in Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005. IJCNN’05, vol. 2. One-against-all multi-class SVM classification using reliability measures (IEEE, 2005), pp. 849–854.
D. Hughes, M. Salathé, An open access repository of images on plant health to enable the development of mobile disease diagnostics. arXiv preprint arXiv:1511.08060 (2015).
L. G. Nachtigall, R. M. Araujo, G. R. Nachtigall, in 2016 IEEE 28th International Conference on Tools with Artificial Intelligence (ICTAI). Classification of apple tree disorders using convolutional neural networks (IEEE, 2016), pp. 472–476.

Download references

Acknowledgements

This research work is partially sponsored by Deanship of Scientific Research at University of Hail, Kingdom of Saudi Arabia. The authors are grateful for this financial support. We also like to thank the Plant Village community for developing this large dataset and VC, HITEC University, Taxila Pakistan.

Funding

Not applicable

Author information

Authors and Affiliations

Department of Computer Science and Engineering, HITEC University, Museum Road, Taxila, Pakistan
Muhammad Attique Khan
Department of ECE, COMSATS University Islamabad, WahCantt Campus, Islamabad, Pakistan
Tallha Akram & Nadia Nawaz
Department of Computer Science, COMSATS University Islamabad, WahCantt Campus, Islamabad, Pakistan
Muhammad Sharif
College of Computer Science and Engineering, University of Hail, Hail, 55476, Saudi Arabia
Majed Alhaisoni
College of Computer and Information Sciences, Prince Sultan University, Riyadh, 11586, Saudi Arabia
Tanzila Saba

Authors

Muhammad Attique Khan
View author publications
You can also search for this author in PubMed Google Scholar
Tallha Akram
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Sharif
View author publications
You can also search for this author in PubMed Google Scholar
Majed Alhaisoni
View author publications
You can also search for this author in PubMed Google Scholar
Tanzila Saba
View author publications
You can also search for this author in PubMed Google Scholar
Nadia Nawaz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

MAK and TA developed this idea and performed the simulations by developing different patches of code with full integration. They are also responsible for this complete write-up. Different accuracy criteria are finalized and also simulated by these authors. MS performed technical supports throughout the paper. AM has given a complete shape to this article and identified several issues and helped the primary authors to overcome all those shortcomings. Moreover, this author is also responsible for the funding of this manuscript. TS is responsible for the final proofreading along with the technical support in the classification step due to her research major. NN provided technical support in different sections which include feature extraction and fusion along with the issues raised in the development of a feature selection method. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Muhammad Attique Khan.

Ethics declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Khan, M.A., Akram, T., Sharif, M. et al. A probabilistic segmentation and entropy-rank correlation-based feature selection approach for the recognition of fruit diseases. J Image Video Proc. 2021, 14 (2021). https://doi.org/10.1186/s13640-021-00558-2

Download citation

Received: 17 March 2018
Accepted: 22 April 2021
Published: 10 May 2021
DOI: https://doi.org/10.1186/s13640-021-00558-2

A probabilistic segmentation and entropy-rank correlation-based feature selection approach for the recognition of fruit diseases

Abstract

1 Introduction

1.1 Major contributions

2 Literature review

3 Proposed method

3.1 Contrast stretching

3.2 Disease identification

3.2.1 Trapezoidal based adaptive thresholding

3.2.2 Quartile deviation-based segmentation

3.2.3 Image fusion

3.2.4 Analysis of segmentation results

3.3 Feature extraction

3.4 Feature selection

4 Experimental results and discussion

4.1 Apple scab disease

4.2 Apple rust disease

4.3 Grape diseases

4.4 Final classification

4.5 Discussion

5 Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords