At present, most image retrieval systems are based on low-level visual features (such as color and texture), and the retrieval process is computer-centric, which makes its retrieval performance unsatisfactory. This is mainly due to the gap between low-level visual features and high-level semantic concepts. In addition, the standards of judging the similarity between images by human and computer image retrieval systems are very different. To solve these problems, a CBIR method based on relevance feedback (RF) is proposed. This method embeds the user model in the retrieval system and establishes a correlation between the low-level features of the image and its high-level semantics via human-computer interaction, thereby reducing the semantic gap and achieving a high retrieval accuracy [14, 15]. However, RF’s unique complex human-computer interaction interface increases the burden on users. In each feedback, the user must indicate which images are related and score the degree of correlation between these images and the query image. In some cases, the user will be confused and face difficulty in giving an appropriate judgment [16]. Therefore, some scholars have proposed the pseudo-relevance feedback (PRF) technique, which is also called local feedback or blind feedback. The main idea is to improve the retrieval performance of a system by extending the query on the basis of a reasonable assumption that some documents ranked at the top are assumed to be related to the query documents and related documents are selected automatically. The strategies of selecting positive examples and the query expansion method are the key issues in applying the virtual relevance feedback technology [15,16,17]. Based on the above research, this paper proposes a method for image retrieval self-feedback based on fuzzy clustering and the virtual relevance feedback technology. This method is mainly based on the similarity measure between the image content features; it applies the fuzzy clustering algorithm to automatically expand the query image features without user intervention and to improve the search performance.
Related technologies
Content-based relevance feedback retrieval is a process of gradual refinement. The basic idea is that the user is allowed to evaluate and mark the retrieval results in the search process, indicating which of the results are related to the query image and which are irrelevant, i.e., “feedback positive examples” and “feedback negative examples.” The system receives the user feedback on the current retrieval results and then automatically adjusts the query based on the feedback information. Finally, the optimized query is used to recalculate the retrieval results. In this way, the system and user gradually make the retrieval result to move toward the direction of the user’s expectation through the interaction and to finally reach the user’s request.
At present, the vector model is often used in image retrieval. That is, the image is represented in the vector form in the feature space; hence, the essence of image retrieval is to find the images corresponding to those closest to the query vector in the feature space. From the viewpoint of the vector model, the correlation feedback techniques can be divided into two categories: the point moving algorithm based on the query vector and the feature weight adjusting algorithm. These correlation feedback methods are based on the premise that the image visual features can completely describe the image semantics. Under this assumption, the user’s retrieval target can be described by a global distribution function. That is, similar images in the feature space are clustered around the central point. Therefore, the retrieval effect can be improved by moving the query points and modifying the distance measurement functions.
In practical applications, however, there are significant differences between the low-level features of images in the same category due to the diversity of semantically similar images. For example, the elephant class in the Corel image library includes elephants with different postures (e.g., standing and lying) and in different environments (e.g., forest, grassland, and water edge), as shown in Fig. 2. In terms of low-level features, the assumption that semantically similar images cluster around the central point in the feature space is not always true. As shown in Fig. 3, where the dot Q represents the query image, the solid dot represents the retrieval target, the hollow dot denotes irrelevant images, and L(Q) indicates the distribution of the user’s retrieval targets; clearly, the distribution does not have an ideal global distribution center. It is difficult to describe it with some kind of canonical geometry. In this case, if the query image is taken as the center in the feature space and the image in the hypersphere with a constant radius is returned as the retrieval result, it is obvious that some of the related images will be missed, while some uncorrelated images will be misdirected. However, as shown in Fig. 4, when L(Q) is divided into three sub-clusters, each cluster is uniformly distributed. The model can reflect the distribution of the correlated images in the feature space well; it also overcomes the shortcomings of the previous correlation feedback methods, which assume that semantically similar images cluster around the central point in the feature space, so it is helpful in finding more correlated images [15].
Based on the above ideas, a self-feedback strategy based on fuzzy clustering is proposed in this paper. First, the initial retrieval results generate multiple substitution queries using fuzzy clustering to form a new query. To some extent, this compensates for the shortage of image information in a single query sample. Then, these substitution queries are retrieved and merged with the initial retrieval results to form the final retrieval results.
Design of the correlation feedback algorithm
Images with the same semantics may have greater differences in their low-level features. The traditional image retrieval methods based on a single query image usually can only retrieve images that are very similar to the query images in their low-level features and inevitably lose other images that have similar semantics but large differences in their low-level features. Therefore, if there are many different query images, rather than just relying on a query image, more relevant images can be retrieved in order to narrow the semantic gap [14, 18].
The key to using PRF technology to solve image retrieval problems is how to select a positive image. The commonly used precision-recall curve shows a nonlinear inverse relation between precision and recall in the image retrieval algorithm: a lower recall, for example, 0.1–0.2, corresponds to a higher precision, such as 0.8 or more. This means that in the initial retrieval results, the images that are ranked higher in the foreground are more likely to be related to the query image and are potentially relevant. In CBIR, the closest image to the query image is also the most similar image. Therefore, our proposed method automatically selects a group of images that are closer to the query image as positive examples in the initially retrieved results, clusters them according to the low-level features of the images, and selects each cluster center as a substitute for the query feedback to improve the retrieval performance.
Improved non-parametric fuzzy clustering algorithm
Variable description
The simultaneous clustering and attribute discrimination (SCAD) algorithm is designed to simultaneously search for the optimal cluster center C and the optimal feature weight set W. Each class has its own set of feature weights: Wi = [wi1, wi2… wiD]. Its objective function is defined as follows
$$ J\left(C,U,W;x\right)=\sum \limits_{i=1}^C\sum \limits_{j=1}^N{u}_{ij}^m\sum \limits_{k=1}^D{w}_{ik}{\left({x}_{jk}-{c}_{ik}\right)}^2+\sum \limits_{i=1}^C{\delta}_i\sum \limits_{k=1}^D{w}_{ik}^2 $$
(14)
where
$$ \left\{\begin{array}{l}{u}_{ij}\in \left[0,1\right],\forall i\\ {}0<{\sum}_{j=1}^N{u}_{ij}<N,\forall i,j\\ {}{\sum}_{i=1}^C{u}_{ij}=1,\forall j\end{array}\right., $$
and
$$ {w}_{ik}\in \left[0,1\right],\forall i,k;\sum \limits_{k=1}^D{w}_{ik}=1,\forall i. $$
For wik, the definition is as follows
$$ {w}_{ik}=\frac{1}{n}+\frac{1}{2{\delta}_i}\sum \limits_{j=1}^N{\left({u}_{ij}\right)}^m\left[\frac{{\left\Vert {x}_j-{c}_i\right\Vert}^2}{n}-{\left({x}_{jk}-{c}_{ik}\right)}^2\right] $$
(15)
The definition of δi is
$$ {\delta}_i=K\frac{\sum_{j=1}^N{\left({u}_{ij}\right)}^m{\sum}_{k=1}^D{w}_{ik}{\left({x}_{jk}-{c}_{ik}\right)}^2}{\sum_{k=1}^D{\left({w}_{ik}\right)}^2} $$
(16)
The modified membership formula is
$$ {u}_{ij}=\frac{1}{\sum_{k=1}^c{\left(\frac{d_{ij}^2}{\begin{array}{l}-\\ {}{d}_{kj}^2\end{array}}\right)}^{\frac{1}{m-1}}} $$
(17)
The cluster centers can be expressed as
$$ {C}_{ik}=\left\{\begin{array}{cc}0& \mathrm{if}\;{w}_{ik}=0\\ {}\frac{\sum_{j=1}^N{\left({u}_{ij}\right)}^m{x}_{jk}}{\sum_{j=1}^N{\left({u}_{ij}\right)}^m}& \mathrm{if}\;{w}_{ik}>0\end{array}\right. $$
(18)
Initialization of cluster centers
The SCAD algorithm first needs to initialize the clustering center. The algorithm is sensitive to the initialization of the clustering center. If the initialization is not appropriate, it will lead to convergence of the algorithm to the local extreme point without the optimal fuzzy division of the dataset. At the same time, the algorithm is time-consuming when the data are large. In this paper, the subtraction clustering algorithm is used to initialize the clustering center [19, 20].
Assuming X = {x1, x2, …, xn}⊂Rd be the sample point set and n be the number of samples, the process of initializing the cluster center by subtractive clustering is as follows:
-
1.
For each xi in X, its density index is calculated according to formula (19) and the data point xc1 with the highest density index is selected as the first cluster center
$$ {D}_i=\sum \limits_{j=1}^n\exp \left[\frac{-{\left\Vert {x}_i-{x}_j\right\Vert}^2}{{\left(0.5{r}_a\right)}^2}\right] $$
(19)
-
2.
Assuming that xck is the k-th selected cluster center and the corresponding density index is Dck, the density index of each data point is modified according to formula (20), and the data point xck + 1 with the highest density index is selected as the new clustering center
$$ {D}_i={D}_i-{D}_{ck}\sum \limits_{j=1}^n\exp \left[\frac{-{\left\Vert {x}_i-{x}_{ck}\right\Vert}^2}{{\left(0.5{r}_b\right)}^2}\right] $$
(20)
-
3.
Calculate Dck + 1/Dc1; if the result is less than δ, the algorithm ends; otherwise, go to step 2
The parameters ra, rb, and δ need to be predetermined. The parameter δ (0.5 ≤ δ < 1) defines the number of initialized cluster centers that are ultimately generated. The smaller the δ is, the larger is the number of clusters generated, while the larger the δ is, the smaller is the number of clusters generated. In this paper, let δ = 0.5, and the values of ra and rb are as shown in formula (21).
$$ {r}_a={r}_b=\frac{1}{2}\underset{k}{\min}\left\{\underset{i}{\max}\left\{\left\Vert {x}_i-{x}_k\right\Vert \right\}\right\} $$
(21)
Algorithm description
The improved non-parametric fuzzy clustering algorithm is described as follows:
-
1.
The initial cluster center C is obtained using the algorithm described in Section 3.3.2; let Cmax = k;
-
2.
For i = Cmax to Cmin;
-
2.1
Select the first i clustering centers in C as the new initial centers C(0);
-
2.2
Update U, C, and W using formulas (17), (18), and (15);
-
2.3
Determine its convergence; if not, go to 2.2; otherwise perform 2.4;
-
2.4
Calculate the index value Vd(c) using the effectiveness indicator function;
$$ {V}_d(c)=\frac{\sum \limits_{i=1}^c\frac{1}{n_i}\sum \limits_{k=1}^n{u}_{ik}^m\cdot {dist}_{D_i}{\left({x}_k,{c}_i\right)}^2}{\begin{array}{l}\min \left\{\min \kern0.5em {dist}_{D_i}{\left({c}_i,{c}_j\right)}^2\right\}\\ {}i=1,..,c\kern0.5em j=1,\dots, c,i\ne j\end{array}} $$
(22)
-
3.
Compare the validity index values, where ci0corresponds to the maximum (or minimum) index value and Vd (ci0) is the optimal number of clusters
Self-feedback algorithm based on fuzzy clustering
The process of self-feedback based on fuzzy clustering is as follows. First, an initial search is performed. Then, the P-images closer to the query image are selected as positive examples from the initial retrieval results, clustered according to the low-level features of the images, and selected. Multiple cluster centers are used as substitute queries for feedback retrieval. Finally, these retrieval results are merged with the initial retrieval results to form the final retrieval results. This method extracts the target features from multiple related images instead of relying solely on a single query image. The algorithm is designed as follows:
-
1.
The user submits query image Q;
-
2.
The user performs an initial search based on the annular moment hashing method described in Section 2.3 and arranges the retrieval results in descending order of the similarity and the output;
-
3.
If the user is satisfied with the search result, the query is ended; otherwise, step 4 is followed;
-
4.
The number of images participating in clustering is determined and P-images with the highest similarity from the above initial retrieval results L(Q) are selected:
$$ L(Q)=\left\{{I}_i|i=1,2\dots P\right\} $$
(23)
-
5.
P-images are clustered using the improved non-parametric fuzzy clustering algorithm described in Section 3.4. The optimal clustering number k* is the number of classifications of the related images. In this case, the corresponding clustering centers C = {C1, C2… Ck*} are the obtained substitution query vectors.
-
6.
Calculating the similarity L(C, Dj) of each image Ij in the image library to C, where Dj is the feature vector of Ij,
$$ L\left(C,{D}_j\right)=\mathit{\operatorname{MIN}}\left\{L\left({C}_i,{D}_j\right)|i=1,2\dots {k}^{\ast}\right\} $$
(24)
-
7.
The feedback search result L(C, Dj) is combined with the initial search result to form the final search result and output it; then, step 3 is followed