Skip to main content

Advertisement

Image analysis using modified exponent-Fourier moments

Article metrics

  • 740 Accesses

Abstract

Classic exponent-Fourier moments (EFMs) have been popularly used for image reconstruction and invariant classification. However, EFMs lack natively the translation and scaling-invariant; in addition, they exhibit two types of drawbacks, namely numerical instability and reconstruction error, which in turn influence their reconstruction capability and image classification accuracy. This study considers the challenge of defining modified EFMs (MEFMs), which are based on modified exponent polynomials. In our methods, the basis function of traditional EFMs is appropriately modified, and these modified basis functions are used to replace the original ones. The basis function of the proposed moments is composed of piecewise modified exponent polynomials modulated by a variable parameter exponential envelope. Various types of optimal-order moments can be established by slightly adjusting the bandwidth of the modified basis functions. Finally, we extend the rotation-invariant feature of previous works and propose a new method of scaling and rotation-invariant image recognition using the proposed moments in a log-polar coordinate domain. The translation invariance can then be achieved by an image projection operation, which is substituted for the traditional approach based on the calculation of image geometric moments. The experimental results demonstrate that the MEFMs perform better than traditional EFMs and other classic orthogonal moments including the latest image moments in terms of the image reconstruction capability and the invariant recognition accuracy of smoothing filters, in both noise-free and noisy conditions.

Introduction

Moments and moment invariants are global descriptors for image feature extraction that have become a hot topic in the field of image analysis. In recent years, various moments have been widely used in image reconstruction [1, 2], image detection [3, 4], target classification [5], digital watermarking [6, 7], image compression [8], and other applications [9, 10]. The study of moments mainly focuses on three directions. The first one is establishing image moments in different coordinate spaces, such as the Cartesian coordinate space [11, 12], polar coordinate space [13, 14], and Radon transformation space [15, 16], among others. The performance of moments reconstructed in the Cartesian coordinate space is better than those in the polar coordinate space and Radon transformation space. The computation complexity is lower; however, rotation-invariant features are difficult to achieve. The image moments are natively rotation-invariant in the polar coordinate and Radon transformation space, and their geometric invariance can easily be achieved. Therefore, the existing image moments are more greatly established in the polar coordinates. Figure 1 shows the various types of image moments in different coordinate systems. The second direction is studying the description ability of the image moments under different basis functions to search for the best basis functions to construct the image moments with better image reconstruction effect and numerical stability. Generally speaking, traditional image moments do not have the inherent properties of geometric invariance; thus, they need to be restructured and designed to satisfy the geometric invariance in pattern recognition. In summary, the construction of rotation-invariant is becoming a hot topic in the study of image moments, which is the third direction for research of image moments.

Fig. 1
figure1

Image moments in different coordinate spaces

As mentioned earlier, the essence of image moments is the set of image transformations based on basis functions. The advantages and disadvantages of its basis functions will directly affect the performance of the constructed image moments. In light of whether the basis set satisfies orthogonal conditions, the image moments can be divided into orthogonal and non-orthogonal moments (similarly known as orthogonal and non-orthogonal transformations, respectively, for example, discrete cosine transform [17], Fourier transform [18], Haar-wavelet transform [19], and Walsh transform [20] belong to orthogonal transformations). Non-orthogonal moments like geometric moments [21], complex moments [22], and rotation moments [23] have made certain achievements in the field of moment applications. The basis functions of non-orthogonal moments are relatively simple with an image reconstruction that is difficult to realize. In addition, the non-orthogonal moments generally have information redundancy that is sensitive to noise. The orthogonal moments can overcome the disadvantages of the abovementioned non-orthogonal moments, thereby becoming a main focus area in the field of image moments in the recent years.

Orthogonal moments can be defined in different coordinate spaces. The basis functions of orthogonal moments defined in polar coordinates are composed of radial polynomials and Fourier complex exponential factors with angular variables (regarded as amplitude and phase coefficients as well); thus, they are called radial orthogonal moments. The radial orthogonal moments in Fig. 1 mainly include Zernike moments (ZMs) [13], pseudo-Zernike moments (PZMs) [14], orthogonal Fourier–Mellin moments (OFMMs) [6], Jacobi–Fourier moments (JFMs) [24], Tchebichef–Fourier moments (TFMs) [25], radial harmonic-Fourier moments (RHFMs) [26], Bessel–Fourier moments (BFMs) [18, 27], exponent-Fourier moments (EFMs) [7], and radial shifted Legendre moments (RSLMs) [28]. These radial orthogonal moments normally have the basic ability of image reconstruction. Moreover, their significant characteristic is that the radial polynomials satisfy orthogonal condition in the unit circle and natively possess a rotation-invariant feature. Thus, radial orthogonal moments have become the preferred descriptor for geometric invariant image recognition, especially for rotation-invariant recognition. Basis functions are regular polynomials defined in the Cartesian coordinates, which can be further divided into continuous orthogonal moments and discrete orthogonal moments, such as Legendre moments (LMs) [29] and Gaussian–Hermite moments (GHMs) [2] that belong to continuous orthogonal moments and Tchebichef moments (TMs) [30], Krawtchouk moments (KMs) [31], Hahn moments (HMs) [32], and Racha moments (RMs) [33] that belong to discrete orthogonal moments. Discrete orthogonal moments do not involve any numerical approximation operations; hence, their basis functions can accurately satisfy an orthogonal condition. Consequently, the image reconstruction performance is better than that of traditional continuous orthogonal moments. In addition, we can construct different moments in other spaces like Radon transform invariant moments and histogram invariant moments in the Radon transform space and histogram space, respectively.

Shortcomings still exist in the abovementioned traditional orthogonal moments. On the one hand, the order of the existing orthogonal moments can only be taken as an integer value, which makes the development of orthogonal moments encounter bottlenecks caused by this constraint. To solve this problem, Xiao et al. [34] and Yang et al. [35] proposed fractional orthogonal moments. The integer-order can be extended to a real-order (also known as fractional order) using their proposed models. Further experimental results showed that the fractional order orthogonal moments were better than the traditional orthogonal moments based on the integer order in image reconstruction, noise robustness, and image recognition. Chen et al. [36, 37] recently extended the ZMs and PZMs to a quaternion and a fractional framework for color image feature extraction. The application of image moments has also been further improved. On the other hand, for image sets with larger distinctions, the classification effect is preferable using the lower-order moments constructed using the basis functions of traditional orthogonal moments. However, for the classification effect of the image sets with smaller discrimination, numerical instability will occur when higher-order moments are adopted. The reason is that the basis functions of traditional orthogonal moments are fixed either in lower- or higher-order moments, which can result in poor classification results in pattern recognition. Wang et al. [38] proposed a circularly semi-orthogonal moment that can maintain a good numerical stability in higher-order moments and can obtain a better visual effect in image reconstruction. This method only performs a simple and fixed modulation on the orthogonal basis functions, and the basis functions of different-order moments are still fixed; hence, the method lacks generality.

Classic orthogonal moments (e.g., EFMs) have the defects of numerical instability and poor accuracy of image recognition in some image classifications, especially in texture image recognition. A modified exponent-Fourier moment (MEFM) is proposed herein based on the concept proposed in [34, 38]. We mainly make attempts in view of three aspects. First, we take on the challenge of studying the performance of semi-orthogonal basis functions at the intersections between the orthogonal and non-orthogonal moments for image reconstruction and pattern recognition. A general semi-orthogonal moment model suitable for different orders can also be established. Second, a new method of the theoretical analysis model of the image moments in the frequency domain is proposed, namely time–frequency correspondence analysis. Finally, a simple and useful algorithm for rotation, scaling, and translation (RST) of invariant image recognition using the proposed moments is introduced herein.

The remainder of this paper is organized as follows: Section 2 provides some preliminaries about the classic exponent-Fourier moments for the 2D images; Section 3 introduces the MEFMs in the polar coordinates and discusses some properties of the MEFMs; Section 4 describes the experiments on the computational complexities of the image moments, image reconstruction, optimal parameter selection, and RST invariant image recognition under both noisy and noise-free, smoothing filter conditions; and Section 5 presents the conclusions.

Preliminaries

This section briefly reviews the definition of the classic orthogonal exponent-Fourier moments (EFMs) [39] for an image along with some EFM properties.

Exponent-Fourier moments

The EFMs of order n with repetition m for a 2D image function f(r, θ) in the polar coordinates is defined as

$$ {E}_{nm}=\frac{1}{2\pi }{\int}_0^{2\pi }{\int}_0^1f\left(r,\theta \right){R}_n^{\ast }(r){e}^{-\tilde{j} m\theta} rdrd\theta $$
(1)

where f(r, θ) denotes the 2D image function in the polar coordinates; \( \tilde{j}=\sqrt{-1},n=0,1,2,\Lambda, m=0\pm 1,\pm 2,\Lambda \) represent the moment orders; and \( {R}_n^{\ast }(r) \) is the conjugate function of Rn(r) defined as

$$ {R}_n^{\ast }(r)=\frac{1}{\sqrt{r}}{e}^{-\tilde{j}2 n\pi r} $$
(2)

Based on the principle of the orthogonal theory, a 2D image function can be reconstructed by the infinite series of the orthogonal function \( {E}_{nm}{R}_n^{\ast }(r) \) over the unit circle.

$$ \overline{f}\left(r,\theta \right)\approx \sum \limits_{n=1}^{n_{\mathrm{max}}}\sum \limits_{m=1}^{m_{\mathrm{max}}}{E}_{nm}{R}_n(r){e}^{\tilde{j} m\theta} $$
(3)

Properties of EFMs and other radial orthogonal moments

For the existing radial orthogonal moments, the number of zeros of the orthogonal polynomials plays a significant role in describing the high-spatial-frequency components of an image. The real and imaginary parts of the radial polynomial of EFMs have 2n and 2n+1 zeros in the interval 0 ≤ r ≤ 1, respectively [39]. Meanwhile, the Bessel polynomials and the orthogonal Fourier–Mellion polynomials have n+2 and n zeros in the interval 0 ≤ r ≤ 1, respectively [6, 40]. Zernike polynomials only have (n − m)/2 zeros in the interval 0 ≤ r ≤ 1. Therefore, the degree n of EFMs required to represent an image is much lower than that in BFMs, OFMMs, and ZMs, thereby causing the EFMs to have a stronger capability in describing an image compared to the other orthogonal moments (e.g., BFMs, OFMMs, and ZMs) in the polar coordinates. Additionally, classic EFMs and other radial orthogonal moments have the property of rotation-invariance similar to geometric invariant recognition. The abovementioned properties show that the exponent-Fourier moments are potentially useful as feature descriptors for image analysis.

Methods

Analysis of the numerical instability involved in classic EFMs

Hu et al. [39] first proposed classic EFMs based on a radial function Rn(r) shown in Eq. (2), which satisfied the orthogonal condition over interval 0 ≤ r ≤ 1. However, the radial function Rn(r) is numerically unstable for classic EFMs, which could cause poor image reconstruction and imprecise image classification in practical applications. The abovementioned reasons are mainly attributed to the following two aspects: First, when r is equal to 0, the real component of the radial function Rn(r) of the EFMs will tend to infinity, and the imaginary part is not a number (i.e., not a number (NaN) value), which are illegal in an actual operation. Second, as shown in Fig. 2, the real component of the radial function Rn(r) of the EFMs will be very large when r tends to 0. This will result in the numerical instability during computation in image moments and will make the computed moments’ value inaccurate. Let r = Δr. When r is equal to 0, where Δr is the minimum value close to 0 (e.g., Δr = 0.005), the first question can be avoided. However, choosing a suitable value of Δr for the computation in lower- or higher-order moments will be difficult. Furthermore, the second question always exists in the computation of the EFMs all the same.

Fig. 2
figure2

Real component of radial function Rn(r) of EMFs with n = 0, 1, 2, 3, 4

Definition of MEFMs

We improve the EFMs and define their modified version, MEFMs, as follows to avoid the numerical instability of the EFMs:

$$ {M}_{nm}=\frac{1}{2}{\int}_0^{2\pi }{\int}_0^1f\left(r,\theta \right){T}_n\left(\alpha, r\right){e}^{-\tilde{j} m\theta} rdrd\theta $$
(4)

where f(r, θ) is an image function in the polar coordinates; n = 0, 1, 2, Λ, m = 0, ± 1, ± 2, Λ are the moments’ order; and Tn(α, r) denotes the radial basis functions of the image moments defined as follows:

$$ {T}_n\left(\alpha, r\right)=\left\{\begin{array}{c}{16}^{-\frac{\alpha_1}{4}r}{e}^{-\tilde{j}2 n\pi r},n\in {N}_{\mathrm{low}}\\ {}{16}^{-\frac{\alpha_2}{4}r}{e}^{-\tilde{j}2 n\pi r},n\in {N}_{\mathrm{high}}\end{array}\right. $$
(5)

where Tn(α1, α2, r) Tn(α, r), (α1, α2) R, n = 0, 1, 2…, Nlow, and Nhigh represents the number of lower- and higher-order moments for the image moments, respectively. The radial basis functions Tn(α, β; r) can be comprehended as a set of orthogonal exponent functions \( {R}_n^{\ast }(r) \) in Eq. (2) multiplied by the compound envelope factor \( \sqrt{r}\left({16}^{-\frac{\alpha }{4}r}\right) \). The basis function \( {R}_n^{\ast }(r){e}^{-\tilde{j} m\theta} \) is orthogonal over the interior of the unit circle.

$$ {\int}_0^{2\pi }{\int}_0^1\left[{R}_n^{\ast }(r){e}^{-\tilde{j} p\theta} rdrd\theta =2{\pi \delta}_{nm}{\delta}_{pq}\right] $$
(6)

where 2π is the normalization coefficient and δnm or δpq is the Kronecker delta function. Thus, the MEFMs can also be called semi-orthogonal EFMs.

Calculation of MEFMs

In the image analysis process, all testing images are digital images; thus, Eq. (4) must be replaced by a discrete form. Consider a digital image f(xi, yj) of the M × N pixels, 0 ≤ i ≤ M, 0 ≤ j ≤ N. We normalize the M × N pixels onto the unit circle [−1, 1] × [−1, 1]. Eq. (4) can be rewritten as

$$ {M}_{nm}=\frac{1}{2\pi}\sum \limits_{i=0}^{M-1}\sum \limits_{j=0}^{N-1}f\left({x}_i,{y}_j\right)\times {\int}_{x_i-\frac{\Delta x}{2}}^{x_i+\frac{\Delta x}{2}}{\int}_{y_j+\frac{\Delta y}{2}}^{y_j+\frac{\Delta y}{2}}{T}_n\left(\alpha, \sqrt{x^2+{y}^2}\right){e}^{-\tilde{j}m\left({\tan}^{-1}\left(y/x\right)\right)} dxdy $$
(7)

where \( {x}_i=\frac{2_i+1-M}{M},{y}_j=\frac{2_j+1-N}{N}\ \mathrm{and}\ \Delta \mathrm{x}=\Delta \mathrm{y}=\frac{2}{\sqrt{M^2+{N}^2}} \). A zero-order approximation method (ZOA) is used to calculate the double integration in Eq. (7) and make a fair comparison with the classic EFMs in [39] via the following experiments:

$$ {\int}_{x_i-\frac{\Delta x}{2}}^{x_i+\frac{\Delta x}{2}}{\int}_{y_j+\frac{\Delta y}{2}}^{y_j+\frac{\Delta y}{2}}{T}_n\left(\alpha, \sqrt{x^2+{y}^2}\right){e}^{-\tilde{j}m\left({\tan}^{-1}\left(y/x\right)\right)} dxdy\approx \Delta x\Delta {yT}_n\left(\alpha, \sqrt{x_i^2+{y}_j^2}\right){e}^{-\tilde{j}m\left({\tan}^{-1}\left({y}_j/{x}_i\right)\right)} $$
(8)

Substituting Eq. (8) into Eq. (7), the modified EFM can be calculated by ZOA as

$$ {\tilde{M}}_{nm}=\frac{2}{\pi \left({M}^2+{N}^2\right)}\sum \limits_{i=0}^{M-1}\sum \limits_{j=0}^{N-1}f\left(f\left({x}_i,{y}_j\right){T}_n\left(\alpha, \sqrt{x_i^2+{y}_j^2}\right){e}^{-\tilde{j}m\left({\tan}^{-1}\left({y}_j/{x}_i\right)\right)}\right) $$
(9)

Similarly, the reconstructed image can be expressed by the following formula:

$$ \overline{f}\left({x}_i,{y}_j\right)\approx \sum \limits_{n=1}^{b_{\mathrm{max}}}\sum \limits_{m=1}^{m_{\mathrm{max}}}{\tilde{M}}_{nm}{T}_n\left(\alpha, \sqrt{x_i^2+{y}_j^2}\right){e}^{\tilde{j}m\left({\tan}^{-1}\left({y}_j/{x}_i\right)\right)} $$
$$ \overline{f}\left({x}_i,{y}_j\right)\approx \sum \limits_{n=1}^{n_{\mathrm{max}}}\sum \limits_{m=1}^{m_{\mathrm{max}}}{\tilde{M}}_{nm}{T}_n\left(\alpha, \sqrt{x_i^2+{y}_j^2}\right){e}^{\tilde{j}m\left({\tan}^{-1}\left({y}_j/{x}_i\right)\right)} $$
(10)

Computation complexity and stability analysis of MEFMs

All the computations of image moments, including the other moments used for comparison, are implemented by the ZOA algorithm proposed in [38] to fairly compare and efficiently verify the properties of the MEFMs without considering accurate calculation and fast algorithm of image moments. Compared with the existing classic orthogonal moments based on higher-order polynomials (e.g., ZMs, LMs, OFMMs, and BFMs), the proposed radial polynomial of the MEFMs is simple (i.e., it is only composed of trigonometric and exponential functions with parameter variables). In practice, it does not involve factorial and accumulative summation operations in classic orthogonal moments; thus, the computational complexity is lower. The radial polynomial of ZMs, OFMMs, and BFMs in lower-order moments (n = 10) [40] is basically close to the uniform distributions, and the amplitudes are more stable (e.g., the amplitude of the radial polynomials of ZMs and BFMs is located in the interval of [− 1, 1], while there are only a few lower-order moments of OFMMs, whose amplitudes exceed 2, and the rest are located in the interval of [− 2, 2]). However, with the increase in the order of the image moments, a numerical instability will appear in the calculation of the abovementioned classic orthogonal moments (e.g., Fig. 3 shows the numerical distribution curves of the classic orthogonal moments at higher-order moments, order n = 50). Figure 3 shows that the radial polynomial of ZMs and OFMMs is close to 0 in the interval [0, 0.8]. Each amplitude gradually increases in the interval of (0.8, 1), and the numerical values tend to be unstable (i.e., the amplitude of OFMMs is close to 1.7 × 1020, when r = 0.95). The radial polynomial of BFMs also tends to decay in the interval [0, 1] (e.g., the amplitude is attenuated to [− 0.1, 0.1] in the interval of [0.5, 1]). However, the polynomial of MEFMs is almost uniform when the order n = 50, and the amplitude is stable. In addition, the classic EFMs [39] and RHFMs [26] have introduced factor \( \sqrt{1/r} \) into their radial basis functions to satisfy orthogonal condition; however, the polynomials’ amplitude of the EFMs and RHFMs tend to NaN (non-number) and Inf (infinity), respectively, when r = 0. This will result in a numerical instability in the image moments constructed. Compared with the other classic orthogonal moments, the proposed image moments can avoid this phenomenon and make the constructed moments more stable. The orthogonal moments constructed by the orthogonal polynomials are better than the non-orthogonal moments in terms of the overall performance. However, this does not mean that the orthogonal polynomials are in a stable state at each point in the defined domain; thus, the proper correction of its unstable orthogonal basis functions can make the image moments reach their best performance. This is the major purpose of the proposed image moments in this paper.

Fig. 3
figure3

ad Radial polynomials of the ZMs, OFMs, BFMs, and MEFMs in higher-order moments

Time–frequency analysis of MEFMs

From the time-domain point of view, the ZOA theory can effectively explain the properties of the constructed basis functions of the orthogonal moments (i.e., the location of zeros of the radial function and the number of zeros of the radial function represent the sampling position and the sampling frequency of an image, respectively). The higher the number of zeros and the more even the distribution in a region, the better is the reconstructed image. For a given order n and repetition m, the radial polynomial of BFMs and OFMMs has n + 2 and n zeros in the interior of a unit circle, respectively, while the radial polynomial of the ZMs only has (n − m)/2 zeros in the interval 0 ≤ r ≤ 1 [40]. Among radial polynomials (or radial functions) with trigonometric functions as basis functions, the real and imaginary parts of the radial polynomial of EFMs [39] and polar harmonic Fourier moments (PHT) [26] have 2n and 2n + 1 zeros in the interior of the unit circle, respectively. The radial polynomials of the polar sine transforms (PST) [41], polar cosine transforms (PCT) [41], and circularly semi-orthogonal moments (SOMs) similarly have n + 2 zeros. Meanwhile, the real and imaginary parts of the radial polynomial of MEFMs have 2n and 2n + 1 zeros in the interior of the unit circle, respectively. As illustrated in Fig. 4, the curve distribution of the real part of the radial polynomial of the MEFMs is smoother at the lower-order moments. This is then compared with the classic orthogonal moments (i.e., ZMs, OFMs, and BFMs) and other orthogonal moments with trigonometric functions as basis functions (i.e., EFMs, PHT, and PCT), which are closer to the uniform distribution and whose magnitudes are more stable (e.g., the amplitude distribution interval is [−1, 1]). For image recognition, most of the algorithms use the lower-order moments of the image moments as the feature extraction for classification. The lower-order moments have a good robustness to noise in pattern recognition; however, the orders of image moments should be increased to obtain more image feature points as the classification features and make a more precise classification for the image sets under a higher similarity condition (e.g., texture images). Therefore, we need to deeply study the higher-order moments of the image moments. The lower-order moments generally correspond to the low-frequency components of an image (e.g., contours or shapes of an image), while the higher-order moments of the image moments represent the detail components of an image (i.e., high-frequency components). The method of the time-domain analysis can be used for the quantitative analysis of the lower-order moments of the image moments, but it cannot provide a more reasonable description of the high-frequency components of an image (corresponding higher-order moments) for image processing or analysis. In view of the abovementioned reasons, a method of time–frequency correspondence is proposed from the frequency domain perspective. This method can analyze and improve the stability of different order moments for image recognition. The basic objective is to consider the representation of the basis functions of the image moments in the frequency domain as a 2D filter. We hope that the frequency bandwidth corresponding to the basis functions of the image moments is wider at the lower-order moments, and the attenuation of the cut-off frequency is as slow as possible. While the corresponding bandwidth is narrower in the higher-order moments, and the attenuation of the frequency cutoff is as fast as possible, in this study, a parameter-modulated MEFM is still proposed and used to verify our concept (Fig. 5). The main principle is to change the bandwidth in the frequency domain by adjusting parameter α of the radial function of the MEFMs in the time domain. In the low-frequency region of the image (lower-order moments), we want to change parameter α (e.g., α = 2 in the experiments) to make the bandwidth as wide as possible, such that more image features of the lower-order moments can be obtained. In the high-frequency region (higher-order moments), the bandwidth is made as narrow as possible by changing parameter α (e.g., α = 0.2 in the experiments), such that more high-frequency components can be suppressed, especially noise interference. Finally, the theoretical results are illustrated and verified by the experimental results of image reconstruction (Section 4.2).

Fig. 4
figure4

Curve distribution of the MEFMs under different parameters in spatial domain

Fig. 5
figure5

Characteristic curves of the MEFMs under different parameters in frequency domain. a In lower-orders moments. b In higher-orders moments

Results and discussion

In this section, the experimental results are used to validate the theoretical framework developed in the previous sections. This section includes four subsections. In the first subsection, we discuss the computational complexities of MEFMs as compared to those of BFMs, ZMs, OFMs, PST, and PCT. In the second subsection, the question of how well an image can be represented using MEFMs is addressed, and the image reconstruction capability of MEFMs is compared with those of BFMs, ZMs, OFMs, SOMs, PST, and PCT. In the third subsection, the question of optimal parameter selection for image reconstruction and recognition is discussed. A new method for the RST invariant image recognition using the proposed moments and the experimental study on the RST recognition accuracy of MEFMs is provided in the last subsection.

Computational complexities

In this section, we demonstrate in terms of the computation time exactly how less complex the computation of the radial polynomial of the MEFMs is when compared to those of BFMs, ZMs, and OFMs. Table 1 shows a summary of the comparisons of the computation time for computing the radial polynomials between MEFMs and the other radial orthogonal moments. In the calculation, the order of the image moments is 5, 10, 15,...30. The test image is a Lena gray-level image (Fig. 6), while the size is 128 × 128. The average value of the computation time by six different order moments is taken as the time-consuming measurements for all the image moments. The hardware configuration of the test computer comprises a 3.2 GHz Intel(R) Core (TM) i5 CPU and 8 GB memory. The software is MATLAB R2013a. Table 1 shows that the time consumed by the MEFMs is slightly higher than that of the PST and EFMs, but its computing time is significantly lower than that of the other classical orthogonal moments (i.e., ZMs, OFMs, and BFMs).

Table 1 Basis functions computation time
Fig. 6
figure6

Lena gray-level image

Image reconstruction

In this subsection, the image representation capability of the MEFMs is presented. For the convenience of computing the image moments, the number of moments used in the image reconstruction and recognition experiments is limited based on nmax = mmax = N, NZ+. In addition, we use the statistical-normalization image reconstruction error (SNIRE) defined in [34] to measure the performance of the image reconstruction.

$$ \overline{\varepsilon^2}=\frac{\sum \limits_{x=1}^N\sum \limits_{y=1}^N\left|f\left(x,y\right)-\overline{f}\left(x,y\right)\left|{}^2\right.\right.}{\sum \limits_{x=1}^N\sum \limits_{y=1}^N{f}^2\left(x,y\right)} $$
(11)

Where f(x, y) is the original image and \( \overline{f}\left(x,y\right) \) is the reconstructed image.

Experiment 1

A set of binary images including digits from 0 to 9 and the uppercase English letters from A to J, and another set of gray-level images and color images including Lena, cameraman, woman and plane, baboon, and pepper, as shown in Fig. 7 are used as test images. The size of each image is 64 × 64. The proposed MEFMs are obtained from the images shown in Fig. 7, and the images are reconstructed using the maximum order of 35 and the parameter α of the radial function of the MEFMs is 2. The results are given in Fig. 8. It can be seen from Fig. 8 that by using the proposed MEFMs, either color, gray-level, or binary images can be reconstructed well.

Fig. 7
figure7

Binary images, gray-level images, and color images used as test images, each of size 64 × 64

Fig. 8
figure8

Images reconstructed from the proposed modified exponent-Fourier moments up to order 35

Experiment 2

To demonstrate the validity of the theory related to the proposed method of time–frequency correspondence in Section 3.5. A comparison of the proposed moments for image reconstruction ability in different parameters is performed and a binary image of uppercase English letter E, a gray-level image cameraman, and a color image baboon are considered in the experiment. The reconstructed experimental results from two types of different methods of determining parameters (i.e., α = 0 and α = 2) in lower-order moments (e.g., the order N = 5, 7, 9, 11, and 13) and higher-order moments (e.g., the order N = 55, 60, 65, 70,...120) are shown in Figs. 9, 10, and 11. Incidentally, lower-order moments and higher-order moments of image moments are related to image reconstruction, e.g., let order N of image moments be 10 and 100, respectively. N = 10 is considered to be a lower-order moment, while N = 100 is a higher-order moment. The comparison study of the reconstructed images using the MEFMs in two types of different parameters (α = 0 and α = 2) shows that, in lower-order moments (N = 5, 7, 9, 11, and 13), the subjective vision of the reconstructed images under parameter α = 2 is better than the reconstructed image when α = 0, the objective evaluation standard related to the performance of image reconstruction has illustrated this phenomenon as well, i.e., the SNIRE of parameter α = 2 in the lower-order moments is generally less than that of parameter α = 0. However, with the increase of a moment’s orders, when the order N of the image moments exceeds 15, the performance for image reconstruction of MEFMs is just opposite to that of lower-order moments. As shown in Table 2, the SNIRE of parameter α = 0in the higher-order moments (e.g., the order N exceeds 25) is less than parameter α = 2, when the moments’ order N = 65, the SNIRE reaches the minimum, and the reconstructed binary image of uppercase English letter “E” is almost close to the original image. As can be seen from Figs. 9 and 10, the subjective vision of reconstructed gray-level and color images under parameter α = 0 is better than that the reconstructed under parameter α = 2 in higher-order moments. The above experimental results also verify the reliability and rationality of the proposed method with respect to time–frequency correspondence in Section 3.5, i.e., the radial function (or polynomial) of the proposed image moments (MEFMs), whose bandwidths and cutoff frequencies in frequency domain will affect the quality and numerical stability of image reconstruction. If lower-order moments are used to describe the image features, the bandwidth of the radial polynomial of the MEFMs can be adjusted to be slightly larger (e.g., the parameter α = 2), while to obtain more image features (the reconstructed images using higher-order moments), the adjustment of bandwidth for radial polynomial of the MEFMs are as narrow as possible (e.g., the parameter α = 0).

Fig. 9
figure9

Gray-level image cameraman are reconstructed in parameters (α = 0 and α = 2) under different order moments

Fig. 10
figure10

Color image baboon are reconstructed in parameters (α = 0 and α = 2) under different order moments

Fig. 11
figure11

a, b Uppercase English letter “E” are reconstructed in parameters (α = 0 and α = 2) under different moments (MEFMs, ZMs, OFMs, and BFMs)

Table 2 Uppercase English letter “E” are reconstructed in parameters (α = 0 and α = 2) under different order moments

Experiment 3

According to the characteristic analysis of the MEFMs’ radial function in frequency domain, we propose a method of image projection transformation for an original image using piecewise function (or polynomial), when an image is reconstructed at lower-moments and higher-moments, respectively (see Eq. (5) in Section 3.2). In order to verify the validity of the piecewise function in Eq. (5), the proposed image moments (MEFMs) are compared with the ZMs, OFMMs, BFMs, and EFMs in this study, and simulation experiments are performed by the reconstruction of the binary image of uppercase English letter “E.” From the experimental results of Fig. 11, it is known that the performance of the proposed image moments constructed by the basis functions, which consists of piecewise polynomials in Section 3.2 is superior to other classical orthogonal moments either in lower-order moments or higher-order moments. Especially with the increase in the order of moments, and when the order is N = 40, the reconstructed images by OFMMs is invalid. When the order is N = 50, the reconstructed images using ZMs is invalid, and the reconstructed images using BFMs and EFMs can maintain good numerical stability in higher-order moments, but those visual effect of image reconstruction are obviously worse than that of the proposed image moments (MEFMs).

We choose the image moments (e.g., SOMs, PST, and PCT) with trigonometric functions as the radial basis functions to reconstruct images and compare the results with the MEFMs to further verify the validity of the proposed image moments. The experimental results show that the SNIRE of the MEFMs along with SOMs, PCT, and PST approximately linearly decreases with the increase in the moments’ order at lower order moments. Moreover, the quality of the reconstructed images is gradually improved. The curve of Fig. 12a shows that the image reconstruction ability of the proposed image moments is better than that of the EFMs, PCTs, and PSTs. However, with the increase of the moments’ order in higher-order moments, Fig. 12b illustrates that the SNIRE of the image moments with trigonometric functions as the radial basis functions does not linearly decrease, and numerical instability exists during image reconstruction. On the contrary, the proposed image moments (MEFMs) can keep the SNIRE gradually decreasing with the increase of moments’ order, showing that the performance of image reconstruction in higher-order moments is better than that of image moments with trigonometric functions as the radial basis functions.

Fig. 12
figure12

Comparison of the results of the reconstructed images under MEFMs, SOMs, PCT, and PST. a In lower-orders moments. b In higher-orders moments

Optimal parameter selection for image reconstruction and recognition

Based on the analysis theory of the time–frequency correspondence in Section 3.5, the selection of parameter value α in Eq. (5) is crucial for the proposed image moments (MEFMs) that will affect the image reconstruction accuracy and the image recognition rate. In other words, choosing the optimal parameter value α to obtain a better image description ability is a problem that needs to be solved at the present. Therefore, a selection method of parameter α must be selected for the proposed MEFMs, which could lead to desirable results in image reconstruction. The selection of parameter α is equivalent to an unconstrained optimization problem (i.e., \( \min \left\{\overline{\varepsilon^2}\left[f,\overline{f};\alpha, N\right]\right\} \)) if two variables α and N are limited based on αmin ≤ α ≤ αmax, Nmin ≤ N ≤ Nmax. For the unconstrained optimization problems, the genetic algorithm (GA) is the most popular and effective method in the recent years. Using GA computing in the proposed image moments, more precise values of parameters α and N can be obtained. However, considering the complexity of the GA implementation process, a simpler algorithm is adopted herein to realize the optimization of parameter α. If the order N of the proposed image moments is fixed, the unconstrained optimization problem of double variables is transformed into the unconstrained optimization problem of a single variable. The specific implementation process is presented below.

First, we will employ 20 gray-level images selected from the Coil-20 database [42] presented in Fig. 16 and use Dg(α) to evaluate the best selection of parameter α defined as follows to investigate the influence of parameter α on the performance of our introduced method:

$$ {D}_g\left(\alpha \right)=\frac{1}{g}\sum \limits_{n=1}^g\overline{\varepsilon^2}\left[{f}_n,{\overline{f}}_n;\alpha \right] $$
(12)

where g = 20 denotes the number of gray-level images from the Coil-20 database, and fn and \( {\overline{f}}_n \) represent the nth original and reconstructed images, respectively. A lower value of Dg indicates a better performance of the proposed image moments in image reconstruction or recognition.

Let us consider herein the influence of orthogonality on the basis function of the proposed image moments (e.g., \( {T}_n\left(\alpha, r\right)={e}^{-\tilde{j}2 n\pi r} \), which is orthogonal in the interval [0,1]) when α = 0. Note that the search interval is restricted to \( -\frac{7}{2}\le \alpha \le \frac{7}{2} \) (i.e., we empirically take a value close to zero, and the stepping increment is 0.5) in the experiments. While the order N of the proposed image moments is given (N = 10 in lower-order moments and N = 60 in higher-order moments in experiments), some different values of Dg can be obtained in terms of the corresponding parameter value α summed up in Table 3. Table 3 clearly shows that {N = 10, α = 2} is optimal in lower-order moments, and {N = 60, α = 0} is optimal in higher-order moments, which are more appropriate for the task of image reconstruction or recognition. Finally, we can conclude that this experiment could considerably help in selecting the optimal parameter value α for the image reconstruction and classification tasks in the future. The optimal value of parameter α given in Table 3 is also consistent with the conclusion of the time–frequency correspondence method proposed in Section 3.5.

Table 3 The search results of the Dg according to different parameter value α

Rotation, scaling, and translation invariant image recognition

In this section, a new RST invariant system for MEFMs that can be implemented in two steps is proposed: for translation invariance, the proposed image projection approach can be considered as a new alternative of the traditional algorithm (i.e., the method for the image translation invariant based on calculating the image geometric moments [28] and center moments [43]), followed by extending the basis functions of the MEFMs from the polar coordinate space to the log-polar space, such that the MEFMs have invariant properties of scaling and rotation at the same time.

Scaling and rotation invariance of MEFMs

Log-polar mapping

In the image processing and recognition process, the original image is usually acquired in a Cartesian coordinate system. First, let fsr(x, y) be the scaled and rotated image of an image function f(x, y) with the scaling factor σ−1 and the rotation angle ϕ in the Cartesian coordinates. We then have

$$ {f}^{sr}\left(x,y\right)=f\left({\sigma}^{-1}\left(x\cos \phi +y\sin \phi \right),{\sigma}^{-1}\left(y\cos \phi -x\sin \phi \right)\right) $$
(13)

Using the conversion relationship from the Cartesian coordinate system to the log-polar coordinate space: x = eρ cos θ, y = eρ sin θ, 0 ≤ θ ≤ 2π, ρ2, we can rewrite Eq. (13) as

$$ {\displaystyle \begin{array}{c}{f}^{sr}\left({e}^{\rho}\cos \theta, {e}^{\rho}\sin \theta \right)=f\left({\sigma}^{-1}\left({e}^{\rho}\cos \theta \cos \phi +{e}^{\rho}\sin \theta \sin \phi \right),{\sigma}^{-1}\left({e}^{\rho}\sin \theta \cos \phi -{e}^{\rho}\cos \theta \sin \phi \right)\right)\\ {}=f\left({\sigma}^{-1}{e}^{\rho}\cos \left(\theta -\phi \right),{\sigma}^{-1}{e}^{\rho}\sin \left(\theta -\phi \right)\right)\\ {}=f\left({e}^{-\ln \sigma }{e}^{\rho}\cos \left(\theta -\phi \right),{e}^{-\ln \sigma }{e}^{\rho}\sin \left(\theta -\phi \right)\right)\\ {}=f\left({e}^{\rho -\ln \sigma}\cos \left(\theta -\phi \right),{e}^{\rho -\ln \sigma}\sin \left(\theta -\phi \right)\right)\end{array}} $$
(14)

The above equation can be simply expressed as

$$ {f}^{sr}\left(\rho, \theta \right)=f\left(\rho -\ln \sigma, \theta -\phi \right) $$
(15)

The Fourier transform (FT) of a 2D image function f(ρ, θ) in the log-polar coordinates can be denoted as follows:

$$ {f}^{sr}\left(\rho, \theta \right)\leftrightarrow F\left(u,v\right) $$
(16)

According to the translation characteristic of 2D Fourier transform, for fsr(ρ, θ) we have

$$ f\left(\rho -\ln \sigma, \theta -\phi \right)\leftrightarrow F\left(u,v\right){e}^{-2\tilde{\pi j}\left(u\ln \sigma + v\phi \right)} $$
(17)

Thus, it is straightforward that \( \left|F\left(u,v\right){e}^{-2\tilde{\pi j}\left(u\ln \sigma + v\phi \right)}\left|=\right|F\left(u,v\right)\right| \).

The above equations and Fig. 13 show that the geometric transformation of the image scaled and rotated in the Cartesian coordinate system will be converted into the corresponding translation operation in the log-polar coordinate space, followed by 2D Fourier transform for fsr(ρ, θ); thus, the invariance of image scaling and rotation can be achieved.

Fig. 13
figure13

The original binary image and the scaled and rotation image in Cartesian coordinates and Log-polar coordinates respectively. a Original binary image in Cartesian coordinates. b Original binary image in log-polar domain. c Scaling(0.5) and rotation(90 deg). d Scaling and Rotation image in log-polar domain

MEFMs invariant computing method in the log-polar coordinate space

Encouraged by the success of the Log-polar mapping (LPM) approach and some related works in [44], we take on the challenge of extending the basis functions of MEFMs from the polar coordinates to the log-polar coordinate space, such that the scaling and rotation invariance for the proposed MEFMs can be easily achieved. In light of Eq. (5), we have

$$ {T}_n\left(\alpha, r\right)=\left|{T}_n\left(\alpha, r\right)\right|{e}^{-\tilde{j}2 n\pi r} $$
(18)

We then let

$$ g\left(r,\theta \right)=w\left(\alpha, r\right)f\left(r,\theta \right) $$
(19)

where w(α, r) = |Tn(α, r)|r can be regarded as a weighted function, and g(r, θ) is a weighted image in the polar coordinate system.

Thus, Eq. (4) can be rewritten as follows:

$$ {\displaystyle \begin{array}{c}{M}_{nm}=\frac{1}{2\pi }{\int}_0^{2\pi }{\int}_0^1f\left(r,\theta \right){T}_n\left(\alpha, r\right){e}^{-\tilde{j} m\theta} rdrd\theta \\ {}=\frac{1}{2\pi }{\int}_0^{2\pi }{\int}_0^1f\left(r,\theta \right)\left|{T}_n\left(\alpha, r\right)\right|{e}^{-\tilde{j}2 n\pi r}{e}^{-\tilde{j} m\theta} rdrd\theta \\ {}=\frac{1}{2\pi }{\int}_0^{2\pi }{\int}_0^1g\left(r,\theta \right){e}^{-\tilde{j}2 n\pi r}{e}^{-\tilde{j} m\theta} drd\theta \end{array}} $$
(20)

Similarly, by using the conversion relationship from the polar coordinate system to the log-polar coordinate space: ρ = ln r, θ = θ, 0 ≤ θ ≤ 2π, ρ (−∞, 0], we can change the above Eq. (20) in the polar coordinate domain to the log-polar domain. The modified version of the radial basis function of MEFMs is redefined as

$$ {T}_n\left(\rho \right)={e}^{\tilde{j}2 n\pi p} $$
(21)

which satisfies the orthogonal condition:

$$ {\int}_{-1}^0{T}_n\left(\rho \right){T}_m^{\ast}\left(\rho \right) dp={\delta}_{nm} $$
(22)

Hence, the basis function of MEFMs in the log-polar domain can be represented as

$$ {P}_{nm}\left(\rho, \theta \right)={T}_n\left(\rho \right){e}^{\tilde{j} m\theta} $$
(23)

Pnm also satisfies the following orthogonal condition:

$$ {\int}_0^{2\pi }{\int}_{-1}^0{P}_{nm}\left(\rho, \theta \right){P}_{lk}^{\ast}\left(\rho, \theta \right) d\rho d\theta ={\delta}_{nl}{\delta}_{mk} $$
(24)

In light of these conclusions, the modified version of the MEFMs in the log-polar domain is defined as

$$ {M}_{nm}^{LPM}={\int}_0^{2\pi }{\int}_{-1}^0g\left(\rho, \theta \right){P}_{nm}^{\ast}\left(\rho, \theta \right) d\rho d\theta $$
(25)

Let gsr(ρ, θ) denote the scaled and rotated version of an image g(ρ, θ) with the scaling Factor σ and rotation angle ϕ in the log-polar coordinates. We then have

$$ {g}^{sr}\left(\rho, \theta \right)=g\left(\ln \left(\sigma r\right),\theta +\phi \right)=g\left(\ln \sigma +\ln r,\theta +\phi \right)=g\left(\rho +\ln \sigma, \theta +\phi \right) $$
(26)

Thus, according to Eqs. (24) and (25), the MEFMs of gsr(ρ, θ) are

$$ {\displaystyle \begin{array}{l}{\tilde{M}}_{nm}^{LPM}={\int}_0^2{\int}_{-1}^0{g}^{sr}\left(\rho, \theta \right){P}_{nm}^{\ast}\left(\rho, \theta \right) d\rho d\theta ={\int}_0^{2\pi }{\int}_{-1}^0{g}^{sr}\left(\rho, \theta \right){e}^{-\tilde{j}2 n\pi \rho}{e}^{-\tilde{j} m\theta} d\rho d\theta \\ {}={\int}_0^{2\pi }{\int}_{-1}^0g\left(\rho +\ln \sigma, \theta +\phi \right){e}^{-\tilde{j}2 n\pi \rho}{e}^{-\tilde{j} m\theta} d\rho d\theta \end{array}} $$
(27)

Let \( \hat{\rho}=\rho +\ln \sigma \) and \( \hat{\theta}=\theta +\phi \), we then have \( \rho =\hat{\rho}-\ln \sigma \) and \( \theta =\hat{\theta}-\phi \). Eq. (27) can be rewritten as

$$ {\displaystyle \begin{array}{c}{\tilde{M}}_{nm}^{LPM}={\int}_0^{2\pi }{\int}_{-1}^0g\left(\hat{\rho},\hat{\theta}\right){e}^{-\tilde{j}2 n\pi \left(\hat{\rho}-\ln \sigma \right)}{e}^{-\tilde{j}m\left(\hat{\theta}-\varphi \right)}d\hat{\rho}d\hat{\theta}\\ {}={\int}_0^{2\pi }{\int}_{-1}^0\left[g\left(\hat{\rho},\hat{\theta}\right){e}^{-\tilde{j}2 n\pi \hat{\rho}}{e}^{-\tilde{j}m\hat{\theta}}d\hat{\rho}d\hat{\theta}\right]{e}^{\tilde{j}2 n\pi \ln \sigma }{e}^{\tilde{j} m\varphi}\\ {}={\int}_0^{2\pi }{\int}_{-1}^0\left[g\left(\hat{\rho},\hat{\theta}\right){e}^{-\tilde{j}2 n\pi \hat{\rho}}{e}^{-\tilde{j}m\tilde{\theta}}d\hat{\rho}d\hat{\theta}\right]{e}^{\tilde{j}2 n\pi \ln \sigma }{e}^{\tilde{j} m\varphi}\\ {}{M}_{nm}^{LPM}{e}^{\tilde{j}2 n\pi \ln \sigma +\tilde{j} m\varphi}\end{array}} $$
(28)

and we have

$$ \left|{\tilde{M}}_{nm}^{LPM}\right|=\left|{M}_{nm}^{LPM}{e}^{\tilde{j}2 n\pi \ln \sigma +\tilde{j} m\phi}\right|=\left|{M}_{nm}^{LPM}\right| $$
(29)

Equations (28) and (29) show that the scaling and rotation of an image by a scaling factor of σ and an angle of ϕ result in a shift of the MEFMs in the ρ-axis and θ-axis, respectively. This simple property leads to the conclusion that the magnitudes of the MEFMs of the scaled and rotated image function remain identical to those before scaling and rotation. Thus, the magnitudes \( \left|{M}_{nm}^{LPM}\right| \) of the MEFMs can be taken as a scaling and rotation invariant feature for image recognition. For the discretization calculation for scaling and rotation invariance of MEFMs, see Appendix.

Pnm is a complete orthogonal basis function; hence, a 2D image can be reconstructed by \( {M}_{nm}^{LPM} \) and represented by the following formula:

$$ g\left(\rho, \theta \right)=\sum \limits_{n=-\infty}^{\infty}\sum \limits_{m=-\infty}^{\infty }{M}_{nm}^{LPM}{P}_{nm}\left(\rho, \theta \right)=\sum \limits_{n=-\infty}^{\infty}\sum \limits_{m=-\infty}^{\infty }{M}_{nm}^{LPM}{e}^{\tilde{j}2 n\pi \rho}{e}^{\tilde{j} m\theta} $$
(30)

If we maintain the constraints n ≤ nmax and m ≤ mmax, an approximate version of the 2D image function denoted as \( \tilde{g}\left(\rho, \theta \right) \) can be calculated as

$$ \tilde{g}\left(\rho, \theta \right)=\sum \limits_{n=1}^{n_{\mathrm{max}}}\sum \limits_{m=1}^{m_{\mathrm{max}}}{M}_{nm}^{LPM}{P}_{nm}\left(\rho, \theta \right)=\sum \limits_{n=1}^{n_{\mathrm{max}}}\sum \limits_{m=1}^{m_{\mathrm{max}}}{M}_{nm}^{LPM}{e}^{\tilde{j}2 n\pi \rho}{e}^{\tilde{j} m\theta} $$
(31)

Projection approach for the image translation invariance

The existing moments-based image translation invariance was mainly achieved by calculating the image geometric moments or center moments [28, 43]. Its major drawbacks include being time-consuming and a more complex computation process in image recognition (Fig. 14a). The main reason is that if other orthogonal moments (e.g., ZMs, OFMs, and BFMs) are used to extract the image features in the image recognition process, the geometric moments or center moments must be calculated again to achieve the translation invariance. Considering the shortcomings of the existing methods, this study proposes a new translation invariant algorithm, also known as the image projection-based method. Our basic approach is to treat translation invariance to separate the target image from the background image. The approach for image translation can be summarized as follows (for the chief algorithm process, see Fig. 14b):

  1. (1)

    If the original image is a color image, the color image should first be gray-scale; otherwise, this step can be a default.

  2. (2)

    Otsu’s algorithm [45] is used to determine the thresholds of the gray-scale image in the global region and then binarize the gray-scale image according to the thresholds.

  3. (3)

    A high-quality binary image can be obtained via a simple image pre-processing operation for binary images (e.g., denoising, filtering, etc.).

  4. (4)

    Calculating the projection image in the horizontal direction of the binary image and obtaining the position for the troughs of the projection image, segmentation is performed for the whole image according to the trough point.

  5. (5)

    The projection operation in the vertical direction is the same as that in Step 4.

  6. (6)

    Finally, according to the segmentation position of the binary image, the target image can be separated from the background in the original image. The experiments are performed on the selected cartoon cat color images (Fig. 15) from the Columbia University image library database [42]. Figure 15 shows (a) as the process of using the projection approach for the untranslated images and (b) as the process of using the projection approach for the translated images.

Fig. 14
figure14

a The block diagram of traditional translation invariant algorithm. b The block diagram of our approach

Fig. 15
figure15

a The projection approach for untranslated color image. b The projection approach for translated color image

Test of classification results for the RST invariance

This subsection presents a simulation experimental study on the RST invariant image classification accuracy of the MEFMs under both noisy and noise-free conditions and in a smoothing filter. A comparison with the accuracy of the classic radial orthogonal moments (e.g., ZMs, OFMs, BFMs, and EFMs) and certain latest orthogonal image moments (e.g., fractional order Legendre moments (Fr-LMs) [2017] and SOMs [2016]) is also depicted. Accordingly, 10th lower-order moments and 60th higher-order moments were adopted and the magnitudes of these selected image moments were used as features for the image classification task. A k-nearest neighbor classifier was used to execute the classification. To evaluate the performance of the classification results, the expression of the correct classification percentages (CCPs) is introduced as follows:

$$ \mathrm{CCPs}=\frac{\mathrm{Number}\ \mathrm{of}\ \mathrm{correctly}\ \mathrm{classified}\ \mathrm{objects}}{\mathrm{The}\ \mathrm{total}\ \mathrm{number}\ \mathrm{of}\ \mathrm{classified}\ \mathrm{objects}}\times 100 $$
(32)

Datasets

The image classification performance of the proposed methods was evaluated with three test datasets: D1, D2, and D3 (Fig. 16). D1 was produced by selecting pictures from a publicly available database, named Coil-100, from Columbia University (the size of each image was 128 × 128; see [42]) (Fig. 17). D2, including 20 butterfly images, was collected from the internet. Some of these images are shown in Fig. 18 and available in [46]. The Brodatz texture image database was used for D3, which included 112 texture images (the size of each image was 640 × 640). Figure 19 shows the typical 35 pictures in the D3 dataset.

Fig. 16
figure16

Samples of Coil-20 grayscale images from Columbia University

Fig. 17
figure17

Some typical color images of Coil-100 from Columbia University

Fig. 18
figure18

Some of butterfly color images used in the image classification

Fig. 19
figure19

Some sample images in Brodatz image database

Experiment 1

First, the training set, including 200 (100 × 2) images, was constructed by rotating each image in the D1 dataset through angles of 0 and 180°. In the next step, each image of the training set was arbitrarily translated with (Δx, Δy)  [−45, 45], subsequently letting φi = 5 i be a rotation angle vector with i being an integer and varying from 0 to 35, rotated by ϕi, and scaled with a scaling factor of α = 0.5 + (2.5 θi)/360  [0.5, 3]. Therefore, we obtained a testing set, including 7200 (36 × 200) images in the experiment. The proposed approach and the ZM-, OFM-, BFM-, and EFM-based methods were also used to implement classification. In the last step, each of the testing set mentioned earlier was corrupted with the salt and pepper noise with the noise densities varying from 5% to 20% with 5% increment steps. The CCPs were obtained via our method and via the ZM-, OFM-, and BFM-based methods, with the results shown in Table 4. Either in the lower-order moments (10th order moments) or higher-order moments (60th order moments), the image classification performance of our approach (MEFMs) performed better than the other classic orthogonal moments-based methods.

Table 4 Comparative study of the CCPs by various methods (including classic ZMs, OFMs, BFMs, and ours) for D1 dataset contaminated by salt and pepper noise

Experiment 2

Twenty color butterfly images in dataset D2 with a size of 640 × 480 were used as the training set. Testing sets, including 1440 (72 × 20) images, were achieved in the same manner as Experiment 1. The correct CCPs were obtained via our proposed method (MEFMs) and the latest image moments (e.g., Fr-LM-, SOM-, and EFM-based methods). Table 5 shows the summarized classification results. For the texture image classification, the proposed image moments (MEFMs) had superior rotation and scaling invariance. Under the condition of simultaneous rotation and scaling attacks, the proposed MEFMs still maintained a higher classification accuracy compared to the latest image moments in recent years (i.e., Fr-LMs, SOMs, and EFMs).

Table 5 Comparison our approach results with the published results in recent years for D2 dataset

Experiment 3

We further evaluated the image classification capability of the proposed method (texture recognition task). In this experiment, the training set was formed in the same manner as Experiment 1 by rotating each image in dataset D3 through angles 0, 45, 90, and 180°. We can obtain 448 (112 × 4) images for the database of the training set. The testing set was composed to mix the RST effect with translation (Δx, Δy)  [−45, 45] and scaling factor α [0.5, 3], which includes 4032 (36 × 112) images. For the testing set, the CCPs were obtained via our proposed method (MEFMs), EFMs, Fr-LMs, and SOMs. Each image of the testing set was divided into two groups: the first group was corrupted by salt and pepper noise with noise densities varying from 5% to 20% with steps of 5% increments, while the second was manipulated by a smoothing filter with different smooth windows (Fig. 20 and Table 6). All these image moments’ CCPs inclined to reduce as the density of salt and pepper noise and smooth windows increased; however, the reductions of the proposed MEFMs were the least in these mentioned methods. This result proves that of the four types of image moments, our proposed image moments (MEFMs) exhibited the highest robustness for the salt and pepper noise as well as the smoothing filter operations.

Fig. 20
figure20

Comparative analysis of CCPs under different noise densities in experiment 3. a In lower-order moments. b In higher-order moments

Table 6 Comparative results of the proposed approach and other published methods in recent years for D3 dataset corrupted by smoothing filter operation

Conclusions

This study introduced a new set of moments based on the modified exponent polynomials, called MEFMs. The main contributions of this study are as follows:

  1. (1)

    A new type of piecewise modified exponent polynomial, also known as the semi-orthogonal polynomial, was derived. The derived polynomial is the transformed versions of classical exponent polynomial.

  2. (2)

    To build a series of numerically stable different-order image moments for image reconstruction and pattern recognition, a new method of time–frequency correspondence is proposed herein, which can improve the image reconstruction effect and accuracy of image recognition.

  3. (3)

    We propose a new method for RST invariant recognition and compared it with the traditional moment invariant-based method. Our approach is more practical and effective for geometric invariant recognition. For the future work, we will search for superior semi-orthogonal image moments for local feature extraction in the image analysis because non-moment-based methods (e.g., [47, 48]) can effectively extract the local features of the image.

Availability of data and materials

(1) The datasets generated in our experiments are available from Coil-20 and coil-100 image database and Butterfly images Database, URL link: http://www.cs.columbia.edu/CAVE/databases/. http://cs.cqupt.edu.cn/info/1078/4189.htm.

(2) The datasets used or analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

BFMs:

Bessel–Fourier moments

CCPs:

Correct classification percentages

EFMs:

Exponent-Fourier moments

Fr-LMs:

Fractional order Legendre moments

Fr-ZMs:

Fractional order Zernike moments

GHMs:

Gaussian–Hermite moments

LMs:

Legendre moments

MEFMs:

Modified exponent-Fourier moments

NaN:

Not a number

OFMs:

Orthogonal Fourier–Mellin moments

PCT:

Polar cosine transform

PHT:

Polar harmonic transform

PST:

Polar sine transform

RST:

Rotation, scaling, and translation

SNIRE:

Statistical-normalization image-reconstruction error

SOM:

Semi-orthogonal moments

ZMs:

Zernike moments

References

  1. 1.

    B. Honarvar, R. Paramesran, C.L. Lim, Image reconstruction from a complete set of geometric and complex moments [J]. Signal Processing 98(2), 224-232 (2014)

  2. 2.

    B. Yang, M. Dai, Image reconstruction from continuous Gaussian–Hermite moments implemented by discrete algorithm [J]. Pattern Recognition 45(4), 1602–1616 (2012)

  3. 3.

    Y.D. Qu, C.S. Cui, S.B. Chen, et al., A fast subpixel edge detection method using Sobel–Zernike moments operator [J]. Image & Vision Computing 23(1), 11–17 (2005)

  4. 4.

    S. Ghosal, R. Mehrotra, A moment-based unified approach for image feature detection [J]. IEEE Transactions on Image Processing. A Publication of the IEEE Signal Processing Society 6(6), 781–793 (1997)

  5. 5.

    A. Hmimid, M. Sayyouri, H. Qjidaa, Fast computation of separable two-dimensional discrete invariant moments for image classification [J]. Pattern Recognition 48(2), 509–521 (2015)

  6. 6.

    Z. Shao, Y. Shang, Y. Zhang, et al., Robust watermarking using orthogonal Fourier–Mellin moments and chaotic map for double images [J]. Signal Processing 120, 522–531 (2016)

  7. 7.

    X.Y. Wang, Q.L. Shi, S.M. Wang, et al., A blind robust digital watermarking using invariant exponent moments [J]. AEUE-International Journal of Electronics and Communications 70(4), 416–426 (2016)

  8. 8.

    B. Xiao, G. Lu, Y. Zhang, et al., Lossless image compression based on integer discrete Tchebichef transform [J]. Neurocomputing 214(C), 587–593 (2016)

  9. 9.

    A. Khotanzad, Y.H. Hong, Invariant image recognition by Zernike moments [J]. IEEE Transactions on Pattern Analysis & Machine. Intelligence 12(5), 489–497 (2002)

  10. 10.

    M.S. Choi, W.Y. Kim, A novel two stage template matching method for rotation and illumination invariance [J]. Pattern Recognition 35(1), 119–129 (2002)

  11. 11.

    H. Zhang, H. Shu, G. Coatrieux, et al., Affine Legendre moment invariants for image watermarking robust to geometric distortions [J]. IEEE Transactions on Image Processing 20(8), 2189–2199 (2011)

  12. 12.

    H. Zhang, H. Shu, G.N. Han, et al., Blurred image recognition by Legendre moment invariants [J]. IEEE Transactions on Image Processing 19(3), 596–611 (2010)

  13. 13.

    Deng AW, Wei CH, Gwo CY, Stable, fast computation of high-order Zernike moments using a recursive method [J]. Pattern Recognition, 2016, 56(C):16-25.

  14. 14.

    C. Singh, R. Upneja, Accurate calculation of high order pseudo-Zernike moments and their numerical stability [J]. Digital Signal Processing 27(1), 95–106 (2014)

  15. 15.

    R.R. Galigekere, W. Holdsworth, M.N.S. Swamy, et al., Moment patterns in the Radon space [J]. Optical Engineering 39(4), 1088–1097 (2000)

  16. 16.

    B. Xiao, J.T. Cui, H.X. Qin, et al., Moments and moment invariants in the Radon space [J]. Pattern Recognition 48(9), 2772–2784 (2015)

  17. 17.

    C. Singh, A. Aggarwal, A comparative performance analysis of DCT-based and Zernike moments-based image up-sampling techniques [J]. Optik-International Journal for Light and Electron Optics 127(4), 2158–2164 (2016)

  18. 18.

    Z. Shao, H. Shu, J. Wu, et al., Quaternion Bessel–Fourier moments and their invariant descriptors for object reconstruction and recognition [J]. Pattern Recognition 47(2), 603–611 (2014)

  19. 19.

    J.S. Guf, W.S. Jiang, The Haar wavelets operational matrix of integration [J]. International Journal of Systems Science 27(7), 623–628 (1996)

  20. 20.

    H.J.A. Ferrer, I.D. Verduzco, E.V. Martinez, Fourier and Walsh digital filtering algorithms for distance protection [J]. IEEE Transactions on Power Systems 11(1), 457–462 (1996)

  21. 21.

    M. Hu, Visual pattern recognition by moment invariants [J]. Information Theory Ire Transactions on 8(2), 179–187 (1962)

  22. 22.

    A. Abo-Zaid, O.R. Hinton, E. Horne, About moment normalization and complex moment descriptors [C]// Paper presented at the International Conference on Pattern Recognition. Berlin: Springer-Verlag Berlin Heidelberg, pp. 399–409 (1988)

  23. 23.

    J. Flusser, On the independence of rotation moment invariants [J]. Pattern Recognition 33(9), 1405–1410 (2000)

  24. 24.

    Z. Ping, H. Ren, J. Zou, et al., Generic orthogonal moments: Jacobi–Fourier moments for invariant image description [J]. Journal of Optoelectronics Laser 40(4), 1245–1254 (2007)

  25. 25.

    Z. Ping, R. Wu, Y. Sheng, Image description with Chebyshev-Fourier moments [J]. Journal of the Optical Society of America, Optics Image, Science, & Vision 19(9), 1748–1754 (2002)

  26. 26.

    H. Ren, A. Liu, J. Zou, et al., Character reconstruction with radial-harmonic-Fourier moments [C]// Paper presented at international conference on Fuzzy Systems and Knowledge Discovery (IEEE Computer Society, 2007), pp. 307–310

  27. 27.

    G. Gao, G. Jiang, Bessel–Fourier moment-based robust image zero-watermarking [J]. Multimedia Tools & Applications 74(3), 841–858 (2015)

  28. 28.

    B. Xiao, G.Y. Wang, W.S. Li, Radial shifted Legendre moments for image analysis and invariant image recognition [J]. Image & Vision Computing 32(12), 994–1006 (2014)

  29. 29.

    B. Fu, J. Zhou, Y. Li, et al., Image analysis by modified Legendre moments [J]. Pattern Recognition 40(2), 691–704 (2007)

  30. 30.

    R. Mukundan, S.H. Ong, P.A. Lee, Image analysis by Tchebichef moments [J]. IEEE Transactions on Image Processing 10(9), 1357–1364 (2001)

  31. 31.

    P.T. Yap, R. Paramesran, S.H. Ong, Image analysis by Krawtchouk moments [J]. IEEE Transactions on Image Processing 12(11), 1367–1377 (2003)

  32. 32.

    P.T. Yap, R. Paramesran, S.H. Ong, Image analysis using Hahn moments [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence 29(11), 2057–2062 (2007)

  33. 33.

    H. Zhu, H. Shu, J. Liang, et al., Image analysis by discrete orthogonal Racah moments [J]. Signal Processing 87(4), 687–708 (2007)

  34. 34.

    B. Xiao, L. Li, Y. Li, et al., Image analysis by fractional-order orthogonal moments [J]. Information Sciences s382–383, 135–149 (2017)

  35. 35.

    J.W. Yang, D.J. Jin, Lu Z D, Fractional order Zernike moment [J]. Journal of Computer-Aided Design & Computer Graphics 29(3), 480–484 (2017)

  36. 36.

    B. Chen, M. Yu, Y.Q. Su, et al., Fractional quaternion Zernike moments for robust color image copy-move forgery detection [J]. IEEE Access 6(C), 56637–56646 (2018)

  37. 37.

    B.J. Chen, X.M. Qi, X.M. Sun, et al., Quaternion pseudo-Zernike moments combining both of RGB information and depth information for color image splicing detection [J]. Journal of Visual Communication and Image Representation 49(C), 283–290 (2017)

  38. 38.

    X. Wang, T. Yang, F. Guo, Image analysis by circularly semi-orthogonal moments [J]. Pattern Recognition 49(C), 226–236 (2016)

  39. 39.

    H.T. Hu, Y.D. Zhang, et al., Orthogonal moments based on exponent functions: exponent-Fourier moments [J]. Pattern Recognition 47(8), 2596–2606 (2014)

  40. 40.

    B. Xiao, J.F. Ma, X. Wang, Image analysis by Bessel–Fourier moments [J]. Pattern Recognition 43(8), 2620–2629 (2010)

  41. 41.

    P.T. Yap, X. Jiang, A.C. Kot, Two-dimensional polar harmonic transforms for invariant image representation [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence 32(7), 1259–1270 (2010)

  42. 42.

    Coil-20 and coil-100 image database, http://www.cs.columbia.edu/CAVE/databases. Accessed 30 June 2017

  43. 43.

    B. Xiao, G. Lu, T. Zhao, et al., Rotation, scaling, and translation invariant texture recognition by Bessel-Fourier moments [J]. Pattern Recognition & Image Analysis 26(2), 302–308 (2016)

  44. 44.

    Z.L. Ping, Y.J. Jiang, S.H. Zhou, et al., FFT algorithm of complex exponent moments and its application in image recognition (Proceedings of the SPIE International Society for Optical Engineering, Athens, 2014)

  45. 45.

    N. Ohtsu, A threshold selection method from gray-level histograms [J]. IEEE Transactions on Systems Man & Cybernetics 9(1), 62–66 (1979)

  46. 46.

    Butterfly images Database, http://cs.cqupt.edu.cn/info/1078/4189.htm. Accessed 15 May 2015

  47. 47.

    J.W. Han, X. Ji, X.T. Hu, et al., Representing and retrieving video shots in human-centric brain imaging space [J]. IEEE Transactions on Image Processing 22(7), 2723–2736 (2013)

  48. 48.

    T. Zhang, L. Guo, K.M. Li, et al., Predicting functional cortical ROIs via DTI-derived fiber shape models [J]. Cerebral cortex 22(4), 854–864 (2012)

Download references

Acknowledgements

The authors would like to thank the anonymous referees for their valuable comments and suggestions.

Funding

This work was supported by National Natural Science Foundation of China (Grant No. 61472298, 61702403), the Fundamental Research Funds for the Central Universities (Grant No. JB170308, JBF180301), the Project funded by China Postdoctoral Science Foundation (Grant No. 2018 M633473), Basic research project of Weinan science and Technology Bureau (Grant No. ZDYF-JCYJ-17), and by project of Shaan xi Provincial supports discipline (mathematics).

Author information

JC developed the idea for the study modified exponent-Fourier moments and contributed the central idea in our manuscript; BH did the analyses for the properties of the proposed image moments, analyzed most of the data, and wrote the initial draft of the paper; and the remaining authors contributed to refining the ideas, carrying out additional analyses and finalizing this paper. All authors were involved in writing the manuscript. All authors read and approved the final manuscript.

Authors’ information

Bing He was born in 1982. He received the B.S. and M.S. degrees in Communication Engineering and Electrical Engineering from Northwestern Polytechnical University (NPU) and Shaanxi Normal University (SNNU), Xi’an, China in 2006 and 2009, respectively. Now he is an associate professor and pursuing the Ph.D. degree at Xidian University, Xi’an, China, and also a member of ACM. His research interests include image processing, object recognition, and digital watermarking.

Jiang-Tao Cui was born in 1975. He received his B.S., M.S. and Ph.D. degrees in Computer Science and Technology from Xidian University, Xi’an, China in 1998, 2001, and 2005, respectively. He is currently a professor at the School of Computer Science and Technology in Xidian University, and also a member of IEEE. His research interests include image and video processing, pattern recognition, and high-dimensional indexing.

Bin Xiao was born in 1982. He received his B.S. and M.S. degrees in Electrical Engineering from Shaanxi Normal University, Xi’an, China, in 2004 and 2007, and received his Ph.D. degree in computer science from Xidian University, Xi’An, China. He is now working as a professor at Chongqing University of Posts and Telecommunications, Chongqing, China. His research interests include image processing, pattern recognition, and digital watermarking.

Yan-Guo Peng was born in 1986. He is currently a lecturer in Computer Architecture at Xidian University, Xi’an, China. He received his Ph. D. degree from Xidian University in 2016. His research interests include secure issues in data management, privacy protection, and cloud security.

Correspondence to Jiangtao Cui.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Discretization calculation for scaling and rotation invariance of MEFMs

A digital image size of N × N is represented in the Cartesian coordinates, the computation of \( {M}_{nm}^{LPM} \) of MEFMs requires the conversion of the image between log-polar coordinates and Cartesian coordinates. Ping’s CEM [44] computing method will be used herein. We use fLP(ρ, θ) to denote an image in log-polar coordinates, fc(x, y) to represent the image in Cartesian coordinates and f[i, j] to denote the discrete coordinates. The log-polar coordinates (ρ, θ) can be converted to the Cartesian coordinates (x, y) using formula (33).

$$ x={e}^{\rho}\times \frac{N}{\sqrt{2}}\times \cos \theta; y={e}^{\rho}\times \frac{N}{\sqrt{2}}\times \sin \theta $$
(33)

The Cartesian coordinates (x, y) can be changed into discrete coordinates [i, j] with the following relationship:

$$ i=-\left\lceil y\right\rceil +\frac{N}{2}+1;j=\left\lceil x\right\rceil +\frac{N}{2} $$
(34)

Symbol denotes the smallest integer not less than x or y. With the help of Eqs. (33) and (34), an image function fLP(ρ, θ) in the log-polar coordinates can be achieved as

$$ {f}_{LP}\left(\rho, \theta \right)=f\left[-\left\lceil {e}^{\rho}\times \frac{N}{\sqrt{2}}\times \sin \theta \right\rceil +\frac{N}{2}+1,\left\lceil {e}^{\rho}\times \frac{N}{\sqrt{2}}\times \cos \theta \right\rceil +\frac{N}{2}\right] $$
(35)

We then let \( \varDelta \rho =\ln \frac{1}{K} \), \( \varDelta \theta =\frac{2\pi }{K} \), and \( {\rho}_u=\ln \frac{u}{K} \), u = 0, 1, …K − 1; \( {\theta}_v=\frac{2\pi v}{K} \), v = 0, 1, …K − 1.

Thus,

$$ {f}_{LP}\left[{\rho}_u,{\theta}_u\right]\approx f\left[-\left\lceil {e}^{\rho_u}\times \frac{N}{\sqrt{2}}\sin {\theta}_v\right\rceil +\frac{N}{2}+1,\left\lceil {e}^{\rho_u}\times \frac{N}{\sqrt{2}}\times \cos {\theta}_v\right\rceil +\frac{N}{2}\right] $$
(36)

In summary, integral Eq. (24) can be approximated as

$$ {M}_{nm}^{LP M}=\frac{1}{K^2}\sum \limits_{u=0}^{K-1}\sum \limits_{v=0}^{K-1}{f}_{LP}\left[{\rho}_u,{\theta}_v\right]w\left[u,v\right]{e}^{-\tilde{j}2n{\pi \rho}_u}{e}^{-\tilde{j}m{\theta}_v}=\frac{1}{K^2}\sum \limits_{u=0}^{K-1}\sum \limits_{v=0}^{K-1}G\left[u,v\right]{e}^{-\tilde{j}2 n\pi \ln \frac{u}{K}}{e}^{-\tilde{j}2 m\pi \frac{v}{K}} $$
(37)

where, G[u, v] = fLP[u, v]w[u, v], and \( w\left[u,v\right]=\left|{T}_n\left(\alpha, {\left[{\left(\frac{u-N/2}{N/2}\right)}^2+{\left(\frac{v-N/2}{N/2}\right)}^2\right]}^{\frac{1}{2}}\right)\right|{\left.{\left(\frac{u-N/2}{N/2}\right)}^2+{\left(\frac{v-N/2}{N/2}\right)}^2\right]}^{\frac{1}{2}} \).

Eq. (37) shows the 2D discrete Fourier transform of G[u, v]. Therefore, the scaling and rotation invariant of MEFMs for an image can be calculated by a 2D discrete Fourier transform (FFT).

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Keywords

  • Orthogonal moments
  • Modified exponent-Fourier moments
  • Image classification
  • Image reconstruction
  • Rotation-invariant