Absolute joint moments: a novel image similarity measure
© Kalinić et al.; licensee Springer. 2013
Received: 22 November 2012
Accepted: 3 April 2013
Published: 26 April 2013
In this paper, we propose a novel approach for estimating image similarity. This measure is of importance in assessing image correspondence or image alignment and plays an important role in image registration. Currently, this problem is approached rather one-dimensionally since most registration methods consider the problem as either mono- or multi-modal. This perspective leads to the selection of some form of either the correlation coefficient (CC) or mutual information (MI) as image similarity measure (ISM). We propose a more generic framework for ISM construction, based on absolute joint moments, which can be considered as a generalization of CC. Within this framework, we propose a specific ISM that provides a different trade-off between MI and CC in terms of performance and computational cost for general registration problems. To illustrate this, we compared CC and MI with the proposed ISM and performed extensive experiments with regard to accuracy, robustness and speed. The evaluation demonstrated that the proposed absolute joint moments is a good combination of properties of CC and MI, with respect to speed and performance.
Image registration is an optimization process that utilizes a similarity measure to find the optimal alignment of two images. The registration accuracy depends on the selection of the optimization algorithm and geometric transformation as well as on the definition of the similarity measure. Therefore, it is essential to use a suitable similarity measure for a given problem. The selection of an image similarity measure, especially in the case of medical image registration, is usually reduced down to the question whether a multi- or mono-modal registration is required. This black-and-white perspective leads to well-known answers and results in the selection of correlation coefficient (CC) for mono-modal registration and mutual information (MI) for multi-modal registration.
However, the problem of registration can be approached from several perspectives, and often, the registration of the same two images can be performed using different methodological approaches. For example, Zitova et al. in  distinguish not only a multi-modal approach but also a multi-view or multi-temporal approach. Similarly, Maintz et al. in , classify registration not only from a intra-/inter-modality perspective but also from a intra-/inter-subject perspective. Perhaps for this reason, many other image similarity measure (ISM) emerged over time (see Section 2).
In the following sections, we propose a wider framework for the construction of image similarity measures. Within this framework, we propose a specific ISM to show that, when constructed in this way, it combines beneficial properties of the two most used ones: CC and MI.
Whatever similarity measure is used for image registration, it has to satisfy one basic condition - at the exact location of the correct alignment of two images, the similarity (measure) has to be maximal. To find the maximum, the image registration incorporates an optimization algorithm, which iteratively calculates a similarity measure. The number of calculations may thus easily reach a number of a few thousands, especially if the geometric transformation is non-linear. Therefore, the complexity of the similarity measure also plays an important role in registration methods. For this reason, in our experiments, we measure the performance of the similarity measure in terms of accuracy, robustness and speed.
The purpose of an ISM is to quantify the similarity between two images, usually referred to as source and target images. In this section, we will briefly discuss the properties of the two most prominent ISMs in order to compare them to the newly proposed one.
To set the notation, let us denote source and target images as S(x) and T(x), where stands for the pixel coordinate vector. Further, the average value is denoted with a line above the function (e.g. ), and p is used to denote the (joint) probability density function (e.g. p T S (x,y)).
is an optimal choice if the pixel values between the images S(x) and T(x) are related by an affine function, even in the presence of a reasonable amount of noise. Since most mono-modal image registration problems assume this type of functional relationship between images, CC was considered a dedicated approach for this type of problems and thus is often the first choice.
All good properties, as well as the shortcomings, of MI and CC come directly from their definition. For example, CC is fast since it uses only summary statistics. The calculation of mean and standard deviation only, which are sufficient to describe an affine functional relationship between pixel values, makes CC a good and fast measure for registration of images with affine relationship between their pixel values. Many image registration problems violate the assumption of an affine relationship between image pixel values, thus making CC a suboptimal choice for images with more complex functional relationship between pixel values [3, 7]. On the other hand, the definition of MI includes much more statistical information about the relationship of the images S(x) and T(x). This property comes directly from the joint probability density function p T S =p(T,S), which fully characterizes the relationship between images T and S, without any assumptions on a type of functional relationship or any noise that may exist in the image acquisition process. Many of the disadvantages of the MI come from the very same things that boost its good properties - the joint probability density function (PDF). For example, since the joint PDF needs to be estimated/approximated, various problems can emerge, like sensitivity to sample size, number of histogram bins or interpolation [7, 8].
In this perspective, we can say that the main difference between CC and MI is whether the ISM incorporates (few) summary statistic values or all the statistical information in the form of a PDF. From this perspective, we can thus see that many measures, proposed earlier in the literature, are derivates of these two most prominent ones. Roughly, we can classify them as either CC-based (e.g. [7, 9–16]), or MI-based (e.g. [17–23]), which are sometimes also referred as information theory-based similarity measures (see [24–27]).
For this reason, we will primarily focus on CC and MI, investigate their properties and compare them to the ISM that we propose. In the following section, we will give some motivating examples to show that there is room for improvement besides the exiting ISMs. Next, we aim to propose a framework for ISM construction which will utilize a chosen amount of statistical values instead of only a few (such as standard deviation, skewness, kurtosis). In this way, we aim to present a framework which will bridge a gap between CC and MI and select a measure from this framework to show that it incorporates the good sides of both of them. In a way, this paper can be seen as an extension of the idea presented in the paper by Kalinic et al. .
As will be shown in further sections, the performance of the MI may vary, depending on the number of bins selected to approximate the PDF. To distinguish one MI implementation from another, the number of bins is used as index (e.g. MI8 and MI256). The three examples shown in Figure 1 are given for each ISM implementation: CC, MI8 and MI256. The comparison between the images is done while applying three different geometric transformations: scaling, rotation and translation. The images which are to be aligned are constructed from the same template image selected from the test set (see Section 5.1) by simple noise addition, with A n / A s =0.5 (for details about noise degradation model, see Section 5.2). Since both images are constructed from the same template, the correct alignment is already known, i.e. the ISM should have a maximal value for unity scaling, zero rotation and zero translation.
In the first example, the image is consecutively translated by 1 pixel in the range of −30 to 30. In the second example, the image is consecutively rotated by 1° in the range of −30° to 30°. In the third example, the image is scaled by s=0.9+0.01·n where . Figure 1 shows the values for MI8, MI256 and CC calculated for the examples.
Notice that MI256 does not perform well in the first example, and neither of the ISMs has the maximum at the correct location for both rotation and scaling. Therefore, an ISM defined in a different way might be able to produce a better result in these examples, but then, it obviously remains to be seen if its better performance would be related only to this particular image. Both questions will be addressed further on (see Section 7).
4 Absolute joint moments
In this section, we define source and target images as random variables and denote them as S and T instead of S(x) and T(x). The following notation is used: if μ S =E[S], then E[(S−μ S ) n ] stands for the n th central moment of the random variable S, or more generally, E[(S−μ S ) n (T−μ T ) m ] stands for the joint central moment of the order (m,n) of the random variables S and T.
4.1 Framework for constructing image similarity measures
where σ T and σ S denote standard deviations.
We refer to the absolute joint moment (AJM) framework for ISM construction. Notice that if we take only the first element of the sum and select ω n =ω m =1, this reduces down to the absolute covariance (numerator of CC from Equation 3).
4.2 Proposed image similarity measure
For the proposed image similarity, we use a specific selection of weights ω n and ω m , which are computationally efficient and guarantee the convergence of the sum. Since this is just one of many possible selection of weights, we will denote ISM as AJM i to indicate that is just one instance.
where N denotes the number of pixel pairs in D X . In further sections, the will be the only used and tested ISM from this framework, so we will denote it simply as AJM.
We will show that this ISM selected from the proposed framework, compared to MI and CC, will have a different trade-off between speed and performance as will result directly from the definition of AJM. In order to show this, in further sections, we will investigate the properties of AJM and compare them to MI and CC with regard to robustness, accuracy and speed.
5 Experimental data
5.1 Data set
Images from publicly available databases were used to test the properties of the registration implemented using AJM as similarity measure. The test set was constructed so as to have as much diversity as possible. First, we used all 44 miscellaneous images from the SIPI database . Since this database does not have medical images, all 19 medical images from the VIS database  were added to the set as well as 3 mammography images from the MIAS database . From the MIAS set, only three images were selected since the variability of the images from this set is low. Finally, 34 images of different objects from the ALOI database  were added to the set which made the total number of images in the testing set 100. The set constructed in this way contained images with different context, from natural to artificially constructed images. Both colour and greyscale images were represented in this set, but all images were converted to greyscale before the registration. All the images were coded with either 7 or 8 bits per pixel and the resolution of images ranged from 128 × 128 to 1,024 × 1,024. For the purpose of the paper, all images were converted to double-precision floating-point format and scaled to interval [0,1].
5.2 Image degradation model
To evaluate the performance of the ISMs, we will need a pair of each image. Therefore, we introduced several degradation models that will be applied on each image in order to simulate different effects that may happen during the image acquisition process such as excessive noise, contrast changes or non-linear intensity distortion. The image degradation models are described in the following subsection and, as can be noticed, are inspired by the paper of Maes et al. .
5.2.1 Contrast inhomogeneity
with (x c ,y c ) being the coordinates of the point around which the curve is positioned and k 1 is the distortion parameter.
To assure that noise does not randomly affect the contrast, we selected a distribution with finite support. Thus, the uniform distribution seemed a natural choice as a noise degradation model. Notice that this simple degradation model becomes more complex when the sequential image degradation model is applied since that degradation affects the noise distribution as well. Additive uniform noise from the interval [0, k 2 A s ] is superimposed on the original image. Here, A s stands for the amplitude of the signal and k 2 stands for the amplitude ratios between noise and signal (A n / A s ).
5.2.3 Non-linear intensity distortion
where I x y stands for the intensity level (at position (x,y)) and are roots of the polynomial that simulates the intensity distortion. After each distortion, the image is normalized to keep the original range of pixel values.
6 Experiments and results
In the first two experiments (Sections 6.1 and 6.2), the image pairs, between which the correct alignment is to be determined, are the original and the degraded image. Therefore, the gold standard is well known since the correct alignment is for unity scaling, and zero translation and rotation. To evaluate the performance of the ISM, an exhaustive search for the global maximum was done in order to assure that the suitability of the ISM, rather than the search strategy, is evaluated. Each ISM between image pairs is calculated for a progressive shift of 1 pixel in the interval [−100,100], for stepwise rotation of 1° in the interval [−180,179] and for scaling in steps of 0.01 to increase or decrease the scaling factor in the interval [0.5,2].
6.1 Robustness test
Parameter range for the robustness test
(x c ,y c )
i 1+i 3
As anticipated, this experiment showed that MI256 is not robust to noise and that CC is not robust to non-linear intensity distortions. For AJM, one can observe that it is affected by high non-linear intensity distortion. However, for a moderate distortion, it still performs satisfactorily. Therefore, AJM seems robust to noise and to moderate amounts of non-linear intensity distortion. AJM is also fairly robust to contrast inhomogeneity since the performance of AJM is comparable to those of other ISMs. All of these conclusions hold for translation, rotation and scaling.
The results of the previous experiment could vary if a different error threshold ξ is selected. To estimate how this results would change for different ξ selected, we performed the following test. Again, all three degradation models were applied and the images were aligned after three different transformations. However, the fixed parameters in this case are k 1, k 2 and i 1. The parameters are set to represent the largest degradation. The results are presented in Figure 5, where, in each line of the table, a graph of the ISMs for different distortion is plotted, and where, in each column, a different transformation is used to achieve the correct image alignment. The graphs show how many images are aligned with an error lower than a certain amount and therefore give insight in the distribution of the ISM error for the image data set.
Figure 5 shows that the order of the ISMs for their relative performance would remain approximately the same no matter what error threshold (ξ) is selected. For all experiments (with the exception of the combination of rotation and contrast inhomogeneities), AJM is between CC and MI with regard to the overall number of correct alignments and sometimes even between the two different implementations of MI. This was, as might be assumed, from its theoretical properties.
As expected, the experiments showed that no ISM could compete with the performance of MI for large non-linear intensity distortions. However, the use of higher order moments was helpful since AJM performs better than CC for non-linear intensity distortion. From the graphs, it is also clear that in a noiseless environment, MI256 performs better than MI8, but in a noisy environment, MI256 is not such a good choice.
6.2 Accuracy test
Parameter range for the accuracy test
(x c ,y c )
As can be noticed in Table 2, the range of parameters is a bit larger in this test than in the previous one. Thus, higher contrast inhomogeneity is allowed as well as lower noise; similarly, non-linear image distortion is implemented as an n th order polynomial, where n can range from 2 to 6. Notice that, according to the robustness test, the increase of the range of these values will not go in favour of AJM.
The average absolute error for translation, scaling and rotation
where D i is the scaling factor (deformation amount) for which the maximal ISM value is achieved. Index i stands for the measurement (image) number, and N denotes the total number of measurements (images). This is done so that the scaling error is symmetrical, i.e. it gives the same error for squeezing and stretching the image by the same factor. Also, it gives no error if the images are scaled by the same factor.
6.3 Speed test
Since CC, MI and AJM are implemented as defined in Equations 1, 2 and 6, respectively, we can notice that the computational complexity of all three measures is . However, it is expected that CC will work faster than AJM since it has to calculate only a product and a ratio instead of a product of exponential functions. Similarly, we can expect that AJM is faster than MI since it does not requires the histogram formation. To evaluate this, we measured the execution time of all three algorithms. For this purpose, the Matlab profiler was used.
Execution time measurements
7 Overall comparison
The total number of aligned image pairs for which AJM outperforms other ISMs
M I 8
M I 256
The experiments show that AJM is robust to noise, fairly robust to contrast inhomogeneities, more robust than CC and less robust than MI for non-linear intensity distortion. The robustness test also showed that MI256 in not very robust to noise. Overall, we can say that AJM is positioned between CC and MI in terms of robustness.
The accuracy test shows that AJM is less accurate than MI8 and overall better than MI256. This difference between different MI implementations also emphasizes how MI is affected by the number of bins of the histogram and the interpolation techniques, while AJM is not. Both MI and AJM outperform CC significantly, primarily due to the fact that CC cannot cope with non-linear intensity distortion. An additional strength of AJM is that it does not require a statistically significant number of pixels for the calculation of entropy, so it can be calculated for a smaller region compared to MI.
Performance of similarity measures
Non-linear intensity distortion
Insensitive to number of bins
Insensitive to interpolation
As a general conclusion, we can say that the experiments have shown that the proposed ISM is able to determine the correspondence among images with complex relationships between the pixel values and is computationally more efficient and does not have some of the inherent disadvantages of MI. Therefore AJM i , from the AJM framework for ISM construction, is complementary to CC and histogram-based approximation of MI, and the specific application will guide the optimal ISM choice.
HK received his PhD degree in Electrical Engineering from the University of Zagreb, Croatia. He is a research and teaching assistant at the University of Zagreb, Faculty of Electrical Engineering and Computing and participates in national and international research projects as a member of the Image Processing Group at the Department of Electronic Systems and Information Processing. His main field of interest is digital image processing and analysis. He is the author of five abstracts, seven conference papers and four journal papers. He is a program committee member or reviewer of four scientific conferences and one international journal.
SL received his BSc and MSc degrees in Electrical Engineering from the University of Zagreb, Zagreb, Croatia in 1985 and 1989, respectively, and his PhD degree in Electrical Engineering from the University of Cincinnati, Cincinnati, OH in 1994. From 1990 to 1994, he was a Fulbright Fellow and a Research Assistant with the University of Cincinnati. He was an assistant professor at the New Jersey Institute of Technology, Newark from 2001 to 2003. He has been a full professor at the University of Zagreb since 2006, where he was an assistant professor from 1996 to 2001 and an associate professor from 2001 to 2006. He was the Chair of the Department of Electronic Systems and Information Processing, Faculty of Electrical Engineering and Computing, University of Zagreb. He is a coauthor of more than 180 refereed journal and conference publications. He served as the Editor-in-Chief of the Journal of Computing and Information Technology. SL served as the IEEE Croatia section Chair from 2005 to 2008. He currently serves as the chapter’s Chair for IEEE Signal Processing Society.
BB is an ICREA research professor at the Department of Information and Communication Technologies of the Universitat Pompeu Fabra, Barcelona. While he graduated with a master’s degree in Electro-Mechanical Engineering Sciences, he obtained a PhD in Medical Sciences on advanced echocardiographic data acquisition and analysis for addressing clinical and pathophysiological questions. He became an associate professor at the Medical Faculty of the KU Leuven where he established a Cardiac Imaging Research Group. He extended his experience in a mixed research/hospital management position in St. George’s Hospital in London (UK) and initiated a new interdisciplinary Cardiology-Engineering research group in Zagreb (Croatia). Currently, he is performing multidisciplinary research in cardiovascular pathophysiology and image analysis/modelling, in collaboration with the major hospitals in Barcelona. His research interests are translational cardiovascular pathophysiology, focussing on assessing cardiac function and understanding and recognising the changes induced by disease and how treatment strategies can be used to modulate this remodelling. BB is the author of more than 130 peer-reviewed papers in international journals (both in medicine and engineering), more than 70 full proceedings papers in international conferences (mainly in biomedical engineering), over 360 abstracts at international conferences (mainly Medical) and more than 130 invited lectures.
This study was supported by the Ministry of Science, Education and Sports of the Republic of Croatia under grant 036-0362214-1989 any partially by the Subprograma de Proyectos de Investigacin en Salud, Instituto de Salud Carlos III, Spain (FIS - PI11/01709).
- Zitova B, Flusser J: Image registration methods: a survey. Image Vis. Comput 2003, 21(11):977-1000. 10.1016/S0262-8856(03)00137-9View ArticleGoogle Scholar
- Maintz J, Viergever M: A survey of medical image registration. Med. Image Anal 1998, 2: 1-36.View ArticleGoogle Scholar
- Sonka M, J M Fitzpatrick (Eds): Handbook of Medical Imaging, Volume 2. Medical Image, Processing and Analysis. SPIE Press; 2000.View ArticleGoogle Scholar
- Viola P: Alignment by maximization of mutual information, PhDthesis, MIT. 1995. Co-supervisors: Lozano-Perez, Tomas and Atkeson, Christopher GGoogle Scholar
- Collignon A, Maes F, Delaere D, Vandermeulen D, Suetens P, Marchal G: Automated multimodality image registration based on information theory. In Information Processing in Medical Imaging. Edited by: R Di Paola, C Barillot, Y Bizais, Bizais Y, Barillot C, Di Paola R. Kluwer Dordrecht; 1995:263-274.Google Scholar
- Viola P, Wells WM: Alignment by maximization of mutual information. Int. J. Comput. Vis 1997, 24: 137-154. 10.1023/A:1007958904918View ArticleGoogle Scholar
- Roche A, Malandain G, PennecX X, Ayache N: The correlation ratio as a new similarity measure for multimodal image registration,. in MICCAI ’98, ed. by WM Wells, A Colchester, S Delp. Proceedings of the First International Conference on Medical Image Computing and Computer-Assisted Intervention. Lecture notes in computer science, vol. 1946 (Springer-Verlag, London, 1998), pp.1115–1124Google Scholar
- Pluim JPW, Maintz JBA, Viergever MA: Interpolation artefacts in mutual information-based image registration. Comput. Vis. Image Underst 2000, 77(9):211-232.View ArticleGoogle Scholar
- Arya KV, Gupta P, Kalra PK, Mitra P: Image registration using robust M-estimators. Pattern Recogn. Lett 2007, 28: 1957-1968. 10.1016/j.patrec.2007.05.006View ArticleGoogle Scholar
- Fitch A, Kadyrov A, Kittler J, Christmas W: Fast robust correlation. IEEE Trans. Image Process 2005, 14(8):1063-1073.View ArticleGoogle Scholar
- Foroosh H, Zerubia JB, Berthod M: Extension of phase correlation to subpixel registration. IEEE Trans. Image Process 2002, 11(3):188-200. 10.1109/83.988953View ArticleGoogle Scholar
- Kaneko S, Satoh Y, Igarashi S: Using selective correlation coefficient for robust image registration. Pattern Recognit 2003, 36(5):1165-1173. 10.1016/S0031-3203(02)00081-XView ArticleGoogle Scholar
- Keller Y, Averbuch A: A projection-based extension to phase correlation image alignment. Signal Process 2007, 87: 124-133. 10.1016/j.sigpro.2006.04.013View ArticleGoogle Scholar
- Periaswamy S, Farid H: Medical image registration with partial data. Med. Image Anal 2006, 10(3):452-464. 10.1016/j.media.2005.03.006View ArticleGoogle Scholar
- Yan H, Liu JG: Robust phase correlation based motion estimation and its applications. In Proceedings of the British Machine Vision Conference. Leeds; September 2008:1-4.Google Scholar
- Hii AJH, Hann CE, Chase JG, Van Houten EEW: Fast normalized cross correlation for motion tracking using basis functions. Comput. Methods Prog. Biomed 2006, 82: 144-156. 10.1016/j.cmpb.2006.02.007View ArticleGoogle Scholar
- Maintz JBA, Meijering EHW, Viergever MA: General Multimodal Elastic Registration Based on Mutual Information. In Image Processing. SPIE, New York; 1998:144-154.Google Scholar
- Luan H, Qi F, Xue Z, Chen L, Shen D: Multimodality image registration by maximization of quantitative-qualitative measure of mutual information. Pattern Recognit 2008, 41: 285-298. 10.1016/j.patcog.2007.04.002View ArticleGoogle Scholar
- Wang Y, Hu BG: Derivations of normalized mutual information in binary classifications. Fuzzy Syst. Knowl. Discov. Fourth Int. Conf 2009, 1: 155-163.Google Scholar
- Cahill ND, Schnabel JA, Noble JA, Hawkes DJ: Overlap invariance of cumulative residual entropy measures for multimodal image alignment. In Medical Imaging 2009:Image Processing. Edited by: Pluim JPW, Dawant BM. SPIE, New York; 2009.Google Scholar
- Loeckx D, Slagmolen P, Maes F, Vandermeulen D, Suetens P: Nonrigid image registration using conditional mutual information. Med Imaging. IEEE Trans 2010, 29: 19-29.Google Scholar
- Thevenaz P, Unser M: Optimization of mutual information for multiresolution image registration. Image Process., IEEE Trans 2000, 9(12):2083-2099. 10.1109/83.887976View ArticleGoogle Scholar
- Warfield SK, Rexilius J, Huppi PS, Inder TE, Miller EG, Zientara GP, Jolesz FA, Kikinis R, III WMW: A binary entropy measure to assess nonrigid registration algorithms. in MICCAI ’01. Proceedings of the 4th International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer-Verlag, London, 2001), pp. 266–274Google Scholar
- Crum WR, Hill DLG, Hawkes DJ: Information theoretic similarity measures in non-rigid registration. Inf Process, Med Imaging 2003, 18: 378-387.View ArticleGoogle Scholar
- Zachary J, Iyengar SS: Information theoretic similarity measures for content based image retrieval. J. Am. Soc. Inf. Sci. Technol 2001, 52: 856-857. 10.1002/asi.1139View ArticleGoogle Scholar
- Cazzanti L, Gupta M: Information-theoretic and set-theoretic similarity. IEEE International Symposium on Information Theory, Seattle July 2006 (IEEE, New York, 2006) pp. 1836–1840Google Scholar
- Lin D: An information-theoretic definition of similarity. Proceedings of the 15th International Conference on Machine Learning (Morgan Kaufmann, San Francisco, 1998), pp. 296304Google Scholar
- Kalinić H, Lončarić S, Bijenens B: A novel image similarity measure for image registration. 7th International Symposium on Image and Signal Processing and Analysis, Dubrovnik, September 2011, ed. by D S vS Lončarić, G Ramponi (IEEE, New York, 2011), pp. 195199Google Scholar
- SIPI datbase . Accessed 22 April 2013 http://sipi.usc.edu/database
- VIS datbase . Accessed 22 April 2013 http://vis-www.cs.umass.edu/∼vislib/Medical/InfarctScan/images.html
- MIAS datbase . Accessed 22 April 2013 http://peipa.essex.ac.uk/info/mias/
- ALOI datbase . Accessed 22 April 2013 http://staff.science.uva.nl/∼mark/aloi/aloi_grey_red4_view.tar
- Maes F, Collignon A, Vandermeulen D, Marchal G, Suetens P: Multimodality image registration by maximization of mutual information. IEEE Trans. Med. Imaging 1997, 16: 187-198. 10.1109/42.563664View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.