- Open Access
Similarity measure for image resizing using SIFT feature
© Hua et al; licensee Springer. 2012
- Received: 7 July 2011
- Accepted: 23 April 2012
- Published: 23 April 2012
On the basis of the Scale Invariant Feature Transform (SIFT) feature, we research the distance measure in the process of image resizing. Through extracting SIFT features from the original image and the resized one, respectively, we match the SIFT features between two images, and calculate the distance for SIFT feature vectors to evaluate the degree of similarity between the original and the resized image. On the basis of the Euclidean distance measure, an effective image resizing algorithm combining Seam Carving with Scaling is proposed. We first resize an image using Seam Carving, and calculate the similarity distance between the original image and its resized one. Before the salient object and content are damaged obviously, we stop Seam Carving and transfer residual task to Scaling. Experiments show that our algorithm is able to avoid the damage and distortion of image content and preserve both the local structure and the global visual effect of the image graciously.
- image resizing
- similarity measure
- SIFT feature
- Seam Carving
With the rapid development of multimedia techniques, different media networks, such as internet, telecom, and digital TV networks, have been interconnected tightly. Media data need transferring among these networks and displaying with different resolution or aspect ratios in various display devices. So, image and video retargeting have become a research venue in computer vision and graphics field.
Due to lack of considering the structure and the feature distribution of images, traditional content-oblivious methods have clear drawbacks, e.g., if the aspect ratio is changed greatly, an image will bring obvious distortion by Scaling, and Cropping is likely to throw away much important information distributing over the entire image via cutting pixels from the image periphery only. A sophisticated resizing algorithm should be able to maintain the salient and interesting regions intact as much as possible. Recently, content-aware methods such as Seam Carving, non-homogeneous warping, and patch transform were proposed, by considering important content, unimportant region, image construction, or texture, to protect both the global visual effect and local structures of the image. Although these methods could be adopted to resize image fairily, there are still some problems needing to be further solved, such as image damage for the large adjustment amount, inefficient iterative or traverse computation, image distance measure and retargeting effect evaluation, etc.
In existing literatures, the patch-based bidirectional similarity measure (BDS) and its improved version are widely utilized for evaluating similarity between the original image and the retargeted one. However, owing to the computation of patch matching, this kind of manner is quite inefficient. In addition, a user study by Rubinstein et al.  demonstrated that the similarity magnitude by BDS is in low agreement with human subjective perception. In this article, we investigate an approach to distance measure for image resizing based on the image Scale Invariant Feature Transform (SIFT) feature, and present an algorithm, combining Seam Carving with Scaling, which could be used to protect the prominent and important object efficiently. We extract SIFT features from a given image and its resized one, respectively, then match the SIFT features from the given image to the result, and calculate the distance for SIFT features to evaluate the degree of similarity between two images. On the basis of the SIFT feature distance measure, our image resizing algorithm first resizes the original image using Seam Carving step-by-step, and calculates the similarity distance between the original image and the resized one. While the distance value exceeds a threshold, we abandon Seam Carving and transfer residual task to Scaling, such that both the local structures and the global visual effect of the image could be preserved graciously.
In summary, our main contributions are
Propose an approach to distance measure between an image and its resized version based on SIFT feature;
Utilize the SIFT distance measure to assess the degree of distortion for the resized images;
Propose an effective image resizing algorithm, combining Seam Carving with Scaling, which could preserve the salient information and the global visual effect based on the SIFT distance measure.
The rest of this article is organized as follows: Section 2 introduces the background of image resizing. Section 3 shows the distance measure algorithms used in this article. In Section 4, we present an image resizing algorithm combining Seam Carving and Scaling. Section 5 describes the analysis and setting of the threshold. We compare the effects of our method with those of other algorithms and present some discussion in Section 6.
Up to now, various algorithms have been proposed for image and video retargeting, in which different aspects are taken into account for achieving desired results.
The salient and important information-based algorithm is a kind of popular retargeting method and could be used for preserving the visual consistency of important regions of an image effectively. Zhang et al.  employed shrinkability maps and a random walk model to improve the resizing efficiency with low storage requirements. Roberto et al.  utilized a reduced linear model for image resizing, in which a combination of gradient information with visual saliency maps  is used for evaluating the image energy. Due to the visual saliency map involving information of color, intensity, and orientation, its implementation is not quite well in some scenarios. Huang et al.  proposed a framework for preserving the global structure in images and vector art. Their method formulates the structure preservation as an optimization problem and the accuracy relies on robust structure detection. Wang et al.  presented a Scale-and-Stretch (SNS) warping method, via iteratively computing optimal local scaling factors for each local region and updating a warped image, to resize an image. In some case, some objects might be excessively distorted since the distortion is distributed over all the spatial directions. Based on the conformal energy, Zhang et al.  employed handles to describe original image and minimized quadratic distortion energies to obtain a resized image, but their method cannot guarantee to strictly preserve edges. Guo et al.  constructed a mesh image representation and associated an image saliency into the image mesh, then regarded image structure as constraints for mesh parameterization. Owing to the emphasis of relative scale of salient object, nearby objects may be distorted.
Avidan and Shamir  presented a greedy image resizing method called Seam Carving which pays more attention to the unimportant regions, and can retain important content via removing or duplicating monotonic pixel-wide low-energy seams. But, if low gradient pixels in the required spatial direction have been run out, or interesting objects span the entire image, some interesting objects and important regions would suffer from distortion, the local structure and global layout might be destroyed. On the basis of Seam Carving, Rubinstein et al.  improved it by using graph cut for image and video retargeting. Through utilizing a stream, a path of several pixels width, instead of a seam, Domingues et al.  presented an improved algorithm called Stream Carving to induce an increase in the quality of resized image. Mansfield et al.  proposed a scene carving method, by decomposing the image retargeting procedure into removing image content with minimal distortion and re-arrangement of known objects within the scene to maximize their visibility. Moreover, they introduced the visibility map for pixel removing, casting retargeting as a binary graph labeling problem to improve Seam Carving . Considering the distortion in both spatial and temporal dimensions, Grundmann et al.  presented a discontinuous Seam Carving for video retargeting to process the video frame sequentially and afford great flexibility. Dong et al.  presented a resizing algorithm combining Seam Carving with Scaling. But, their algorithm needs to compute all the possible combinations of resizing amount by Seam Carving and Scaling, respectively, then chooses the best ratio for resizing image. Other similar approaches, which combine Seam Carving with Scaling, resize image by using a modified energy function based on wavelet decomposition , analyzing the cost of the next seam  and the importance value for the minimal seam , etc. By combining the region of interest-based technique with an extended Seam Carving, Kopf et al  proposed a video retargeting algorithm for reducing the distortion of straight lines and avoiding jitter in the target video. Chen and Luo  proposed an approach for modeling dynamic visual attention to detect the focus of interest, by defining visual cubes to determine a proper extent of salient regions for the global optimization. Their algorithm is able to keep the video's isotropic manipulation and the continuous dynamics of visual perception. In addition, patch-based methods [21–24] are also presented for image retargeting or image summarization.
In general, associated with seam removal, some artifacts and warping will be introduced to the resized image. The more the number of removed seams is, the heavier the distortion of resulting image would be. Ideally, an image resizing algorithm should check whether farther seam removal will result in unacceptable distortion. So, a similarity or distance metric could be taken into account for image resizing to evaluate the retargeting effect. Similarity measure between images is an important portion of image analysis, broadly used for image retrieval, quality assessment, and visual tracking. For image summarization, Simakov et al.  proposed a similarity measure method which quantitatively captures the incompleteness and incoherence of the patches between the original image and the resized images. Rubinstein et al.  provided a similarity measure algorithm between images termed bi-directional warping (BDW). It measures the similarity between every row (column) and then takes the maximum alignment error as the distance. Maalouf and Larabi  defined a multi-scale bandelet-based perceptual similarity measure for image retargeting, via measuring the geometric and perceptual similarities between two images to obtain resulting image that contains as much as geometric and perceptual features of the original image.
In literatures [15, 21, 27], BDS is employed to calculate the distance between the retargeted image and the original, and assess resizing effect. This type of algorithm is time-consuming due to iteration computation. Such limitation becomes a bottleneck applying the technique for interactive use (e.g., for portable devices). Besides, Rubinstein et al.  compared a number of state-of-the-art retargeting methods and conducted a user study, their Subjective Analysis and Objective Analysis revealed that both BDS and BDW show low agreement with human perception, while image descriptors such as SIFT flow  and Earth Mover's Distance  are more suitable than patch-based distances for conveying local permissible changes.
SIFT feature is a type of descriptor of the key point found out from multiple scale spaces of an image. Due to the invariant to image translation, scaling, rotation and robust matching across the affine distortion, change in illumination and addition of noise, the SIFT feature is widely used for image matching and retrieval, pattern recognition, etc. The SIFT descriptor has the ability to robustly capture structural properties of the image; it is more suitable than patch-based distance for conveying local permissible changes in content. The SIFT feature points dominantly distribute across regions where color and texture change, hence, SIFT feature vectors and number would mildly change if the low-energy regions are carved out only, and obviously vary while the silent objects get damaged. Such that we can utilize the SIFT feature to show the deformation of the resized image and the distance from the original.
We extract the SIFT feature, using SIFT algorithm by Lowe , from the original image and its resized one, respectively. According to the vectors and the number of SIFT features, we calculate the distance for a resized one from the original image. There are two manners for this purpose.
3.1. The Euclidean distance between SIFT feature vectors
For the case of dimension reduction, the feature number of resized image will decrease. Calculating the Euclidean distances from source features to target ones is capable of revealing the difference between two images obviously. Along with adjustment amount getting larger, outliers across features will get more, the distance values will become lager. In particular, when the silent objects within the image get damaged, both the feature vectors and number would alter heavily. Inversely, if the distance is computed from target features to source ones, in the process of resizing, features in resized image are usually capable of finding appropriate matched-features from the original. Such that the distance would alter slightly, it is not able to represent the degree of deformation in the resized image sufficiently.
where m is the number of SIFT features of the original image S, n is the number of SIFT features of the resized image T, S ik is the k th element of the i th feature vector of S, t jk is the k th element of the j th feature vector of T. Because D(S,T) denotes an average distance regarding all SIFT features in an original image to target, it is suitable for the case of number variation of detected feature points from various images.
3.2. The percentage of matched SIFT features
where Nm, Nt indicate the number of matched SIFT feature pairs, the total number of SIFT features in the original image, respectively.
Analogous to the Euclidean distance between SIFT features, the change of the percentage is capable of expressing the degree of distortion as well. The smaller the value of percentage becomes, the larger the distortion within the resized image will be.
We notice that SIFT feature points mainly locate at prominent objects or edges (see Figure 1). While prominent objects and important regions begin to be damaged, both the vector and the number of SIFT features would change obviously. However, comparing with the Euclidean distance measure, it is difficult to find a consistent threshold for different images by the percentage of matched SIFT feature pairs. The possible reason is that, for a SIFT feature point p i in the original image, we decide which SIFT feature point as its corresponding matched-point in the other image like this: If the ratio of the distance from the closest neighbor to that of the second closest is less than 0.6, the closest feature point is thought as the matched point of p i . So, the change of percentage is not straightforward with respect to the change of SIFT features. So, in this article, we utilize the Euclidean distance between SIFT feature vectors to proceed the similarity measure for image resizing.
Extract the SIFT feature from a given image;
Resize the image by Seam Carving step-by-step, remove Δn seams each step, extract the SIFT feature from the resized image, and then calculate the Euclidean distance between the original image and the resized one;
Judge whether the distance value exceeds a threshold θ at the i th step. If less than θ, go to step (2) to continue; otherwise, go to step (4);
Stop using Seam Carving and employ Scaling to resize the (i - 1)th step image to the ultimate size directly.
Utilize single-directional resizing algorithm to resize a given image S in the width and height directions, respectively, based on the Euclidean distance of the SIFT feature, we only capture the resizing amount L w in width and L h in height up to the (i - 1)th step, in which the magnitude of distance is less than the threshold θ;
By Seam Carving, L w vertical seams and L h horizontal seams with minimal energy are removed according to optimal paths, we get the resized image T;
Scale the image T to the ultimate size, such that we can gain a desired image.
According to above steps, we can resize the original to the preferred size.
The threshold θ is a significant parameter. With different values, we would obtain diverse retargeting effects. To get a preferable threshold, we execute the image resizing and conduct a statistic analysis with the Benchmark, RetargetMe, involving 80 images. First, we fix the aspect ratio of the image, and adjust the horizontal dimension to 200, then adopt the L2 norm of the gradient for evaluating the image energy.
For all images in the Benchmark, image resizing is implemented by Seam Carving, each step we carve out five seams for facilitating the estimation of the distortion degree. If the number of removal seams each step is smaller than 5, the increment of distortion is not obvious in every step, and it is difficult to find the crucial step; conversely much greater than 5, it will weaken the slope of the segment corresponding to the state of salient object damaged. Then the Euclidean distance values described in Section 3 is obtained. We compute the differences of distance value between the adjacent computation points of seam removal, and find out the maximum from the differences. Such that we can get two distance values corresponding with the maximum. We focus attention on the large one of two values, since salient object maybe begin to get damaged associated with this distance value Dcritical. We get the Dcritical for each image within the Benchmark.
The Dcritical values corresponding to the maximum difference
Removal seams associated with Dcritical
Removal seams associated with Dcritical
We can calculate the mean and standard deviation of the data in Table 1: the mean μ = 0.204173 and the standard deviation σ = 0.054302. If we choose 60% area under the bell curve between (μ-xσ) and (μ+xσ), namely the area from (μ-xσ) to +∞ is 80%, x = 0.84 can be obtained by looking up the standard normal distribution table. Then we calculate the lower limit and get (μ-xσ) = 0.15856. It means if the threshold θ is set to 0.15856, based on the probability distribution the 80% images would avoid being damaged. Hence, in this article, we select θ = 0.159 for a 200 × 133 image. According to this approach, we could get θ = 0.133 for a 500 × 332 image.
We find contents in height (width) direction suffer from more damages for the W-H (H-W) algorithm. A possible reason is that structures of image have been changed after a single-directional resizing. In this scenario, the threshold θ is not suitable for the amount of residual direction. However, the proposed algorithm is generally able to get better global visual effect as shown in Figure 6f.
According to the SIFT algorithm, for a given image, extracted vectors and the number of SIFT features will vary with its scale. Like this, there exists a difference threshold for an image with different scale. For images with various resolutions, two approaches could be taken: (1) Resize image by the threshold corresponding to the resolution. Through experiments and statistic analysis, a suitable threshold for that scale could be selected. (2) Normalize the input to a uniform scale, resizing image with the threshold for the uniform scale, then return to the original scale.
In this article, we research an approach to distance measure between an image and its resized version based on the SIFT feature vector. Based on the distance measure, we propose a rapid image resizing algorithm combining Seam Carving with Scaling. The algorithm can avoid the disorder and distortion of image contents and preserve both the important regions and the global visual effect of the original image.
For the future work, we will further research an adaptive-combined resizing algorithm, choosing the best scheme for different scenarios, and searching an adaptive threshold for various resolutions. Moreover, we will research the evaluation method that shows a high agreement with subjective perception for assessing resizing effect by different algorithms.
- Rubinstein M, Gutierrez D, Sorkine O, Shamir A: A comparative study of image retargeting. ACM Trans Graph 2010, 29(6):Article number 160.View ArticleGoogle Scholar
- Zhang YF, Hu SM, Martin RR: Shrinkability maps for content-aware video resizing. Comput Graph Forum 2008, 27(7):1797-1804. 10.1111/j.1467-8659.2008.01325.xView ArticleGoogle Scholar
- Roberto G, Ardizzone E, Pirrone R: Real-time content-aware image resizing using reduced linear model. Proceedings of 2010 IEEE 17th International Conference on Image Processing, Hong Kong 2010, 2813-2816. 26-29View ArticleGoogle Scholar
- Itti L, Koch C, Niebur E: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 1998, 20(11):1254-1259. 10.1109/34.730558View ArticleGoogle Scholar
- Huang QX, Mech R, Carr N: Optimizing structure preserving embedded deformation for resizing images and vector art. Comput Graph Forum 2009, 28(7):1887-1896. 10.1111/j.1467-8659.2009.01567.xView ArticleGoogle Scholar
- Wang YS, Tai CL, Sorkine O, Lee TY: Optimized scale-and-stretch for image resizing. ACM Trans Graph 2008, 27(5):Article number 118.View ArticleGoogle Scholar
- Zhang GX, Cheng MM, Hu SM, Martin RR: A shape-preserving approach to image resizing. Comput Graph Forum 2009, 28(7):1897-1906. 10.1111/j.1467-8659.2009.01568.xView ArticleGoogle Scholar
- Guo Y, Liu F, Shi J, Zhou ZH, Gleicher M: Image retargeting using mesh parameterization. IEEE Trans Multimed 2009, 11(5):856-867.View ArticleGoogle Scholar
- Avidan S, Shamir A: Seam carving for content-aware image resizing. ACM Trans Graph 2007, 26(3):10-18. 10.1145/1276377.1276390View ArticleGoogle Scholar
- Rubinstein M, Shamir A, Avidan S: Improved seam carving for video retargeting. ACM Trans Graph 2008, 27(3):Article number 16.View ArticleGoogle Scholar
- Domingues D, Alahi A, Vandergheynst P: Stream carving: an adaptive seam carving algorithm. Proceedings of 2010 IEEE 17th International Conference on Image Processing, Hong Kong 2010, 901-904.View ArticleGoogle Scholar
- Mansfield A, Gehler P, Van Gool L, Rother C: Scene carving: scene consistent image retargeting. 11th European Conference on Computer Vision (ECCV), Heraklion, Crete, Greece 2010, 143-156.Google Scholar
- Mansfield A, Gehler P, Van Gool L, Rother C: Visibility maps for improving seam carving. European Conference on Computer Vision (ECCV), Heraklion, Crete, Greece 2010.Google Scholar
- Grundmann M, Kwatra V, Han M, Essa I: Discontinuous seam-carving for video retargeting. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA 2010, 569-576.Google Scholar
- Dong WM, Zhou N, Paul JC, Zhang XP: Optimized image resizing using seam carving and scaling. ACM Trans Graph 2009, 28(5):Article number 125.View ArticleGoogle Scholar
- Han JW, Choi KS, Wang TS, Cheon SH, Ko SJ: Improved seam carving using a modified energy function based on wavelet decomposition. The 13th IEEE International Symposium on Consumer Electronics (ISCE2009), Kyoto, Japan 2009, 38-41.Google Scholar
- Kopf S, Kiess J, Lemelson H, Effelsberg W: FSCAV: fast seam carving for size adaptation of videos. MM'09: Proceedings of the Seventeen ACM International Conference on Multimedia, New York, NY, USA 2009, 321-330.View ArticleGoogle Scholar
- Hwang DS, Chien SY: Content-aware image resizing using perceptual seam carving with human attention model. IEEE International Conference on Multimedia and Expo(ICME2008), Hannover, Germany 2008, 1029-1032.Google Scholar
- Kopf S, Haenselmann T, Kiess J, Guthier B, Effelsberg W: Algorithms for video retargeting. Multimed Tools Appl 2011, 51(2):819-861. 10.1007/s11042-010-0717-6View ArticleGoogle Scholar
- Chen DY, Luo YS: Content-aware video resizing based on salient visual cubes. J Visual Commun Image Represent 2011, 22(3):226-236. 10.1016/j.jvcir.2010.12.003View ArticleGoogle Scholar
- Simakov D, Caspi Y, Shechtman E, Irani M: Summarizing visual data using bidirectional similarity. IEEE Conference on Computer Vision and Pattern Recognition2008 (CVPR 2008), Anchorage, AK, USA 2008, 1-8.View ArticleGoogle Scholar
- Cho TS, Butman M, Avidan S, Freeman WT: The patch transform and its applications to image editing. IEEE Conference on Computer Vision and Pattern Recognition 2008 (CVPR 2008), Anchorage, AK, USA 2008, 1-8.Google Scholar
- Pritch Y, Kav-Venaki E, Peleg S: Shiftmap image editing. ICCV' 09: IEEE 12th International Conference on Computer Vision 2009, Kyoto, Japan 2009, 151-158.Google Scholar
- Barnes C, Shechtman E, Finkelstein A, Goldman DB: Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans Graph 2009, 28(3):Article number 24.View ArticleGoogle Scholar
- Rubinstein M, Shamir A, Avidan S: Multi-operator media retargeting. ACM Trans Graph 2009, 28(3):Article number 23.View ArticleGoogle Scholar
- Maalouf A, Larabi MC: Image retargeting using a bandelet-based similarity measure. Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Dallas, TX, USA 2010, 942-945.Google Scholar
- Hua SG, Li XX, Zhong Q: Similarity criterion for image resizing. EURASIP J Adv Signal Process 2011, 2011: 27. 10.1186/1687-6180-2011-27View ArticleGoogle Scholar
- Liu C, Yuen J, Torralba A, Sivic J, Freeman WT: SIFT flow: dense correspondence across different scenes. Paper presented at European Conference on Computer Vision (ECCV) 2008, 3: 28-42.Google Scholar
- Pele O, Werman M: Fast and robust earth mover's distances. IEEE 12th International Conference on Computer Vision, Kyoto, Japan 2009, 460-467.Google Scholar
- Lowe DG: Distinctive image features from scale-invariant keypoints. Int J Comput Vis 2004, 60(2):91-110.View ArticleGoogle Scholar
- RetargetMe - A Benchmark for Image Retargeting2010. [http://people.csail.mit.edu/mrub/retargetme]
- Wolf L, Guttmann M, Cohen-Or D: Non-homogeneous content-driven video-retargeting. ICCV 2007: IEEE 11th International Conference on Computer Vision 2007, Rio de Janeiro, Brazil 2007, 1-6.Google Scholar
- Krähenbühl P, Lang M, Hornung A, Gross M: A system for retargeting of streaming video. ACM Transactions on Graphics (TOG) - Proceedings of ACM SIGGRAPH Asia 2009 2009., 28(5):Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.