- Research Article
- Open Access
Image Segmentation Method Using Thresholds Automatically Determined from Picture Contents
© Y. B. Chen and O. T.-C. Chen. 2009
- Received: 1 June 2008
- Accepted: 28 January 2009
- Published: 22 April 2009
Image segmentation has become an indispensable task in many image and video applications. This work develops an image segmentation method based on the modified edge-following scheme where different thresholds are automatically determined according to areas with varied contents in a picture, thus yielding suitable segmentation results in different areas. First, the iterative threshold selection technique is modified to calculate the initial-point threshold of the whole image or a particular block. Second, the quad-tree decomposition that starts from the whole image employs gray-level gradient characteristics of the currently-processed block to decide further decomposition or not. After the quad-tree decomposition, the initial-point threshold in each decomposed block is adopted to determine initial points. Additionally, the contour threshold is determined based on the histogram of gradients in each decomposed block. Particularly, contour thresholds could eliminate inappropriate contours to increase the accuracy of the search and minimize the required searching time. Finally, the edge-following method is modified and then conducted based on initial points and contour thresholds to find contours precisely and rapidly. By using the Berkeley segmentation data set with realistic images, the proposed method is demonstrated to take the least computational time for achieving fairly good segmentation performance in various image types.
- Initial Point
- Segmentation Result
- Initial Contour
- Contour Point
- Texture Gradient
Image segmentation is an important signal processing tool that is widely employed in many applications including object detection , object-based coding [2–4], object tracking , image retrieval , and clinical organ or tissue identification . To accomplish segmentations in these applications, the methods can be generally classified as region-based and edge-based techniques. The region-based segmentation techniques such as semisupervised statistical region refinement , watershed , region growing , and Markov-random-field parameter estimation  focus on grouping pixels to become regions which have uniform properties like grayscale, texture, and so forth. The edge-based segmentation techniques such as Canny edge detector , active contour , and edge following [14–16] emphasize on detecting significant gray-level changes near object boundaries. Regarding to the above-mentioned methods, the segmenting mechanisms associated with users can be further categorized as either supervised segmentation or unsupervised segmentation.
The advantage of the region-based segmentation is that the segmented results can have coherent regions, linking edges, no gaps from missing edge pixels, and so on. However, its drawback is that decisions about region memberships are often more difficult than those about edge detections. In the literature, the Semisupervised Statistical Region Refinement (SSRR) method developed by Nock and Nielsen is to segment an image with user-defined biases which indicate regions with distinctive subparts . SSRR is fairly accurate because the supervised segmentation is not easily influenced by noise, but is highly time-consuming. The unsupervised DISCovering Objects in Video (DISCOV) technique developed by Liu and Chen could discover the major object of interest by an appearance model and a motion model . The watershed method that is applicable to nonspecific image type is also unsupervised [9, 17]. The implementation manners of the watershed method can be classified into rain falling and water immersion . Some recent watershed methods use the prior information-based difference function instead of the more-frequently-used gradient function to improve the segmented results  and employ the marker images as probes to explore a gradient space of an unknown image and thus to determine the best-matched object . The advantage of the watershed method is that it can segment multiple objects in a single threshold setting. The disadvantage of the watershed method is that the different types of images need different thresholds. If the thresholds are not set correctly, then the objects are under-segmented or over-segmented. Additionally, slight changes in the threshold can significantly alter the segmentation results. In [21, 22], the systematic approach was demonstrated to analyze nature images by using a Binary Partition Tree (BPT) for the purposes of archiving and segmentation. BPTs are generated based on a region merging process which is uniquely specified by a region model, a merging order, and a merging criterion. By studying the evolution of region statistics, this unsupervised method highlights nodes which represent the boundary between salient details and provide a set of tree levels from which segmentations can be derived.
The edge-based segmentation can simplify the analysis by drastically minimizing the amount of pixels from an image to be processed, while still preserving adequate object structures. The drawback of the edge-based segmentation is that the noise may result in an erroneous edge. In the literature, the Canny edge detector employed the hysteresis threshold that adapts to the amount of noise in an image, to eliminate streaking of edge contours where the detector is optimized by three criteria of detection, localization, and single response . The standard deviation of the Gaussian function associated with the detector is adequately determined by users. The Live Wire On the Fly (LWOF) method proposed by Falcao et al. helps the user to obtain an optimized route between two initial points . The user can follow the object contour and select many adequate initial points to accomplish that an enclosed contour is found. The benefit of LWOF is that it is adaptive to any type of images. Even with very complex backgrounds, LWOF can enlist human assistance in determining the contour. However, LWOF is limited in that if a picture has multiple objects, each object needs to be segmented individually and the supervised operation significantly increases the operating time. The other frequently adopted edge-based segmentation is the snake method first presented by Kass et al. . In this method, after an initial contour is established, partial local energy minima are calculated to derive the correct contour. The flaw of the snake method is that it must choose an initial contour manually. The operating time rises with the number of objects segmented. Moreover, if the object is located within another object, then the initial contours are also difficult to select. On the other hand, Yu proposed a supervised multiscale segmentation method in which every pixel becomes a node, and the likelihood of two nodes belonging together is interpreted by a weight attached to the edge linking these two pixel nodes . Such approach allows that image segmentation becomes a weighted graph partitioning problem that is solved by average cuts of normalized affinity. The above-mentioned supervised segmentation methods are suitable for conducting detailed processing to objects of segmentation under user's assistance. In the unsupervised snake method also named as the active contour scheme, the geodesic active contours and level sets were proposed to detect and track multiple moving objects in video sequences [26, 27]. However, the active contour scheme is generally applied when segmenting stand-alone objects within an image. For instance, an object located within the complicated background may not be easily segmented. Additionally, contours that are close together cannot be precisely segmented. Relevant study, the Extended-Gradient Vector Flow (E-GVF) snake method proposed by Chuang and Lie has improved upon the conventional snake method . The E-GVF snake method can automatically derive a set of seeds from the local gradient information surrounding each point, and thus can achieve unsupervised segmentation without manually specifying the initial contour. The noncontrast-based edge descriptor and mathematical morphology method were developed by Kim and Park and Gao et al., respectively, for unsupervised segmentation to assist object-based video coding [29, 30].
The conventional edge-following method is another edge-based segmentation approach that can be applied to nonspecific image type [14, 31]. The fundamental step of the edge-following method attempts to find the initial points of an object. With these initial points, the method then follows on contours of an object until it finds all points matching the criteria, or it hits the boundary of a picture. The advantage of the conventional edge-following method is its simplicity, since it only has to compute the gradients of the eight points surrounding a contour point to obtain the next contour point. The search time for the next contour point is significantly reduced because many points within an object are never used. However, the limitation of the conventional edge-following method is that it is easily influenced by noise, causing it to fall into the wrong edge. This wrong edge can form a wrong route to result in an invalid segmented area. Moreover, the fact that initial points are manually selected by users may affect accuracy of segmentation results due to inconsistence in different times for selection. To improve on these drawbacks, the initial-point threshold calculated from the histogram of gradients in an entire image is adopted to locate positions of initial points automatically . Additionally, the contour thresholds are employed to eliminate inappropriate contours to increase the accuracy of the search and to minimize the required searching time. However, this method is limited in that the initial-point threshold and contour threshold remain unchanged throughout the whole image. Hence, optimized segmentations cannot always be attained in areas with complicated and smooth gradients. If the same initial-point threshold is employed throughout an image with areas having different characteristics, for example, a half of the image is smooth, and the other half has major changes in gradients, then the adequately segmented results can clearly only be obtained from one side of the image, while the objects from the other side are not accurately segmented.
This work proposes a robust segmentation method that is suitable for nonspecific image type. Based on the hierarchical segmentation under a quad-tree decomposition [32, 33], an image is adequately decomposed into many blocks and subblocks according to the image contents. The initial-point threshold in each block is determined by the modified iterative threshold selection technique and the initial-point threshold of its parent block. Additionally, the contour threshold is calculated based on the histogram of gradients in each block. Using these two thresholds, the modified edge-following scheme is developed to automatically and rapidly attain fairly good segmentation results. Segmentations on various types of images are performed during simulations to obtain the accuracy of segmentations using methods such as the proposed, watershed, active contour, and others. To do fair comparison, the data set and benchmarks from the Computer Vision Group, University of California at Berkeley were used . Simulation results demonstrate that the proposed method is superior to the conventional methods to some extent. Owing to avoiding human interferences and reducing operating time, the proposed method is more robust and suitable to various image and video applications than the conventional segmentation methods.
2.1. Stage of Applying the Modified Iterative Threshold Selection Technique
The iterative threshold selection technique that was proposed by Ridler and Calvard to segment the foreground and background is modified to calculate the initial-point threshold of the whole image or a particular block from the quad-tree decomposition, for identifying initial points . The modified iterative threshold selection technique is illustrated as follows.
Let where MAX is a function to select the maximum value.
- (2)is adopted to classify all points in a decomposed block into initial and noninitial points. A point with is an initial point, while a point with is a noninitial point. The groups of initial and noninitial points are denoted by and , respectively. In these two groups, the averaged is computed by
where and denote the numbers of initial and noninitial points, respectively,
where round rounds off the value of to the nearest integer number. and , ranging from 0 to 1, denote the weighting values of initial and noninitial groups, respectively. Additionally,
If then and go to Step 2, else
2.2. Stage of the Quad-Tree Decomposition Process
During the quad-tree decomposition process, can be set by a value smaller than 0.5 at the first decomposition level to lower for capably attaining initial points from low-contrast areas. Additionally, is increased with a decomposition level. For the smallest decomposed block in the last decomposition level, can be a value larger than or equal to 0.5 for increasing to avoid the undesired initial points. Notably, the initial-point thresholds of blocks with drastic gray-level changes would rise, whereas the initial-point thresholds of blocks with smooth gray-level changes would fall. This approach of determining initial-point threshold can obtain adequate initial points based on the complexity of image contents.
is a point from a decomposed block .
If then is labeled as the initial point and is recorded where
Repeat step 2 for all points in the block .
2.3. Stage of Determining the Contour Threshold Tc
2.4. Stage of Applying the Modified Edge-Following Method
Select an initial point and its . This initial point is represented by and set where the edge-following direction is perpendicular to the maximum-gradient direction . Here, is a value denoting one of the eight compass directions as shown in Figure 3.
Let , where is the contour-point index. The searching procedure begins from the initial point and the direction .
- (3)First, to reduce computational time, the search is restricted to only three directions by setting , where denotes the number of directions needed. The direction of the next point thus has three possible values: and . For instance, if , then the next contour point could appear at the predicted contour point , or , as shown in Figure 7(a). With the left-sided point and right-sided point of the predicted contour point , the line formed by and points is perpendicular to the line between and , where indicates the direction deviation, as revealed in Figure 7(b) under and . Additionally, and can be represented as
- (5)and that interpret the relationships among the predicted point, its left-sided and right-sided points, and and , are used to obtain the next probable contour point:
Equations (7) and (8) are used to determine the th contour point. The first term represents the gradient between the predicted point and its left-sided or right-sided point. The second term may prevent (7) or (8) from finding the wrong contours due to the noise interference. If the difference in the second term is too large, then the wrong contour point may be found.
Select the largest value by using or If , then the correct direction has been found, and go to step 8. Here, comes from the decomposed block which the predicted contour point belongs to.
If , then the previously searched direction may have deviated from the correct path and set to obtain the seven neighboring points for direction searching, going to step 5. Otherwise, stop the search procedure, and go to step 10.
The searching procedure is finished when the th contour point is in the same position as any of the previous searched contour points or has gone beyond the four boundaries of the image. If neither condition is true, then set , and return to step 3 to discover the next contour point.
If set and go to step 2 to search for the contour points in the opposite direction to .
Go to step 1 for another initial point that is not searched. When all initial points are conducted, the procedure of the modified edge-following method is ended.
where α is set to 0.5 in our simulations.
Segmentation results of the LWOF, E-GVF, watershed and proposed methods.
Numbers of segmented objects
F-measures and computational time of the LWOF, snake, watershed and proposed methods.
SNR = 18.87 dB
SNR = 12.77 dB
SNR = 9.14 dB
Processing time (sec)
Processing time (sec)
Processing time (sec)
The above experimental results demonstrate that the proposed method performs better than the other methods. As for the blurry objects resulting from the out-of-focus shot in Figure 9, the proposed method can accurately segment all objects without incurring over-segmentation and under-segmentation as does the watershed method in Figures 9(d) and 9(e), respectively. Figure 10 reveals that both the proposed and watershed methods demonstrate the capability of fully segmenting objects inside another object and overlapping objects but the E-GVF snake method cannot be applied in these pictures. The proposed method can segment more objects out of the image in Figure 10, which contains many individual objects, than the watershed method. In the simulation results shown in Figure 11, by considering the gray-level changes of the left and right neighboring points during the contour-searching process, the proposed method not only reduces the noise interference, it also outperforms both the E-GVF snake and watershed methods against noise interference.
F-measures and computational time of the noisy images conducted by the proposed and watershed methods.
Processing time (sec)
Processing time (sec)
In practical applications, the ground truths are not available. The conventional methods, BG, TG, and B/TG, that need the ground truths to determine the best-matched thresholds or parameters may not obtain good segmentation results under no ground truth. However, the proposed robust segmentation method does not need the ground truths and iterative operations to determine the segmentation results, and therefore is very suitable to various real-time image and video segmentation applications under no ground truth.
This work proposes an automatically determined threshold mechanism to perform a robust segmentation. Different initial-point thresholds are determined and given to areas with drastic and smooth changes in gray-level values. The contour thresholds are generated by analyzing the decomposed blocks, thus preventing the search from falling into the wrong path, and saving computational time. The contour search process also considers the gradients of the left and right neighboring points of every predicted contour point, in order to lower the possibility of the method being affected by the neighboring noise interferences. Additionally, most of the searching process requires only the computation of the gradients of three directions, thus minimizing the searching time. The proposed method can perform segmentation on objects inside another object and objects that are close to each other, which the E-GVF snake method cannot perform. The proposed method also solves problems encountered by the watershed method, in which the results may change significantly as the threshold values differ. The proposed method can significantly reduce noise interference, which easily affects the conventional edge-following method. In handling blurry objects from an out-of-focus shot, the proposed method can also segment the required objects. Finally, the benchmark from Computer Vision Group, University of California at Berkeley was conducted to demonstrate that the proposed method could take the least computational time to obtain robust and good segmentation performance than the conventional ones. Therefore, the proposed method can be widely and effectively employed in various segmentation applications.
Valuable discussions with Professor Tsuhan Chen, Carnegie Mellon University, Pittsburgh, USA is highly appreciated. Additionally, the authors would like to thank the National Science Council, Taiwan, for financially supporting this research under Contract nos.: NSC 95-2221-E-270-015 and NSC 95-2221-E-194-032. Professor W. N. Lie, National Chung Cheng University, Chiayi, Taiwan is appreciated for his valuable suggestion. Dr. C. H. Chuang, Institute of Statistical Science, Academia Sinica, Taipei, Taiwan, is thanked for kindly providing the software program of the snake and watershed methods.
- Liu D, Chen T: DISCOV: a framework for discovering objects in video. IEEE Transactions on Multimedia 2008,10(2):200-208.View ArticleGoogle Scholar
- Pan J, Gu C, Sun MT: An MPEG-4 virtual video conferencing system with robust video object segmentation. Proceedings of Workshop and Exhibition on MPEG-4, June 2001, San Jose, Calif, USA 45-48.Google Scholar
- Yang J-F, Hao S-S, Chung P-C, Huang C-L: Color object segmentation with eigen-based fuzzy C-means. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '00), May 2000, Geneva, Switzerland 5: 25-28.Google Scholar
- Chien S-Y, Huang Y-W, Hsieh B-Y, Ma S-Y, Chen L-G: Fast video segmentation algorithm with shadow cancellation, global motion compensation, and adaptive threshold techniques. IEEE Transactions on Multimedia 2004,6(5):732-748. 10.1109/TMM.2004.834868View ArticleGoogle Scholar
- Zhou JY, Ong EP, Ko CC: Video object segmentation and tracking for content-based video coding. Proceedings of IEEE International Conference on Multimedia and Expo (ICME '00), July 2000, New York, NY, USA 3: 1555-1558.View ArticleGoogle Scholar
- Chiang C-C, Hung Y-P, Lee GC: A learning state-space model for image retrieval. EURASIP Journal on Advances in Signal Processing 2007, 2007:-10.Google Scholar
- Chen YB, Chen OT-C, Chang HT, Chien JT: An automatic medical-assistance diagnosis system applicable on X-ray images. Proceedings of the 44th IEEE Midwest Symposium on Circuits and Systems (MWSCAS '01), August 2001, Dayton, Ohio, USA 2: 910-914.Google Scholar
- Nock R, Nielsen F: Semi-supervised statistical region refinement for color image segmentation. Pattern Recognition 2005,38(6):835-846. 10.1016/j.patcog.2004.11.009View ArticleMathSciNetMATHGoogle Scholar
- Vincent L, Soille P: Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Transactions on Pattern Analysis and Machine Intelligence 1991,13(6):583-598. 10.1109/34.87344View ArticleGoogle Scholar
- Adams R, Bischof L: Seeded region growing. IEEE Transactions on Pattern Analysis and Machine Intelligence 1994,16(6):641-647. 10.1109/34.295913View ArticleGoogle Scholar
- Kim DH, Yun ID, Lee SU: New MRF parameter estimation technique for texture image segmentation using hierarchical GMRF model based on random spatial interaction and mean field theory. Proceedings of the 18th International Conference on Pattern Recognition (ICPR '06), August 2006, Hong Kong 2: 365-368.Google Scholar
- Canny J: Computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 1986,8(6):679-698.View ArticleGoogle Scholar
- Bogdanova I, Bresson X, Thiran J-P, Vandergheynst P: Scale space analysis and active contours for omnidirectional images. IEEE Transactions on Image Processing 2007,16(7):1888-1901.View ArticleMathSciNetGoogle Scholar
- Pitas I: Digital Image Processing Schemes and Application. John Wiley & Sons, New York, NY, USA; 2000.Google Scholar
- Chen YB, Chen OT-C: Robust fully-automatic segmentation based on modified edge-following technique. Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '03), April 2003, Hong Kong 3: 333-336.Google Scholar
- Sonka M, Hlavac V, Boyle R: Image Processing, Analysis, and Machine Vision. 2nd edition. Brooks/Cole, New York, NY, USA; 1998.Google Scholar
- Chien S-Y, Huang Y-W, Chen L-G: Predictive watershed: a fast watershed algorithm for video segmentation. IEEE Transactions on Circuits and Systems for Video Technology 2003,13(5):453-461. 10.1109/TCSVT.2003.811605View ArticleGoogle Scholar
- Kuo CJ, Odeh SF, Huang MC: Image segmentation with improved watershed algorithm and its FPGA implementation. Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS '01), May 2001, Sydney, Australia 2: 753-756.Google Scholar
- Grau V, Mewes AUJ, Alcañiz M, Kikinis R, Warfield SK: Improved watershed transform for medical image segmentation using prior information. IEEE Transactions on Medical Imaging 2004,23(4):447-458. 10.1109/TMI.2004.824224View ArticleGoogle Scholar
- Hu Y, Nagao T: A matching method based on marker-controlled watershed segmentation. Proceedings of the International Conference on Image Processing (ICIP '04), October 2004, Singapore 1: 283-286.Google Scholar
- Salembier P, Garrido L: Binary partition tree as an efficient representation for image processing, segmentation, and information retrieval. IEEE Transactions on Image Processing 2000,9(4):561-576. 10.1109/83.841934View ArticleGoogle Scholar
- Lu H, Woods JC, Ghanbari M: Binary partition tree for semantic object extraction and image segmentation. IEEE Transactions on Circuits and Systems for Video Technology 2007,17(3):378-383.View ArticleGoogle Scholar
- Falcao AX, Udupa JK, Miyazawa FK: An ultra-fast user-steered image segmentation paradigm: live wire on the fly. IEEE Transactions on Medical Imaging 2000,19(1):55-62. 10.1109/42.832960View ArticleGoogle Scholar
- Kass M, Witkin A, Terzopoulos D: Snakes: active contour models. International Journal of Computer Vision 1988,1(4):321-331. 10.1007/BF00133570View ArticleGoogle Scholar
- Yu SX: Segmentation using multiscale cues. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '04), June-July 2004, Washington, DC, USA 1: 247-254.Google Scholar
- Paragios N, Deriche R: Geodesic active contours and level sets for the detection and tracking of moving objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 2000,22(3):266-280. 10.1109/34.841758View ArticleGoogle Scholar
- Mukherjee DP, Ray N, Acton ST: Level set analysis for leukocyte detection and tracking. IEEE Transactions on Image Processing 2004,13(4):562-572. 10.1109/TIP.2003.819858View ArticleGoogle Scholar
- Chuang C-H, Lie W-N: A downstream algorithm based on extended gradient vector flow field for object segmentation. IEEE Transactions on Image Processing 2004,13(10):1379-1392. 10.1109/TIP.2004.834663View ArticleGoogle Scholar
- Kim B-G, Park D-J: Novel noncontrast-based edge descriptor for image segmentation. IEEE Transactions on Circuits and Systems for Video Technology 2006,16(9):1086-1095.View ArticleGoogle Scholar
- Gao H, Siu W-C, Hou C-H: Improved techniques for automatic image segmentation. IEEE Transactions on Circuits and Systems for Video Technology 2001,11(12):1273-1280. 10.1109/76.974681View ArticleGoogle Scholar
- Chen YB, Chen OT-C: Semi-automatic image segmentation using dynamic direction prediction. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '02), May 2002, Orlando, Fla, USA 4: 3369-3372.Google Scholar
- Tierny J, Vandeborre J-P, Daoudi M: Topology driven 3D mesh hierarchical segmentation. Proceedings IEEE International Conference on Shape Modeling and Applications (SMI '07), June 2007, Lyon, France 215-220.Google Scholar
- Smith JR, Chang S-F: Quad-tree segmentation for texture-based image query. Proceedings of the 2nd Annual ACM Multimedia Conference, October 1994, San Francisco, Calif, USA 279-286.Google Scholar
- Martin D, Fowlkes C, Tal D, Malik J: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. Proceedings of the 8th IEEE International Conference on Computer Vision, July 2001, Vancouver, Canada 2: 416-423.Google Scholar
- Ridler TW, Calvard S: Picture thresholding using an iterative selection method. IEEE Transactions on Systems, Man, and Cybernetics 1978,8(8):630-632.View ArticleGoogle Scholar
- van Rijsbergen C: Information Retrieval. 2nd edition. Department of Computer Science, University of Glasgow, Glasgow, UK; 1979.Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.