Texture Classification for 3D Urban Map
© Hiroyuki Inatsuka et al. 2009
Received: 27 February 2008
Accepted: 23 February 2009
Published: 7 June 2009
This paper proposes a method to control texture resolution for rendering large-scale 3D urban maps. Since on the 3D maps texture data generally tend to be far larger than geometry information such as vertices and triangles, it is more effective to reduce the texture by exploiting the LOD (Level of Detail) in order to decrease whole data size. For this purpose, we propose a new method to control the resolution of the texture. In our method we classify the textures to four classes based on their salient features. The appropriate texture resolutions are decided based on the classification resulsts, their rendered sizes on a display, and their level of importance. We verify the validity of our texture classification algorithm by applying it to the large-scale 3D urban map rendering.
Three-dimensional urban maps (3D map) have a variety of applications such as navigation systems and disaster management and simulations of urban planning. As the technology that acquires 3D range data and models of cities from the real world has advanced [1–4], more sophisticated visualization of photo-realistic 3D map is becoming available.
In general the 3D map has geometry information of 3D meshes and texture images. Since the amount of the data is huge, its data size often becomes a problem when it is treated in devices including PCs, car-navigation systems, and portable devices. Therefore, the reduction in the volume of data becomes important to make the 3D map applications user friendly.
Many methods on the LOD control of general 3D models have been proposed so far [5–7]. Moreover there exist some visualization techniques [7–12] of terrain data that express the surfaces of a ground with geographical features like mountain district. While these conventional methods assume that a single model is locally smooth and consists of topological manifold that contains a large number of meshes, the city model consists chiefly of buildings. Since the data in the 3D map mostly consist of many small, already simplified meshes (like buildings made of some cuboids), these simplification methods cannot be applied to it. Although some methods on modeling and rendering of the 3D maps have been already proposed [13–15], there exist few methods concerning the reduction of the texture data for the 3D map. In the 3D map, the proportion of the total amount of texture data tends to be much higher than that of geometrical information such as vertices and triangles. For example, in the 3D map we use for a simulation (Figure 10) the whole size of the textures is 461 MBytes in JPEG format, while the geometry information has only 15 MBytes in gzipped VRML. It means that the LOD of the texture has a capability of effectively reducing the whole data size rather than the LOD of the geometry. For this purpose, we propose the technique for controlling the resolution of the texture. Wang et al.  propose a method to detect the repetition of content in the textures to reduce storage and required memory. In this paper we focus on urban map rendering from a ground level (as shown in Figure 11) rather than bird's eye view. These systems can be used in car-navigation and human-navigation systems.
In the following sections, we describe the overview of our rendering system with consideration of the level of texture importance. In Section 3, we propose our classification method based on K-Nearest Neighbor approach. Then we show some experimental results to verify the validity of our classification method in Section 4. In Section 5 we summarize the strength of our method.
2. Proposed Method
Our 3D map is composed of simplified polygonal meshes (as shown in Figure 3) and textures mapped on the meshes. The textures are made of photographs of real scenes.
In order to reduce rendering cost, it is an efficient stategy that only visible data from a user's viewpoint are loaded and rendered. To implement this, we introduce a representative viewpoint (RV), which is a discrete point in the 3D map used for a reference point of rendering. In our rendering system, we spread RVs all over the map in advance. Using the Z-buffer algorithm, visible objects from each RV are determined. When rendering, we first find the RV closest to a current viewpoint of a user. The objects seen from the RV on the map are found, and then only the visible objects from it are rendered.
Note that in our setting, different types of objects are separately saved to different image files, for example, no image includes a billboard and a window simultaneously. Since our classification is done for images directly, the algorithm would fail for images that simultaneously contain objects to be classified to different classes.
2.2. Selection of Representative Viewpoints
In this step we select the representative viewpoints (RVs) on the 3D map. In our framework, we suppose that a user walks on a street of the 3D map. Only rendering from a ground level is considered. Although to make the points equally spaced in the map, one has to define them based on the geometrical form of terrain data, we adopt the following simple procedure to reduce its computational complexity, as we assume the terrain associated with the map is flat.
Next, the points in the area thought to be paths when the user actually walks through in the 3D map are selected as the RVs from the candidates on the lattice points in Figure 2. This is done simply by excluding the candidate points in high places such as the tops of the buildings.
2.3. Determination of Visible Objects
A set of the objects that are visible from each of the RVs are determined by using the Z-buffer algorithm . The accuracy of the determination depends on the resolution of the Z-buffer. Increasing the resolution, both of the accuracy and the computational cost increase, and vice versa. We have determined the resolution of the buffer by trial and error. Note that one needs to do this processing to a 3D map only once as preprocessing. For example, Figure 3 depicts the objects judged as "visible" from the RV denoted by the sphere in the middle. The determination is done for all the RVs, and we store the indices of the objects in a list.
2.4. Texture Resolution
The distance can be found by the barycentric coordinates of the surface where the texture is mapped and the coordinates of the RV. When the surface is slant to the RV, the area of the texture is narrower than the one seen from the front. To take this into account, we first consider a virtual plane that is perpendicular to the line connecting center of the plane and the RV. The area in the new surface made by projecting each vertex of the surface, where the texture is mapped, to the plane is calculated. Sometimes one texture is repeatedly mapped on a large surface. The number of the repeats can be found by the texture coordinate . Finally the size of the texture is estimated by using , and .
The threshold is used for saliency detection and control a weight on images with smooth background. Of course the two thresholds affect the performance of the algorithm, for example, the smaller put more weights on images with flat background. However the thresholds do not sensitively affect a final result. We choose these thresholds by trial and error.
Next we consider the colors of the texture. In this method, the texture composed of the wide variety of colors is considered as "less important" texture, since in many applications including the navigation system, too complex information seldom plays an important role, and it is even unrecognizable from a distance. To evaluate the color complexity, we use the variance of the RGB histograms. Note that as each bin of the histogram means its frequency, a low variance indicates that the texture contains a wide variety of colors and a small number of colors mean a high variance.
3. Texture Classification
In the previous section, we define the value that is obtained from the display size and the features of the texture. In our previous report , one of the multiple resolutions is assigned according to of each texture, when actually rendering the 3D map. In the system of  , we prepare four levels of the textures, which are simply created by reducing the resolution of an original texture by 1/2, 1/4, 1/8. However with this method, salient texture such as road traffic signs and unremarkable texture like the walls of buildings are treated similarly. To address the problem, we introduce a new LOD control based on the texture classification. We make it possible to control the resolution more reasonably based on image features by classifying the texture into some classes, and then changing the reduction ratio based on the classes. By using this method it becomes possible to make the resolution of the image in one class smaller than the one in an other class even if it has larger in (3).
3.1. Definition of Class
In the end, we define the following four classes.
Class 1: walls with soft edges.
Class 2: texture with some clearly outlined objects with smooth background.
Class 3: walls with sharp edges.
Class 4: others.
In our framework, we select the resolution based on these classes besides the levels based on (3). We consider the case that four levels of textures are prepared, where the reduction ratios are 1, 1/2, 1/4, and 1/8, and all the textures are labeled as each of the Levels 1 to 4. Then we classify all of the original texture to Class 1 to 4. For the textures in Class 2, their original images are allocated to the Level 1 and 2 and the images shrunk to a half are allocated to other levels, while for Class 1 the images shrunk to 1/8 are used for all the levels. By this strategy, efficient LOD control of the texture is made possible than the case of only taking (3) into account.
3.2. Classification Method
The K-Nearest Neighbor (K-NN) algorithm is used to automatically classify a large amount of textures. In general the selection of feature vectors greatly affects the accuracy of the classification. Here we introduce two feature vectors based on colors and edges.
The first feature is color moments of color. The color moments have been widely used in conventional image retrieval systems [20, 21]. The first order moment (mean) roughly measures the appearance of images. Although it has been proved to be efficient in the retrieval system, in our application measuring the color difference does not improve this classification, that is, blue and red signboards should be equally treated. On the other hand, the second moment (variance) of the color measures the minuteness and has the capability of discriminating simplicity and complexity of the textures. The third (skewness) and forth (kurtosis) moments may also be applied, but these high-order moments are often sensitive to small changes and may degrade the performance. For the reasons stated above, we adopt the second-order moment for the feature of colors.
When walking through the 3D map, a current position and its closest RV are found, and then loaded are all the textures at the appropriate resolutions that belong to the nine RVs, that is, closest RV and eight neighboring RVs. The reason why the neighboring RVs are loaded is that loading cost can be reduced and scenes can smoothly change when the user moves to the area of another RV.
4. Experimental Results
4.1. Precision of Classification
Precision of features on colors.
Var. Hist.: the variance of the color histogram.
Hist(512): the color histogram quantized to 512 bins.
CCV: color Coherent Vector proposed in .
Moment: proposed feature vector.
Precision of features on edges.
Sharp edges: the cost in (2) that we use to determine the texture level.
Wavelet coeffs.: quantized wavelet coefficients (high pass outputs of the dyadic wavelet).
Directionality: the quantized direction of the edges that is obtained by Sobel filter.
Anisotropic diffusion: proposed feature vector.
Precision of two features.
Moment + Wavelet
Hist(512) + Wavelet
Moment + Anisotropic diffusion
4.2. Data Size and Quality of 3D Map
Reduction ratio of texture resolution.
In this paper, we proposed the method that controls texture resolutions based on their features. By allocating low resolutions to visually unimportant textures, we reduce the data size to load for rendering without much degradation of quality.
The authors are grateful for the support of a Grant-in-Aid for Young Sciences (#14750305) of Japan Society for the Promotion of Science, fund from MEXT via Kitakyushu innovative cluster project, and Kitakyushu IT Open Laboratory.
- Haala N, Brenner C, Anders K: 3D urban GIS from laser altimeter and 2D map data. Proceedings of the ISPRS Commission III Symposium on Object Recognition and Scene Classification from Multispectral and Multisensor Pixels, July 1998, Columbus, Ohio, USA 339-346.Google Scholar
- Haala N, Peter M, Kremer J, Hunter G: Mobile LiDAR mapping for 3D point cloud collection in urban areas: a performance test. Proceedings of the 21st International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS '08), July 2008, Beijing, China 37, part B5, Commission 5: 1119ff.Google Scholar
- Cornelis N, Cornelis K, Van Gool L: Fast compact city modeling for navigation pre-visualization. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR '06), June 2006, New York, NY, USA 2: 1339-1344.Google Scholar
- Pollefeys M, Nistér D, Frahm J-M, et al.: Detailed real-time urban 3D reconstruction from video. International Journal of Computer Vision 2008,78(2-3):143-167. 10.1007/s11263-007-0086-4View ArticleGoogle Scholar
- Hoppe H: Progressive meshes. Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '96), August 1996, New Orleans, La, USA 99-108.View ArticleGoogle Scholar
- Hoppe H: View-dependent refinement of progressive meshes. Proceedings of the 24th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '97), August 1997, Los Angeles, Calif, USA 189-198.View ArticleGoogle Scholar
- Luebke D, Reddy M, Cohen JD, Varshney A, Watson B, Huebner R: Level of Detail for 3D Graphics. Morgan Kaufmann, San Francisco, Calif, USA; 2003.Google Scholar
- Pajarola R: Large scale terrain visualization using the restricted quadtree triangulation. Proceedings of the IEEE Visualization Conference (Vis '98), October 1998, Research Triangle Park, NC, USA 19-26.Google Scholar
- Losasso F, Hoppe H: Geometry clipmaps: terrain rendering using nested regular grids. Proceedings of the 31st International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH '04), August 2004, Los Angeles, Calif, USA 769-776.Google Scholar
- Hoppe H: Smooth view-dependent level-of-detail control and its application to terrain rendering. Proceedings of the IEEE Visualization Conference (Vis '98), October 1998, Research Triangle Park, NC, USA 35-42.Google Scholar
- Cignoni P, Ganovelli F, Gobbetti E, Marton F, Ponchio F, Scopigno R: BDAM—batched dynamic adaptive meshes for high performance terrain visualization. Computer Graphics Forum 2003,22(3):505-514. 10.1111/1467-8659.00698View ArticleGoogle Scholar
- Cignoni P, Ganovelli F, Gobbetti E, Marton F, Ponchio F, Scopigno R: Interactive out-of-core visualization of very large landscapes on commodity graphics platform. Proceedings of the 2nd International Conference on Virtual Storytelling (ICVS '03), November 2003, Toulouse, France 21-29.Google Scholar
- Döllner J, Buchholz H: Continuous level-of-detail modeling of buildings in 3D city models. Proceedings of the 13th ACM International Workshop on Geographic Information Systems (GIS '05), November 2005, Bremen, Germany 173-181.Google Scholar
- Hu J, You S, Neumann U: Approaches to large-scale urban modeling. IEEE Computer Graphics and Applications 2003,23(6):62-69. 10.1109/MCG.2003.1242383View ArticleGoogle Scholar
- Takase Y, Yano K, Nakaya T, et al.: Visualization of historical city Kyoto by applying VR and web3D-GIS technologies. Proceedings of the 7th International Symposium on Virtual Reality, Archaeology and Cultural Heritage (VAST '06), October-November 2006, Nicosia, CyprusGoogle Scholar
- Wang H, Wexler Y, Ofek E, Hoppe H: Factoring repeated content within and among images. ACM Transactions on Graphics 2008,27(3):1-10.Google Scholar
- Foley JD, van Dam A, Feiner SK, Hughes JF: Computer Graphics: Principles and Practice. Addison-Wesley, Reading, Mass, USA; 1995.Google Scholar
- Inatsuka H, Uchino M, Okuda M: Level of detail control for texture on 3D maps. Proceedings of 11th International Conference on Parallel and Distributed Systems Workshops (ICPADS '05), July 2005, Fukuoka, Japan 2: 206-209.View ArticleGoogle Scholar
- Wandell BA: Foundations of Vision. Sinauer Associates, Sunderland, Mass, USA; 1995.Google Scholar
- Long F, Zhang HJ, Feng DD: Fundamentals of content-based image retrieval. In Multimedia Information Retrieval and Management. Edited by: Feng D, Siu W, Zhang H. Springer, Berlin, Germany; 2003:1-26.View ArticleGoogle Scholar
- Flickner M, Sawhney H, Niblack W, et al.: Query by image and video content: the QBIC system. Computer 1995,28(9):23-32. 10.1109/2.410146View ArticleGoogle Scholar
- Perona P, Malik J: Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence 1990,12(7):629-639. 10.1109/34.56205View ArticleGoogle Scholar
- Pass G, Zabih R, Miller J: Comparing images using color coherence vectors. Proceedings of the 4th ACM International Multimedia Conference, November 1996, Boston, Mass, USA 65-73.Google Scholar
- Zhang R, Zhang ZM: A clustering based approach to efficient image retrieval. Proceedings of the 14th International Conference on Tools with Artificial Intelligence (ICTAI '02), November 2002, Washington, DC, USA 339-346.Google Scholar
- Goodrum AA: Image information retrieval: an overview of current research. Informing Science 2000,3(2):63-67.Google Scholar
- Daly S: The visible difference predictor: an algorithm for the assessment of image fidelity. In Digital Image and Human Vision. Edited by: Watson AB. MIT Press, Cambridge, Mass, USA; 1993:179-206.Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.