Skip to main content

Fast pose estimation for texture-less objects based on B-Rep model


Widespread industrial products, which are usually texture-less, are mainly represented with 3D boundary representation (B-Rep) model for designing and manufacturing, hence the pose estimation of texture-less object based on B-Rep model is worthy of much studying in industrial inspection. In view of such facts that surfaces are much crucial both to construction of B-Rep model and to recognition of real object, the edges of the visible surfaces in each aspect view of B-Rep model are computed and the edges in a search image containing real B-Rep objects are extracted with modified Hough algorithm. Secondly, the two edge sets are converted into the metric space for comparison, where each edge is expressed with the tetrad of edge length, angle of middle point, angle of perpendicular axis, and length of perpendicular axis. In that way, the pose of real B-Rep object in a search image is estimated by comparing the edge set of every aspect view with the edge set of the search image with the bipartite graph matching algorithm. The corresponding experiment was taken with some products in national design reservoir (NDR), and it verified the effectiveness of the texture-less pose estimation approach based on B-Rep model.

1 Introduction

The research of pose estimation and orientation of industrial products, which are typically metallic and texture-less, has fascinated humans in the recent years due to industrial applications and augmented reality. It is the key process to match 3D object features with 2D image features to determine the existence and pose of the object in scenes in a variety of vision tasks related to object recognition [1,2,3]. So 3D object representation or 3D model becomes the first important problem to solve in advance, as for industrial object, 3D representation has existed in fact before manufacturing. Generally speaking, 3D mesh model is currently the dominant for visualization and display in the general field; however, industrial products are extensively designed and manufactured according to boundary representation (B-Rep) model. B-Rep is directly consisted of geometry features such as surfaces and edges, which generally follows Standard for Exchange of Product Model Data (STEP) to share and exchange in heterogeneous design and manufacture platforms.

From a neuropsychological point of view, surfaces are the primary factors of 3D object recognition. 3D shapes are spatial configurations of surface fragments encoded by IT neurons [4]. Ecological psychologist Gibson regarded that the composition and layout of surfaces to be perceived constitute what they afford [5]. Gestalt psychologists try to settle how local discontinuities in motion or depth are evaluated with respect to object boundaries and surfaces, and it hints that surfaces and their boundaries be the enhanced and ultimate cognitive elements of objects [6]. Similarly, Marr believed that 3D shape representation is to describe surface geometry [1]. The surface-based representation of 3D object is the intermediate stage between the image-based representation and 3D shape representation [7]. As a matter of fact, B-Rep model is just surface centered and usually converted into an adjacent attributed surface graph (AAG) in order to recognize and analyze, while the edges that make up the surface boundary are the most crucial visual attributes of surfaces. At mean time, it is mentioned that edges are the most fundamental image features, for instance, they are first located by Gabor filters in the current deep learning mechanism.

In this paper, we first set up the metric space about edges in order to compare the edges from aspect views of B-Rep model and the search image, and propose an algorithm about the pose estimation of texture-less object in the search image based on B-Rep model. In view that surfaces are as the crucial visual and functional features from ecological psychology and affordance theory [5], moreover, considering that surfaces are also the core elements in B-Rep model, surfaces are first extracted from the neutral STEP file of B-Rep model, and then the edges of surface boundary are picked up to constitute the feature set of B-Rep model. Secondly, when the B-Rep model is projected to generate a number of aspect views that make up an aspect graph, the edge sets from each aspect view that correspond to certain pose parameters are computed for next matching. In the same way, the edges in the search image containing real B-Rep object are detected and merged according to contiguity and continuity rule, and then the search image is characterized into the edge set. Now the object pose can be estimated by matching the bipartite graph of two edge sets from the search image and one of aspect views of B-Rep model.

In practice, some same problems like in [8] need to be solved. After placing the camera on one virtual sphere with constant radius centered at the B-Rep model center, we ensure the translation invariance and scale invariance by normalizing the two edge sets in the metric space, and the rotation invariance due to edge-pairwise comparison in a bipartite graph matching, because the origin of coordinate system is at the center, all B-Rep models are within the unit sphere, and the bipartite graph matching is the only best regardless of edge orientations.

Another problem, object off-center, is that the object does not appear in the center of the search image. In fact, since this problem is not related aspect views, it can be solved by partially matching aspect views with the search image. Meanwhile, this method can solve the partial occlusion problem.

One main contribution of this paper is that the aspect graph of B-Rep model can provide more accurate and fast alignment references because the edges are the inherent and direct geometry features of B-Rep model. It provides the simple edge comparison that the two edge sets are converted into the four-dimension metric space. Moreover, the bipartite graph matching is more complete and accurate in the comparison of the two edge sets than the template matching.

The rest of the paper is organized as follows. Section 2 gives the related literature. In Section 3, the metric space for edge comparison and bipartite graph matching algorithm are set up. The surface attributes of B-Rep model and the characteristics of projected surface edges in aspect views are analyzed in Section 4. In Section 5, the edges in the search image containing the real B-Rep object are detected and simplified. The object pose estimation based on the edge bipartite graph matching algorithm is described in Section 6. The experiment results are discussed in Section 7.

2 Related work

Approaches for 3D object recognition in a single image have been extensively studied. Reference [2] described the overviews of object recognition from the passive approaches and the active approaches, and alluded that titillating evidences from neuroscience motivated radically to rethink the solution to 3D object recognition. It indicated that detectors paid more attention to shape properties than to color or texture properties, for example, local shape features, medial axis or skeleton, Fourier descriptors, edge direction histograms, and so on. In addition, the chain of k-connected approximately straight boundary aimed at the calculated edges in outdoor images as it simulates the certain characteristics of human visual system [9]. Coarse and refined object recognition were performed by SIFT features of interesting points in images [10]. Other common shape features included bounding ellipses, curvature scale-space, elastic models, and edge direction histograms [11] in CBIR systems. Fergus detected all the curves by Canny edge operator, and each curve was split into independent segments at its bi-tangent points to obtain feature vector of the curve [12]. The singularities or shocks in medial axes of shape outlines were used to segment the skeleton of the object into a tree-like structure called shock graph [13]. There were other shape detector like a log-polar histogram of points in object boundary [14], the orientations and principle curvatures of visible patch [15]. Unlike above, the part-based approaches provide high level volumetric parts such as generalized cylinders and super-quadrics to reduce the search space [16, 17]. Though the methods based on descriptors of feature points can decrease run-time computational complexity, they were not suitable for shiny metal surfaces [18].

Nonetheless, the aforementioned methods were not specifically devised for the detection of texture-less objects. Current texture-less object detectors mainly involve edge/gradient-based template matching [19, 20], BOLD [21], gradient orientation [22], line [23], and curve [24]. Some other texture-less detectors also consider depth information from RGB-D data [25, 26]. Combining with the detectors, the search space of aspect views is reduced using prior knowledge [27] or scale-space hierarchical model [8]. A purely edge-based method was presented for real-time scalable detection of texture-less objects in 2D images [28]. A regularized, auto-context regression framework iteratively reduces uncertainty in object coordinate and detects multiple objects by a single RGB image [29]. 3D object was detected and pose was estimated only from color images [30].

In this paper, the edges are directly selected from the STEP file of B-Rep model, and then converted into the edge set of aspect views in the metric space as accurate reference. Meanwhile, the edge set of the search image are extracted by using modified Hough transformation. The comparison of the two edge sets is completed to estimate the object pose with a bipartite graph matching algorithm.

3 Metric space

We set up the metric space about edge to compare the edge sets from the aspect views of B-Rep model and the search image. The four properties of edge, called an edge tetrad, is used to evaluate the similarity of two edges, seen in Fig. 1, which can uniquely determine a line section. In the four-dimension metric space, similarity distance satisfies the triangular equality. An edge tetrad is (l, alpha, theta, r), the edge length is l, the angle of middle point is alpha, the angle of perpendicular axis to the edge is theta, and the length of perpendicular axis is r.

Fig. 1
figure 1

An edge metric space

Meanwhile, the bipartite graph matching algorithm, itself rotation invariant, is used in the tetrad metric space to compare the two edge sets from the aspect views of B-Rep model and the search image. Furthermore, the two sets need also be normalized beforehand in the metric space for translation and scale invariance. For simple calculation, the curve edges of surface boundary in B-Rep model are segmented and approximated as a series of line sections. While surfaces in 3D B-Rep model are projected into 2D views, each aspect view is consisted of the line sections represented with the tetrads. At mean time, the edge tetrads are extracted and converted from the search image with image processing algorithms such as the combination of Canny Operator and Hough transformation.

During the matching of bipartite graph, the distance matrix A need to be calculated in advance, and each value of the matrix is the distance of a pair of edge tetrad separately from one of aspect view and the search image.

$$ {\displaystyle \begin{array}{l}A=\left[D= cr\left({s}_i^{\prime },{s}_j^{\prime \prime}\right)\right],\\ {}D=\sqrt{\sum_i{\left({x}_i-{y}_i\right)}^2},{x}_i,{y}_i\in \left(l, alpha, theta,r\right).\end{array}} $$

What makes this measure robust against occlusion and clutter is the fact that if some features are missing, either in the aspect view of B-Rep model or in the search image, the observed edges will lead to the random match with idle elements or noise edges, which will overall contribute to the larger distance sum. In order to obtain estimation robustness, the matched edge pairs in the bipartite graph are divided into two groups, in which one group, called the nearest edge pairs, has the nearest the minimum distances, and another has larger distances. The group with minimum distance returns the more precise correspondences between the aspect view of B-Rep model and the search image, which help to pick up the edges affected not by occlusion or clutter, as detailed in Section 6.

4 Method—the edge set of an aspect view of B-Rep model

B-Rep model are designed and stored in proprietary file format in different CAD platform, and these files are generally transformed into STEP files for transferring, sharing, exchanging, and displaying in heterogeneous CAD environments. B-Rep model is a kind of hierarchical organization in STEP file: solid, topological shell, topological advanced face, geometrical surface or polygon, topological loop, topological-oriented edge, topological edge curve, geometrical curve or line, topological vertex point, Cartesian point. The surfaces of industrial products bear a variety of functional semantics and reflect design intents, and they are spontaneously the core elements in B-Rep model, which is transformed into adjacent attributed surfaces graph for easy analysis. Surface attributes are mainly expressed by the boundary edges. The surfaces in B-Rep model are divided into three types: free surface, elementary surface, and polygon. Non-uniform rational B-spline (NURBS) is the only mathematical approach to define free surfaces in STEP. Elementary surfaces include conic, sphere, torus, and so on. Polygon is a planar loop closed with curves or lines, in which the curves are NURBS curves or elementary curves.

The boundary of any surface type is closed with interconnecting curves and lines at endpoints. A curve is further segmented into some sections at inflection points or extreme curvature points such as maximum curvature and zero curvature [31], and each curve section approximates to a straight line, as shown in Fig. 2. Thus line sections of visible surface boundaries are collected as an edge set and further converted in the metric space for next comparison after B-Rep model is projected into aspect views in 2D space according to the angle between projection direction and surface normal.

Fig. 2
figure 2

Segment edges of a surface boundary

An aspect graph is a series of projection views of 3D object in certain directions from virtual sphere centered at the model center. An aspect graph contains the stable views of the model and the processes from one stable view to another stable view. An aspect view is consisted of the visible surfaces along the projecting direction. An aspect graph is off-line converted into the reference to store the pose parameters of B-Rep model, as shown in Fig. 3.

Fig. 3
figure 3

An aspect graph

Suppose \( {\mathrm{s}}_{\mathrm{i}}^{\prime } \) is the projection of a visible edge si in B-Rep model, M is the matrix of projecting transformation, \( {s}_i^{\prime }={Ms}_i \).

An aspect graph is produced according to the evenly spaced intervals of longitude and latitude of the view sphere. Due to normalizing the edges from the aspect views and the search image, the radial freedom need not to be considered. Higher aspect density helps to obtain the more refined pose, but it increases the matching computational complexity. Though we reduce the aspect density with the hierarchical views [8], the intervals of longitude angle and latitude angle are not optimized here.

Compared to the mesh model, it is difficult precisely to remove all hidden lines of the large surfaces in B-Rep model. In order to decrease the complexity of project computation, we simply determine projective visibility via the surface normal made up of the three random points in each surface boundary.

5 The edge set of a search image

Edges in an image have prominent change of gray values in certain directions, which have the characteristics of less calculation and abundant information. Surface boundaries as essential visual features are closed with straight lines, convex curves, and concave curves. Furthermore, the projection of straight line section retains same shape or point, and the projection of convex or concave curve retains same shape or straight line in 2D projecting space. Therefore, it is crucial to detect lines and curves in the search image to match the edges of the aspect view of B-Rep model. In the case that line sections approximate the short convex and concave curves by extending line width or pixel number in width, line edges and curve edges in the search image can be detected with generalized Hough algorithm.

Sometimes, the same line is disconnected though the endpoints are very near; another occasion is that some endpoints are more dense, and these two types of point overlap in some regions. According to contiguity rule and continuity rule, these points need be merged into one point with the nearest neighbor clustering in metric space.

In another more frequent case that the camera calibration such as focal length is unknown, the image normalization is used to eliminate the scale difference and reduce the projection distortion. The B-Rep object is usually not in the center of the image in most cases; we crop the image by shifting and scaling, and then normalize the nine sections respectively. The nine sections are separately matched by moving an aspect view of B-Rep model, and the sum of match distance is as the final comparison result. The pose accuracy depends on the number of segmentations and their combination; here, just nine cases are listed in Fig. 4. The nine sections are produced gradually by shifting toward the center.

Fig. 4
figure 4

Cropping images for adjusting off-center

6 The pose estimation by matching the bipartite graph

Suppose the edge set of an aspect view of B-Rep model be \( {S}^{\prime }=\left\{{s}_1^{\prime },{s}_2^{\prime },\dots, {s}_{i-1}^{\prime },{s}_i^{\prime },\dots, {s}_n^{\prime}\right\} \), among which \( {s}_i^{\prime } \) is the projection of edge si in B-Rep model, and the edge set of the search image possibly containing B-Rep object be \( {S}^{{\prime\prime} }=\left\{{s}_1^{\prime \prime },{s}_2^{\prime \prime },\dots, {s}_{j-1}^{\prime \prime },{s}_j^{\prime \prime },\dots, {s}_m^{\prime \prime}\right\},m\ge n \), the Euclidean distance of \( {\mathrm{s}}_{\mathrm{i}}^{\prime } \) and \( {\mathrm{s}}_{\mathrm{i}}^{\prime \prime } \) leads to the correlation matrix A of S and \( {S}^{{\prime\prime}}\mathrm{A}=\left[\mathrm{cr}\left({\mathrm{s}}_{\mathrm{i}}^{\prime },{\mathrm{s}}_{\mathrm{j}}^{\prime \prime}\right)\right] \), which is used to match in the next step.

The two sets from an aspect view and the search image actually form an incomplete bipartite graph, and the mapping relation can be optimized by the bipartite graph matching algorithm, thus the minimum sum of distances of the two edge sets means the best match. For simple calculation, the empty element ϕ is first inserted in the smaller set for the equal number of elements in the two sets, then the computation of distance sum of edge pairs is now the optimal bipartite graph matching, shown in Fig. 5. The element number m determines the computation complexity, and the complexity of Kuhn-Munkres [32], which is a popular optimal bipartite graph matching algorithm, is n2 + nn, so the computation complexity is O(n2) in the worst case.

Fig. 5
figure 5

Bipartite graph matching

In these cases, the pose estimation of B-Rep object in the search image can be realized, and the algorithm is as follows:

  1. 1.

    Extract the surfaces and the edges from the STEP file of B-Rep model in the design database.

  2. 2.

    Build an aspect graph by projecting B-Rep model, and compute the edge set of each aspect view based on visible surfaces along the projection direction.

  3. 3.

    The edge sets of aspect view are normalized and converted into the metric space.

  4. 4.

    Segment the search image and get the edge set with Canny detector, Hough detector.

  5. 5.

    The edge set of the search image is normalized and converted into the metric space.

  6. 6.

    Compute the distance matrix between the two sets in the metric space, then the minimum distance sum and the matched edges with the bipartite graph matching algorithm, i.e., Kuhn-Munkres.

  7. 7.

    Divided the matched edges into two groups.

  8. 8.

    The pose is estimated according to the minimum distance sum and the ratio of the nearest edge pairs.

The matching accuracy is affected by some parameters. The edge number of an aspect view is usually less than the search image, and it is difficult to specify the distance between the inserted empty elements and an edge. At initial stage, the empty distance is equal to the maximum in the distance matrix; however, if the specified distance value is too high, some matches with the minimum distance may be overridden. The occlusion problem is similar to the empty distance due to the loss of some inherent points in the search image. In the case of occlusion, the range about the minimum distance in the nearest edge pairs needs to be evaluated. Likewise, the aforementioned threshold in the nearest neighbor clustering need be specified based on experience to merge better.

In view of the conditions, a learning mechanism is proposed to determine these parameters. Inputting the aspect graphs and the search images from the training set, and outputting the specified distances, and the parameters are iteratively adjusted. The training set includes some simple geometry shapes such as cubes and cylinders, as shown in Fig. 6.

Fig. 6
figure 6

The learning mechanism of pose estimation

In the learning mechanism, the parameters are adjusted by using regular 3D models such as boxes and cylinders.

7 Results and discussion

7.1 Accuracy

In general, the more complicated objects have more edges; consequently, the algorithm need more the runtime. It takes less runtime to use computation parallel online by distributing different groups of aspect views over different threads and settling down all aspect views offline.

Evaluations are checked in various cases such as different shaped objects (rounded and sharp bracket), different densities of aspect graphs, different focal lengths, as well as different parameters before and after learning. Suppose the longitude angle is σ, the latitude angle is τ, the estimated pose is ep, the true pose is tp, and the pose error rate er is defined to measure the accuracy in pose estimation:

$$ {\displaystyle \begin{array}{l} ep=\left({\sigma}^{\prime },{\tau}^{\prime}\right), tp=\left(\sigma, \tau \right),\\ {} er=\left(1+\left( ep\cdot tp\right)/\left(\left| ep\right|\left| tp\right|\right)\right)/2.\end{array}} $$

Ten different poses of each object are randomly selected to compute the error rates.

  1. 1)

    Rounded and sharp brackets

    Because curves are segmented into line sections in B-Rep model, the aspect views as the references were little affected by the rounded shapes; however, the edge set of the search image greatly fluctuated due to the curved edges even though generalized Hough transformation has been adjusted well, as shown in Fig. 7. However, the edges of B-Rep model are ready-made and determined by its intrinsic properties, so they are constant in any project direction.

  2. 2)

    The focal length

Fig. 7
figure 7

Error rates about rounded and sharp shapes

It can be deduced that the focal length difference Δdcauses the deviation in the normalized aspect views by the formula:

$$ {\displaystyle \begin{array}{l}{x}^{\prime }= xd/\left(z+d\right),{y}^{\prime }= yd/\left(z+d\right),\\ {}z=0,\dot{x^{\prime }}= xz,\dot{y^{\prime }}= yz.\end{array}} $$

Thus the shape distortion and change are not related with the change of focal length.

  1. 3)

    Different densities of aspect graphs

The intervals of longitude angle and latitude angle are evenly split here, the more density KK the more runtime, but the less pose error rate, seen in Fig. 8. After all, the indexes are different according to the complexity of object such as cubes and brackets.

  1. 4)

    Before and after learning

Fig. 8
figure 8

Accuracy and runtime as to the aspect density

Under no occlusion, by adjusting the parameters such as dmax and dmin, which are the thresholds of the maximum distance and minimum distance between the edges from two edge sets, the error rate can reduce, for instance, at least 15% at the density of 24.

7.2 Robustness

The robustness to occlusion and clutter is inspected in Fig. 9, and the brackets could be correctly found. A rounded object and a sharp object are picked according to five sequences in which the objects were randomly occluded by 0, 10%, 20%, 30%, 40%, and 50%, respectively. For each sequence, the images were randomly added with some other shapes such as pipes and boxes.

Fig. 9
figure 9

Error rates about occlusion and clutter

Figure 9 shows the pose error rates with respect to the amount of clutter and occlusion. It can be seen that the error rate of rounded objects increases with high occlusion, and the rate of sharp objects is not significantly influenced by the amount of clutter and occlusion.

The parameters need to be adjusted, including the coefficient of the empty distance based on the maximum distance, the coefficient of non-occlusion distance based on the minimum distance, and the gap in Canny and Hough detectors. The parameters before and after learning affect the error rate in Fig. 10.

Fig. 10
figure 10

Error rates affected by the learning mechanism

7.3 Limitations

B-Rep model provides our approach with the precise reference to determine the pose, which simultaneously reduces the extracting runtime from the projecting views of 3D model, and the edge comparison can be well expressed in the metric space toward either the aspect views of B-Rep model or the search image.

The major limitation of our approach is that the algorithm of extracting edges from the search image is not slightly suitable for rounded shapes for replacing arcs with lines, so the accuracy for such objects is worse than the objects with sharp edges.

8 Conclusions

Texture-less industrial products are represented as B-Rep model in designing and manufacturing. In the paper, the edges and surfaces of B-Rep model are selected from STEP files, the aspect views of B-Rep model are obtained according to projection transformation, and the edge sets of aspect views as the accurate references are represented in the metric space. Similarly, the edges are detected from the search image containing B-Rep object according to Hough transformation, and the edge set is also represented in the metric space. The poses of texture-less industrial object is estimated by matching the bipartite graph of the two edge sets from the search image and the aspect views. It is confirmed that our approach is feasible, though there are still some problems to be solved. The accuracy of pose estimation needs to be adjusted by optimizing the above extracting and matching algorithms; in addition, the industrial object benchmark based on B-Rep model needs to be established in order to check the effectiveness and efficiency of the related algorithms.



Adjacent attributed surface graph


Boundary representation


National design reservoir


Non-uniform rational B-spline


Standard for exchange of product model data


  1. D. Marr, in Vision. A computational investigation into the human representation and processing of visual information (MIT Press, Cambridge, 1994), pp. 107–111

    Google Scholar 

  2. A. Andreopoulos, J.K. Tsotsos, 50 years of object recognition: directions forward. Comput. Vis. Image Underst. 117, 827–891 (2013)

    Article  Google Scholar 

  3. L. G. Roberts, “Machine perception of three-dimensional solids,” Ph.D. Dissertation, Dept. Elect. Eng., Ma. Inst. Tech., MA, USA. (1963).

  4. Y. Yamane, E.T. Carlson, K.C. Bowman, Z. Wang, C.E. Connor, A neural code for three-dimensional object shape in macaque inferotemporal cortex. Nature Neuroscience 11(11), 1352–1360 (2008)

    Article  Google Scholar 

  5. J.J. Gibson, An ecological approach to visual perception. The American Journal of Psychology 102(4), 443–476 (1979)

    Google Scholar 

  6. I. Kovács, Gestalten of today: early processing of visual contours and surfaces. Behavioural Brain Research 82(1), 1–11 (1996)

    Article  Google Scholar 

  7. H. Sakata, K.I. Tsutsui, M. Taira, Toward an understanding of the neural processing for 3D shape perception. Neuropsychologia 43(2), 151–161 (2005)

    Article  Google Scholar 

  8. M. Ulrich, C. Wiedemann, C. Steger, Combining scale-space and similarity-based aspect graphs for fast 3D object recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(10), 1902–1914 (2012)

    Article  Google Scholar 

  9. V. Ferrari, L. Fevrier, F. Jurie, C. Schmid, Groups of adjacent contour segments for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(1), 36–51 (2007)

    Article  Google Scholar 

  10. J. Ma, T.H. Chung, J. Burdick, A probabilistic framework for object search with 6-DOF pose estimation. International Journal of Robotics Research 30(10), 1209–1228 (2011)

    Article  Google Scholar 

  11. R. Datta, D. Joshi, J. Li, J.Z. Wang, Image retrieval: ideas, influences, and trends of the new age. ACM Computing Surveys 40(2), 5–60 (2008)

    Article  Google Scholar 

  12. R. Fergus, P. Perona, and A. Zisserman, “A visual category filter for Google images,” in : Proc. ECCV, Amsterdam, pp. 242–256 (2004).

    Google Scholar 

  13. K. Siddiqi, A. Shokoufandeh, S.J. Dickinson, S.W. Zucker, Shock graphs and shape matching. International Journal of Computer Vision 35(1), 13–32 (1999)

    Article  Google Scholar 

  14. S.J. Belongie, J. Malik, J. Puzicha, Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(4), 509–522 (2010)

    Article  Google Scholar 

  15. T.J. Fan, Describing and recognizing 3-D objects using surface properties (Springer, New York, 1990), pp. 55–72

    MATH  Google Scholar 

  16. I. Biederman, Recognition-by-components: a theory of human image understanding. Psychological Review 94(2), 115–147 (Apr. 1987)

    Article  Google Scholar 

  17. R. Nevatia, T.O. Binford, Description and recognition of curved objects. Artificial Intelligence 8(1), 77–98 (1977)

    Article  Google Scholar 

  18. V. Lepetit, Key point recognition using randomized trees. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(9), 1465–1479 (2006)

    Article  Google Scholar 

  19. J. Chan, J. A. Lee, and K. M. Qian, “BORDER: An oriented rectangle approach to texture-less object recognition,” in Proc. CVPR. pp. 2855–2863 (2016).

  20. E. Munoz, Y. Konishi, V. Murino, and A. D. Bue. “Fast 6D pose estimation for texure-less objects from a single RGB image,” in Proc. ICRA, Stockholm, Sweden pp. 5623–5630 (2016).

  21. F. Tombari, A. Franchi, and L. D. Stefano, “BOLD features to detect texture-less objects,” in Proc. ICCV, Sydney, Australia, 2013, pp. 1265–1272

  22. S. Hinterstoisser, C. Cagniart, P.S. Ilic, N.N. Sturm, P. Fua, V. Lepetit, Gradient response maps for real-time detection of textureless objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(5), 876–888 (2012)

    Article  Google Scholar 

  23. P. David and D. DeMenthon, Object recognition in high clutter images using line features, in Proc. 10th Computer Vision, Beijing, China, pp. 1581–1588 (2005).

  24. B. Vijayakumar, D. Kriegman, J. Ponce, Invariant-based recognition of complex curved 3D objects from image contours. Computer Vision and Image Understanding 72(3), 287–303 (1998)

    Article  Google Scholar 

  25. S. Li, S. Koo and D. Lee, Real-time and model-free object tracking using particle filter with joint color-spatial descriptor, in Proc. IROS, pp. 6079–6085 (2015)

  26. C. Choi and H. I. Christensen, 3D pose estimation of daily objects using an RGB-D camera, in Proc. IROS, Vilamoura, Portugal, pp. 3342–3349 (2012)

  27. CV Bank , D. M. Gavrila, and C. Wohler, A visual quality inspection system based on a hierarchical 3D pose estimation algorithm, in Proc. DAGM, Magdeburg, Germany. pp. 179–186 (2003).

  28. Hodan, Tomá, et al., Efficient texture-less object detection for augmented reality guidance. Mixed and Augmented Reality Workshops (ISMARW), 2015 IEEE International Symposium on. IEEE 2015

  29. Brachmann, Eric, et al., Uncertainty-driven 6d pose estimation of objects and scenes from a single rgb image. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)

  30. M. Rad, V. Lepetit, BB8: a scalable, accurate, robust to partial occlusion method for predicting the 3D poses of challenging objects without using depth. International Conference on Computer Vision (2017)

  31. H. Asada, M. Brady, The curvature primal sketch. IEEE Transactions on Pattern Analysis and Machine Intelligence 8(1), 2–14 (1986)

    Article  Google Scholar 

  32. C. Gary, A first course in graph theory (Dover Publications, New York, 2012), pp. 65–70

    Google Scholar 

Download references


The authors thank the editor and anonymous reviewers for their helpful comments and valuable suggestions. I would like to acknowledge all our team members, especially Yan Wei.

About the authors

Jihua Wang was born in Yantai, Shandong Province, China in 1966. He received the B.E. in engine engineering from Tsinghua University, Beijing, in 1990, the M.E. degrees in management science from Shandong University, Jinan, in 2005, and the Ph.D. degree in management science from Shandong Normal University, Jinan, in 2009.

From 1990 to 2002, he was an senior engineer with China National Truck Corp. Since 2010, he has been a Professor with the College of Information Science and Engineering, Shandong Normal University. He is the author of more than 30 articles, and more than 5 inventions. His research interests include visual computation, computer graphics, shape recognition, CAD, intelligent design, and design ontology (e-mail:

Wei Yan was born in Qufu, Shandong, China in 1984. She received the Ph.D. degree in computer science from University of Strasbourg, France, in 2014. Since 2014, she has been an Assistant Professor with the School of Information Science and Engineering, Shandong Normal University. She published 27 papers, including 8 peer-reviewed journal papers and 19 peer-reviewed conference papers. Her research interests include knowledge engineering, semantic similarity, ontology modeling and inference, and inventive design. (e-mail:


This research was supported by the National Natural Science Foundation of China (61472233) and Natural Science Foundation of Shandong Province (ZR2014FM018).

Availability of data and materials

We can provide the data.

Author information

Authors and Affiliations



All authors take part in the discussion of the work described in this paper. The first author contributed more to this work.

Corresponding author

Correspondence to Jihua Wang.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Yan, W. Fast pose estimation for texture-less objects based on B-Rep model. J Image Video Proc. 2018, 117 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: