Skip to main content

Metal stamping character recognition algorithm based on multi-directional illumination image fusion enhancement technology


Metal stamping character (MSC) automatic identification technology plays an important role in industrial automation. To improve the accuracy and stability of segmentation and recognition for MSCs, an algorithm based on multi-directional illumination image fusion technology is proposed. First, four grayscale images are taken with four bar-shape directional light sources from different directions. Next, based on the difference in surface grayscale characteristics for the different illumination directions of the surface’s stamped depression regions and flat regions, the image background is extracted and eliminated. Second, the images are fused using the difference processing on the images in the two groups of relative illuminant directions. Third, mean filter, binarization, and morphological closing operations are performed on the fused image to locate and segment the character string in the image, and the characters are normalized by correcting the skew of the segmented character string. Finally, histogram of oriented gradient features and a backpropagation neural network algorithm are employed to identify the normalized characters. Experimental results show that the algorithm can effectively eliminate the interference of factors such as oil stains, rust, oxide, shot-blasting pits, and different background colors and enhance the contrast between MSCs and background. The resulting character recognition rate can reach 99.6%.


Characters are one of the main methods for information identification, recording, and storage. Metal stamping characters (MSCs) are widely used in the identification of industrial products because they are hard to alter and permanently preserved. The high-quality automation of character recognition on industrial products is highly desirable in the manufacturing and periodic inspection of these products. Inspection is performed at various production stages. It is clear that the earlier method of using human inspectors, however, misses a considerable number of defects because humans are unsuitable for such simple and repetitive tasks. Automated vision inspection can be a good alternative to reduce human workload and labor costs as well as to improve inspection accuracy and throughput. Unfortunately, MSCs constantly change over time and vary through the manufacturing process flow. For example, spray paint and other similar processing cause the color of the characters to vary process by process. Because the colors of the MSCs are generally similar to the background, the contrast between the two regions is very low. Besides, annealing, incineration, and other processes, as well as stacking or service over the long term, will produce oxide scale or rust on the metal surface that further reduces the contrast between the characters and the background; sometimes, such characters are hard to distinguish even with the human eyes. Finally, hydraulic oil stains, shot-blasting process, the spatter of welding slag, electrostatic adsorption of iron powder, and other artifacts of the product manufacturing and service process also obscure the pressed characters and reduce the image quality. The traditional optical character recognition (OCR) techniques, such as text recognition [1,2,3], license plate recognition [4,5,6], and ID card identification [7, 8], have been developed for decades and achieved great success in different commercial systems. But most of them are designed for images with high quality and often fail to produce good results when applied for MSCs. To adapt to the development and promotion of intelligent manufacturing technology, achieving OCR for MSCs is particularly important.

Studies on OCR technology mainly focus on two aspects: character segmentation and character recognition. OCR in scenes with deep learning algorithm is one of the most important research areas in computer vision, and it has been studied for many years with different successful applications, although there are lots of research on text recognition in different scenarios, such as for printed document or manuscripts [9, 10]; however, the research on OCR technology for MSCs or industrial field application is rarely in the current literature. Character segmentation, which is the precondition of character recognition, heavily influences the accuracy of character recognition. Hence, digital imaging processing technology is used to enhance image quality to improve the accuracy and stability of character segmentation. In contrast, recognition algorithms based on different feature descriptors have been investigated in many studies. In stamping character segmentation, Li et al. first classifies the labeled image into several planes from the darkest to the brightest [11]. The printed text on the label is extracted from the binarized image of the darkest plane. Then, the block of printed text is determined using connect components analysis and removed. Finally, the pressed characters are extracted successfully. The character segmentation success rate of their algorithm can reach 93.4%. The main reason for the segmentation failure is the uneven distribution of the image grayscale caused by deformation of the image. Gao et al. extracted two binary images of characters stamped on mechanical parts using a histogram of oriented gradient (HOG)-based local dynamic threshold algorithm and the Otsu method [12]. The two results were fused to obtain a more optimal binary image. Although the quality of the binarized image obtained by this algorithm is obviously improved, the results in this paper show that the result is still not satisfactory for images with strong background interference. Danijela et al. proposed a combination of a threshold adjustment control loop and image data merging methods [13]. Two-dimensional entropy feedback control was used to enhance image quality and improve the accuracy of character segmentation. In stamping character recognition, Li et al. first used a Gabor filter to directly extract the local stroke features of the convex character image in the horizontal, vertical, and left and right diagonal directions and constructed a Gabor feature space with rotation and scale invariance based on the total energy and invariance of the Gabor filter output to improve the accuracy rate of character recognition [14]. The accuracy rate of this algorithm can reach 97.83% (based on a single character image after successful segmentation). In addition, some studies highlight the contrast between characters and backgrounds by obtaining the depth information of the characters. Quan et al. used sinusoidal grating projection and phase-shifting techniques to conduct the three-dimensional reconstruction of characters on a shadow wall and then obtain the depth information of stamped characters to complete the character structure and recognition [15]. However, this method needs to design complex system markings and requires a large amount of calculation, which is not ideal in the industrial field in practice. Chen et al. used the simplified photometric technology to obtain the normal vector of each point in the sample and then used a graph-based clustering method to segment imprinting characters [16]. The algorithm makes full use of the three-dimensional surface features of imprinting characters, but it has specific requirements regarding the material of object and cannot adapt to the different reflection models of different materials. Similar photometric stereo methods are also proposed by Ikehata et al. [17], Tsiotsios and Davison [18], et al., and they are useful complements to the existing techniques for background removal and are especially useful when there was no template available. However, for the task of OCR, the estimation of the surface topology is not the final goal, and it is not necessary to reconstruct the surface contour; similar image acquisition strategy needs to be studied.

Multi-directional illumination technology is a kind of image processing method that obtains the projection image of the target object under different light sources from fixed points and then approximates the three-dimensional structure of the target surface through image fusion technology. Although many studies have focused on multi-directional illumination-based surface detection techniques, most efforts have been put into algorithm development to recognize surface textures or segment defects. The effect of scratched or embossed character detection and classification on metallic surfaces under multiple illuminant directions has been less discussed. Liao et al. utilized a metal surface under 16 different lighting directions for image fusion processing to enhance image contrast and detect and classify surface defects [19]. León et al. captured and fused 32 images of a metal surface shot from different illumination directions to calculate the pit area for automatic monitoring of surface quality [20]. Racky et al. obtained object images under the illumination of two different directional light sources and enhanced the contrast of embossed characters using morphological techniques on the shadows formed in different directions to achieve effective character segmentation [21]. Leung and Malik provided a unified model to construct a vocabulary of prototype tiny surface patches with associated local geometric and photometric properties extracted from images under directional lights, and they studied a large collection of images of different materials to build a three-dimensional texton vocabulary [22]. They then used the vocabulary to characterize any material.

In this study, an MSC recognition algorithm based on multi-directional illumination is proposed. The algorithm analyzes the gray value difference of a point under different illumination directions and the difference is used to fuse the images to enhance the contrast between the stamping character and its background. Then, the fused image is preprocessed and single character images are divided and normalized. HOG features are used as the feature descriptor for a normalized single character. Finally, a backpropagation (BP) neural network is used to train and recognize MSCs from the extracted features to verify the effectiveness of the algorithm. Our system was implemented on Window 7 with a 16-GB RAM and an Intel 64 bit 3.40-GHz CPU. C/C++ support for Visual Studio (VS) is provided to enable cross-platform C and C++ development using VS (version 2013) on Windows. OpenCV (version 2.4.9) which mainly aimed at real-time computer vision was installed in Visual Studio environment.

Image acquisition system and method analysis

To obtain images of a target object under illumination from different directions, the image acquisition system shown in Fig. 1 was designed. Here, Fig. 1a is the designed light source system, which is composed of four symmetrically distributed bar light sources called L1, L2, L3, and L4. To distinguish each light source, it is identified by its azimuth angle. In the system layout, each light source was set as a positive light source, as shown in Fig. 1b, where h is the vertical distance between the light source and target and α is the irradiation direction of the light source. An industrial camera (acA 1600-20gm, Basler, Germany) was used. The output image of the camera is an 813 × 618 grayscale image. After the system has been set up, four sample images of the target object under different lighting directions can be obtained by controlling the acquisition sequence of the camera.

Fig. 1
figure 1

ag Image acquisition system and image samples obtained under multiple illumination directions

Because the radiation intensity of light is inversely proportional to the square of the radiation distance, the difference in the distance between a point on the surface of the target object and the light source will greatly influence the radiation intensity of the light sources received by each point. This will result in a sampled image with an uneven distribution of brightness contours under the illumination conditions of a close light source with a low angle, as shown in Fig. 1b. Figure 1dg shows the brightness contour of a target object illuminated from different directions. In each figure, the color change from red to blue represents the decreasing brightness caused by illumination direction. Further, it can be found by analyzing the brightness cloud map that the surface area near the light source is significantly brighter than that far from the light source. Moreover, there are highlight areas on the reflective panel of the recessed area of each stamping character, as the partial enlarged detail shown in Fig. 1d shows. This is because the brightness of light reflected from the phototropic face of the character area concave with respect to the camera is significantly higher than that reflected from the object surface and the backlit surface of the character recessed area. When the lighting direction is changed, the brightness level in the concave area of the stamped character also changes, as shown in the enlarged regions in the images of Fig. 1d, f. Because the brightness value in the grayscale image is directly related to the gray value of the pixel point, the change in gray value in each image will also follow the above rules. If the background information of each grayscale image can be removed and the high-luminance areas in the recessed areas that appear at various angular positions, where Φ = (0°, 90°, 180°, 270°), are extracted, these four images can be fused, as shown in Fig. 1c, and a contrast-enhanced image is obtained for character recognition.

Algorithms and experiments

Image fusion

To fuse images to enhance the contrast between characters and background, it is necessary to retain useful information in the recessed area caused by stamped characters and eliminate the information in the image background. The proposed method extracts and eliminates the background information using the following steps: First, a linear interpolation algorithm is used to reduce the size of the original image. Second, the reduced image is blurred using a median filter algorithm to eliminate the MSCs on the surface to obtain an approximate background image in which the overall brightness distribution of the image is preserved. Third, the linear interpolation algorithm is then used to enlarge the blurred image back to the original image size and is used as the background of the original image. Finally, setting the reference gray value of the image 128, the background is eliminated by subtracting the gray values of the original image from those of the background image. This can be described as follows:

$$ {I}_B\left(i,j\right)=\mathrm{Scale}\left(\mathrm{MedianBlur}\left(\mathrm{Scale}\left(I\left(i,j\right),\frac{1}{n}\right)\right),n\right), $$
$$ {I}_E\left(i,j\right)=128+\left(I\left(i,j\right)-{I}_B\left(i,j\right)\right), $$

where Scale(∙) represents the linear interpolation operator of the image, in which n is the magnification factor and 1/n is the minification factor. The value of n is associated with the width of a stamping character. Operator MedianBlur(∙) is the median blur filter for the image, I(i, j) is the original image, IB(i, j) is the grayscale background image, IE(i, j) is the grayscale image after background homogenization, and i and j are the pixel coordinates of the image.

Figure 2a shows the captured image of a target object at Φ = 180°. The approximate background image processed according to Eq. (1) is shown in Fig. 2b. Here, image zoom factor n was set to 11 according to the width of the sample’s stamping character, and the filtering kernel of the median filter algorithm was a 5 × 5 matrix. The line-scan method is used to analyze the variation of the gray values of the pixels in Fig. 2. Without loss of generality, the scan line S in Fig. 2a is taken as an example, and its corresponding scan line in the approximate background image (Fig. 2b) is S′. The behaviors of the pixel gray values of the original and approximated background image along this scan line are plotted as C and C′, respectively, as shown in Fig. 2c.

Fig. 2
figure 2

ac Result of image background extraction and comparison of scan lines from the original and background images

The curves in Fig. 2c show that because the light source is on the left side of the object (Φ = 180°), the gray value of the image progressively decreases from left to right. However, the existence of the surface recessed area caused by the stamping character in the original image (Fig. 2a) results in a sudden change in the grayscale level. This sudden change shows that the gray value at the backlit face of the recessed area is reduced sharply, as indicated by the hollow circles in Fig. 2c, while the gray level at the reflective surface is increased to different degree, as indicated by the solid circles in Fig. 2c. In the approximate background image (Fig. 2b), the image grayscale value curve C′ maintains the change in gray level more smoothly because the effects of the MSCs have been eliminated. This verifies the feasibility of image background extraction algorithm presented above.

Because the image background has been extracted, it can be subtracted from the original image using Eq. (2). The result and new pixel gray values for scan line S in Fig. 2a are shown in Fig. 3a. As this image shows, the fluctuations of the pixel gray values in the recessed areas are significantly stronger than those in the flat region image after the background has been eliminated, which verifies the effectiveness of the image background elimination algorithm. The image fusion is then implemented using the following equations.

$$ {\Delta}_1\left(i,j\right)=\mathrm{abs}\left({I}_E^0\left(i,j\right)-{I}_E^{180}\left(i,j\right)\right), $$
$$ {\Delta}_2\left(i,j\right)=\mathrm{abs}\left({I}_E^{90}\left(i,j\right)-{I}_E^{270}\left(i,j\right)\right), $$
$$ {I}_F\left(i,j\right)=128-\left({\Delta}_1\left(i,j\right)+{\Delta}_2\left(i,j\right)\right), $$

where, \( {I}_E^0\left(i,j\right),{I}_E^{90}\left(i,j\right),{I}_E^{180}\left(i,j\right), \) and \( {I}_E^{270}\left(i,j\right) \) are background-eliminated images for Φ = 0°, 90°, 180°, and 270°, respectively. Further, Δ1(i, j) is the difference between images \( {I}_E^0\left(i,j\right) \) and \( {I}_E^{180}\left(i,j\right), \) Δ2(i, j) is the difference between \( {I}_E^{90}\left(i,j\right) \) and \( {I}_E^{270}\left(i,j\right), \) and IF(i, j) is the image fusion result.

Fig. 3
figure 3

a, b Image gray values at a typical scan line

To verify the effectiveness of the image fusion algorithm, first, the background elimination processing was performed on the images obtained from Φ = 0°, 90°, 180°, and 270°, respectively. Then, the four images were fused using Eqs. (3)–(5), and the fusion result is shown in Fig. 3b. Comparing the images with and without fusion, it can be seen that in the stamped recessed areas of the fused image, there are unidirectional and stable fluctuations of the pixel grayscale value in the image whether on the reflecting surface or the backlit surface. This means that the contrast between the MSCs and the background is clearly enhanced, which is useful for locating the recessed areas and segmenting the MSCs.

Character segmentation and recognition

Character segmentation

The character segmentation process is shown in Fig. 4. First, the mean filtering algorithm is used to improve the smoothness of the fused image, then the Otsu method is adopted for binarization. The result is shown in Fig. 5a. In this figure, the red dashed circles indicate some connected components with small areas, which are usually noise in the image, while the green solid circle indicates a hollow space inside the character. These kinds of artifacts affect the accuracy of character segmentation. To eliminate their influence, the method proposed in this paper uses the connected component labeling method to traverse all of the connected components in the binary image to eliminate connected components generated by noise by determining their area [23, 24]. Then, the hollow parts of the binary image are filled with a morphological closing algorithm [25]. The result is shown in Fig. 5b. To accurately locate the character in the image, a horizontal projection function is used to draw the horizontal grayscale projection image of Fig. 5b. Its projection result is shown in Fig. 5c. The grayscale projection image generally consists of one or more independent projection peaks. Using the start and end points of each peak, Fig. 5b can be divided into corresponding sub-images, and then the target area can be located according to the number, length, width, aspect ratio, and area features of the connected component in the sub-images. The location results are shown in Fig. 5d.

Fig. 4
figure 4

Flow chart for character segmentation

Fig. 5
figure 5

ad Character string location and segmentation

Figure 6a is the segmentation result of a single line of characters from Fig. 5d. Because of the uncertainty in the stamping process (such as manual feeding and mold deformation), the MSCs inevitably have a certain amount of skew. Skew correction for the characters is necessary for improving the character recognition rate. To correct the skew, the connected components in Fig. 6a are marked, and the coordinates of the weighted center point of each connected component are calculated, as indicated by the orange dots in Fig. 6b. Then, the least squares method is used to fit these points to a straight line, as indicated by the yellow line in Fig. 6b. Using the slope of the straight line, the angle of the character string in the initial image is calculated, and then an affine transformation is used to correct the rotation, as shown in Fig. 6c. The connected component labeling algorithm is used again to separate single characters. The character segmentation results are shown in Fig. 6d.

Fig. 6
figure 6

ad Processes and results for single character segmentation

Character recognition

The HOG feature is a reliable method for capturing the gradient information on the borders of character strokes in an image as well as the shapes of character strokes that are crucial for text recognition, so it is adopted to describe the characteristics of a single character image [26]. In the method proposed in this paper, an M × N input image was taken, as shown in Fig. 7a as an example. To extract the HOG feature, the block shown in Fig. 7b with a size of mb × mb is used to traverse the input image in the horizontal and vertical directions with a step size of ms. The block is divided into four cells of equal size mc × mc. Because the gradient values in each cell have been calculated, the spatial histogram is calculated according to the gradient direction and amplitude by dividing the gradient direction into nine bins, as shown in Fig. 7b. Hereafter, the histogram features in four cells were merged into a 36-dimensional block feature. By traversing the image, a feature matrix composed of all the block features is obtained, as shown in Fig. 7c. A feature vector is then obtained by concatenating each row and column of the feature matrix, as shown in Fig. 7d. The feature vector describes the features of the entire input image. The dimension of the feature vector is determined by ms, mc, and mb, which have a substantial influence on the character recognition accuracy. This influence is discussed below.

Fig. 7
figure 7

ad HOG feature extraction process

HOG is a non-linear feature. To identify characters, a three-layer (namely the input layer, the output layer, and the hidden layer as shown in Fig. 8) BP neural network was used as a classifier for character recognition [27, 28]. x = [x1, x2, …, xd] is the input to the neural network, a = [a1, a2, …, aq] is the output value of the hidden layer and is also the input value of the next layer. y = [y1, y2, …, yn]T represents the output value of the neural network. \( {\boldsymbol{w}}_h^{\left[1\right]}={\left[{w}_{1h}^{\left[1\right]},\kern0.5em \dots \kern0.5em ,\kern0.5em {w}_{\mathrm{i}h}^{\left[1\right]},\dots, {w}_{\mathrm{d}h}^{\left[1\right]}\right]}^{\mathrm{T}} \) represents the connection weight of the neurons of the input layer and the hth neuron of the hidden layer. \( {\boldsymbol{w}}_j^{\left[2\right]}={\left[{w}_{1j}^{\left[2\right]},\kern0.5em {w}_{2j}^{\left[2\right]},\kern0.5em \dots \kern0.5em ,\kern0.5em {w}_{hj}^{\left[2\right]},\kern0.5em \dots \kern0.5em ,\kern0.5em {w}_{qj}^2\right]}^{\mathrm{T}} \) represents the weight between the neurons of the hidden layer and the jth neuron of the output layer.

Fig. 8
figure 8

BP neural network model

The BP algorithm is a monitored learning method through the gradient algorithm for solving the question of the weight, and the training course is stopped when the error function reduces to below a given tolerance. Then, the fixed structure of a BP model is obtained. The sigmoid function is employed as the activation function: \( f(x)=\frac{1}{1+{e}^{-x}} \). The number of input layer neurons of the neural network is the dimension d of the extracted HOG feature, the number of output layer neurons is the number of classifications l, and the number of hidden layer neurons q was calculated using the following empirical equation [29].

$$ q=\frac{dl+0.5l\left({l}^2+d\right)-1}{d+l} $$

Results and discussion

Effect of imaging parameters

The distance h between the light source and the target, the illumination direction α of the light source, and the light intensity IL all affect the imaging quality and thus the fusion result. According to the theory of light radiation intensity, geometric parameters h and α are mutually restricted, so this paper only discusses the influence of α and IL. Figure 9 shows the fusion results of three different samples at IL = 35 cd for α at 70°, 60°, 50°, 40°, 30°, and 20°, respectively. The results show that a change in α has little effect on the image fusion result of the target characters. Figure 10 shows the fusion results for the light intensity changes from 10 to 55 cd at the two extreme illumination directions, α = 70° and α = 20°, of the proposed system. A comparison shows that, when the light intensity is set to a low level, such as from 10 to 25 cd, the character image has low brightness and there are many hollow areas in the characters. As the light intensity increases, such as from 30 to 45 cd, the hollow areas gradually become smaller and fewer. As the light intensity further, such as at IL = 55 cd, the edges of the characters are blurred because excessive light intensity causes the difference between the brightness of the character’s edge and the image background to be small. In the final system, α is set to 30° and IL is set between 35 and 45 cd.

Fig. 9
figure 9

ac Image fusion results for different samples at different illumination directions

Fig. 10
figure 10

a, b Image fusion results for different samples at different illumination intensity at α = 20° and 70°

Adaptability analysis

Because of the complexity of the industrial environment, MSCs are inevitably subject to various disturbances and contamination. The disturbances and contaminants include the bumps, scratches, oil stains, and handprints left by manual transportation during transport; the rust due to long-term storage; a large amount of surface oxide caused by annealing or pits on the metal surface caused by shot-blasting; and the color of the characters or the background, which can change frequently during the painting or spraying process. To study the adaptability of the algorithm, experiments on the recognition of MSCs under different conditions were carried out in this study. Figure 11 shows sample color pictures under different initial conditions, while Fig. 12 presents the result obtained after fusion enhancement and binarization using the algorithm proposed in this paper. As can be seen from Fig. 12ac, the images after the fusion process can effectively suppress the slight disturbances shown in Fig. 11ac. In addition, for characters that are difficult to identify with the naked eye because of large areas of rust (Fig. 11d), large-scale oxide disturbance (Fig. 11e), and pit disturbance (Fig. 11f), some of the existing OCR methods did not obtain acceptable results. In contrast the algorithm proposed in this paper obtained better results (see Fig. 12df). In addition, the proposed algorithm also has strong robustness against color changes of the image. For example, for samples with different colors, as shown in Fig. 11g, h, the proposed algorithm demonstrates good adaptability. And experiments have shown that all of the following sample images can be accurately recognized by the proposed algorithm.

Fig. 11
figure 11

ah Color photos of samples with different conditions that can interfere with character recognition

Fig. 12
figure 12

ah Image enhancement results

Feature recognition analysis

The proposed vision system was applied to the regular inspection and manufacturing lines of civilian liquefied petroleum gas cylinders. The MSC number of the cylinders consists of 10 digits 0 to 9, 26 characters “a” to “z”, 1 special character “-”, so the character samples were divided into 37 categories. Each category had 75 characters by segmentation of the captured image, so the character sample had a total of 2775 characters. All character samples were divided into training and test sets. The number of samples of the training set was 1850, and the sample number of the test set was 925. Experiments were carried out on the four groups of HOG features with different parameters listed in Table 1. The recognition results and runtime for a single hidden layer neural network are shown in Table 2. These results show that the recognition rate with the proposed method is suitable for industrial online application, and the recognition rates of HOG1 and HOG3 are better than those of HOG2 and HOG4 because they have higher dimensions d, which mean that the HOG feature contains more information for feature discrimination.

Table 1 HOG features with different parameters
Table 2 Comparison of recognition performance under different feature conditions with BP neural network

To further demonstrate the performance of the proposed algorithm in this paper, the current research algorithms about metal stamping character are compared. The research of the other four algorithms is based on a single original image, and the proposed algorithm is based on multi-image fusion. As Table 3 shows, the proposed method obtains a superior character recognition accuracy of 99.6%.

Table 3 Recognition rates of different algorithms


In this study, an algorithm based on multi-directional lighting image fusion technology was proposed to enhance the contrast between MSCs and their background to improve the accuracy and stability of MSC segmentation and recognition. To evaluate the performance of the proposed model, the performance of the algorithm under different lighting conditions, image disturbances, and algorithm parameters was evaluated.

The obtained results have revealed that light intensity has a greater effect on the image fusion result than light direction. In fact, a light intensity value in the range of 35–45 cd is suitable for industrial application in the proposed system. The obtained results also revealed that the algorithm is robust to the interference of oil stains, rust, oxide, shot-blasting pits, and different background colors and can significantly enhance the contrast between MSCs and the background. Therefore, the proposed method can greatly reduce the dependence on the character segmentation and recognition algorithm. All of these results show that the proposed method is an effective method for identifying MSCs.

However, considering MSCs with different character depths and widths will result in different lighting requirements, and multiline MSCs will increase the difficulty for character segmentation, so we will further improve the proposed algorithm to improve its robustness and adaptability. These will be explored in the next study.





Histogram of oriented gradients


Metal stamping character


Optical character recognition


  1. A. Tonazzini, S. Vezzosi, L. Bedini, Analysis and recognition of highly degraded printed characters. Doc. Anal. Recognit. 6(4), 236–247 (2003)

    Article  Google Scholar 

  2. K. Elagouni, C. Garcia, Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation. Int. J. Doc. Anal. Recognit. 17(1), 19–31 (2014)

    Article  Google Scholar 

  3. R.F. Moghaddam, C.M. AdOtsu, An adaptive and parameterless generalization of Otsu’s method for document image binarization. Pattern Recogn. 45(6), 2419–2431 (2012)

    Article  Google Scholar 

  4. H. Caner, H.S. Gecim, A.Z. Alkar, Efficient embedded neural network based license plate recognition system. IEEE Trans. Veh. Technol. 57(5), 2675–2683 (2008)

    Article  Google Scholar 

  5. R. Panahi, I. Gholampour, Accurate detection and recognition of dirty vehicle plate numbers for high-speed applications. IEEE Trans. Intell. Transp. Syst. 18(4), 767–779 (2017)

    Article  Google Scholar 

  6. Y. Yuan, W. Zou, Y. Zhao, et al., A robust and efficient approach to license plate detection. IEEE Trans. Image Process. 26(3), 1102–1114 (2017)

    MathSciNet  Article  Google Scholar 

  7. M. Ryan, N. Hanafiah, An examination of character recognition on ID card using template matching approach. Procedia Comput. Sci. 59, 520–529 (2015)

    Article  Google Scholar 

  8. Wang N. G., Zhu X. W., Zhang J. Research of ID card recognition algorithm based on neural network pattern recognition. Proceedings of International Conference on Mechatronics, Electronic, Industrial and Control Engineering (Atlantis), 964–967, 2015

  9. B. Su, S. Lu, C.L. Tan, Robust document image binarization technique for degraded document images. IEEE Trans. Image Process. 22(4), 1408–1417 (2013)

    MathSciNet  Article  MATH  Google Scholar 

  10. R. Hedjam, R.F. Moghaddam, M. Cheriet, A spatially adaptive statistical method for the binarization of historical manuscripts and degraded document images. Pattern Recogn. 44(9), 2184–2196 (2011)

    Article  Google Scholar 

  11. J.M. Li, C.H. Lu, X.Y. Li, A novel segmentation method for characters pressed on label based on layered images. J. Optoelectron. Laser 19(6), 818–822 (2008)

    Google Scholar 

  12. C. Gao, Y.X. Chang, Y.C. Guo, Study on binarization algorithm for the mechanical workpiece digital recognition. Opto-Electron. Eng. 37(6), 1–5 (2010)

    Google Scholar 

  13. Ristic D., Vuppala S. K., Graser A. Feedback control for improvement of image processing: an application of recognition of characters on metallic surfaces. IEEE International Conference on Computer Vision Systems (ICVS’06), 39–45. 2006

  14. J.M. Li, C.H. Lu, G.P. Li, Novel feature extraction method for raised or indented characters based on Gabor transform. J. Syst. Simul. 20(8), 2133–3136 (2008)

    Google Scholar 

  15. C. Quan, X.Y. He, C.F. Wang, et al., Shape measurement of small objects using LCD fringe projection with phase shifting. Opt. Commun. 189(1), 21–29 (2001)

    Article  Google Scholar 

  16. C. Wei, C.H. Lu, Z. Shen, Segmentation of embossed characters pressed on metallic label based on surface normal texture. Int. J. Adv. Comput. Technol. 4(19), 332–340 (2012)

    Google Scholar 

  17. Ikehata S., Wipf D., Matsushita Y., et al. Robust photometric stereo using sparse regression. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 318–325. 2012

  18. C. Tsiotsios, A.J. Davison, Near-lighting Photometric Stereo for unknown scene distance and medium attenuation. Image Vis. Comput. 57, 44–57 (2017)

    Article  Google Scholar 

  19. Y. Liao, X. Weng, C.W. Swonger, et al., Defect detection and classification of machined surfaces under multiple illuminant directions. Int. Soc. Opt. Photonics 7798, 1–16 (2010)

    Google Scholar 

  20. F.P. León, Model-based inspection of shot-peened surfaces using fusion techniques. Machine Vision and Three-Dimensional Imaging Systems for Inspection and Metrology. Int. soc. opt. Photonics 4189, 41–53 (2001)

    Google Scholar 

  21. Racky J., Pandit M. Active illumination for the segmentation of surface deformations. Proceedings of International Conference on Image Processing (Los angeles), 41–45, 1999

  22. T. Leung, J. Malik, Representing and recognizing the visual appearance of materials using three-dimensional textons. Int. J. Comput. Vis. 43(1), 29–44 (2001)

    Article  MATH  Google Scholar 

  23. K. Appiah, A. Hunter, P. Dickinson, et al., Accelerated hardware video object segmentation: from foreground detection to connected components labelling. Comput. Vis. Image Underst. 114(11), 1282–1291 (2010)

    Article  Google Scholar 

  24. Yeong L. S., Ang L. M., Seng K. P. Efficient connected component labelling using multiple-bank memory storage. IEEE International Conference on Computer Science and Information Technology (ICCSIT), 9: 75–79, 2010

  25. R.M. Haralick, S.R. Sternberg, X. Zhuang, Image analysis using mathematical morphology. IEEE Trans. Pattern Anal. Mach. Intell. 4, 532–550 (1987)

    Article  Google Scholar 

  26. N. Dalal, B. Triggs, Histograms of oriented gradients for human detection. IEEE Comput. Soc. Conf. Comp. Vis. Pattern Recognit. (CVPR) 1, 886–893 (2005)

    Google Scholar 

  27. D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by back-propagating errors. Nature 323(6088), 533 (1986)

    Article  MATH  Google Scholar 

  28. Z. Zhang, C. Wang, The research of vehicle plate recognition technical based on BP neural network. AASRI Procedia 1, 74–81 (2012)

    Article  Google Scholar 

  29. L.N. Fan, X.W. Han, G.Y. Zhang, Image Processing and Pattern Recognition (Science press, Beijing, 2007)

    Google Scholar 

  30. Q. Wang, Study on Segmentation and Recognition Technology of Low Quality Pressed Characters (Shangdong University, Jinan, 2015)

  31. G.P. Li, Study on Image Acquisition and Recognition for Protuberant Character on Label Based on Moire Technology (Shangdong University, Jinan, 2007)

Download references


The authors thank the editor and anonymous reviewers for their helpful comments and valuable suggestions.

About the authors

Zhong Xiang received the B.A.Eng. and Dr. Eng. degrees from the Zhejiang University, Hangzhou, China, in 2005 and 2010, respectively. He is currently an associate professor with the Faculty of Mechanical Engineering and Automation, Zhejiang Sci-Tech University, Hangzhou, China. His research interests include fiber composite reinforcement, inspection and manufacturing equipment design, and development for different pressure vessels.

Zhaolin You received the B.A.Eng. degree from the Zhejiang Sci-Tech University, Hangzhou, China, in 2017. He is currently working toward the M.S. degree in the Faculty of Mechanical Engineering and Automation, Zhejiang Sci-Tech University, Hangzhou, China. His research interests include image processing, pattern recognition, and computer graphics.

Miao Qian received the B.A.Eng. degree from Jilin University, Jilin, China, in 2008, and the Ph.D. degree from Zhjiang University, Zhejiang, China, in 2015. He is currently an associate professor with the Faculty of Mechanical Engineering and Automation, Zhejiang Sci-Tech University, Hangzhou, China. His research interests include image processing, pattern recognition, and computer graphics.

Jianfeng Zhang received the B.A.Eng. degree from Yancheng Institute Of Technology, Jiangsu, China in 2015, and the M. S. degree from Zhejiang Sci-Tech University, Hangzhou, China, in 2018. His research interests include image processing, pattern recognition, and computer graphics.

Xudong Hu received the B.A.Eng. and M.S.Eng. degrees from Zhejiang Sci-Tech University, Hangzhou, China, and the Ph.D. degree from Zhejiang University, Hangzhou, China. He is currently a professor with the Faculty of Mechanical Engineering and Automation, Zhejiang Sci-Tech University, Hangzhou, China. His research interests include image processing, pattern recognition, computer graphics, fiber composite reinforcement, and inspection and manufacturing equipment design.


This work was supported by the National Natural Science Foundation of China [grant numbers U1609205 and 51605443], the Public Welfare Technology Application Projects of Zhejiang Province [grant number 2017C31053], and the 521 Talent Project of Zhejiang Sci-Tech University.

Availability of data and materials

Please contact the author for data requests.

Author information

Authors and Affiliations



All authors take part in the discussion of the work described in this paper. The author ZX conceived the idea, developed the method, and conducted the experiment. ZY, MQ, JZ, and XH were involved in the extensive discussions and evaluations, and all authors read and approved the final manuscript.

Corresponding author

Correspondence to Miao Qian.

Ethics declarations

Competing interests

The authors declare that they have no competing interests. We confirm that the content of the manuscript has not been published or submitted for publication elsewhere.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Xiang, Z., You, Z., Qian, M. et al. Metal stamping character recognition algorithm based on multi-directional illumination image fusion enhancement technology. J Image Video Proc. 2018, 80 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Metal stamping characters (MSCs)
  • Multi-directional illumination
  • Image fusion
  • Character segmentation
  • Character recognition