Foreign object debris material recognition based on convolutional neural networks
© The Author(s). 2018
Received: 6 April 2017
Accepted: 15 March 2018
Published: 3 April 2018
The material attributes of foreign object debris (FOD) are the most crucial factors to understand the level of damage sustained by an aircraft. However, the prevalent FOD detection systems lack an effective method for automatic material recognition. This paper proposes a novel FOD material recognition approach based on both transfer learning and a mainstream deep convolutional neural network (D-CNN) model. To this end, we create an FOD image dataset consisting of images from the runways of Shanghai Hongqiao International Airport and the campus of our research institute. We optimize the architecture of the D-CNN by considering the characteristics of the material distribution of the FOD. The results show that the proposed approach can improve the accuracy of material recognition by 39.6% over the state-of-the-art method. The work here will help enhance the intelligence capability of future FOD detection systems and encourage other practical applications of material recognition technology.
To reduce or eliminate FOD damages, certain companies have developed FOD detection systems, such as the Tarsier system by QinetiQ, FODetect by Xsight, and iFerret by Stratech . All these systems use a camera to take a photograph of suspicious FOD, and then, the photographs are verified by human experts. These systems have been commercially deployed in a few airports but have not achieved large-scale global usage. One main reason for this low-level deployment is that the final FOD verification step relies exclusively on recognition by a human expert, which has two disadvantages. The first disadvantage is that reliable verification requires a capable and experienced official, which incurs additional cost for the airport authority. For example, the Vancouver Airport filled this position with an employee from its FOD vendor. The second disadvantage is that people’s recognition capability is not completely trustworthy because they are inevitably fatigued from time to time.
Han et al. [5, 6] worked on FOD object recognition using a support vector machine (SVM) and random forest. FOD object recognition is to identify what the FOD is. Unfortunately, the exact nature of FOD is varied because FOD can be composed of any object, any color and any size. Over 60% of the FOD items are made of metal. Therefore, recognition of the FOD material constitution has much greater practical significance than object recognition.
Material recognition is a fundamental problem in computer vision. In contrast with the several decades of object recognition research, material recognition has only begun receiving attention in recent years. It is a flourishing and challenging field. The approaches to material recognition can be broadly categorized as hand-crafted or automatic feature extraction. Hand-crafted approaches can be further divided into surface reflectance [7–11], 3D texture [12–19], and feature fusion [20–22] approaches. Automatic feature extraction approaches refer to those that involve acquiring image features using a deep convolutional neural network (D-CNN) [23–26].
However, past research results remain inadequate to meet the demands of FOD material recognition. First, there is no specific FOD dataset for the task because of the unique airport environment. Although Bell et al.  used more than 300 million image patches for training, the images were acquired mainly in indoor environments where light conditions are quite different from the FOD emergence locations. The results were hence quite poor when these 300 million image patches were used for training while FOD images were used for testing (please refer to the “Section 4” for details). Second, a high-recognition ratio is necessary for metal recognition. Metallic objects are far more harmful than other materials. Meanwhile, 60% of FOD is constituted by metal . However, according to prior results [19, 21], the recognition rate was quite low for metallic objects.
This paper proposes a novel FOD material recognition approach based on transfer learning and a mainstream deep convolutional neural network (D-CNN) model. This paper describes an FOD image dataset consisting of images taken on the runways of Shanghai Hongqiao International Airport and the campus of our research institute. The dataset consisted of 3470 images divided into three categories by material: metal, concrete, and plastic. The proposed approach is optimized to recognize metal because of the high risk that is due to its high-damage level to aircrafts and its high occurrence frequency in airports.
This research will help improve the intelligence capability, the ease of using, and the user experience of FOD detection systems. It will also encourage more applications of material recognition systems, especially in security and manufacturing, such as construction site management [27, 28].
The rest of this paper is organized as follows: Section 2 introduces related work, and our approach is described in Section 3. Section 4 presents a discussion of the experiment results, and Section 5 summarizes our conclusion and plan for future work.
2 Related work
Material recognition, a fundamental problem in computer vision, has a wide range of applications. For example, an autonomous vehicle or a mobile robot can make decisions on whether a forthcoming terrain is asphalt, gravel, ice, or grass. A cleaning robot can distinguish among wood, tile, or carpet. The approaches to material recognition are broadly divided into two categories according to feature extraction methods: hand-crafted features and automatic features. Hand-crafted approaches can be further divided into surface reflectance-based, 3D texture-based, and feature fusion-based approaches. Automatic feature extraction approaches refer to those acquiring image features through a D-CNN.
The most popular formalization for model surface reflectance is the bidirectional reflectance distribution function (BRDF). This function defines the amount of light reflected at a given point on a surface for any combination of incidence and reflection angles . The BRDF has a parametric type [29, 30] and an empirical type [7–9, 11]. Parametric BRDF models cannot acquire a broad set of real-world reflectance properties. In contrast, empirical BRDF models always require prior knowledge, such as illumination conditions, geometry, and surface material properties. Such prior knowledge cannot be expected to be available for real-world images. Zhang et al.  introduced an empirical model based on a reflectance disk and reflectance hashing. The reflectance disk, a measurement of the surface property, was built using a customized camera apparatus. Gaussian low-pass filters, Laplacian filters, and gradient filters were applied to the reflectance disk. Textons, referring to fundamental micro-structures in natural images, were computed by k-means clustering on the output of the filter banks. Following this approach, texton boosting and reflectance hashing were employed for feature selection and image classification. This approach is not feasible for real-world images, as reflectance disks are generated by a customized apparatus in a laboratory environment. Moreover, different surface materials may exhibit similar reflectance phenomena: for example, plastic, glass, and wax are translucent. Therefore, fulfilling the goal of material recognition only by using surface reflectance properties appears to be difficult.
Three-dimensional texture refers to surface roughness that can be resolved by the human eye or a camera. Such texture-based approaches follow the feature extraction-and-classification routine. Various researchers used a number of descriptors to extract the local features of an image. For example, some studies [12–17] used the maximum response, one study  used sorted random projections, and another study  applied a kernel descriptor for this purpose. These feature vectors were then fed into a classifier, usually SVM, latent Dirichlet allocation (LDA), or nearest neighbor. These approaches were designed to obtain salient results on CUReT [15–17], ETHTIPS [12–14], and FMD . However, these datasets are inappropriate for FOD material recognition tasks. The images from CUReT and ETHTIPS datasets were captured using a customized apparatus in an ideal laboratory environment. These images not only had different appearance with real-world images but also were unobtainable in daily life. The FMD dataset is composed of real-world images from the website Flickr. However, the FMD dataset suffers three downsides with regard to FOD recognition: (1) Few samples are FOD alike. (2) The photos are barely taken outdoors. (3) There is a lack of intentional collection of images of metal, concrete, and plastic materials.
Sharan et al. [20–22] combined reflectance and 3D texture into new fused features as input to an LDA or an SVM classifier. They chose four groups of features, namely, color and texture (e.g., Color, Jet, and SIFT), micro-texture (e.g., Micro-Jet and Micro-SIFT), shape (e.g., curvature), and reflectance (Edge-Slice and Edge-Ribbon). As the previous work, this research was also performed on an FMD dataset that made it unfeasible for FOD material recognition tasks.
Since Hinton’s monumental work  in 2006, deep learning has received considerable attention in both academia and industry because of its superior performance over other machine learning methods. He et al.  used a 152-layer D-CNN to obtain a 3.57% error rate on the ILSVRS2015 dataset. The result was better than the error of 5.1% incurred by humans . Researchers have attempted to apply D-CNNs to automatically extract image features to achieve material recognition. Cimpoi et al. [23, 24] proposed the Fisher-vector CNN via amelioration of the pooling layer in the D-CNN. They reported a considerable amount of improvement over the work by Sharen et al. [21, 22]. Bell et al.  proposed a new dataset, the Material-in-context Database (MINC), for material recognition based on Imagenet . They achieved an impressive recognition accuracy of 85.2% by utilizing Alexnet  and GoogLeNet models . Zhang et al.  assumed that the features for object recognition could be helpful for material recognition to some extent and integrated features learned from the ILSVRC2012  and the MINC . The results were state-of-the-art, as expected. However, the MINC dataset was built using images taken from indoor environments, which are unsuitable for FOD material recognition.
3.1 Dataset construction
The Columbia–Utrecht Reflectance and Texture Database (CUReT) [15, 17], the KTH-TIPS , the Flickr Material Database (FMD) , and the Material-in-context Database (MINC)  are open datasets for material recognition. CUReT consists of 61 textures imaged under 205 diverse illumination and angle conditions. KTH-TIPS has 11 material categories, each category with four samples imaged under various conditions. Images in the FMD dataset are from the Flicker website. This dataset has 1000 images of 10 material categories. MINC has approximately 300 million image patches tailored from ImageNet. As stated in Section 2, these four datasets are improper for FOD material recognition.
We choose metal, plastic, and concrete as three typical FOD materials to construct the dataset. According to FAA’s AC 150/5220-24 , these materials appear most frequently on runways and taxiways. Furthermore, metallic FOD constitutes approximately 60% of all FOD. Metal and plastic may exhibit similarly intense reflectance phenomena under strong light in outdoor environments. To complicate things further, these images are taken on runways or taxiways, which are made of concrete. The concrete background of metal or plastic images poses a tremendous challenge to distinguishing these two materials from the background. It is similar with detecting Uyghur language text from complex backgrounds . Therefore, careful treatment is imperative to recognize metal, plastic, and concrete correctly.
Statistics of the FOD dataset
Campus road training
Airport runway and campus road testing
The FOD dataset introduced in this paper was different in three aspects from previous datasets. First, there was a significant extent of intra-class variations for each material category. Images belonging to the same material category usually had completely different shapes or even identities. Second, all images had concrete as the background, emulating the circumstances in airports. Third, all images were captured in outdoor environments.
3.2 Choice of the D-CNN model
A D-CNN, an extremely efficient and automatic feature-learning approach, transforms an original input to a higher-level and more abstract representation using non-linear models. A D-CNN is composed of multiple convolution layers, pooling layers, fully connected layers, and classification layers. The network parameters are optimized through the back-propagation algorithm.
D-CNNs have a broad set of applications in image classification, object recognition, and detection. Glasssix  trained a CNN with an improved ResNet34 layer and obtained 99.83% accuracy on the famous LFW face recognition database. Considering the scale, context, sampling, and deep combined convolutional networks, the BDTA team won the championship of the ILSVRC2017 object detection task. The Subbmission4 model provided by BDTA can detect 85 object categories and achieved a 0.73 mean average precision on DET task 1a (object detection with provided training data) . Chen et al. provided an effective CNN named Dual Path Networks for object localization and object classification, which obtained a 6.2% localization error rate and a 3.4% classification error rate on the ILSVRC2017 object localization task . Yan et al. provided a supervised hash coding with deep neural network for environment perception of intelligent vehicles, and the proposed method can obviously improve the search accuracy .
Detailed descriptions of AlexNet, VGG-16, and GoogLeNet
Input (RGB image)
Convolution (kernel size/stride)
Convolution (kernel size/stride)
Convolution (kernel size/stride)
Convolution (kernel size/stride)
Convolution (kernel size/stride)
Max. pool 2*2/2
Average pool 7*7/1
With the increase of network depth, the recognition accuracies of VGG and GoogLeNet on outdoor metal images shown in Fig. 8 are reduced. We conjecture that ResNet  may have a low accuracy rate for FOD images from an outdoor environment. Thus, we did not perform experiments using ResNet on the FOD material dataset.
3.3 Transfer learning
The technique of transfer learning is applied in this paper to avoid the overfitting problem. Transfer learning is literally defined as the transfer of knowledge learned in one domain to another domain. The technique is especially useful for the D-CNN models because of their high demand in terms of the huge amount of human-labeled training data [43, 44]. Without sufficient training data, the D-CNN models tend to be over-fitted. It would be truly favorable to reduce the need and effort to collect, clean, and label a large amount of data with the help of transfer learning.
In this paper, the parameters of the improved AlexNet model are initialized by those trained from MINC. This model continues to be trained by fine-tuning the weights of all layers based on the FOD dataset discussed in Section 4. It is observed that earlier layers’ features of a D-CNN entail more generic features (e.g., edge detectors or color blob detectors) that are reusable for many tasks . In addition, later layers of the D-CNN contain details more specific in the original dataset, e.g., MINC. The weights of later layers should be optimized more than the ones of earlier layers with the help of the new dataset, e.g., the FOD dataset. Therefore, the dedicated choice of weights’ initialization is equivalent to shortening the distance from the starting point to the optimum, which helps avoid the overfitting problem.
Transfer learning has achieved a wide range of applications in many tasks. In the recognition task, Reyes et al. pre-trained a CNN using 1.8 million images and used a fine-tuning strategy to transfer learned recognition capabilities from the general domains to the specific challenge of the Plant Identification task . Bell et al. trained all of their CNNs for material recognition by fine-tuning the network starting from the weights obtained on 1.2 million images from ImageNet (ILSVRC 2012) . In object detection, OverFeat , the winner of the location task of ILSVRC2013, also used transfer learning. Google DeepMind used transfer learning to solve complex sequences of tasks .
3.4 Improved D-CNN based on AlexNet
Inspired by transfer learning, an improved D-CNN based on AlexNet is described in this section. The improved D-CNN model has an additional fully connected layer appended to the model as the last layer. It shares the first eight layers with AlexNet; hence, the model consists of five convolution layers and four fully connected layers. The ninth layer has three neuronal nodes, indicating that the network has three material tag outputs. We use softmax loss as the classifier. The detailed network structure is shown in Table 2. The experiments were conducted on the Caffe framework with Nvidia Tesla K20 GPU card. Caffe  is a highly effective framework for deep learning, such as Yan et al.’s framework [50–53] for HEVC coding unit. Using the FOD training dataset, we fine-tuned the improved D-CNN based on pre-training the weights in MINC. Our implementation for FOD material recognition follows the practice in Krizhevsky’s and He’s papers[32, 35]. During training, the inputs to the improved D-CNN were fixed-size 224 × 224 RGB images. The batch was set to 256, the momentum was set to 0.9, the weight decay (the L2 penalty multiplier) was set to 0.5, and the learning rate was set to 0.001. In total, the learning rate was decreased three times, and the learning was stopped after 20-K iterations. The FOD testing dataset was used for the FOD material recognition test after the fine-tuning stage. All of our experiment base above hyperparameters achieved state-of-the-art results.
4 Experimental results and discussion
4.1 Improved D-CNN based on AlexNet
Outdoor metal image test results for AlexNet, VGG-16, and GoogLeNet trained on the MINC dataset
4.2 Results of the improved model
In this section, we compared the performance of the three D-CNN models. The first was AlexNet with parameters trained by MINC, abbreviated as AM. The second was AlexNet with parameters trained by both the MINC and FOD datasets—this model was called AMF. The third model was the improved model shown in Table 2 with parameters trained by the MINC and FOD datasets, abbreviated as IAMF. All experiments were conducted within the framework of Caffe.
The dataset description is given in Table 1. To guarantee that the testing dataset was comparable to practical situations, the items in the FOD testing dataset must meet the following criteria: All samples in the FOD testing dataset should have been collected at the airport or the institutional campus. The testing samples did not overlap with those used for training. Furthermore, samples had various appearances within the same material category. Please refer to Fig. 3 for the testing samples.
Confusion matrix of the IAMF
We found that the ability to discriminate material was based on the degree of neuronal excitation. For example, a neuron might have been excited for metal but not for plastic. To judge the discrimination ability of the D-CNN model, we observed the distributions of the different neurons’ excitations. The higher the value of a certain neuron’s excitation compared to others, the stronger the ability to discriminate a certain material. For example, according to the red circles in Fig. 10, the values of Neuron 3 were better than the values of Neuron 1 and Neuron 2 for concrete images (image ID 341–440). Thus, the AFM model had stronger ability to discriminate concrete. According to the green circles in Fig. 10, compared with the AM model and the AMF model, Neuron 1 of the IAMF model had better excitation values than the other neurons for metal images (image ID 1–105). As a result, the IAMF model had stronger discrimination ability for metal than other models. Besides, the IAMF model had a more concentrated neuron excitation distribution, indicating that the IAMF model had a more stable discrimination ability for FOD material. Therefore, the discrimination abilities of the three D-CNN models for FOD gradually increased from the AM model to the IAMF model. The result also confirmed the effectiveness of the IAMF model.
FOD material recognition is a challenging and significant task that must be performed to ensure airport safety. The general material recognition dataset is not applicable to FOD material recognition. Therefore, a new FOD dataset was constructed in this study. The FOD dataset was different from previous material recognition datasets in that all training and testing samples were collected in outdoor environments, e.g., on a runway, on a taxiway, or on campus. We compared the performances of three well-known D-CNN models on the new dataset. The results were far from acceptable, especially for the recognition of metal, which accounts for 60% of all FOD. An improved D-CNN model was then introduced and compared with AlexNet. The new model achieved a 38.6% improvement over AlexNet in terms of the recognition of metal FOD.
We also inferred that concrete backgrounds can adversely affect the FOD material recognition performance, leading to the misclassification of metal or plastic as concrete. Therefore, our future work will investigate possible approaches to introduce image segmentation to distinguish metal and plastic from concrete. Other technologies, such as radar or infrared imaging, may be required for better recognition results.
This work was supported by the National Natural Science Foundation of China (No. 61170155).
This work was also supported by the Science & Technology Commission of Shanghai Municipality (No. 15DZ1100502).
Availability of data and materials
The FOD material dataset can be downloaded from Google Drive.
HYX proposed the framework of this work and drafted the manuscript. ZQH designed the proposed algorithm and performed all the experiments. SLF and YCF offered useful suggestions and helped in modifying the manuscript. ZH helped in drafting the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- M. J. O'Donnell, “Airport foreign object debris (FOD) detection equipment,” FAA, AC (150/5220) vol. 24 (2009).Google Scholar
- WIKI, Foreign object damage. https://en.wikipedia.org/wiki/Foreign_object_damage. Accessed Nov 2016.
- CAAC Airport Division, et al., “FOD Prevention Manual,” (2009)Google Scholar
- H Zhang et al., The current status and inspiration of FOD industry in international civil aviation. Civ. Aviat. Manage. 295, 58–61 (2015)Google Scholar
- Z Han et al., A novel FOD classification system based on visual features, In International Conference on Image and Graphics, LNCS Vol. 9217 (2015), pp. 288–296. https://doi.org/10.1007/978-3-319-21978-3_26 Google Scholar
- Z Han, Y Fang, H Xu, Fusion of low-level feature for FOD classification, In 10th International Conference on Communications and Networking in China (2015), pp. 465–469. https://doi.org/10.1109/CHINACOM.2015.7497985 Google Scholar
- M Jehle, C Sommer, B Jähne, Learning of optimal illumination for material classification, In 32nd Annual Symposium of the German Association for Pattern Recognition, LNCS Vol. 6376 (2010), pp. 563–572. https://doi.org/10.1007/978-3-642-15986-2_57 Google Scholar
- C Liu, J Gu, Discriminative illumination: per-pixel classification of raw materials based on optimal projections of spectral BRDF. IEEE Trans. Pattern Anal. Mach. Intell. 36(1), 86–98 (2014). https://doi.org/10.1109/TPAMI.2013.110 View ArticleGoogle Scholar
- C Liu, G Yang, J Gu, Learning discriminative illumination and filters for raw material classification with optimal projections of bidirectional texture functions, In Computer Vision and Pattern Recognition (2013), pp. 1430–1437. https://doi.org/10.1109/CVPR.2013.188 Google Scholar
- H Zhang, K Dana, K Nishino, Reflectance hashing for material recognition, In Computer Vision and Pattern Recognition (2015), pp. 3071–3080. https://doi.org/10.1109/CVPR.2015.7298926 Google Scholar
- J Filip, P Somol, Materials classification using sparse gray-scale bidirectional reflectance measurements, International Conference on Computer Analysis of Images and Patterns, LNCS. Vol. 9257 (2015), pp. 289–299. https://doi.org/10.1007/978-3-319-23117-4 Google Scholar
- E Hayman, B Caputo, M Fritz, JO Eklundh, On trhe significance of real-world conditions for material classification, In European Conference on Computer Vision, LNCS. Vol. 3024 (2004), pp. 253–266. https://doi.org/10.1007/978-3-540-24673-2_21 MATHGoogle Scholar
- B Caputo, E Hayman, M Fritz, JO Eklundh, Classifying materials in the real world. Image Vis. Comput. 28(1), 150–163 (2010). https://doi.org/10.1016/j.imavis.2009.05.005 View ArticleGoogle Scholar
- B Caputo, E Hayman, P Mallikarjuna, Class-specific material categorization, In 10th IEEE International Conference on Computer Vision, Proc. IEEE Int. Conf. Comput. Vision II (2005), pp. 1597–1604. https://doi.org/10.1109/ICCV.2005.54 Google Scholar
- M Varma, A Zisserman, Classifying images of materials: achieving viewpoint and illumination independence, In European Conference on Computer Vision, LNCS Vol. 2352 (2002), pp. 255–271. https://doi.org/10.1007/3-540-47977-5_17 MATHGoogle Scholar
- M Varma, A Zisserman, A statistical approach to material classification using image patch exemplars. IEEE Trans. Pattern Anal. Mach. Intell. 31(11), 2032–2047 (2009). https://doi.org/10.1109/TPAMI.2008.182 View ArticleGoogle Scholar
- M Varma, A Zisserman, A statistical approach to texture classification from single images. Int. J. Comput. Vis. 62(1-2), 61–81 (2005). https://doi.org/10.1023/B:VISI.0000046589.39864.ee View ArticleGoogle Scholar
- L Liu, PW Fieguth, D Hu, Y Wei, G Kuang, Fusing sorted random projections for robust texture and material classification. IEEE Trans. Circuits. Syst. Vid. Technol. 25(3), 482–496 (2015). https://doi.org/10.1109/TCSVT.2014.2359098 View ArticleGoogle Scholar
- D Hu, L Bo, Toward robust material recognition for everyday objects, In British Machine Vision Conference, Proc. BMVC (2011), pp. 1–11. https://doi.org/10.5244/C.25.48 Google Scholar
- C Liu, L Sharan, EH Adelson, R Rosenholtz, Exploring features in a Bayesian framework for material recognition, In 2010 IEEE Conference on Computer Vision and Pattern Recognition, Proc. CVPR (2010), pp. 239–246. https://doi.org/10.1109/CVPR.2010.5540207 Google Scholar
- L Sharan, C Liu, R Rosenholtz, EH Adelson, Recognizing materials using perceptually inspired features. Int. J. Comput. Vis. 103(3), 348–371 (2013). https://doi.org/10.1007/s11263-013-0609-0 View ArticleMATHMathSciNetGoogle Scholar
- L Sharan, R Rosenholtz, EH Adelson, Accuracy and speed of material categorization in real-world images. J. Vis. 14(9), 1–24 (2014). https://doi.org/10.1167/14.9.12 View ArticleGoogle Scholar
- M Cimpoi et al., Describing textures in the wild, In 2014 IEEE Conference on Computer Vision and Pattern Recognition, Proc. CVPR (2014), pp. 3606–3613. https://doi.org/10.1109/CVPR.2014.461 Google Scholar
- M Cimpoi, S Maji, A Vedaldi, Deep filter banks for texture recognition and segmentation, In 2015 IEEE Conference on Computer Vision and Pattern Recognition, Proc. CVPR (2015), pp. 3828–3836. https://doi.org/10.1109/CVPR.2015.7299007 Google Scholar
- S Bell, P Upchurch, N Snavely, K Bala, Material recognition in the wild with the materials in context database, In 2015 IEEE Conference on Computer Vision and Pattern Recognition, Proc. CVPR (2015), pp. 3479–3487. https://doi.org/10.1109/CVPR.2015.7298970 Google Scholar
- Y. Zhang et al., “Integrating deep features for material recognition,” (2015) [arXiv:1511.06522].Google Scholar
- H Son, C Kim, N Hwang, C Kim, Y Kang, Classification of major construction materials in construction environments using ensemble classifiers. Adv. Eng. Inform. 28(1), 1–10 (2014). https://doi.org/10.1016/j.aei.2013.10.001 View ArticleGoogle Scholar
- A Dimitrov, M Golparvar-Fard, Vision-based material recognition for automated monitoring of construction progress and generating building information modeling from unordered site image collections. Adv. Eng. Inform. 28(1), 37–49 (2014). https://doi.org/10.1016/j.aei.2013.11.002 View ArticleGoogle Scholar
- JJ Koenderink et al., Bidirectional reflection distribution function of thoroughly pitted surfaces. Int. J. Comput. Vis. 31(2-3), 129–144 (1999). https://doi.org/10.1023/A:1008061730969 View ArticleGoogle Scholar
- M Oren, Generalization of the Lambertian model and implications for machine vision. Int. J. Comput. Vis. 14(3), 227–251 (1995). https://doi.org/10.1007/BF01679684 View ArticleGoogle Scholar
- GE Hinton et al., A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006). https://doi.org/10.1162/neco.2006.18.7.1527 View ArticleMATHMathSciNetGoogle Scholar
- K. He et al., “Deep residual learning for image recognition,” (2015) [arXiv:1512.03385].Google Scholar
- O Russakovsky et al., ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y View ArticleMathSciNetGoogle Scholar
- J Deng et al., ImageNet: a large-scale hierarchical image database, In 2009 IEEE Conference on Computer Vision and Pattern Recognition, Proc. CVPR (2009), pp. 248–255. https://doi.org/10.1109/CVPR.2009.5206848].Google Scholar
- A Krizhevsky, I Sutskever, GE Hinton, ImageNet classification with deep convolutional neural networks, In 26th Annual Conference on Neural Information Processing Systems, Adv. Neural Inf. Proces. Syst (2012), pp. 1097–1105Google Scholar
- C Szegedy et al., Going deeper with convolutions, In 2015 IEEE Conference on Computer Vision and Pattern Recognition, Proc. CVPR (2015), pp. 1–9. https://doi.org/10.1109/CVPR.2015.7298594].Google Scholar
- C Yan et al., Effective Uyghur language text detection in complex background images for traffic prompt identification. IEEE Trans. Intell. Transport. Syst. 19(1), 220–229 (2018). https://doi.org/10.1109/TITS.2017.2749977 View ArticleGoogle Scholar
- Labeled Faces in the Wild. http://vis-www.cs.umass.edu/lfw/results.html#glasssix. Accessed Sept 2017.
- Large Scale Visual Recognition Challenge 2017 (ILSVRC2017), Task 1a: Object detection with provided training data. http://image-net.org/challenges/LSVRC/2017/results. Accessed Sept 2017.
- Y. Chen, J. Li, H. Xiao et al., “Dual path networks,” (2017) [arXiv:1707.01629].Google Scholar
- C Yan et al., Supervised hash coding with deep neural network for environment perception of intelligent vehicles, IEEE Transactions on Intelligent Transportation Systems (2018). https://doi.org/10.1109/TITS.2017.2749965 Google Scholar
- K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” (2014) [arXiv:1409.1556].Google Scholar
- L Shao, F Zhu, X Li, Transfer learning for visual categorization: a survey. IEEE Trans. Neural Netw. Learn. Syst. 26(5), 1019–1034 (2015)View ArticleMathSciNetGoogle Scholar
- F Zhuang et al., Survey on transfer learning research. J. Software 26(1), 26–39 (2015). https://doi.org/10.13328/j.cnki.jos.004631 MathSciNetGoogle Scholar
- Y Sun, X Wang, X Tang, Deeply learned face representations are sparse, selective, and robust, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Proc. CVPR (2015), pp. 2892–2900Google Scholar
- A. K. Reyes, et al. “Fine-tuning deep convolutional networks for plant recognition,” CLEF (Working Notes), 2015.Google Scholar
- P. Sermanet, et al. “Overfeat: Integrated recognition, localization and detection using convolutional networks,” [arXiv:1312.6229] (2013).Google Scholar
- V Mnih et al., Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)View ArticleGoogle Scholar
- Y Jia et al., Caffe: convolutional architecture for fast feature embedding, In 2014 ACM Conference on Multimedia, Proc. ACM Conf. Multimedia (2014), pp. 675–678. https://doi.org/10.1145/2647868.2654889 Google Scholar
- C Yan, Y Zhang, J Xu, et al., A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal. Process. Letters 21(5), 573–576 (2014)View ArticleGoogle Scholar
- C Yan et al., Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans. Circuits. Syst. Vid. Technol. 24(12), 2077–2089 (2014)View ArticleGoogle Scholar
- C Yan et al., Parallel deblocking filter for HEVC on many-core processor. Electron. Lett. 50(5), 367–368 (2014)View ArticleGoogle Scholar
- C Yan et al., Efficient parallel HEVC intra-prediction on many-core processor. Electron. Lett. 50(11), 805–806 (2014)View ArticleGoogle Scholar