An overview of touchless 2D fingerprint recognition

Touchless fingerprint recognition represents a rapidly growing field of research which has been studied for more than a decade. Through a touchless acquisition process, many issues of touch-based systems are circumvented, e.g., the presence of latent fingerprints or distortions caused by pressing fingers on a sensor surface. However, touchless fingerprint recognition systems reveal new challenges. In particular, a reliable detection and focusing of a presented finger as well as an appropriate preprocessing of the acquired finger image represent the most crucial tasks. Also, further issues, e.g., interoperability between touchless and touch-based fingerprints or presentation attack detection, are currently investigated by different research groups. Many works have been proposed so far to put touchless fingerprint recognition into practice. Published approaches range from self identification scenarios with commodity devices, e.g., smartphones, to high performance on-the-move deployments paving the way for new fingerprint recognition application scenarios.This work summarizes the state-of-the-art in the field of touchless 2D fingerprint recognition at each stage of the recognition process. Additionally, technical considerations and trade-offs of the presented methods are discussed along with open issues and challenges. An overview of available research resources completes the work.


Introduction
Fingerprints, i.e., ridge and valley patterns on the tip of a human finger, are one of the most important biometric characteristics due to their known uniqueness and persistence properties [1,2]. Automated touch-based fingerprint recognition has been a topic of research for several decades [3]. Nowadays, large-scale touch-based fingerprint recognition systems are not only used worldwide by law enforcement and forensic agencies, but they are also deployed in the mobile market and in nation-wide applications [2,4]. However, the touch-based fingerprint capturing process suffers from distinct problems, e.g., signals of low contrast caused by dirt or humidity on the sensor plate, latent fingerprints of previous users, or distortions due to elastic deformation of the finger caused by the pressure which is put on the sensor plate [5]. In addition, an inconvenient acquisition process and hygienic concerns may lower the user acceptability of touch-based fingerprint systems and hence, limit their deployment.
To tackle these shortcomings of touch-based fingerprint recognition systems, the first touchless (also referred to as contactless) fingerprint recognition scheme was proposed by Song et al. in 2004 [6]. Since then, a constantly growing number of contributions related to this topic have been published each year by numerous research laboratories working in the field of biometrics, as illustrated in Fig. 1. Conceptual advantages like a less constrained acquisition process pave the way for new applications, improves usability and hence, user acceptance. Further, finger images acquired by a touchless sensor exhibit no deformation and comprise no latent fingerprints. These major advantages motivated a large amount of works published in recent years.
This work aims at providing a comprehensive overview of published scientific literature in the field of touchless fingerprint recognition. It is not intended to re-evaluate proposed approaches as implementations of many works are not publicly available and re-implementations might lack important optimizations or require specific sensor hardware. Moreover, for technical details of surveyed approaches, the reader is referred to the according publications. Where possible, results of published works are presented in a comparative manner. If authors provided a single result in the publication text (e.g., in the abstract or summary), those values are taken directly. Otherwise, a representative result is chosen in good faith from the presented plots and tables.
While touchless fingerprint recognition technologies have been investigated for some years, the corresponding literature is dispersed across different publication media and overview works mostly focus on specific process modules. Parziale and Chen [7] elaborated on the differences of 2D and 3D acquisition technologies, processing strategies, and quality aspects. Further, the authors gave an overview on presentation attack detection (PAD) schemes. Khalil and Wan [8] reviewed state-of-the-art algorithms along the preprocessing pipeline and address PAD. Even though, their work highlights some important issues in the field it lacks a comprehensive discussion of current approaches. Labati et al. [5] conducted a comparative overview of 2D versus 3D touchless fingerprint recognition and address the processing of touchless fingerprints to touch-based equivalent fingerprints using unwrapping algorithms. Moreover, the authors provide a high-level discussion of different feature extraction and comparison subsystems. A brief survey of mobile touchless fingerprint recognition using smartphones as capturing device have been presented by Malhotra et al. [9]. Mil'shtein and Pillai [10] present a short comparative review of touchless and touch-based schemes as well as a selective summary of state-of-the-art touchless acquisition techniques. In addition, the authors briefly discuss challenges of touchless recognition. Labati et al. [11] provided a more elaborated overview of the whole recognition pipeline which is completed by a discussion of liveness detection algorithms, nonidealities of current approaches, and a performance summary. As previously mentioned, the published overview papers are mostly restricted to particular subsets of the topic, i.e., subsystems of a touchless fingerprint recognition system.
As the fact that the existing surveys are either not comprehensive or outdated, this work aims at providing a more complete overview of the state-of-the-art of touchless 2D fingerprint recognition. The first part is structured according to the pipeline of a touchless fingerprint recognition system. It provides the reader brief overview of main processing steps, as well as a detailed summary of proposed approaches. In a second part, an in-depth discussion of issues and challenges is provided. Furthermore, available research resources are described in detail. This summary primarily addresses biometric researchers and practitioners aiming to gain an overview of the current state-of-the-art of the topic.
Apart from the standardized terms and definitions [12], the following taxonomy will be used throughout this work: • Finger image or finger photo refers to an image acquired using a touchless capture device, e.g., smartphone camera, which contains one or more fingers of a subject.
• Fingerprint image refers to a finger image cropped to an area representing a fingerprint, i.e., fingertips.
• Fingerprint refers to a preprocessed touchless fingerprint image or a fingerprint captured by a touch-based sensor.
Furthermore, a distinction is made between the capturing of a finger image without any preprocessing and the acquisition of a fingerprint image which includes an enhancement by some preprocessing algorithms. It should be noted that the ISO/IEC 2382 Part 37 standard suggests the usage of the term capturing process [12].
The general biometric workflow of a touchless fingerprint recognition system is sketched in Fig. 2. The first part of this work is structured accordingly: Section 2 describes different finger image capturing approaches. In Section 3, the processing steps which are necessary to achieve a high-quality biometric sample are described. Section 4 highlights touchless quality assessment followed by a summary of feature extraction and comparison approaches in Section 5 and Section 6. The second part discusses different issues and challenges in Section 7. An overview on touchless biometric databases is further given in Section 8. Section 9 finally draws a conclusion.

Capturing process
During a touchless capturing process, one or more fingers are presented to an optical capturing device. These devices can either be prototypical hardware designs assembled by the researchers or general purpose devices which are adapted to the special needs of touchless fingerprint recognition. The National Institute of Standards and Technology (NIST) [13] published a guidance document for the evaluation of touchless fingerprint capturing. The document accurately defines requirements for the assembly of touchless fingerprint capturing devices with respect to different application scenarios. Figure 3 depicts impressions of a fingerprint captured with a touch-based fingerprint sensor (Fig. 3a) and a the corresponding finger image acquired using a touchless device (Fig. 3b). It is observable that the touch-based fingerprint can be directly used for feature extraction whereas the corresponding touchless fingerprint image requires further preprocessing.

Prototypical hardware design
Many prototypical hardware designs rely on elaborated capturing technologies adopted from other research areas to obtain finger images of high quality. Table 1 lists most relevant works categorized by approach and ordered by the year of publication. All listed approaches focus on overcoming known challenges of touchless fingerprint capturing like unconstrained environmental influences, the lack of deformations, or focusing issues. Several authors combine a box-like setup with LEDs to achieve a predicable illumination and to exclude environmental influences [15,16,18]. LED arrangements around the finger lead to a homogeneous contrast on the fingerprint area. Colored illumination can also emphasize the fingerprint characteristics and hence lead to improved results [16].
The majority of capturing setups used finger guidance in form of circular holes [16] or fixed finger placements [20]. Tsai et al. [17] presented a more unconstrained approach which works without a box and finger guidance. The authors used a strong illumination combined with a small distance between the lens and the fingertip to minimize environmental lights. A variable-focus liquid lens was able to acquire high-quality finger images of moving fingers.
To overcome the issue of fingerprint distortions, Palma et al. [20] and Mil'shtein et al. [14] presented capturing devices using rotating line scan cameras. The acquired finger image slices were merged together to a nail-to-nail rolled fingerprint image. This impression has significantly fewer distortions than a touch-based fingerprint. Alternatively, Wang et al. [15] suggested a setup of three cameras arranged around the fingertip to acquire finger photos of different orientation which are stitched together. A continuous image analysis assessed if the finger was positioned properly and enabled a convenient capturing of high-quality finger images.
Mil'shtein et al. [14] and Ramachandra et al. [18] showed the possibility of combining the capturing of fingerprints and finger veins in multi-modal devices. Ramachandra et al. [18] used low-cost equipment such as an industrial camera with a monochrome sensor. Weissenfeld et al. [19] introduced a mobile hand-held device which captured face and finger images using a single sensor.

General purpose devices
In contrast to elaborated hardware setups, many research groups use general purpose devices to capture finger images. Most relevant approaches are summarized in Table 2 sorted by type of recording device.  First experiments on general purpose devices were conducted by Lee et al. [21] who used the camera of a mobile phone with an external LED light to acquire finger images. Hiew et al. [36] also used an external illumination along with a semi-professional camera in a box setup. In both schemes, the finger images were acquired completely manual.
Several early works investigated the applicability of webcams for finger image acquisition. Major advantages are affordable price and an easy connectivity to a computer [24,26,27]. All contributions used a manual capturing process and no additional illumination. Additionally, Piuri and Scotty [24] conducted an experiment with external illumination but were not able to achieve significant performance benefits. Nevertheless, the authors reported accurate results in a touchless to touch-based interoperability scenario. It is worth noting that despite the rather low image quality of webcams, a biometric recognition scenario could be established with such devices [26] using level-0 features. Level-0 features typically refer to local texture patterns like line structures or dominant local orientations.
Nowadays, smartphones are most often used for capturing because they are widely available, have high-quality cameras, and can provide immediate user feedback. Here, the most promising settings are to keep the auto-focus activated and if available use the macro mode. Additionally, the flash should be enabled [29,37]. External extensions like additional lights and macro lenses are considered as beneficial by Sagiroglu et al. [38].
Several authors suggested using on-screen finger guidance for a high user convenience and an easier fingerprint processing workflow [29,33,35]. Here the camera view presented on the screen is combined with a line representing the finger contour. Modern smartphones are able to process and qualify video streams in order to select the frame which contains a finger image of high quality [30,39]. A convenient automatic capturing comparable to the approach of Wang et al. [15] can be established. Moreover, Carney et al. [33] and Weissenfeld et al. [40] proposed the capturing of a whole slap hand in one image which makes the capturing of up to four fingerprints more convenient. Several works considered finger image capturing under different environmental influences [32,[41][42][43]. The authors concluded, that the capturing itself is not limited by different light situations or indoor and outdoor environments. Nevertheless, varying backgrounds might have a major influence on further processing.
Due to the huge variety of smartphones, several works investigated on interoperability between different models [28,34,41]. It is observable that there are no huge performance differences between particular models of the same generation. Deb et al. [34] also showed that fingerprint images acquired by low-cost smartphones could be compared to touch-based fingerprints. The tested commercial apps showed a practical biometric performance.
A nail-to-nail rolled equivalent touchless finger image is a desirable goal to achieve a large region of interest (ROI). Alkhathami et al. [31] proposed a nail-to-nail rolled finger image by mosaicking three images acquired sequentially with one smartphone. During the capturing, the subject was asked to perform a virtual rolling of his finger. All three images were stitched together to form a larger fingerprint.
Level-3 characteristics, i.e., sweat pores, on touchless image data were firstly analyzed by Genovese et al. [23]. The authors used an off-the-shelf camera and a green LED illumination. In a constrained setup with fixed distance between finger and sensor, the authors captured accurate finger images with a resolution of ≈3800 ppi which is sufficient for extracting level-3 features which refer to sweat pores.

Preprocessing pipeline
The captured image data differs fundamentally between touchless and touch-based acquisition devices. Most touch-based schemes produce a gray-scale image in which the ridge skin area touching the scanners surface is shown in black (or dark gray values) while valley and background area is white (or light gray values). In general, these samples are used directly for feature extraction without extensive preprocessing. The majority of touchless finger image acquisition schemes deliver color images which require a comprehensive preprocessing prior to the extraction of features. Basic challenges are a low ridge valley contrast, a blurred ROI, and a displaced, rotated, or pitched finger. Further, principally different appearances, e.g., the lack of skin deformation, cause incompatibilities. The image processing pipeline has to be developed dependent on the selected device and the observed environmental circumstances during the capturing. For an example finger image, a touchless preprocessing pipeline is illustrated in Fig. 4. In recent years, touchless finger image preprocessing evolved to a heterogeneous topic of research with many different approaches and contributors. Unfortunately, the field lacks a harmonized vocabulary in order to compare different approaches. To get a clear understanding of the preprocessing steps, we define frequently used terms as follows: 1. Finger detection: in the initial step, one or more fingers are detected (or segmented), e.g., based on color or shape analysis, see In 2012, Khalil and Wan [8] presented a survey on the special topic of preprocessing finger images acquired with mobile phones. The authors highlighted the relevance of this field of research and summarized the differences between the touchless and the touchbased domain.
Elaborated preprocessing workflows have to be developed especially for commodity devices in order to compensate the limited capabilities of built-in cameras and environmental side effects. The following subsections summarize proposed approaches for each processing stage. Table 3 additionally highlights fundamental challenges of processing touchless finger images and lists suggested methods to overcome these challenges.

Finger detection and segmentation
Unconstrained capturing systems, which do not have a finger guidance based on dedicated hardware or an on-screen guidance, require a finger detection. Such an algorithm  detects the position and orientation of the finger and forms the basis for an automatic capturing system. The image is then segmented and cut to the fingerprint containing area. Four different approaches can be distinguished, whereas in practice implementations often apply a combination of them: • Sharpness: Sharpness-based approaches exploit the difference between the focused sharp finger area and the blurred background. This effect is most suitable on images acquired with a very small finger-to-sensor distance and a wide open aperture. The early work of Lee et al. [49] presented a fixed focus real-time scheme, which selected the best focused and oriented image out of a series. The authors investigated on the suitability of general purpose focus measuring algorithms. Their experiment showed that the Variance-Modified-Laplacian of Gaussian (VMLOG) algorithm is best suited for the touchless fingerprint capturing device they used. The authors also compared a finger moving method with a fixed lens to a lens-moving method with a fixed distance between sensor and finger. They concluded that the former method is preferable which is questionable from today's perspective. A subsequent work by the same authors [21] compared three segmentation approaches. One of them was sharpness-based and used the Tenengrad method [50] in the frequency domain.
Here, a Sobel operator was used to calculate the horizontal and vertical gradients in the image. A certain threshold was established to separate the sharp foreground from the background area. Lee et al. [51] aimed at selecting the best focused image out of a video stream. The authors proposed an algorithm based on a Gaussian filter to segment the sharp regions of an image which corresponded to the finger region. • Shape: The shape of a finger is highly common for all finger position codes (i.e., various finger instances from thumb finger to little finger), which enables a detection via shape. Jonietz et al. [52] proposed a conjunction of a shape-and color-based finger detection using edge pairing. The authors applied machine learning-based algorithms to the binarized image in the LUV color model. They also used Histogram of Oriented Gradient (HOG) features with rich feature descriptors as baseline and compared their results with them.
• Contrast and color: Especially, if a certain illumination is used, a determination based on the contrast or color is an efficient mechanism for finger detection. Based on findings of Hiew et al. [53] for the segmentation in skin and background area, an analysis of the YCbCr color space represents the most promising approach. The result is a binary image with a separation between finger image area and background. The above approach is widely adopted, modified to meet different prerequisites, and further investigated by many authors [37,46,54,55]. Ravi and Sivanath [27] showed that extending the Cr component with information of the HSV and nRGB color space enables a precise isolation of a finger. The authors used a certain threshold for every color channel and merged the results. Wang et al. [44] presented comprehensive research on different finger illuminations and color models. For this reason, the authors captured images with green, red, and blue illumination and compared the YCbCr color model with YIQ and HSV. Alternatively, other color models such as CMYK (magenta channel) [9] and CIELAB [39] were also investigated. This approach was adopted in many other preprocessing workflows similar to [37,46,55]. Because of prerequisites during the capturing process, most approaches considered only the largest segmented area as fingerprint [37,55]. The color-based segmentation is often combined with an adaptive thresholding, e.g., based on Otsu image thresholding [9,44,46,53]. Hier et al. [53] also determined the mean and covariance on the CbCr channels to improve the segmentation accuracy. Another approach by Lee et al. [21] exploited skin color properties with help of guided machine learning. This approach was shown to reveal competitive results but is more complex compared to others. As a second scheme, the authors suggested a region growing approach. Using an initial seed and a similarity measure with a certain threshold the tested pixels were added to the seed. This approach is also suitable for ROI extraction. With the mean shift segmentation Ramachandra et al. [41] proposed another contrast-based approach.
The algorithm filters the input image in the spatial domain and segments it by fusing the convergence points in homogeneous regions. With this elaborated approach, the authors were able to achieve accurate results in challenging environments. Priesnitz et al. [56] presented a deep learning-based semantic segmentation scheme for the hand area as well as fingertips. The authors used a general purpose hand gesture dataset to test their algorithm against a color-based baseline segmentation algorithm. The proposed method showed accurate results especially in challenging environmental conditions. It should be critically noted that none of the discussed approaches conducted a wider analysis on different skin color types, e.g., as defined in [57]. • Image depth information: Jonietz and Jivet [58] presented a segmentation approach using the information of a depth sensor combined with an RGB image captured by smartphones. The authors were able to extract the slap hand from a busy background and proposed further processing. Exploiting the images' depth information the system worked especially well in the presence of objects of similar color, e.g., when two hands were placed on top of each other.

ROI extraction, orientation estimation, and core point detection
Once a finger is detected, the ROI has to be extracted which includes the normalization to a proper width, height, and resolution. This preprocessing stage assumes an extracted finger image as input. It should be noted that, especially in more constrained setups, finger detection and ROI extraction is done in one step [41].
In their work, Piuri and Scotti [24] simplified the color-based segmentation approach of Lee et al. [21] for ROI extraction. The authors combined this approach with a frequency estimation map. Moreover, they used a Gaussian probability density function and performed a region growing in order to extract the ROI. A comparable approach by Hiew et al. [53] exploited the ridge line characteristics of the fingertip. Here, the segmented finger was divided in non-overlapping blocks. If a ridge-line characteristic was observable within a block, it was added to the ROI. Ramachandra et al. [41] also show that in constrained setups a ROI extraction based on finger geometry properties is also possible. The authors computed the ROI statically by detecting characteristic points like the fingertip and discontinuities.
Since most feature extractors are not invariant to the rotation, all finger images must have the same orientation. Dongjae Lee et al. [51] presented a rolling and pitching estimation by calculating the distance between the core point and the border of the fingertip. Lee et al. [21] estimated the orientation by iteratively computing the robust regression method. The scheme used the Sobel operator on sub-blocks of the input image to compute the orientation of the local gradients. A simple technique on segmented finger images is to approximate a tangent along the border between finger and background and rotate the image to a predefined orientation [29]. In contrast to the aforementioned contributions, Ramachandra et al. [41] proposed a preprocessing pipeline without a rotation stage in combination with a rotation invariant feature extractor. Sisodia et al. [55] also introduced an approach which rotates minutiae features. Here, a minutia which is above a predefined correlation threshold had to be determined in the probe and reference images. Together with the core points of both images, a rotation angle was computed. Regarding an application to large scale databases, the performance of this approach is questionable.
Many comparison algorithms require a core point or a Principal Singular Point (PSP) as reference point. Several works used the ridge line orientation and curvature for detection of the core point [53,55]. Labati et al. [47] suggested a rather complex approach which estimates all singular points from the global ridge structure using computational intelligence classification techniques. Lee et al. [51] used the Poincaré index from the touch-based domain described in [59] to roughly determine the core point.

Fingerprint image enhancement
After the extraction of the ROI, ridge line characteristics have to be further emphasized to extract features accurately. Simple approaches only adapt fingerprint images with kernel based operations in the spatial domain [53], whereas more elaborated algorithms exploit combinations of different filters in the frequency domain [24].
Finger image enhancement should result in a fingerprint image which has a homogeneous illumination. A normalization using mean and variance filters [53] or histogram enhancements like Contrast Limited Adaptive Histogram Equalization (CLAHE) [46,60] were found to be well-suited for this task. Malhotra et al. [9] also suggested the analysis of Local Binary Patterns (LBP) on the ridge-valey contrast for enhancement. Moreover, Wasnik et al. [39] suggested a Frangi Filter which searches for tubular structures.
An important issue is the reduction of blur in the source image. To ensure this, Piuri and Scotti [24] proposed a combination of the Lucy-Richardson and the Wiener filter. In addition, they suggested a blind deconvolution method to enhance images which could not be handled by the algorithms proposed previously.
Liu et al. [60] combined noise removal and illumination correction, and histogram equalization in spatial domain with a ridge line frequency estimation based on Gabor filters. Additionally, a context-based correction is suggested to emphasize the ridgeline structure on low reliability areas. This approach compares blocks (patches) of the fingerprint with a directory and substitutes these blocks with more accurate data. Birajadar et al. [37] also exploited phase congruency processing in the frequency domain. The authors use the monogenic extension of a real 2D log-Gabor isotropic wavelet for the enhancement. A later work of the same authors [35] confirmed that the algorithm also works on a large scale data set captured in an unconstrained environment. Similar work based on the aforementioned scheme was presented by Sagiroglu et al. [38].

Further preprocessing
Special capturing schemes or feature extractors require additional preprocessing steps. Image mosaicking or image fusion describes composition of two or more images to one larger finger image. In the best case, the fused image exhibits a larger ROI and a better image quality. Mosaicking techniques became essential in use-cases where a large-sized sensor is not available but a rolled finger should be captured. In the works of Choi et al. [61] and Liu et al. [62], the authors showed common use cases of mosaicking touchless images. Three (virtual) images were stitched together by using adoptions of the wellknown iterative closest point algorithm. Using a very constrained capturing setup, Choi et al. [61] performed a static stitching without any correspondence measurement. The second approach by Liu et al. [62], which is also used by Alkhathami et al. [31] on a mobile device, extracts Scale Invariant Feature Transformation (SIFT) features from preprocessed images and searches for correspondences between them. Finally, the images are stitched along a border line and post-processed.
To reach the aim of touchless-to-touch image interoperability, Salum et al. [63] proposed further enhancement of touchless image data. At first, the authors added different randomly chosen ellipses to the original image. Secondly, a contour enhancement by a horizontal and vertical fading is added to the image.
Additionally, several works showed that ridge thinning and skeletonizing approaches from the touch-based domain are also applicable to touchless image data to improve the biometric performance [25,27,55].

Quality control
In comparison to touch-based fingerprint recognition systems, touchless schemes contain more critical steps during acquisition and processing which could reduce the system performance. For this reason, an elaborated quality assurance is particularly essential for touchless samples. Several works showed that direct application of touch-based fingerprint quality assessment leads to inaccurate results [64][65][66]. In contrast, Priesnitz et al. [67] demonstrated that the touch-based quality assessment tool NFIQ2.0 is also applicable for touchless samples. The authors concluded that the predictive power highly depends on an adequate pre-processing. Figure 5a depicts a finger image example of high quality in comparison to three finger images of low quality due to acquisition issues. In Fig. 5b, the ROI contains a highlight caused by an overpowered flash light which leads to a low rigde-valley contrast while the contrast on the whole finger is rather high. A wrong focus position results in a blurry ROI from which no details are extractable as shown in Fig. 5c. From a roll pose rotated sample depicted in Fig. 5d, features are extractable but not comparable with an unrotated presentation.
For the purpose of quality assessment, different authors suggested dividing the fingerprint area into blocks. Subsequently, a certain quality assessment algorithm is applied to each of the blocks to either merge the results of each block to one score or to consider only areas above a certain threshold for feature extraction [7,42,66,68]. Parziale and Chen [7] proposed a coherence-based quality measurement. This approach measures strength of the dominant direction in a local region. For this purpose, the authors applied a normalized coherence estimation on local gradients of the gray level intensity. Moreover, the covariance matrix of the gradient vectors was denoted which represents the clarity of the ridge line structure.
Li et al. [42,65] introduced a quality assessment algorithm for finger images acquired with smartphones. The authors used different metrics in the spatial and frequency domain which resulted in a feature vector. A Support Vector Machine (SVM) was trained to separate high-quality blocks from those with low quality.
Yang et al. [66] presented another quality control scheme for samples captured in unconstrained environments. The input fingerprint was not previously segmented or processed. The algorithm used the amplitude-frequency and ridge line orientation in the Fourier domain as distinguishing quality feature. Each block received its own quality value, so only high-quality blocks were considered for feature extraction. The authors concluded that the proposed algorithm works accurately on the majority of tested samples but also provided finger images where it fails. The same authors extended their approach by using an SVM [68]. Li et al. [69] further extended the amount of employed quality features by additionally using a local clarity score and frequency domain analysis.
Lee et al. [51] proposed an effective early stage quality estimation method. The scheme is based on gradient distribution which shows the characteristics of the repeatable line patterns of the fingerprint and therefore its quality. For a first stage quality estimation, this scheme showed a good performance compared in relation to its computational effort. Another contribution by Noh et al. [16] proposed a comparable quality assessment and ridge frequency estimation and benchmarked its performance.
Labati et al. [64] compared their implementation of a neural network classification system with a k-Nearest-Neighbor (kNN) classifier, a linear/quadratic discriminant classifier, and NFIQ1.0 [70]. The authors used a rather constrained data set and were able to show that their own approach performs significantly better than the NFIQ1.0 algorithm. A latter work of the same authors showed the computational performance of the system in a practical approach [71]. Zaghetto et al. [45] treated rotational deviations on mosaicked fingerprints captured in a multi-view environment as a measure of quality. A four-layered neural network was proposed which classifies the input dataset into rotated or un-rotated.

Feature extraction
The feature extraction from touchless captured fingerprint samples is performed similarly to touch-based scenarios. Several works showed that established feature extractors can be applied to touchless image data, as shown in Fig. 6. When using touch-based algorithms, it is important to notice that an extractor which performs considerably good on touchless and touch-based samples does not necessarily lead to an interoperability between them. Touchless developments range from simple texture feature extraction with out-of-the-box algorithms to dedicated fingerprint feature extractors.
Some works in the touchless domain used the well-established Verifinger SDK to evaluate the performance of their processing pipeline [37,73] or benchmarked their approaches against it. Moreover, many works used the NIST standardized MINDTCT [74] algorithm for feature extractor on processed images [18,24,41,63]. Similarly, Yang et al. [66] used this feature extractor for quality estimation. It should be noted that Verifinger requires a fingerprint scaled to 500 DPI in order to work properly. A DPI normalization as described in Section 7.4 is usually not performed but could influence the amount of features extracted. Han et al. [73] investigated the compatibility of photographed finger images with the Verifinger feature extractor. The authors showed that it is possible to Fig. 6 Feature extraction. Minutiae points extracted from the touch-based fingerprint (a) and a touchless fingerprint (b). The feature extraction was performed with FingerNet [72]. Please note that due to the different capturing process, the touchless fingerprint image is mirrored extract features with some manual preprocessing in form of a ROI extraction. It should be noted that Verifinger does perform additional internal preprocessing which improves the overall accuracy. Sisodia et al. [55] presented a simple feature extraction technique using kernel operations which represent common minutiae characteristics. The work proposed of Ravi et al. [27] described an extraction and classification of minutiae comparable to [55] using the counting number algorithm. On the preprocessed binary image, it counts the amount of white pixel around the center point and estimates the corresponding minutia type.
Another work by Wang et al. [75] applied a sliding window on normalized images. It used local gradient codings and LBP for feature extraction. The authors analyzed different block sizes to extract the texture features. Similarly, general purpose texture descriptors have been employed in [76].
Hiew et al. [77] transferred an approach based on a block-wise Gabor-filter from the touch-based domain to touchless data. Here, the magnitude was converted to a scalar number which represents the feature point. In addition, a PCA was performed to compress the feature vector and a projection in its normalized Eigenspace is applied to each Gabor feature vector. Ramachandra et al. [18] used Spectral Minutiae Representation (SMR) on minutiae extracted with MINDTCT to achieve a fixed length feature vector.
With ScatNet, Sankaran et al. [32] and Malhotra et al. [9] proposed a novel feature extractor. Group-invariant scattering networks [78] refer to a filter bank of wavelets that produce a representation which was shown to be stable to local affine transformations. The authors extended the approach with an additional wavelet-modulus transformation for high frequency components. A low-pass filter-based convolution concatenated the wavelet responses of an arbitrary number of filters which lead to more discriminative features. The authors compared their ScatNet approach to a minutia-based baseline using VeriFinger SDK [79] and Minutiae Cylinder Code (MCC) [80] for feature extraction and performed slightly better than them.
Yin et al. [81] proposed a distortion-free feature representation using the ridge count itself as feature. Additionally, to single minutiae, pairs of minutiae were also considered as feature. The authors used a genetic algorithm to solve the combinatorial optimization problem. To improve effectiveness and accuracy, a minutia-pair expanding algorithm was suggested. To perform comparisons on these feature vectors, a similarity metric was defined. On two benchmark databases, the authors were able to perform better than the established touch-based feature extractors. It should be critically noted that in their test setup the algorithm had a high overall runtime.
Kumar and Zhou [26] suggested a feature extraction based on level-0 features, such as local texture patterns. The evaluation included various combinations of approaches, e.g., Localized Radon Transformation (LRT), and revealed remarkably good performance. In a more recent work, Vyas and Kumar [82] suggested an improved scheme using minutiae comparison.
Genovese et al. [23] proposed a combination of image processing algorithms and machine learning for extracting level-3 features (sweat pores). The authors extracted the green channel from an RGB image and applied different gamma transformations on it. A simple image processing followed by an extraction of connected components identified candidates for sweat pores. A CNN distinguished whether a candidate point is a sweat pore or not. Building upon this work, Labati et al. [83] presented a comparative study on level-3 feature extraction. Two CNNs were trained to detect sweat pores on preprocessed touchless, touch-based, and latent fingerprints. The first CNN determined possible sweat pores in the images whereas the second one detected falsely selected pores. Compared to the touch-based results, the touchless recognition performance turned out to be inferior which was caused by variable illumination situations and pore reflection.

Comparison
In the final comparison stage, touchless and touch-based fingerprint recognition systems operate in a similar way. Figure 7 shows a comparison of a single fingerprint captured from a touchless and a touch-based capturing device. Similar to the feature extraction stage, many works applied comparison methods of the touch-based domain, e.g., the NIST bozorth3 [74] comparator [41,63,84,85]. The NIST also evaluated the impact of fingerprint samples captured by touchless devices on different fingerprint recognition algorithms [86]. Lindoso et al. [87] introduced the first comparator dedicated to touchless fingerprint recognition in 2007. The authors proposed a zero mean normalized cross correlation approach. This method was directly applied to the gray levels of the input image. In the first step, a coarse alignment estimated the way the images were shifted and rotated to fit to the template. In the second step, fingerprint regions were selected based on quality and compared to each other based on the gray level in a final step.
Stein et al. [29] suggested a simple comparison of all minutiae to each other based on the Modified Hausdorff Distance (MHD) and orientation. Kumar and Zhou [26] compared level-0 features by using a normalized Hamming distance for an image texture comparison. The authors concluded that localized fingerprint sub-regions are more robust to rotations and partial distortions.
Labati et al. [88] presented an approach using neural networks to detect a pair of mated minutiae between two samples. A list of local features around any minutiae of the corresponding sample was established. This information was incorporated during the training of the neural network. It then decided if the candidates were referring to the same minutia or not. Also, the work includes analyses on comparing more than one fingerprint view. Sankaran et al. [32] and Malhotra et al. [9] suggested combinations of conventional and machine learning techniques. At first, the conventional algorithm computed the L1distance between each two ScatNet features resulting in a comparison score. Secondly, the approach relied on a supervised binary classifier which learned whether an image pair is a match or not. Building upon their work in [9], Malhorta et al. [89] showed that their algorithm can be adapted to also work on highly unconstrained data.
Lin and Kumar [90] proposed a comparison framework based on a multi-Siamese CNN for touchless to touch-based fingerprint comparison. Three sub-CNNs were trained on fingerprint minutiae, respective ridge maps, and specific regions of ridge maps. The authors generated deep fingerprint representations which were concatenated. This approach appeared to be more robust for cross-domain comparisons. They were able to outperform other CNN-based approaches. A later work by Tan and Kumar [91] especially focused on pose invariant feature matching.
To exploit the properties of their introduced features optimally, Yin et al. [81] defined a comparison metric using a number of corresponding minutiae and the global topological similarity.

Issues and challenges
In the past years, many works on the topic of touchless fingerprint recognition have been published. Nevertheless, there are still some unsolved issues. The following subsections set out the most relevant challenges related to the touchless recognition process and provide starting points for further research.

Biometric performance
The most important measurement criterion for any biometric system is the recognition performance. Table 4 highlights outstanding touchless fingerprint recognition workflows with their achieved recognition performance. So far, touchless 2D fingerprint schemes yield an inferior recognition accuracy compared to touch-based ones. Practical perfor- mance rates are only achieved by more sophisticated touchless approaches, e.g., based on 3D fingerprints captured by systems which utilize special acquisition devices and comprehensive preprocessing [92]. Up to now, mobile approaches using a commodity device are not able to achieve competitive results. Along the touchless fingerprint recognition pipeline, different stages should be considered to achieve a good biometric performance: • Acquisition: A homogeneously illuminated, noise-free finger image should be acquired. High-quality camera equipment and a predictable illumination are a good precondition for a proper finger image.
• Preprocessing: An accurately segmented and rotated fingerprint images yield meaningful comparison scores. At this point, user instructions or a fingerprint guidance during the capturing process can help to increase accuracy. • Quality assessment: A dedicated quality assessment which is integrated in the preprocessing pipeline is crucial to consider only samples of high quality.
• Feature extraction and comparison: A specific touchless feature extraction which is adapted to the considered dataset reveals results comparable to touch-based schemes.
Also, it can be observed that some aspects of this research area have been extensively researched, while others deserve more attention. For example, several well-functioning segmentation algorithms have been proposed whereas only little research has been conducted on dedicated touchless feature extraction.

Environmental influences
Touchless fingerprint capturing and processing has to deal with different environmental influences. Environmental influences or comparison between different sensor types may lower the performance, as discussed in the following subsections. According to Malhotra et al. [9], challenging environmental situation are: • Uncontrolled background • Varying illumination • Finger position • Impurities on the finger surface Further technical challenges can be summarized as: • Varying camera setup (especially on smartphones) • Noisy fingerprint impression due to low contrast Especially on mobile devices, environmental influences have a high impact on the biometric recognition accuracy as showcased by Malhotra et al. [9]. Fingerprint detection and segmentation algorithms have to be robust against a huge variety of environmental conditions ranging from very dark environments to ones with bright sunlight. Especially color-based segmentation reveals deficits on scenes with a background which contains color similar to skin color. Developers working on mobile setups should be aware of the fact that an acquisition in every environmental situation is hardly feasible. Preprocessing and quality assurance algorithms should be able to assess the situation as precisely as possible and to decide whether a fingerprint capturing is feasible. An appropriate user feedback is expected to be helpful in such cases. In prototypical hardware setups, environmental influences play a minor role. Most devices have a hood and homogeneous background which ensures a predictable illumination situation, whereas others require a laboratory environment to work properly [16].
Setups designed for the usage under different environmental influence could also benefit from the use of depth information on an image like suggested by Jonitz and Jivet [58]. The additional depth information helps algorithms to segment the finer and gives a hint on the distance between finger and sensor.

Usability and acceptability
One of the main advantages of touchless fingerprint acquisition is seen in a higher usability compared to touch-based schemes. Touch-based fingerprint capturing suffers from hygienic issues in case various participants are touching the sensor surface. Touch-based schemes also require a certain orientation and pressure of the finger and generally need more time for the capturing process. As discussed in Section 2, touchless capturing devices show different levels of usability. In general, a higher usability can be achieved by: 1 Sensor-to-finger distance: A freely chosen distance between the finger and sensor during the presentation of the finger is desirable. 2 Pose angle: An unconstrained orientation during the presentation of the finger leads to a more convenient system. 3 Fourprint capturing: Most touchless devices can directly capture up to four fingers in one acquisition process. Preprocessing is then able to accurately separate the fingerprint areas into fingerprint images. 4 Integrated quality assessment: An integrated quality measure ensures that the capturing process is finished as soon as one high-quality template of one or more finger is captured. 5 Fast capturing process: The time needed to present the fingers accurately should be as short as possible. Processing steps should be applied subsequent to acquisition wherever it is feasible. 6 Easy-to-understand user feedback: An integrated user feedback helps to present the fingers smoothly.
The points 1-4 address an unconstrained acquisition process which is highly desirable for enhanced usability. Nevertheless, a more unconstrained capturing also requires more robust finger detection algorithms and especially an elaborated quality assessment to avoid the capturing of low-quality samples. These usability goals can only be achieved with an large amount of processing power. Today, no mobile capturing setup satisfies all of these requirements. The majority of commodity devices for capturing focus on a rather unconstrained capturing (e.g., [33]) whereas prototypical hardware setups focus more on recognition accuracy [16].
In a comprehensive study, Furman et al. [93] evaluated the usability of three stationary touchless recognition products. The authors came to the conclusion that touchless capturing requires a dedicated instruction.

Touchless-to-touch-based sensor interoperability
Interoperability between touch-based and touchless sensors is a desirable objective in many cases, e.g., to avoid re-enrolment of subjects already registered with the Page 20 of 28 system in case of sensor exchange or to enable cross-matching between fingerprint databases captured through touchless and touch-based sensors. A fundamental difference between touch-based and touchless fingerprints is that touchless fingerprints are mirrored along the vertical axis. The majority of touchless sensors also capture color finger images whereas touch-based sensors capture grayscale fingerprints. Further, touchless fingerprints contain no deformations due to pressing the finger onto a surface. Some differences, e.g., mirroring, color-to-grayscale conversion or inverted back-and foreground, can be implemented in a straight-forward manner without a loss of accuracy. Other differences require elaborated approximation approaches, e.g., the aspect ratio or deformation estimation [94]. An accurate and robust scheme for correcting deformations on touchless 2D fingerprint images has not yet been established. One important factor which may cause biometric performance drops in interoperability scenarios is the DPI alignment for touchless data. For touch-based sensors the measure of spatial dot density is an important metric for acquisition devices to align the data samples to a certain size and resolution. ISO/IEC compliant fingerprints need to exhibit 500 DPI which nowadays is a minimum requirement for commercial products [95]. Touchless devices such as digital cameras feature no DPI value because the acquired image is not bound to a physical scale. Nonetheless, it is mandatory to normalize touchless fingerprints to the same size and resolution in order to achieve an accurate performance. Fingerprint images can be normalized by cropping the image area and rescaling it to a certain height and width. By knowing the sensors resolution and focal length and approximating the distance between finger and sensor via the auto focus and the fingers' width the DPI of the finger area can be approximated to an almost constant value [33,61]. Wild et al. [96] proposed a comparative test of their resolution estimation scheme on different smartphones. The authors were able to achieve accurate comparison scores in an interoperability scenario.
Another important issue is the ridge frequency estimation on touchless data. The ridge frequency of a fingerprint refers to the amount of ridges which are present within a window of defined size. Due to the touchless acquisition, there is no deformation resulting from pressing the finger onto the sensor surface. Considering 2D fingerprint images this means that the frequency of ridges is increasing towards the borders in contrast to touch-based fingerprints where it stays almost stable. Moreover, blurred border areas flatten the peaks which hampers correct feature detection. Thin plate splines are a suitable tool to correct these deformations in general which also has a positive effect on the ridge frequency and interoperability [16,48]. In a first approach, the algorithm of Noh et al. [16] searched for corresponding points in touchless and touch-based samples and minimizes an energy function. This approach showed accurate results but is hardly practically implementable because one touchless and one touch-based sample is needed. Lin et al. [48] went one step further and formulated a deformation correction model based on robust thin plate splines. Different models were trained to meet the individual finger shape. During the comparison different deformation correction models were automatically selected. A comparable method was also suggested by Dabouei et al. [97]. The NIST also conducted a comprehensive study on interoperability issues in application scenarios were touchless and touch-based fingerprints are compared [98].

Presentation attack detection
Reliable Presentation Attack Detection (PAD), i.e., anti-spoofing, modules are vital to enhance the security of fingerprint recognition systems. PAD represents a wellstudied field of research for touch-based fingerprint recognition systems [99]. Specialized hardware-based skin detection methods which are reported to reliably detect diverse Presentation Attack Instruments (PAI) species, e.g., gummy fingers, are already integrated in many commercial touch-based fingerprint capturing devices. In contrast, in a touchless fingerprint recognition system, PAD turns out to be more challenging. Up until now, only a few approaches to PAD in touchless fingerprint acquisition have been proposed.
Moon et al. [100] proposed a PAD method based on wavelet analysis of the finger tip surface texture. Wang et al. [15] presented a PAD algorithm which exploits the differences between bona fide presentations and attack presentations in band-selective Fourier spectra. In addition, reflection detection was implemented to detect fake finger materials. A video-based PAD method based on the detection of sweat pores was presented by Parziale and Chen [7]. The idea of PAD for touchless fingerprint acquisition using texture descriptors in conjunction with neural network-based classifiers was proposed by Alkhathami et al. [31]. Moreover, a detection of finger veins can be employed for PAD in a touchless fingerprint recognition system. An approach for PAD with a setup based on smartphones is presented by Stein et al. [30]. They used a video-based acquisition and show that it is possible to detect presentation attacks by analyzing different video frames. A further work by Overgaard et al. [101] tried to exploit Eulerian Video Magnification (EVM) for liveness detection. The method emphasized the heartbeat-related color variations of genuine fingers. However, the authors raised several concerns that this approach might not be put into practice.
Taneja et al. [102] created a large publicly available spoofed fingerphoto database. The database contains print-out attacks, photo attacks, and non-spoofed finger images captured with two different smartphones.

Biometric template protection
Due to the strong and permanent link between individuals and their fingerprints, exposure of enrolled fingerprint templates to adversaries can seriously compromise biometric system security and user privacy, e.g., stolen fingerprints could be used to create artifacts in order to launch presentation attacks. Numerous techniques have been proposed for fingerprint-based biometric template protection over the last 20 years [103,104]. In addition, the ISO/IEC standard for the protection of biometric information [105] provides guidance for protection under requirements of confidentiality, integrity, and renewability/revocability during storage and transfer and for secure and privacy-compliant management and processing of biometric information.
While originally designed and evaluated on touch-based fingerprint databases, concepts for biometric cryptosystems, e.g., the fuzzy vault scheme [106,107] or the fuzzy commitment scheme [108,109], and cancelable biometrics, e.g., Cartesian, radial or functional transformations [110,111], could be adapted to touchless fingerprints, too. Depending on the employed scheme, feature type transformations of fingerprint templates might be required [112]. Due to this reason, almost no research has been conducted to design particular template protection schemes for touchless fingerprints. Most notably, Hiew et al. [77] proposed the use of multiple random projections to achieve a cancelable  [114] presented an algorithm which directly encrypts fingerprint images using a novel memristive chaotic system. Malhotra et al. [115] addressed the issue of fingerprint template protection in selfie images on social media platforms.

Multi-biometrics
Multi-biometric systems have been found to significantly improve the accuracy and reliability of biometric systems [116]. With the possibility of a slap hand acquisition, the fusion of biometric information obtained from four fingers can be employed to improve biometric performance, especially in unconstrained environments. Deb et al. [34] demonstrated the potential of fusing information of four fingers acquired through two slap hand acquisition devices. Noh et al. [117] proposed a score-level fusion of three fingers acquired by a touchless sensor to achieve higher recognition accuracy. Carney et al. [33] performed a score-level fusion of two, four, and eight fingers. They were able to achieve significant performance gains due to the fusion. Moreover, biometric information obtained from touchless fingerprints could be fused with different biometric characteristics. Improvement in biometric performance as a result of biometric fusion should be weighed against the associated overhead involved, such as additional sensing cost, i.e., it is preferred to combine biometric characteristics that can be acquired in a single presentation [118]. Mil'shtein et al. [14] and Ramachandra et al. [18] suggested a fusion of finger vein patterns with touchless fingerprints.

Research resources
Databases comprising touchless fingerprint image data are vital for the development of improved processing modules. An overview of databases available for research purposes and their properties is given in Table 5.
The Hong Kong Polytechnic University established several databases for different proposals. So far, the most comprehensive touchless-to-touch fingerprint database has been established by Kumar [120]. It consists of 1800 touchless 2D finger images and the corresponding touch-based fingerprints acquired from 300 subjects. A multi modal database [121] features 6264 2D finger images including corresponding vein images of 156 subjects are provided with 6 samples of index and middle fingers as texture and vein image for each subject. Another database containing low-resolution finger surface images acquired by a low-cost webcam was established in [122]. The database contains 1466 images from 156 subjects captured in two sessions.
The IIITD SmartPhone Fingerphoto Database v1 (ISPFDv1) [32] is a smartphone finger photo database which consists of 4096 finger photo images from 128 subjects. The database is acquired using a smartphone camera with varying background and illumination. Per subject 8, images of both, the right index and middle finger, are taken. The illumination is categorized in indoor and outdoor whereas the background is separated into a white one and a busy one. Every category contains two fingers in two lightning and background situations. In summary, 4096 images were taken and additionally acquired with a touch-based device to estimate the cross-sensor comparison performance. A follow-up database ISPFDv2 [89] was captured using two smartphones and one touchbased device. It includes more than 17,000 touchless and 2432 touch-based samples of 304

Semi
The database also contains plain 2D finger images and for this reason is also suitable for 2D fingerprint research fingers. A further extension by presentation attacks is proposed by the same institution [102]. The authors captured 128 presentation attacks using optical devices and printers. The Social-Media Posted Finger-selfie (SMPF) database [102] provides 1000 images downloaded from social media platforms which contain fingers. This database could be used for research on template protection schemes.
Chopra et al. [123] collected another smartphone-based database. The UNconstrained FIngerphoTo (UNFIT) database contains 3450 samples of 115 subjects, captured using multiple smartphones with different resolutions. The samples are captured considering different challenges, such as background, illumination, miss-focusing and multi-finger presentations. This database is well-suited for research on finger detection and quality aspects but inappropriate for biometric performance testing.
IIT Bombay, Touchless and Touch-Based Fingerprint Database [35] consists of 800 touchless and 800 touch-based fingerprint images of 200 subjects. The touchless samples are captured using a smartphone with the developed android app and are cropped to an image size of 170 × 260. The database also consists of 800 touch-based fingerprints of the same 200 subjects with an image size 260 × 330. It aims to help researchers in their endeavors in comparing the performance of touchless and touch-based fingerprint biometric systems.
The first smartphone spoofing attack database by Taneja et al. [102] contains 4096 bonafide finger images and 8182 spoofing attacks. The bonafide images are taken from the ISPFDv1 database. From the dataset, the authors created 2048 print attacks (printouts which were again photographed) and 6144 photo attacks. The photo attacks are taken from the screens of an iPad, a smartphone, and a laptop. The authors used the same devices as in the ISPFDv1 database.
The semi-public 1 cross-sensor GUC100 database [124] contains five touch-based and one touchless sensor (TST Bird3). During the database establishment 100 subjects presented their 10 fingers to all 6 devices. This was repeated 12 , to obtain natural variance. All in all approximately 72,000 images were collected.

Conclusions
In this work, the state-of-the-art in the constantly evolving field of touchless fingerprint recognition is summarized and discussed. This research field features a broad spectrum of different acquisition systems from high-end setups to low-cost devices. Subsequently, different preprocessing approaches have to be applied to the acquired image data. It can be observed that a general endeavor of summarized research is to achieve interoperability between touchless and touch-based fingerprint recognition systems. In general, touchless schemes reveal improved usability and high user acceptance whereas biometric performance remains as challenge, especially on mobile of-the-shelf devices. Concepts for further research topics related to touchless fingerprint recognition, e.g., PAD or biometric template protection, have already been presented in the literature. Building upon these concepts, first stationary and mobile commercial touchless fingerprint recognition systems have been introduced. However, more work is yet to be done in order to achieve robust, interoperable, secure, privacy preserving, and user-friendly systems.