Three-dimensional face recognition under expression variation

Wang, Xueqiao; Ruan, Qiuqi; Jin, Yi; An, Gaoyun

doi:10.1186/1687-5281-2014-51

Research
Open access
Published: 25 November 2014

Three-dimensional face recognition under expression variation

Xueqiao Wang¹,
Qiuqi Ruan¹,
Yi Jin¹ &
…
Gaoyun An¹

EURASIP Journal on Image and Video Processing volume 2014, Article number: 51 (2014) Cite this article

3732 Accesses
12 Citations
Metrics details

Abstract

In this paper, we introduce a fully automatic framework for 3D face recognition under expression variation. For 3D data preprocessing, an improved nose detection method is presented. The small pose is corrected at the same time. A new facial expression processing method which is based on sparse representation is proposed subsequently. As a result, this framework enhances the recognition rate because facial expression is the biggest obstacle for 3D face recognition. Then, the facial representation, which is based on the dual-tree complex wavelet transform (DT-CWT), is extracted from depth images. It contains the facial information and six subregions’ information. Recognition is achieved by linear discriminant analysis (LDA) and nearest neighbor classifier. We have performed different experiments on the Face Recognition Grand Challenge database and Bosphorus database. It achieves the verification rate of 98.86% on the all vs. all experiment at 0.1% false acceptance rate (FAR) in the Face Recognition Grand Challenge (FRGC) and 95.03% verification rate on nearly frontal faces with expression changes and occlusions in the Bosphorus database.

1 Introduction

3D face recognition is a continuously developing subject with many challenging issues [1–3]. These years, many new 3D face recognition methods which were demonstrated on the Face Recognition Grand Challenge (FRGC) v2 data have got good performances.

Regional matching scheme was firstly proposed by Faltemier et al. [4]. In their paper, the whole 3D face images were divided into 28 patches. The fusion results from independently matched regions could achieve good performance. Wang et al. [5] extracted the Gabor, LBP, and Haar features from the depth image, and then the most discriminative local feature was selected optimally by boosting and trained as weak classifiers for assembling three collective strong classifiers. Mian et al. [6] extracted the spherical face representation (SFR) of the 3D facial data and the scale invariant feature transform (SIFT) descriptor of the 2D data to train a rejection classifier. The remaining faces were verified using a region-based matching approach which was robust to facial expression. Berretti et al. [7] proposed an approach that took into account the graph form to reflect geometrical information for 3D facial surface, and the relevant information among the neighboring points could be encoded into a compact representation. 3D weighted walkthrough (3DWW) descriptors were proposed to demonstrate the mutual spatial displacement among pairwise arcs of points of the corresponding stripes. Zhang et al. [8] found a novel resolution invariant local feature for 3D face recognition. Six different scale invariant similarity measures were fused at the score level, which increased the robustness against expression variation.

The accuracy of 3D face recognition could be significantly degraded by large facial expression variations. Alyuz et al. [9] proposed an expression resistant 3D face recognition method based on the regional registration. In recent years, many methods dealt with facial expression before recognition. Kakadiaris et al. [10] utilized the elastically adapted deformable model firstly, and then they mapped the 3D geometry information onto a 2D regular grid, thus combining the descriptiveness of the 3D data with the computational efficiency of the 2D data. A multistage fully automatic alignment algorithm and the advanced wavelet analysis were used for recognition. Drira et al. [11] represented facial surfaces by radial curves emanating from the nose tips and used elastic shape analysis of these curves to develop a Riemannian framework for analyzing shapes of full facial surfaces. Their method used the nose tips which are already provided. Mohammadzade et al. [12] presented a new iterative method which can deal with 3D faces with opened mouth. They performed experiments to prove that the combination of the normal vectors and the point coordinates can improve the recognition performance. A verification rate of 99.6% at a false acceptance rate (FAR) of 0.1% has been achieved using the proposed method for the all versus all experiment. Amberg et al. [13] described an expression invariant method for face recognition by fitting an identity/expression separated 3D Morphable Model to shape data. The expression model greatly improved recognition. Their method operated at approximately 40 to 90 s per query.

Our method is an automatic method for 3D face recognition. The framework of our method is presented in Figure 1. For data preprocessing, an improved nose detection method is proposed. At the same time, the small pose of the face can be corrected. Then, the face region (face without hair and ears) is gotten using a sphere centered at the nose tip. After finding the face region, the facial expression is removed using a new method which is based on sparse representation. Finally, the depth image is constructed. In the training section, we use all the 943 faces in FRGC v1 for training. First of all, we extract the four-level magnitude subimages of each training faces using DT-CWT. Subsequently, we vectorize the six magnitude subimages into a large vector which dimension is 384 and utilize the linear discriminant analysis (LDA) [14] to learn the subspace of the training faces and then record the transformation matrix. Secondly, the six subregions’ four-level magnitude subimages are extracted using DT-CWT, and they are vectorized into a large vector which dimension is 2,304. After that, we utilize the linear discriminant analysis [14] to learn the transformation matrix too. Finally, we get all the gallery faces’ two features using DT-CWT and their transformation matrix, respectively, to establish two LDA subspaces. In the testing section, we obtain all the probe faces’ two features by using DT-CWT and their two transformation matrices, respectively. Cosine distance is used to establish two similarity matrices. In the end of the method, two similarity matrices are fused, and the nearest neighbor classifier is used to finish the recognition process.

The main contributions of this work can be summarized as follows:

● The first contribution is an improved nose detection method which can correct the small pose of the face iteratively. The proposed nose detection algorithm is simple, and the success rate is 99.95% in the FRGC database.

● The second one is that we propose a new 3D facial expression processing method which is based on sparse representation. Li et al. [15] utilized sparse representation into 3D face recognition, but they applied it in the recognition section. In this paper, sparse representation is used for facial expression processing. The objective of the sparse representation is to relate a probe with the minimum number of gallery dataset. Considering that the first task of our expression processing work is to find the minimum number of expressional components out of the dictionary (because people only make one expression for one time), the objective of sparse representation is naturally better suited for finding the expressional deformation from the dataset. This method is a learning method that can abstract the testing face’s neutral component from a dictionary of neutral and expressional spaces, and it only costs 14.91 s for removing one facial expression (The type of our CPU is Intel (R) Core (TM) i3-2120, and the RAM is 2 GB.). The proposed method is more simple and only cost less time.

The paper is organized as follows: In Section 2, the data preprocessing methods are proposed. The improved nose tip detection method is presented in this section. Then, the 3D facial expression processing method is presented in Section 3. In Section 4, the framework of our 3D face recognition method is given. Experimental results are given in Section 5, and the conclusions are drawn in Section 6.

2 3D data preprocessing

Firstly, a 3 × 3 Gaussian filter is used to remove spikes and noise, and then the range data are subsampled at a 1:4 ratio.

Some 3D faces in the FRGC database contains information of the ears, while some faces’ ears are hidden by the hair. For the purpose of consistency, we only use the face region into recognition. Now, we introduce the face region extracting method.

2.1 Nose detection

The nose is the center of a 3D face, so nose detection is important for facial region extraction. The block diagram of the proposed procedure for nose detection is presented in Figure 2.

In this paper, the first step of nose tip detection is finding the central stripe. Details are presented in our earlier work [16].

We use the face with ID 02463d453 in FRGC v1 as the standard face and manually find its nose tip on its stripe. Subsequently, we find other persons’ nose tip using an automatic iterative algorithm. Let us suppose that A is the central stripe of the face with ID 02463d453, and B is the central stripe of the face whose nose tip needs to be found. The method is as follows:

(1)
Align stripe A to stripe B using the ICP [17] method and record the transformation matrix M ₂.
(2)
Use M ₂ to find point p which is the first person’s transformed nose tip.
(3)
Crop a sphere (radius =37 mm) centered at point p. The highest point in the sphere is found as the nose tip of B. The step is shown in our previous work [16].
(4)
Crop a sphere (radius =90 mm) centered at the nose tip and align to the standard face. Calculate the transformed nose tip p 1.
(5)
Crop a sphere (radius =25 mm) centered at point p 1. The highest point in the sphere is found as the new nose tip p 2.
(6)
If ||p 2 − p 1|| <2 mm, p 2 is the nose tip, else, back to step (4).

2.2 Face region

Once the nose tip is successfully found, the region in the last step of nose detection is used as the face region. All the faces with excessive head rotation, hair artifact, and big expressions were successfully segmented by the proposed nose detection algorithm. Some examples are presented in Figure 2.

3 3D facial expression processing method based on sparse representation

Facial expression is one of the biggest obstacle of 3D face recognition because 3D face has less information and some information on the face can be changed easily by facial expression. In this section, we introduce a new expression processing method for removing facial expression which is based on sparse representation. We expect that our method could establish correspondence between an open mouth and estimated neutral component.

3.1 Brief introduction of sparse representation

In this paper, we use L1-regularized least squares regression [6, 18] to estimate the coefficients of our model. L1-regularized is known to produce sparse coefficients and can be robust to irrelevant features. In general, the problem can be formulated as:

\hat{a} = argmin \{{|y - Ax |_{2}^{2} + γ| x|}_{1}\}

(1)

where y is the test sample, x is the sparse representation on dictionary A, and γ is a scalar constant (we use γ = 5,000 in this paper). The feature-sign search method [6] is adopted to solve Equation 1.

3.2 Facial expression processing method

First of all, we use a triangle-based linear interpolation method to fit a surface Z = f (X, Y) (the size of it is 128 × 128). Meanwhile, we use a triangle-based linear interpolation to fit a surface too (the size of it is 384 × 384), and then we establish the depth image using the surface for the feature extraction in Section 4.

We consider face (F_face = Z) as the sum of a reference face (F_reference), a neutral component (ΔF_Neutral), and an expressional component (ΔF_Expression). In this paper, we use the face with ID F0001_NE00 of BU-3DFE dataset as reference.

F_{face} = F_{reference} + Δ F_{Neutral} + Δ F_{Expression}

(2)

The goal of this section is getting F_Neutral:

F_{Neutral} = F_{reference} + Δ F_{Neutral}

(3)

In this paper, we use sparse representation to evaluate the testing face’s ΔF_Neutral and ΔF_Expression from a neutral space and an expressional space, respectively, because we want to find the minimum number of expressional components out of the dictionary and a linear combination of ΔF_Neutral using a neutral space. First of all, the dictionary A = [A₁, A₂] is needed to be established, where A₁ is a neutral space and A₂ is an expressional space. The results of 275 neutral faces (each person’s first face of FRGC v1) subtracting the reference face, respectively, and then vectorizing into 275 large vectors are used to be the neutral space $A_{1} = [A_{1}^{1}, A_{1}^{2}, \dots, A_{1}^{275}], where A_{1}^{i} = F_{neutral}^{i} - F_{reference} = (\begin{array}{c} Δ z_{1}^{1} \\ Δ z_{1}^{2} \\ ⋮ \\ Δ z_{1}^{n} \end{array})$ . Then, the results of 460 different expressional faces (the first 10 men’s 23 expressional faces from BU-3DFE dataset and the first 10 women’s 23 expressional faces from BU-3DFE dataset) subtracting their corresponding neutral face, respectively, and vectorizing into 460 large vectors are applied to be the expressional space $A_{2} = [A_{2}^{1}, A_{2}^{2}, \dots, A_{2}^{460}], where A_{2}^{i} = F_{expression}^{i} - F_{neutral} = (\begin{array}{c} Δ z_{2}^{1} \\ Δ z_{2}^{2} \\ ⋮ \\ Δ z_{2}^{n} \end{array})$ .The reference face and the first persons’ expressional faces are shown in Figure 3.

In the testing section, the reference face is subtracted from the testing face and the result’s sparse representation of the dictionary A is abstracted by Equation 1 $\hat{x} = arg min \{{|y - [A_{1}, A_{2}] [\begin{array}{c} x_{1} \\ x_{2} \end{array}]|}_{2}^{2} + γ {|\begin{array}{c} x_{1} \\ x_{2} \end{array}|}_{1}\}, where y = F_{test} - F_{reference} = (\begin{array}{c} Δ z_{test}^{1} \\ Δ z_{test}^{2} \\ ⋮ \\ Δ z_{test}^{n} \end{array})$ . Because neutral components of neutral faces are highly correlated, this method can find the familiar neutral components of the testing face. After this, we reconstruct the testing face’s neutral component using A₁ and sparse vector ${\hat{x}}_{1}$ :

Δ {\hat{F}}_{Neutral} = A_{1} \hat{x_{1}}

(4)

So, ${\hat{F}}_{Neutral}$ is equal to the sum of F_reference and $A_{1} \hat{x_{1}}$ :

\begin{array}{l} {\hat{F}}_{Neutral} = F_{reference} + Δ {\hat{F}}_{Neutral} \\ = F_{reference} + A_{1} \hat{x_{1}} \end{array}

(5)

But ${\hat{F}}_{Neutral}$ is approximate, so each point in ${\hat{F}}_{Neutral}$ may not exist exactly on F_face. In this paper, we use an iterative method to find the neutral face. The method is presented in Figure 4.The results of expression processing of ten different people are presented in the second line of Figure 5. Error maps are showed in the third line. From the maps, we could find that our method can maintain the rigid parts of the faces. Note that not only can our method remove facial expression, but also it can maintain neutral faces. So in the recognition section, we do not have to recognize whether the probe face is expressional. Some neutral faces are presented in Figure 6.

Finally, the expression-removed depth images are constructed using F_Neutral. The size of the depth image is 128 × 128.

4 3D face recognition using dual-tree complex wavelet feature

After removing the facial expression, the 3D faces become very similar. Extracting discriminating feature from each face is very important. In this paper, we utilize the dual-tree complex wavelet transform [19, 20] to extract expression-removed faces’ feature (the size of the face image is 128 × 128), and six subregions’ feature (the six regions are extracted from the face image which size is 384 × 384 and the size of each region is 128 × 128). Six feature points are shown in Figure 7A. We used an easy way to find these points. Firstly, we manually defined the six points of a standard face. Then, for each gallery and probe faces, six subregions of size 9 × 9 which centroids are the same as the standard face are found. Finally, the shape index value [21] refines the six feature points. The local maxima refine landmarks for point 2 and point 5, while the local minima refine landmarks for points 1, 3, 4, and 6. Thus, the six subregions of size 128 × 128 which centroids are the six refined feature points are defined.

In the training section, we use all the 943 faces in FRGC 1.0 for training. First of all, we extract the four-level magnitude subimages of each training face. Subsequently, we vectorize the six magnitude subimages into a large vector (the dimension is 384), and then we utilize LDA [2] to learn the discriminant subspace and record the transformation matrix. Secondly, we extract the six subregions’ four-level magnitude subimages using DT-CWT and vectorize them into a large vector (the dimension is 2,304) and utilize LDA to learn the subspace too. Finally, we get all the gallery faces’ two features using DT-CWT and their transformation matrix, respectively.

In the testing section, we get all the probe faces’ two features by using DT-CWT and the two transformation matrices, respectively. Cosine distance is used to establish similarity matrix S₁ and S₂. After this, we normalize them using function (9).

S_{r c}^{'} = \frac{S_{r c} - min (S_{r})}{max (S_{r}) - min (S_{r})}

(6)

In the function, S_rc represent an element of similarity matrix S₁ and S₂ (at row r and column c), S_r is the elements of S₁ and S₂ at row r, and $S_{r c}^{'}$ denotes the similarity normalized S_rc. Then, the final similarity matrix is established by a simple sum rule S = S₁ + S₂. Recognition is achieved by the nearest neighbor classifier.

5 Results and analysis

We perform our experiments on the Bosphorus database [22] and the FRGC [23] 3D face database.

The Bosphorus database consists of 105 subjects in various poses, expressions, and occlusion conditions. Eighteen subjects have beard/moustache and short facial hair is available for 15 subjects. The majority of the subjects are aged between 25 and 35 years. There are 60 men and 45 women in total, and most of the subjects are Caucasian. Also, 27 professional actors/actresses are incorporated in the database. Up to 54 face scans are available per subject, but 34 of these subjects have 31 scans. Thus, the number of total face scans is 4,652.

FRGC v1 contained 943 3D faces, while FRGC v2 contained 4,007 3D faces of 466 persons. The images were acquired with a Minolta Vivid 910. The Minolta 910 scanner uses triangulation with a laser stripe projector to build a 3D model of the face. The 3D faces are available in the form of four matrices, each of size 640 × 480. The data consists of frontal views. Some of the subject has facial hair, but none of them is wearing glasses. The 2D faces are corresponding to their respective 3D face. In FRGC v2, 57% are male and 43% are female. The database was collected during 2003 to 2004. In order to evaluate the robustness of our method against expression variations, we classified 1,648 faces with expression as the non-neutral dataset (411 persons), while 2,359 neutral faces as the neutral dataset (422 persons). The number of the neutral dataset and the non-neutral dataset is not equal because some people in FRGC v2 contained only one face. We use ‘N’ which represents for neutral, ‘E’ which indicates for non-neutral, and ‘A’ which stands for all in the following of the paper.

5.1 Experiments on Bosphorus database

Firstly, to evaluate the performance of the nose tip detection method, we test our method on the Bosphorus database. The results of data preprocessing of the first person are presented in Figure 8. From the figure, we can find that our method can deal with expressional face and posed face which angle is less than 30°, but it cannot find the nose tip of the big angled face (±45° and ±90°), because most part of the face is missing.To further confirm the effectiveness of the proposed expression processing approach, we perform experiments on nearly frontal faces (those poses are less than 30°) with expression changes and occlusions. We compare the original faces and the expression-removed faces using leave-one-out method. We extract the DT-CWT feature and then use LDA to finish recognition. FRGC v1 is used for training LDA subspace. Receiver operating characteristic (ROC) curves of the experiment were presented in Figure 9. From the figure, we can find that the facial expression-removed faces performed better than the original faces.

5.2 Experiments on FRGC

5.2.1 Comparison with original mouths

Dealing with open mouth has been a serious topic in 3D face recognition, and a number of researchers have been working on it. We expect that our method in correctly establishing correspondence between an open mouth and estimated neutral component can greatly improve 3D face recognition.

As a first set of experiments, we test our algorithm on the mouth area of FRGC v2. As the experimental protocol, we constructed the gallery set containing the first neutral face for each subject and the remaining ones made up of the probe set. We compare the expression-removed mouths with the original mouths using the PCA method. The recognition rate of using the original mouths is 52.95%, while the recognition rate of using the expression-removed mouths is 69.5%. We could find that the expression-removed mouths contain more identity information than the original mouths.

5.2.2 Comparison with original faces

Then, for the purpose of evaluating the performance of the expression processing method, we compare the expression-removed faces with the original faces using the Gabor feature [24] and DT-CWT feature of the whole depth image. We finished four experiments which contained the neutral vs. neutral experiment, neutral vs. non-neutral experiment, all vs. all experiment, and ROCIII experiment. In the all vs. all experiment, every image of FRGC v2 is matched with all remaining others. It resulted 16,052,042 combinations. Similarly, in the neutral vs. neutral experiment, every image of the neutral database is matched with all remaining others and it resulted 5,562,522 combinations. In the neutral vs. non-neutral experiment, the gallery images come from the neutral dataset and the probe entries come from the expression dataset. In the ROCIII experiment, the gallery images come from the Fall 2003 semester, while the probe entries come from the Spring 2004 semester.

From Table 1, we can see that the facial expression-removed faces performed better than the original faces. From the N vs. E experiment, we can find that the expression-removed faces are more useful for face recognition. Using the facial expression-removed faces achieved 11.7% higher recognition rate and 12.29% higher verification rate at 0.001 FAR than using the original faces. Meanwhile, we can find that the DT-CWT feature is more effective than the Gabor feature for 3D face recognition.

Table 1 Facial expression-removed faces compared with original faces using Gabor feature and DT-CWT feature of whole depth image

Full size table

5.2.3 ROC and CMC of our method

In this section, we employed two different scenarios for the experiments: identification and verification. Four experiments were performed as those in Section 5.1. CMC curves of the four experiments were presented in Figure 10, while ROC curves of the four experiments were presented in Figure 11. The performance of the feature extracted from the whole face using DT-CWT and LDA, the feature extracted from six subregions using DT-CWT and LDA, and the fusion of the two are shown in each figure. As is seen, combining the DT-CWT feature and the six subregions’ feature can improve the recognition performance even further.

5.2.4 Comparisons with other methods

Here, we compared our method with that of the state-of-the-art methods using the fusion results. Table 2 shows the verification results for state-of-the-art methods on the FRGC database as reported in the literature.

Table 2 Verification rate comparison with the state-of-the-art methods at 0.001 FAR

Full size table

Also, the verification rate of our method is shown in Table 2. The performances of A vs. A and ROCIII experiments were slightly lower but still closed to the best.

6 Conclusions

We presented an automatic method for 3D face recognition. We used an improved detection method to correct the pose of the face. We showed that the proposed method could correct posed face which angle is less than 30°.

We also proposed a 3D facial expression processing method, which was based on sparse representation. It could abstract the neutral component from a dictionary which is the combination of neutral and expressional spaces and enhance the recognition rate. Our method could deal opened mouth and expression of grin. We showed that the estimated neutral faces which are extracted from the expression faces are familiar with that extracted from their corresponding neutral face.

Then, the facial representation which contained the whole facial feature and the six subregions’ feature extracted by DT-CWT were gotten. Holistic and local feature could represent a 3D face more effective for the recognition. Finally, LDA was used to enhance the accuracy of the recognition.

References

Zhong C, Sun Z, Tan T: Robust 3D face recognition using learned visual codebook. In Proceedings of IEEE Conference on Pattern Recognition. Minneapolis; 2007:17-22.
Google Scholar
Zhong C, Sun Z, Tan T: Learning efficient codes for 3D face recognition. In Proceedings of 15th IEEE International Conference on Image Processing. San Diego; 2008:1928-1931. 12–15 Oct
Google Scholar
Chang KI, Bowyer KW, Flynn PJ: An evaluation of multi-modal 2D +3D face biometrics. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27(4):619-624.
Article Google Scholar
Faltmier TC, Bowyer KW, Flynn PL: A region ensemble for 3-D face recognition. IEEE Trans. Inf. Forensics Secur. 2008, 3(1):62-73.
Article Google Scholar
Wang Y, Liu J, Tang X: Robust 3D face recognition by local shape difference boosting. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32(10):1858-1870.
Article Google Scholar
Lee H, Battle A, Raina R, Ng AY: Efficient sparse coding algorithms. Adv. Neural. Inf. Proc. Syst. 2007, 19: 801.
Google Scholar
Berretti S, Bimbo AD, Pala P: 3D face recognition using iso-geodesic stripes. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 32(12):2162-2177.
Article Google Scholar
Zhang G, Wang Y: Robust 3D face recognition based in resolution invariant features. Pattern Recognit. Lett. 2011, 32(7):1009-1019. 10.1016/j.patrec.2011.02.004
Article Google Scholar
Alyuz N, Gökberk B, Akarun L: Regional registration for expression resistant 3-D face recognition. IEEE Trans. Inf. Forensics Secur. 2010, 5(3):425-440.
Article Google Scholar
Kakadiaris A, Passalis G, Toderici G, Murtuza MN, Lu Y, Karampatziakis N, Theoharis T: Three-dimensional face recognition in the presence of facial expressions: an annotated deformable model approach. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29(4):640-649.
Article Google Scholar
Drira H, Amor BB, Srivastava A, Daoudi M, Slama R: 3D face recognition under expressions. Occlusions Pose Variat. 2013, 35(9):2270-2283.
Google Scholar
Mohammadzade H, Hatzinakos D: Iterative closest normal point for 3D face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35(2):381-397.
Article Google Scholar
Amberg B, Knothe R, Vetter T: Expression invariant 3D face recognition with a Morphable Model. In International Conference on Automatic Face & Gesture Recognition. Amsterdam; 2008:1-6. 17–19 Sept
Google Scholar
Belhumeur PN, Hespanha JP, Kriegman DJ: Eigenfaces vs. fisherface: recognition using class special linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19(7):711-720. 10.1109/34.598228
Article Google Scholar
Li X, Jia T, Zhang H: Expression-insensitive 3D face recognition using sparse representation. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Miami; 2009:2575-2582. 20–25 June
Google Scholar
Wang X, Ruan Q, Jin Y, An G: Expression robust three-dimensional face recognition based on Gaussian filter and dual-tree complex wavelet transform. J. Intell. Fuzzy Syst. 2014, 26: 193-201.
Google Scholar
Besl PJ, McKay ND: A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 1992, 14(2):239-256. 10.1109/34.121791
Article Google Scholar
Donoho D: For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 2006, 59(6):797-829. 10.1002/cpa.20132
Article MATH MathSciNet Google Scholar
Selesnick IW, Baraniuk RG, Kingsbury NG: The dual-tree complex wavelet transform. IEEE Signal Proc. Mag. 2005, 22(6):123-151.
Article Google Scholar
Liu C, Dai D: Face recognition using dual-tree complex wavelet features. IEEE Trans. Image Process. 2009, 18(11):2593-2599.
Article MathSciNet Google Scholar
Koenderink JJ, van Doorn AJ: Surface shape and curvature scales. Image Vision Comput. 1992, 10(8):557-565. 10.1016/0262-8856(92)90076-F
Article Google Scholar
Savran A, Alyüz N, Dibeklioğlu H, Çeliktutan O, Gökberk B, Sankur B, Akarun L: Bosphorus Database for 3D face analysis. Workshop on Biometrics and Identity Management 2008, 47-56.
Chapter Google Scholar
Phillips PJ, Flynn P, Scruggs T, Bowyer KW, Chang J, Hoffman K, Marques J, Min J, Worek W: Overview of the Face Recognition Grand Challenge. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1. San Diego; 2005:947-954. 20–25 June
Google Scholar
Jones JP, Palmer LA: An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. J. Neurophysiol. 1987, 27: 1233-1258.
Google Scholar
Mian AS, Bennamoun M, Owens R: An efficient multimodal 2D-3D hybrid approach to automatic face recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29(11):1927-1943.
Article Google Scholar
Maurer T, Guigonis D, Maslov I, Pesenti B, Tsaregorodtsev A, West D, Medioni G: Performance of Geometrix ActiveID TM 3D face recognition engine on the FRGC data. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). San Diego; 2005:154. 20–25 June
Google Scholar
Cook J, Cox M, Chandran V, Sridharan S: Robust 3D face recognition from expression categorisation, ICB 2007. LNCS 2007, 4642: 271-280.
Google Scholar

Download references

Acknowledgements

This work was supported partly by the National Natural Science Foundation of China (61172128), the National Key Basic Research Program of China (2012CB316304), the New Century Excellent Talents in University (NCET-12-0768), the Fundamental Research Funds for the Central Universities (2013JBM020, 2013JBZ003), the Program for Innovative Research Team in the University of Ministry of Education of China (IRT201206), the Beijing Higher Education Young Elite Teacher Project (YETP0544), the National Natural Science Foundation of China (61403024), and the Research Fund for the Doctoral Program of Higher Education of China (20120009110008, 20120009120009).

Author information

Authors and Affiliations

Beijing Key Laboratory of Advanced Information Science and Network Technology, Institution of Information Science, Beijing Jiaotong University, Beijing, 100044, China
Xueqiao Wang, Qiuqi Ruan, Yi Jin & Gaoyun An

Authors

Xueqiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Qiuqi Ruan
View author publications
You can also search for this author in PubMed Google Scholar
Yi Jin
View author publications
You can also search for this author in PubMed Google Scholar
Gaoyun An
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xueqiao Wang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Authors’ original file for figure 9

Authors’ original file for figure 10

Authors’ original file for figure 11

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Cite this article

Wang, X., Ruan, Q., Jin, Y. et al. Three-dimensional face recognition under expression variation. J Image Video Proc 2014, 51 (2014). https://doi.org/10.1186/1687-5281-2014-51

Download citation

Received: 01 April 2014
Accepted: 06 November 2014
Published: 25 November 2014
DOI: https://doi.org/10.1186/1687-5281-2014-51

Three-dimensional face recognition under expression variation

Abstract

1 Introduction

2 3D data preprocessing

2.1 Nose detection

2.2 Face region

3 3D facial expression processing method based on sparse representation

3.1 Brief introduction of sparse representation

3.2 Facial expression processing method

4 3D face recognition using dual-tree complex wavelet feature

5 Results and analysis

5.1 Experiments on Bosphorus database

5.2 Experiments on FRGC

5.2.1 Comparison with original mouths

5.2.2 Comparison with original faces

5.2.3 ROC and CMC of our method

5.2.4 Comparisons with other methods

6 Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords