- Open Access
Automatic prediction of age, gender, and nationality in offline handwriting
EURASIP Journal on Image and Video Processing volume 2014, Article number: 10 (2014)
The classification of handwriting into different categories, such as age, gender, and nationality, has several applications. In forensics, handwriting classification helps investigators focus on a certain category of writers. However, only a few studies have been carried out in this field. Classification of handwriting into a demographic category is generally performed in two steps: feature extraction and classification. The performance of a system depends mainly on the feature extraction step because characterizing features makes it possible to distinguish between writers. In this study, we propose several geometric features to characterize handwritings and use these features to perform the classification of handwritings with regards to age, gender, and nationality. Features are combined using random forests and kernel discriminant analysis. Classification rates are reported on the QUWI dataset, reaching 74.05% for gender prediction, 55.76% for age range prediction, and 53.66% for nationality prediction when all writers produce the same handwritten text and 73.59% for gender prediction, 60.62% for age range prediction, and 47.98% for nationality prediction when each writer produces different handwritten text.
Handwritings can be classified into many categories including gender, age, handedness, and nationality. This type of classification has several applications. For example, in the forensic domain, handwriting classification can help the investigators to focus on a certain category of suspects. Additionally, processing each category separately leads to improved results in writer identification and verification applications.
There are only a few studies in the literature that investigate the automatic detection of gender, age, and handedness from handwritings. Bandi et al.  proposed a system that classifies the handwritings into demographic categories using the ‘macro-features’ introduced in . These features focus on measures such as pen pressure, writing movement, stroke formation, and word proportion. The authors reported classification accuracies of 77.5%, 86.6%, and 74.4% for gender, age, and handedness classification, respectively. However, in this study, all the writers had to produce the same letter.
Unfortunately, this is not always the case in real forensic caseworks. Moreover, the dataset used in this study is not publicly available.
Liwicki et al.  also performed the classification of gender and handedness in the online mode (which means that the temporal information about the handwriting is available). The authors used a set of 29 features extracted from the online information and its offline representation and applied support vector machines and Gaussian mixture models to perform the classification. The authors reported a performance of 67.06% for gender classification and 84.66% for handedness classification. In a recent study , the authors reported separately the performance of the offline mode, the online mode and their combination. The performance reported for the offline mode was 55.39%, which is slightly better than chance.
In this paper, we propose a new method for the detection of the age range, gender, and nationality of the writer of a handwritten document. A set of novel features are proposed and described, including directions, curvatures, tortuosities, chain codes, and edge-based directional features. These features are combined using several classifiers, including random forests and kernel discriminant analysis. This method is evaluated using the QUWI database, which is the only available public dataset containing annotations regarding gender, age range, and nationality.
The remainder of this paper is organized as follows: Sections 2 and 3 give a detailed description of our feature extraction and classification methods. Section 4 presents the dataset used in this study and the detailed results. Section 5 concludes this work and draws some perspectives. Our method consists of two main steps: feature extraction and classification. These two steps are illustrated in Figure 1.
2 Feature extraction
In this step, the characterizing features are extracted from the handwriting. To make the system pen independent, images are first binarized using the Otsu thresholding algorithm . The following subsections describe the features considered in this study. These features do not correspond to a single value, but are defined by a probability distribution function (PDF) extracted from the handwriting images to characterize the writer's individuality [6, 7]. The PDF describes the relative likelihood for a certain feature to take on a given value.
Note that all these developed features or their equivalents are used by forensic document examiners as well as graphologists in order to distinguish between different categories of writers .
1.1 Direction feature (f1)
This method has been used in writer identification [7, 9], and its implementation closely resembles the one proposed by Matas et al. . First, we compute the Zhang skeleton of the binarized image. This skeleton is well known for not producing parasitic branches unlike most skeletonization algorithms . The skeleton is then segmented at its junction pixels. Then, we traverse the pixels of the obtained segments of the skeleton using the predefined order favoring the four-connectivity neighbors as shown in Figure 2a. A result of such an ordering is shown in Figure 2b. For each pixel p, we consider the 2⋅N + 1 neighboring pixels centered at position p. A linear regression of these pixels gives a good estimation of the tangent at the pixel p (Figure 2c). The value of N has empirically been set to 5 pixels throughout this paper.
The PDF of the resulting directions is computed as a vector of probabilities for which the size has been empirically set to 10. It is worth noting that this is the first time that such a method of computing directions has been proposed for categorization applications.
2.2 Curvature feature (f2)
In forensic document examination, curvature is commonly accepted as a characterizing feature [7, 8]. We have adapted this method to handwriting as follows: for each pixel p belonging to the contour, we consider a neighboring window of size t. Inside this window, we compute the number of pixels n1 and the number of pixels n2 that belong to the background and foreground, respectively (see Figure 3a). The difference n1 - n2 is positive at the points on which the contour is convex and negative at the points on which the contour is concave and is therefore a good indicator of the local curvature of the contour. Therefore, we estimate the curvature as being: . The value C is illustrated in Figure 3b on a binary shape for which t has been empirically set to 5. The PDF of curvatures is computed in a vector with a size empirically set to 100. This way of computing curvatures is also novel in the field of offline writer identification and categorization, and to the extent of our knowledge, it has never been used before.
2.3 Tortuosity feature (f3)
This feature makes it possible to distinguish between fast writers who produce smooth handwriting and slow writers who produce ‘tortuous’/twisted handwriting. To estimate tortuosity, for each pixel p of the text, we determine the longest line segment that traverses p and is completely included inside the foreground (Figure 4a). An example of estimated tortuosities is shown in Figure 4b.
The PDF of the angles of the longest traversing segments is produced in a vector with the size set to 10, as mentioned previously.
2.4 Chain code features (f4 to f7)
Chain codes are generated by scanning the contour of the text and assigning a number to each pixel according to its location with respect to the previous pixel. Figure 5 shows a contour and its corresponding chain code.
Chain codes have been applied to writer identification in . These features make it possible to characterize a detailed distribution of curvatures in the handwriting. Chain codes might be applied at different orders:
f4: The PDF of i patterns in the chain code list such that i∈0,1,…,7. This PDF has a size of 8.
f5: The PDF of (i, j) patterns in the chain code list such that i,j∈0,1,…,7. This PDF has a size of 64.
Similarly, f6 and f7 correspond to the PDF of (i, j, k) and (i, j, k, l) in the chain code list with sizes of 512 and 4,096, respectively. Not all successions of chain code patterns can be obtained. For example, the chain code pattern (1, 5) is not a possible succession, and therefore its corresponding distribution in the PDF will always be nil.
2.5 Edge-based directional features (f8 to f26)
Initially introduced in , these features provide a detailed distribution of directions and can also be applied at several sizes by positioning a window centered at each contour pixel and counting the occurrences of each direction, as shown in Figure 6a. These features have been computed from size 1 (f8, which has a PDF size of 4) to size 10 (f17, which has a PDF size of 40). We have also extended these features to include not only the contour of the moving window but also the whole window (Figure 6b) . This feature has been computed from size 2 (f18, which has a PDF size of 12) to size 10 (f26, which has a PDF size of 220).
In this step, the features previously presented are used to decide which category each handwriting belongs to. When performing the classification, each element of the feature vectors will be used as a separate input for the classifier. (For example, f1 will be an input vector of 10 elements for the classifier.)
We have combined these features using a Random Forest classifier  with kernel discriminant analysis using spectral regression (SR-KDA). Descriptions of the random forests classifier and the SR-KDA  are given below.
The use of these two classifiers is justified by their ability to train on large datasets for features and achieving high classification rates .
3.1 Random forest classifier
Random forests is an ensemble learning method for classification that operates by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes output by individual trees. Each decision tree is constructed as follows:
If the number of cases in the training set is N, sample n cases such as n < N at random but with replacement from the original data. This sample will be the training set for growing the tree.
If there are M input variables, a number m < <M is specified such that at each node, m variables are selected at random from M and the best split on these m is used to split the node. The value of m is held constant during the forest growing.
Each tree is grown to the largest extent possible. There is no pruning.
In our case, we built the random forest classifiers for the cases of age, gender, and nationality using the R random forest library .
3.2 Kernel discriminant analysis using spectral regression
Let xi ∈ Rd, i = 1,…, m be training vectors represented as an m × m kernel matrix K such that K(x i , x j ) = ?⟨Φ(x i ), Φ(x j )?⟩, where Φ(x i ) and Φ(x j ) are the embeddings of data items x i and x j . If ν denotes a projective function into the kernel feature space, then the objective function for KDA is :
where C b and C t denote the between-class and total scatter matrices in the feature space, respectively. Equation 1 can be solved by the eigen-problem C b = λC t . It is proved in  that Equation 1 is equivalent to:
where α = [α 1 , α 2 ,…, α m ]T is the eigenvector satisfying KWKα = λ KKα.
W = (W l )l = 1,…,n is an (m × m) block diagonal matrix of labels arranged such that the upper block corresponds to positive examples and the lower one corresponds to negative examples of the class. Each eigenvector α yields a projection function ν in the feature space.
It is also shown in  that instead of solving the eigen-problem in KDA, the KDA projections can be obtained by the following two linear equations:
where ϕ is an eigenvector of W, I is the identity matrix, and δ > 0 is a regularization parameter. W = (W l )l = 1,…,n is an (m × m) block diagonal matrix of labels arranged such that the upper block corresponds to positive examples and the lower one corresponds to negative examples of the class. Eigenvectors ϕ are obtained directly from the Gram-Schmidt method. Because (K + δ I) is positive definite, a Cholesky decomposition is used to solve the linear equations in (3). Thus, for the resolution of the linear system of Equation 3, the system becomes:
i.e., solve the system to first find vector θ and then find vector α. In summary, SR-KDA only needs to solve a set of regularized regression problems, and there is no eigenvector computation involved. This results in a significant improvement of computational complexity and allows large kernel matrices to be handled. After obtaining α, the decision function for the new data item is calculated from:
The classification results of those classifiers for all the presented features on the QUWI dataset will be shown in the next Section.
In this section, we describe the QUWI handwriting database on which the experiments have been conducted. We also present the results obtained for each individual feature as well as their combination using random forests and kernel discriminant analysis. The results are then analyzed and discussed.
To the best of our knowledge, the only publicly available handwriting dataset annotated with respect to age, gender, and nationality is the QUWI dataset . This dataset contains handwritings of 1,017 writers in both English and Arabic. In each language, writers produced one text that is the same for all the writers and another text that is different for every writer. Moreover, writers in this dataset have different genders, age ranges, and nationalities. Because very few writers are left-handed (around fifty writers), this dataset can only be useful for handedness detection.
To perform the classification, 70% of this dataset has been used for training and 30% for testing as is often the case in data mining . We have computed the presented features on this dataset. As mentioned previously, each feature corresponds to a PDF of several values with each of them used as a separate predictor. These predictors were combined using a random forest classifier, which is well suited for this category of features , as well as the kernel discriminant analysis using spectral regression.
Three classification tasks were defined for this dataset:
Gender classification. Note that a random classification would predict approximately 50%, as this is a two-class classification.
Age range classification. To avoid classes with very small patterns, seven age ranges were defined: (1950 to 1965), (1966 to 1975), (1976 to 1985), (1986 to 1990), (1991 to 1995), (1996 to 2000), and (2001 to 2012). A random classification would therefore predict approximately 14%.
Nationality prediction. To avoid small classes, only writers of eight different nationalities were considered. Each of these classes has more than 30 writers. A random classification would only predict approximately 12%.
Tables 1, 2, and 3 depicts the correct classification rates for each category of features using a random forest of 5,000 random trees and kernel discriminant analysis for every gender, age range, and nationality classification. The classification is performed for the Arabic and English languages separately in the first step and jointly in the second step. The results are reported for the case of similar texts written by all the writers and different texts for each writer. Figure 7 summarizes the best results for gender, age range, and nationality using two classification methods.
4.3 Discussion and analysis
To test which feature combination is optimal for each classification problem, we plotted the average performance (for similar and different texts using random forest and KDA) for the proposed geometric features (f1 to f3), chain code features (f4 to f7), edge-based directional features (f8 to 17), and filled edge-based directional features (f18 to f26). The results are shown in Figure 8. It is important to note that the performances are seemingly very high for nationality, low for age range and even lower for nationality detection. This is due to the fact that nationality prediction is a binary classification problem in which even a random prediction would score 50%, whereas age range and nationality detection are respectively seven- and eight-class classification problems in which a random classifier would only score 14% and 12%, respectively.
The results show that chain code-based features generally outperform using other features for predicting the gender and the nationality which suggests that the detailed distribution of curvatures in the handwriting is of a high importance in characterizing the gender and nationality. Note as well that the proposed geometric features outperforms other features for predicting the age range which suggests that all of the directions, curvatures, and tortuosity are essential for determining the age through handwriting.
We also plotted the average performance of random forests and KDA classifiers when combining all the features (f1 to f26). The results are shown in Figure 9. Random forests are generally preferred for the prediction of age range and nationality, whereas KDA is preferred for the prediction of gender. This clearly suggests that random forests are to be preferred when predicting patterns with many classes whereas KDA are to be preferred for binary classification problems.
The average performance when combining all the features (f1 to f26) on the same and different texts is shown in Figure 10. Notice that handwritings produced by the same writer yield slightly better results for the prediction of gender but not for the prediction of age range or nationality. This suggests that working on the same texts or different texts do not have any benefits in improving the classification results.
The average performance when combining all the features (f1 to f26) on Arabic and English texts is shown in Figure 11. Generally, Arabic handwritings yield better prediction results. This is explained by the complexity of the Arabic script which tends to help better categorize writers.
Additionally, the combination of several features does not always yield better results. There are many cases in which one feature alone outperforms a combination of several features. Indeed, some features might be redundant or irrelevant and contain no useful information in which case they need to be removed for obtaining better performance.
The classification systems described here are promising; however, there remains a lot of room for improvement in terms of using new features and classification methods. Comparison of results, obtained in this research, with other researchers is difficult because of differences in experimental details, the actual handwriting used, the method of data collection, and dealing with cursive off-line handwritten text. If this work is compared to writer demographic identification research [1, 4], it is the first one that implemented on offline cursive Arabic and English writers. This also means that it uses different sets of features and classification techniques. Unfortunately, both datasets used in [1, 4] are not publically available. The dataset used in this research is available for research purposes.
Finally, for the comparison purposes, the average correct gender classification results are over 73%, which exceeds the results reported in  for offline gender identification (55.39%) on a different dataset consisting of 200 writers. The results also compare well with the 77.5% reported in  on a smaller dataset (800 individuals wrote the same letter). The authors of  also report an age range classification accuracy of 86.6%, which seemingly outperforms our 55%. However, the authors only included two age range categories (below 24 and above 45) and included only 650 individuals.
We have presented a method that uses several geometric features for the classification of age range, gender, and nationality of handwritings, which is applicable for both Arabic and English documents. This study is the first that reported classification results for those subcategories on the QUWI dataset . The results are reported for both text-dependent and text-independent category classification.
Experiments show that using chain code-based features generally outperforms using other features for predicting the gender and the nationality, and the proposed geometric features outperforms other features for predicting the age range. The results suggest that random forests are generally preferred for the prediction of age range and nationality, whereas KDA is preferred for the prediction of gender. We have also noticed that handwritings produced by the same writer yield slightly better results for the prediction of gender but not for the prediction of age range or nationality. It has also shown that experiments on Arabic handwritings attained generally better prediction results. Future work includes exploring ways of combining the proposed features and using other classifiers. The use of the proposed features for predicting the handedness of writers is also planned.
Bandi K, Srihari SN: Writer demographic identification using bagging and boosting. In Proceedings of the International Graphonomics Society Conference (IGS). Salerno, Italy; 2005:133-137. 26–29 June
Srihari S, Cha SH, Arora H, Lee S: Individuality of handwriting: a validation study. In 2001 Proceedings of the Sixth International Conference on Document Analysis and Recognition. Seattle; 2001:106-109. 10–13 September
Liwicki M, Schlapbach A, Loretan P, Bunke H: Automatic detection of gender and handedness from on-line handwriting. In Proceedings of the 13th Conference of the International Graphonomics Society. Melbourne; 2007:179-183. 11–14 Novembers
Liwicki M, Schlapbach A, Bunke H: Automatic gender detection using on-line and off-line information. Pattern. Anal. Appl. 2011, 14: 87-92. 10.1007/s10044-010-0178-6
Otsu N: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 1979, 9(1):62-66.
Hassaine A, Al-Maadeed S, Alja’am J, Jaoua A, Bouridane A: The ICDAR2011 Arabic Writer Identification Contest. In Proceedings of the Eleventh International Conference on Document Analysis and Recognition. Beijing, China; 2011. 18–21 September
Hassaïne A, Al-Maadeed S, Bouridane A: A set of geometrical features for writer identification. In The 19th International Conference of Neural Information Processing Doha, Qatar. Berlin Heidelberg: Springer; 2012:584-591. 12–15 November
Koppenhaver K: Forensic Document Examination: principles and practice. New York: Humana Press; 2007.
Bulacu M, Schomaker L: Text-independent writer identification and verification using textural and allographic features. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29(4):701-717.
Matas J, Shao Z, Kittler J: Estimation of curvature and tangent direction by median filtered differencing. Lecture notes in computer science. vol 974. In The 8th International Conference on Image Analysis and Processing. Springer-Verlag, Berlin; 1995:83-88. 13–15 September
Zhang TY: A fast parallel algorithm for thinning digital patterns. Commun. ACM 1984, 27(3):236-239. 10.1145/357994.358023
Siddiqi I, Vincent N: Text independent writer recognition using redundant writing patterns with contour-based orientation and curvature features. Pattern Recogn. 2010, 43(11):3853-3865. 10.1016/j.patcog.2010.05.019
Breiman L: Random forests. Mach. Learn. 2001, 45: 5-32. 10.1023/A:1010933404324
Cai D, He X, Han J: Proceedings of the ICDM. Omaha, Nebraska; 2007. 28–31 October
Bock HH, Diday E: Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data. Heidelberg: Springer; 2000.
Liaw A, Wiener M: Classification and regression by randomforest. NANR News 2002, 2(3):18-22. http://CRAN.R-project.org/doc/Rnews
Mika S, Ratsch G, Weston J, Scholkopf B, Mullers KR: Fisher discriminant analysis with kernels. Proceedings of the 1999 IEEE Signal Processing Society Workshop, Madison. In Neural Networks for Signal Processing IX, 1999. Piscataway: IEEE; 1999:41-48. 23–25 August
Lin TY, Xie Y, Wasilewska A, Liau CJ: Data Mining: Foundations and Practice, vol. 118. Heidelberg: Springer; 2008.
Al-Ma’adeed S, Ayouby W, Hassaine A, Aljaam J: QUWI: an Arabic and English handwriting dataset for offline writer identification. In International Conference on Frontiers in Handwriting Recognition. Bari, Italy; 2012. 18–20 September
This work is supported by the Qatar National Research Fund through National Priority Research Program (NPRP) No. 09 – 864 – 1 – 128. The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official views of the Qatar National Research Fund or Qatar University.
The authors declare that they have no competing interests.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
About this article
- Writer demographic category classification
- Handwriting analysis
- Chain code
- Edge-based directional features
- Writer identification