Keywords

1 Introduction

Face recognition systems are widely used in various access control applications including law enforcement. The ease of operation and the non-intrusive process of capturing face have further increased their applicability for different authentication applications. However, the face biometric system’s performance is known to degrade due to the variation in expression, ageing, illumination, etc. The availability of large-scale face databases and high performing deep learning techniques [5] has enabled a new generation of face recognition systems, which are robust to these variations. Nevertheless, remaining challenges for face recognition such as a beautification surgery [7], make-up [6], gender discriminablility problem for transgender subjects [8], facial disguise [6] remain due to prominent changes in both texture and geometric facial characteristics which significantly impact the performance of face recognition systems. In spite of the significant research conducted to improve the face recognition systems under covariates as mentioned above, the availability of small scale databases (such as, limited or one sample before and after surgery) makes it more challenging to achieve a robust performance.

Fig. 1.
figure 1

Face images of the subjects before and after drug abuse [4]. Images compiled from www.facesofmeth.us

Recently, a new challenge to face recognition systems was identified due to the prolonged consumption of illicit drugs that can prominently change the physical structure of the face [3, 4]. Figure 1 shows the example of faces before and after illicit substance abuse that caused noticeable changes in appearance of the face characteristic. Furthermore, the availability of only single sample before and single sample after substance abuse, and the fact that such samples are taken in an un-constrained environment makes the problem even more challenging. To achieve good recognition performance on such data, with strong intra-class variation, a robust face recognition needs to be designed which remains a challenge still. The significant change in the facial structure is caused by suppressed appetite that leads to undernourishment which further leads the body to consume muscle tissue and facial fat [4]. This process will result in a gaunt and hollowed-out appearance of the face leading to exhaustive changes in facial geometry.

The early work on illicit drug abuse in the context of face recognition was first introduced by [3, 4]. In [3], the performance of the state-of-the-art face recognition system including two different commercial-off-the shelf systems is presented. The recognition performance of the state-of-the-art schemes evaluated on the substance (or drug) abuse database indicates the degraded performance with a recognition rate of 38\(\%\) (Rank 1) using Histogram of Gradients (HOG) [9]. Further, a new scheme to detect a drug abuse face is also introduced in [3] based on pairwise dictionary learning. In [4], the performance of the state-of-the-art face verification is presented by evaluating eight different face verification algorithms including one commercial system. This study also indicated the degraded performance of the state-of-the-art face recognition algorithms. Recently, a relatively new feature extraction approach known as AutoScat derived from the scattering wavelet was proposed in [2]. The experimental results are reported on the same database used in [3] indicated a recognition rate of 32\(\%\) at Rank 1. These results indicate a lower performance than the state-of-the-art results reported in [3] using HOG features on the same database. Thus, based on the available work, the results achieved from the available works has indicated the reduced performance when compared with state-of-the-art face recognition algorithms on the drug abuse face recognition database. Thus, there is a need for newer algorithms to significantly address the texture and geometric variations of the face before and after drug abuse to improve the biometric performance.

In this work, we present a novel scheme based on the collaborative representation of the statistically independent filters’ response computed on the face images before and after drug abuse. Given the face image, the proposed scheme will first detect and segment the face region which is then normalized to a size of 120 \(\times \) 120 pixels. In the next step, we subdivide the normalized face images into six non-overlap** patches such that three patches are obtained horizontally and remaining three patches are obtained vertically as illustrated in Fig. 3. The key motivation for subdivision in non-overlap** region is to obtain at-least few regions that are not physiologically changed to greater extent. We then process each of these patches to extract statistically independent texture features using both multi-scale and multi-bit Binarized Statistical Image Features (BSIF) [10]. The Multi-scale and Multi-bit BSIF has 56 different filters (or kernels) with a varying scale ranging from \(5 \times 5\), \(7 \times 7\), \(9 \times 9\), \(11 \times 11\), \(13 \times 13\), \(15 \times 15\) and \(17 \times 17\) with varying bits ranging from 5, 6, 7, 8, 9, 10, 11 and 12 according to different filter size. The features obtained for each of these filters on individual patches are processed independently and classified using a probabilistic Collaborative Representation Classifier (Pro-CRC) [11]. Based on the comparison score generated using Pro-CRC, we obtain a ranked list of identities by sorting a comparison score in the descending order. Since we have six patches and 56 different filter kernels from BSIF that are used to extract the features independently, we have \(56 \times 6 = 336\) rank lists for each face image. We then perform the rank level fusion using a majority voting rule to obtain the final rank list to identify the person. Thus, it is our assertion that the use of non-overlap** face image patch together with the features extracted independently using 56 different kernels from BSIF filter bank is expected to handle the variations due to both texture and geometric structure of the face to achieve the improved face recognition after drug abuse. The main contributions of this work are as listed below:

  • Presents a novel scheme based on the collaborative representation of statistically independent filters whose responses are computed using 56 different kernels (or filters) from BSIF filter banks on the drug abuse face images.

  • Presents extensive experiments that are carried out on the publicly available Illicit Drug Abuse Database (DAD) [4] comprised of face images of 100 subjects.

  • Reports extensive comparative study by comparing the performance of the proposed scheme with six different state-of-the-art techniques that includes the commercial face recognition software from Neurotechnology, transfer learning approach using Deep Convolutional Neural network (CNN) along with the recently proposed AutoScat features [2].

The rest of the paper is organized as follows: the proposed scheme is discussed in the Sect. 2, the details of the experimental results are presented in the Sects. 3 and  4 draws the conclusion.

2 Proposed Scheme for Drug Abuse Face Recognition

Figure 2 shows the block diagram of the proposed illicit drug abuse face recognition framework that can be structured in two main working components such as: (1) Face detection and normalisation unit (2) Proposed scheme for feature extraction, classification and rank-level fusion to achieve improved face recognition.

Fig. 2.
figure 2

Block diagram of the proposed face recognition scheme for subjects who abuse illicit drugs.

2.1 Face Detection and Normalisation

Given an image I, the first step is to detect and normalize the face images. The face detection is carried out by employing the Viola-Jones algorithm [1] by considering it’s robustness and performance in a real-time scenario. Due to the unconstrained capture of the face images, the use of face detection technique has resulted in a few false detections that are rectified using the technique described in [13]. In the next step, the face image is normalized to compensate rotation using the affine transform as mentioned in [12]. The final normalized face images \(I_{N}\) is of \(120 \times 120\) pixels.

2.2 Proposed Scheme

Figure 3 illustrates the block diagram of the proposed scheme for robust face recognition of subjects who abuse drugs. The primary challenge with faces of subjects who abuse drugs is to address the variations in textural features that are due to the presence of random moles and acnes, non-uniform deformation of the face structure due to the loss of facial muscles. Since these variations are random in different parts of the face and also across the subjects, based on the metabolic activity as a result of drug consumption, we are motivated to approach this problem using a patch-based paradigm. Given the normalised face image \(I_{N}\), we obtain six non-overlap** face image blocks \(I_{Bi} = \left\{ I_{B1}, I_{B2}, \ldots , I_{B6} \right\} \) both vertically and horizontally. Figure 3 illustrates the six blocks \( I_{Bi} \) obtained on the normalised face image \(I_{N}\). We then consider each block \( I_{Bi} \) at a time to extract the features using 56 different filters (or kernels) from the BSIF filter bank [10].

Fig. 3.
figure 3

Block diagram of the proposed scheme

In this work, we employed the open-source BSIF filter bank [10] to extract the features corresponding to each face image block \(I_{Bi} \). The BSIF filters are learned in an unsupervised manner using an Independent Component Analysis (ICA) on the set of image patches extracted from thirteen different natural images. The natural images are first divided to have 50000 images patches which are then mean subtracted followed by dimensionality reduction using Principal Component Analysis (PCA), then used to learn the filters by employing ICA. The learned ICA basis will form the filters that are statistically independent. Thus, depending on the size of the natural image patches and the selection of a top number of basis from ICA, one can learn the bank of filters with different size and length (or bit). For instance, the BSIF filter of size \(7 \times 7\) with 10 bits (or length) indicates that top 10 basis of ICA is selected that is trained using the natural images of size \(7 \times 7\). In this work, we have considered the filter with seven different scales such as: \(5 \times 5\), \(7 \times 7\), \(9 \times 9\), \(11 \times 11\), \(13 \times 13\), \(15 \times 15\) and \(17 \times 17\) and eight different bits (or length) such as: 5, 6, 7, 8, 9, 10, 11, 12 to constitute a filter bank with \(7 \times 8 = 56\) different statistically independent filters (or kernels). Thus, given the \(i^{th}\) face image block \(I_{Bi}\) and the BSIF filter \(F_{b}^{s \times s}\) the response is computed as follows [10]:

$$\begin{aligned} r_i =\sum I_{Bi} *F_{b}^{s \times s} \end{aligned}$$
(1)

Where, \(I_{Bi}\) denotes the \(i^{th}\) face image block, \(*\) denotes the convolution operation and \(F_{b}^{s \times s}\) denotes the filter with the size \(s \times s\) \(\forall \) \(s = \left\{ 5,7,9,11,13,15,17\right\} \) and b denotes the length (or bits) \(\forall \) \(b = \left\{ 5, 6, 7, 8, 9, 10, 11, 12\right\} \) and \(r_i\) indicates the response for the \(i^{th}\) face image block \(I_{Bi}\).

In the next step, the obtained response \(r_i\) is binarized to obtain the binary string as follows [10]:

$$\begin{aligned} b_i = {\left\{ \begin{array}{ll} 1, &{} \text {if}\ r_i > 0 \\ 0, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(2)

Finally, the BSIF encoded features are obtained as the histogram of pixel’s binary codes that can effectively characterize the texture components in the \(i^{th}\) face image block \(I_{Bi} \).

The use of BSIF filter bank comprised of independent filters with different size and different length of filters will provide rich feature representation for the given face image block \(I_{Bi}\). Hence, it is our assertion that the use of the features extracted using independent BSIF filters and combining them at the rank level will further reduce the variations and improve the performance of face recognition after drug abuse.

Fig. 4.
figure 4

Illustration of the BSIF features extracted on face image block using different scales with fixed length (or bit) of 8 (a) before drug abuse (b) after drug abuse

Figure 4 illustrates the feature extraction approach when different scale size is used with the fixed length of 8 bit corresponding to the \(i^{th}\) face image block before and after drug abuse. It can be observed that, the use of larger scale describes the coarse texture information when compared to that of the small scale size. Figure 5 illustrates the features extracted on both before and after drug abuse face image block \(I_{Bi}\) when scale size is fixed to \(7 \times 7\) and length (or bits) is varied from 5 to 12. Thus, here it can also be observed that, the use of different bits will provide different feature representation. Thus, the application of multi-scale and multi-length filters from the BSIF filter bank can provide robust feature representation to achieve an improved performance for the drug abuse face recognition task.

Fig. 5.
figure 5

Illustration of the BSIF features face image block extracted using different length at a fixed scale of \(7 \times 7\) (a) before drug abuse (b) after drug abuse

In the next step, we employ the probabilistic Collaborative Representation Classifier (Pro-CRC) [11] independently on 56 different feature representations obtained using the BSIF filter bank. We considered to employ the Pro-CRC because of it’s robust performance even when only a small training set is available. Such a constraint suits well to our application with one sample per subject available for training. The primary idea of the Pro-CRC is to jointly maximize the likelihood that a test sample belongs to each of the multiple subjects and finally classify the test sample to the subject with maximum likelihood. The features extracted from each of the 56 BSIF filters are classified independently using Pro-CRC to obtain the corresponding comparison scores. Finally, the comparison scores are sorted in the descending order to obtain the ranked list of the possible user identities. Since, we have 56 filters that are used independently, we have 56 different rank list that can be obtained for the face image block \(I_{Bi}\) as: \(RL_{Bi}^{x} = \left\{ RL_{Bi}^{1}, RL_{Bi}^{2}, RL_{Bi}^{3}, \ldots , RL_{Bi}^{56}\right\} , \forall x = \left\{ 1,2,3,4,\ldots 56\right\} \). Finally, given the test face image \(I_{N}\), the recognition is performed by combining the rank list of 56 filters from six face image blocks \(I_{Bi}\) using majority voting as follows:

$$\begin{aligned} Fu_{RL} = MJ\left\{ RL_{B1}^{1:56}, RL_{B2}^{1:56}, RL_{B3}^{1:56}, RL_{B4}^{1:56}, RL_{B5}^{1:56}, RL_{B6}^{1:56}\right\} \end{aligned}$$
(3)

Where, MJ indicates the majority voting, \(Fu_{RL}\) indicates the final fused rank list and \(\left\{ RL_{B1}^{1:56}, \ldots RL_{B6}^{1:56}\right\} \) represents the rank list obtained for each face image block corresponding to 56 different BSIF filters.

3 Experiments and Results

In this section, we present the performance of the proposed scheme on the publicly available Drug Abuse face Database (DAD) [4]. This database is similar to that of the Illicit Drug Abuse Face Database (IDAF) [3] and both of these databases are collected through the internet, especially from the ‘Face of Meth’ webpage [15]. Since IDAF is not publicly available, we have used the Drug Abuse Database for the experiments. The DAD is comprised of frontal face images (mostly) captured from 101 subjects before and after illicit drug abuse. But, in this work, we have selected 100 subjects owing to the reasonable quality of the face images in the DAD databases. Thus, in this work, we have used \(100 \ subjects \times 2 \ samples \ = 200 \ images\). The results are presented in terms of the Recognition Rate \(\%\) at Rank-1 to Rank-5. Thus, the higher the value of the Recognition Rate, the better is the biometric performance.

Table 1. Recognition performance of the proposed scheme on the DAD database

The performance of the proposed method is compared with the six different face recognition systems that include the features extracted using: Local Binary Patterns (LBP), Histogram of Gradients (HOG), Log-Gabor (LG), AutoScat features [2] and commercial face recognition system from Neurotechnology. In addition to these face recognition systems, we have also evaluated the performance of the Deep Convolutional Neural Network (CNN) based on transfer learning paradigm. To this extent, we have used the AlexNet [14] in which the last fully connected layer is retrained using the face images from DAD database. Since the DAD database has only one sample for each subject, we have carried out the data augmentation using random crop to retrain the deep CNN. Finally, the features of the last layer are classified using linear Support Vector Machine (SVM) to obtain the performance-quantified using Recognition Rate (\(\%\)). To the best of our knowledge, the deep CNN is utilized for the first time with the application to recognize the faces affected with illicit drug abuse.

Fig. 6.
figure 6

Sample results demonstrating the performance of the proposed scheme with (a) correct Rank 1 recognition (b) in-correct Rank 1 recognition

Fig. 7.
figure 7

Recognition performance of the proposed scheme on DAD database

Table 1 presents the quantitative results of the proposed scheme along with the comparative methods in terms of Recognition Rate (\(\%\)) at Rank-1 and also Rank-5. The results are also presented in terms of the Cumulative Match Characteristics (CMC) plots as shown in the Fig. 7. The following are key findings:

  • The proposed method yields the best performance with a Rank-1 recognition rate of \(52\%\). Thus, compared to the state-of-the-art face recognition techniques including the AutoScat [2], the commercial face recognition system and deep-CNN, the proposed scheme has indicated the best performance.

  • The proposed scheme has indicated the improvement of Recognition Rate (\(\%\)) at Rank-1 with 26\(\%\) and Rank-5 with 10\(\%\) when compared to the second best performing technique based on the HOG-features.

  • The CMC curves indicate the performance of the proposed scheme together with state-of-the-art algorithms for varying ranks from 1 to 7. It can be observed here that, the proposed system has demonstrated high performance consistently across all ranks when compared with six different state-of-the-art schemes.

  • Figure 6 illustrates the example image pairs that are correctly (see Fig. 6(a)) and incorrectly (see Fig. 6(b)) recognised at Rank-1 using the proposed scheme. Based on the obtained results, the proposed system is capable of correctly identifying the subjects with major physiological variations, occlusion (presence of spectacles), little expressions along with varying image quality. However, most of the failed identification cases are due to the larger variations due to fast paced ageing appearance in addition to the variations of the prolonged use of illicit drugs.

4 Conclusion

In this work, we address a face recognition problem especially related to illicit drug abuse variations that will deform both local and global characteristics of the face images. We proposed a novel framework based on the block image processing and the collaborative representation of the statistically independent features extracted using 56 different filters (or kernels) from BSIF filters bank. The proposed method addresses the variations due to the drug abuse by dividing the face image into six non-overlap** image block which is then processed using BSIF filter bank to have 56 different feature representation. Then, each of these 56 different feature representation is classified independently using probabilistic Collaborative Representation Classifier (Pro-CRC) to obtain comparison scores. These scores are then sorted in the descending order to the obtain a rank list corresponding to all user identities. Finally, the rank list corresponding to 56 different feature representation are combined using the majority voting to obtain the recognition accuracy. Extensive experiments are carried out on the publicly available Illicit Drug Abuse Database (DAD) comprised of faces from 100 substance users. The performance of the proposed scheme is compared with six different state-of-the-art systems including a commercial face recognition system and transfer learning using deep CNN. The experimental results have demonstrated the improved performance from the proposed scheme for recognizing the subjects who abuse the drugs.