Problem formulation andProposed Approach Pose variation leads to high variance in the coefficientsestimated from 2D images. This high variance degrades the representationaccuracy as features. Due to pose variation reliable and discriminativefeatures undergoes self-occlusion. Problem of face classification across theposes naturally transforms into regression problem.

Accurate estimation ofnon-linear changes cannot be estimated by linear regression. In this paper non-linearregression of images with reduced coefficients via kernel ELM is proposed to estimatethe non-linear mapping between frontal face views from its counter-partnon-frontal views for effective face recognition.Data set is divided into training set and testingset. In training set as well as testing set, there exist two matrices F = f1,f2, · · · , fn ? RD×N and P =xp1, xp2, · · · , xpn ? RD×N where f i , i=1…N a set of N front face , xpi , i=1…N a set of N pose faces and the corresponding class labels li? {1, . . .

.C}. K is kernel functiondefined on Train set images and K’ is kernel function defined on Train and Testset images Pp that can be used to definetraining kernel set. Input: Set of Pose Images Pp= xp1,xp2, · · · , xpn ?RD×N Output: Front pose K* Pp = F0? RD×N (9)here K, denotes the output weights of the hidden layer. The architecture of Kernel ELM for single hidden layer feedforward network, for the given training set consist of d units from poseface image set for input and d units output layer comprising of frontal faceimage, and kernel matrix K NxN acting hidden layer neurons.Algorithm for pose normalization using kernel-ELMbased nonlinear regression with reduced coefficients 1.

Givena training set containing N samples of different poses including frontal onesFTrain(FTri) , PoseTrain(PTri) i= 1,2,3,…,N with m classes. 2. All the N samples are pre-processed withGamma correction function which replaces gray-level with new intensity I?, where ? is a user-defined parameter. This step enhances the local dynamicrange of the image in dark or shadowed regions while compressing it in brightregions.

3After gamma processing each of the sample image space is transformed totransformed space by applying DCT (Discrete Cosine Transformation) blockwisewith block size of 8×8 blocks. 4.From 64 coefficients of each block 25 low frequency coefficients are retained 5.

Image space with reduced coefficients are retrieved back by using Inverse DCT 6.Train kernel and Test kernel matrix is calculated from N samples with reducedcoefficients instead of activation function g(x) for hidden nodes. Thekernel matrix = : i,j = h(xi) · h(xj) = K(xi, xj) (10) is only related to the input dataxi and the number of input samples.

7. Output weight is computed by f(x)which can now be expressed in terms of f(x) = (11) instead of whereT is the front pose. Parameter controls the amount of shrinkage of coefficients.The variance of coefficients is described by the trace of the covariance matrixof K. 8.

Compute Testing and TrainingAccuracy