• Stepwise Procedures In Discriminant Analysis

  • CHAPTER ONE -- [Total Page(s) 2]

    Page 2 of 2

    Previous   1 2
    • DEFINITION OF TERMS

      Discriminant Function

      This is a latent variable which is created as a linear combination of discriminating variables, such that

      Y =      L1X1 + L2X2 +          + Lp Xp

      where the L’s are the discriminant coefficients, the x’s are the discriminating variables.

      The eigenvalue: This is the ratio of importance of the dimensions which classifies cases of the dependent variables. There is one eigenvalue for each discriminant function. With more than one discriminant function, the first eigenvalue will be the largest and the most important in explanatory power, while the last eigenvalue will be the smallest and the least important in eXplanatory power.

      Relative importance is assessed by eigenvalues since they reflect the percents of variance eXplained in the dependent variable, cumulating to 100% for all functions. Eigenvalues are part of the default of output in SPSS (Analysis, Classify, Discrimination).

      The Discriminant Score

      This is the value obtained from applying a discriminant function formula to the data for a given case. For standardized data, Z score is the discriminant score.

      Cutoff

      When group sizes are equal, the mean of the two centroids for two- groups discriminant analysis is the cut off. The cut off is the weighted mean if the groups are unequal. A case is classed as 0 if the discriminant score of the discriminant function is less than or equal to the cut off or classed as 1 if above it.

      The Relative Percentage

      This is equal to the eigenvalue of a function divided by the sum of all eigenvalues of all discriminant functions in the model. It is the percent of discriminating power for the model associated with a particular discriminant function. It tells us how many functions are important. The ratio of eigenvalues indicates the relative discriminating power of the discriminant

      functions.

      The Canonical Correlation, R

      This measures the association between the groups formed by the dependent and the given discriminant function. A large canonical correlation

      indicates high correlation between the discriminant functions and the groups.

      An R of 1.0 shows that all of the variability in the discriminant scores can be accounted for by that dimension. The relative percentage and R do not have to be correlated. Canonical Correlation, R , also shows how much each function is useful in determining group differences.

      Mahalanobis Distances

      This is the distance between a case and the centroid for each group (of the dependent variables) in attribute space (a dimensional space defined by n variables). There is one mahalanobis distance for each group of case, and it will be classified as belonging to the group with the smallest mahalanobis distance. This means that the closer the case to the group centriod, the smaller the mahalanobis distance. Mahalanobis distance is measured in terms of standard deviations from the centroid.

      The Classification Table

      This is a table in which the rows are observed categories of the dependent and the columns are the predicted categories of the dependent. All cases lie on the diagonal at perfect prediction.

      Hit Ratio

      This is the percentage of cases on the diagonal of a confusion matrix. It is the percentage of correct classifications. The higher the hit ratio the less the error of misclassification, also the less the hit ratio the higher the error rate.

      Tolerance

      This is the proportion of the variation in the independent variables that is not explained by the variables already in the model. Zero tolerance means that the independent variable under consideration is a perfect linear combination of other variables already in the model. A tolerance of 1 implies that the predictor variables are completely independent of other predictor variables already in the model. Most computer packages set the minimum tolerance at 0.01 as the default option.


  • CHAPTER ONE -- [Total Page(s) 2]

    Page 2 of 2

    Previous   1 2
    • ABSRACT - [ Total Page(s): 1 ] Abstract Several multivariate measurements require variables selection and ordering. Stepwise procedures ensure a step by step method through which these variables are selected and ordered usually for discrimination and classification purposes. Stepwise procedures in discriminant analysis show that only important variables are selected, while redundant variables (variables that contribute less in the presence of other variables) are discarded. The use of stepwise procedures ... Continue reading---

         

      APPENDIX A - [ Total Page(s): 1 ] ... Continue reading---

         

      APPENDIX B - [ Total Page(s): 1 ] APPENDIX II BACKWARD ELIMINATION METHOD The procedure for the backward elimination of variables starts with all the x’s included in the model and deletes one at a time using a partial  or F. At the first step, the partial  for each xi isThe variable with the smallest F or the largest  is deleted. At the second step of backward elimination of variables, a partial  or F is calculated for each q-1 remaining variables and again, the variable which is th ... Continue reading---

         

      TABLE OF CONTENTS - [ Total Page(s): 1 ]TABLE OF CONTENTSPageTitle PageApproval pageDedicationAcknowledgementAbstractTable of ContentsCHAPTER 1: INTRODUCTION1.1    Discriminant Analysis1.2    Stepwise Discriminant analysis1.3    Steps Involved in discriminant Analysis1.4    Goals for Discriminant Analysis1.5    Examples of Discriminant analysis problems1.6    Aims and Obj ectives1.7    Definition of Terms1.7.1    Discriminant function1.7.2    The eigenvalue1.7.3    Discriminant Score1.7.4    Cut off1.7 ... Continue reading---

         

      CHAPTER TWO - [ Total Page(s): 3 ] 5 is called the mahalanobis (squared) distance for known parameters. For unknown parameters, the Mahalanobis (squared) distance is obtained by estimating p1, p2 and S by X1, X2 and S, respectively. Following the same technique the Mahalanobis (Squared) distance, D , for the unknown parameters is D2 = (X- X)+S-1 (X1- X2) . The distribution of D can be used to test if there are significant differences between the two groups.2.4 WELCH’S CRITERION Welch (1939) suggest ... Continue reading---

         

      CHAPTER THREE - [ Total Page(s): 5 ]The addition of variables reduces the power of Wilks’ Λ test statistics except if the added variables contribute to the rejection of Ho by causing a significant decrease in Wilks’ Λ ... Continue reading---

         

      CHAPTER FOUR - [ Total Page(s): 3 ]CHAPTER FOUR DATA ANALYSISMETHOD OF DATA COLLECTIONThe data employed in this work are as collected by G.R. Bryce andR.M. Barker of Brigham Young University as part of a preliminary study of a possible link between football helmet design and neck injuries.Five head measurements were made on each subject, about 30 subjects per group:Group 1    =    High School Football players Group 2    =    Non-football playersThe five variables areWDIM    =    X1    =    head width at wi ... Continue reading---

         

      CHAPTER FIVE - [ Total Page(s): 1 ]CHAPTER FIVERESULTS, CONCLUSION AND RECOMMENDATIONRESULTSAs can be observed from the results of the analysis, when discriminant analysis was employed, the variable CIRCUM(X2) has the highest Wilks’ lambda of 0.999 followed by FBEYE (X2) (0.959). The variable EYEHD (X4) has the least Wilks’ lambda of 0.517 followed by EARHD (X5) (0.705). Also the least F-value was recorded with the variable CIRCUM (X2) (0.074) followed by the variable FBEYE (X2) (2.474), while the variable EYEHD (X4 ... Continue reading---

         

      REFRENCES - [ Total Page(s): 1 ] REFERENCES Anderson, T.W. (1958). An introduction to multivariate statistical Analysis. John Wiley & Sons Inc., New York. Cohen, J. (1968). Multiple regression as a general data-analytic system. Psychological Bulletin 70, 426-443. Cooley W.W. and Lohnes P.R. (1962). Multivariate procedures for the Behavioural Sciences, New York John Wiley and Sons Inc. Efroymson, M.A. (1960). Multiple regression analysis. In A. Raston & H.S. Wilfs (Eds.) Mathematical methods for ... Continue reading---