Factor Analysis In Research

Afza.Malik GDA

 Nursing Research and Factor Analysis

Factor Analysis In Research

What Is Factor Analysis,Types of Factor Analysis,Variables Characteristics,Considering the Variables,Mathematical Description of Analysis,Outcomes of Analysis.

What Is Factor Analysis

    Factor analysis is a multivariate technique for determining the underlying structure and dimensionality of a set of variables. By analyzing intercorrelations among variables, factor analysis shows which variables cluster together to form unidimensional constructs. It is useful in elucidating the underlying meaning of concepts. 

    However, it involves a higher degree of subjective interpretation than is common with most other statistical methods. In nursing research, factor analysis is commonly used for instrument development ( Ferketich & Muller, 1990), theory development, and data reduction. 

    Therefore, factor analysis is used for identifying the number, nature, and importance of factors, comparing factor solutions for different groups, estimating scores on factors, and testing theories (Nunnally & Bernstein, 1994).

Types of Factor Analysis

    There are two major types of factor analysis: exploratory and confirmatory. In exploratory factor analysis, the data are described and summarized by grouping together related variables. The variables may or may not be selected with a particular purpose in mind. 

    Exploratory factor analysis is commonly used in the early stages of research, when it provides a method for consolidating variables and generating hypotheses about underlying processes that affect the clustering of the variables. 

    Confirmatory factor analysis is used in later stages of research for theory testing related to latent processes or to examine hypothesized differences in latent processes among groups of subjects. In confirmatory factor analysis, the variables are carefully and specifically selected to reveal underlying processes or associations.

Variables Characteristics 

    The raw data should be at or applicable to the interval level, such as the data obtained with Likert-type measures. Next, a number of assumptions relating to the sample, variables, and factors should be met. 

    First, the sample size must be sufficiently large to avoid erroneous interpretations of random differences in the magnitude of correlation coefficients. 

    As a rule of thumb, a minimum of five cases for each observed variable is recommended however, Knapp and Brown (1995) reported that ratios as low as three subjects per variable may be acceptable. Others generally recommend that 100 to 200 is advisable (Nunnally & Bernstein, 1994).

    Second, the variables should be normally distributed, with no substantial evidence of skewness or kurtosis. Third, scatterplots should indicate that the associations between pairs of variables should be linear. 

    Fourth, outliers among cases should be identified and their influence reduced either by transformation or by arbitrarily replacing the outlying value with a less extreme score. 

    Fifth, instances of multicollinearity and singularity of the variables should be deleted after examining to see if the determinant of the correlation matrix or eigenvalues associated with some factors approach zero. In addition, a squared multiple correlation equal to 1 indicates singularity; and if any of the squared multiple correlations are close to 1, multicollinearity exists. 

    Sixth, outliers among variables, indicated by low squared multiple correlation with all other variables and low correlations with all important factors, suggest the need for cautious interpretation and possible elimination of the variables from the analysis. 

    Seventh, there should be adequate factorability within the correlation matrix, which is indicated by several sizable correlations between pairs of variables that exceed .30. Finally, screening is important for identifying outlying cases among the factors. 

    If such outliers can be identified by large Mahala Nobis distances (estimated as chi square values) from the location of the case in the space defined by the factors to the centroid of all cases in the same space, factor analysis is not considered appropriate.

Considering the Variables 

    When planning for factor analysis, the first step is to identify a theoretical model that will guide the statistical model ( Ferketich & Muller, 1990). The next step is to select the psychometric measurement model, either classic or neoclassical, that will reflect the nature of measurement error. 

    The classic model assumes that all measurement error is random and that all variance is unique to individual variables and not shared with other variables or factors. The neoclassic model recognizes both random and systematic measurement error, which may reflect common variance that is attributable to unmeasured or latent factors. 

    The selection of the classic or neoclassical model influences whether the researcher chooses principal-components analysis or common factor analysis ( Ferkerich & Mullerlly)

Mathematical Description of Analysis

    Mathematically speaking, factor analysis generates factors that are linear combinations of variables. The first step in factor analysis is factor extraction, which involves the removal of as much variance as possible through the successive creation of linear combinations that are orthogonal (unrelated) to previously created combinations. 

    The principal-components method of extraction is widely used for analyzing all the variance in the variables. However, other methods of factor extraction, which analyze common factor variance ( ie ., variance that is shared with other variables), include the principal-factors method, the alpha method, and the maximum likelihood method (Nunnally & Bernstein, 1994). 

    Various criteria have been used to determine how many factors account for a substantial amount of variance in the data set. One criterion is to accept only those factors with an eigenvalue equal to or greater than 1.0 (Guttman, 1954). 

    An eigenvalue is a standardized index of the amount of the variance extracted by each factor. Another approach is to use a screen test to identify sharp discontinuities in the eigenvalues for successive factors (Cattell, 1966).

Outcomes of Analysis

    Factor extraction results in a factor matrix that shows the relationship between the original variables and the factors by means of factor loadings. The factor loadings, when squared, equal the variance in the variable accounted for by the factor. 

    For all of the extracted factors, the sum of the squared loadings for the variables represents the communality (shared variance) of the variables. The sum of a factor's squared loadings for all variables equals that factor's eigenvalue (Nunnally & Bernstein, 1994).

    Because the initial factor matrix may be difficult to interpret, factor rotation is commonly used when more than one factor emerges. Factor rotation involves the movement of the reference axes within the factor space so that the variables align with a single factor (Nunnally & Bernstein, 1994). 

    Orthogonal rotation keeps the reference axes at right angles and results in factors that are uncorrelated. Orthogonal rotation is usually performed through a method known as varimax, but other methods (quartic max and equal max) are also available. Oblique rotation allows the reference axes to rotate into acute or oblique angles, thus resulting in correlated factors (Nunnally & Bernstein). 

    When oblique rotation is used, there are two resulting matrices: a pattern matrix that reveals partial regression coefficients between variables and factors and a structure matrix that shows variable to factor correlations, Factors are interpreted by examining the pattern and magnitude of the factor loadings in the rotated factor matrix (orthogonal rotation) or pattern matrix (oblique rotation). 

    Ideally, there are one or more marker variables, variables with a very high loading none and only one factor (Nunnally & Bernstein, 1994), that can help in the interpretation and naming of factors. Generally, factor loadings of .30 and higher are large enough to be meaningful (Nunnally & Bernstein). 

    Once a factor is interpreted and labeled, researchers usually determine factor scores, which are scores on the abstract dimension defined by the factor.

    Replication of factor solutions in subsequent analysis with different populations gives increased credibility to the findings. Comparisons between factor-analytic solutions can be made by visual inspection of the factor loadings or by using formal statistical procedures, such as the computation of Cattell's salient similarity index and the use of confirmatory factor analysis (Gorsuch, 1983).

Post a Comment


Give your opinion if have any.

Post a Comment (0)

#buttons=(Ok, Go it!) #days=(20)

Our website uses cookies to enhance your experience. Check Now
Ok, Go it!