Generated by GPT-5-mini| EFA | |
|---|---|
| Name | EFA |
| Abbreviation | EFA |
| Type | Multidisciplinary method |
| Introduced | 20th century |
| Related | Factor analysis, PCA, ICA, CFA |
EFA
Exploratory factor analysis (commonly abbreviated as EFA) is a statistical technique for uncovering latent structure in multivariate data. It is used to identify underlying factors that explain correlations among observed variables and to guide scale construction, psychometric validation, and dimensionality reduction. EFA has been adopted across psychology, sociology, market research, genetics, and neuroscience, linking empirical measurement to theoretical constructs.
Exploratory factor analysis (EFA) is defined as a set of techniques for modeling observed covariances in terms of fewer unobserved latent variables; related abbreviations include CFA, PCA, ICA, ML, and GFA. EFA differs from confirmatory factor analysis (Confirmatory factor analysis) by emphasizing discovery rather than hypothesis testing, and it is often compared with principal component analysis (Principal component analysis) and independent component analysis (Independent component analysis). Estimation approaches commonly cited are maximum likelihood (Maximum likelihood estimation), weighted least squares (Weighted least squares), and principal axis factoring (Principal axis factoring). Rotation families include varimax (Varimax rotation), oblimin (Oblimin rotation), and promax (Promax rotation), while criteria for factor retention often reference the Kaiser criterion (Kaiser criterion), scree test (Scree test), and parallel analysis (Parallel analysis).
EFA originated from early 20th‑century work on intelligence and psychometrics by researchers associated with Spearman and Thurstone, evolving through theoretical contributions from Kaiser and computational advances influenced by developments in Pearson's correlation methods. Mid‑century expansions incorporated algorithms from numerical linear algebra used in Singular value decomposition and influenced by work at institutions such as University of Chicago and University of Minnesota psychometrics programs. The rise of computing at IBM and statistical software from SPSS, SAS Institute, and later R (programming language) packages like psych (R package) accelerated adoption. Contemporary methodological refinements bridge to related models developed at Harvard University, Stanford University, and University College London that integrate factor models with structural equation frameworks exemplified in LISREL and Mplus.
EFA workflows begin with selection of observed indicators drawn from studies at American Psychological Association conferences, market surveys by Nielsen Holdings, or gene expression panels from projects such as The Cancer Genome Atlas. Data preprocessing often follows standards endorsed by World Health Organization surveys and includes inspection of correlation matrices, Bartlett's test (Bartlett's test of sphericity), and measures such as the Kaiser‑Meyer‑Olkin index (Kaiser–Meyer–Olkin test). Estimation uses algorithms like expectation‑maximization (EM algorithm) for missing data and employs rotation to improve interpretability with references to Thurstone's simple structure. Applications include scale construction in clinical instruments used by American Psychiatric Association diagnostic tools, consumer segmentation in case studies for Procter & Gamble, factor discovery in genomics research published by National Institutes of Health, and source separation in neuroimaging projects at Massachusetts General Hospital and Human Connectome Project.
EFA is applied in multiple domains including personality and intelligence assessment within institutions such as Stanford Binet test development, organizational measurement used by McKinsey & Company consulting, social attitudes research featured in Pew Research Center reports, and educational measurement in studies tied to OECD surveys. In biomedical research, EFA aids biomarker identification in studies funded by Wellcome Trust and European Research Council. In marketing, practitioners at Boston Consulting Group and Kantar Group employ EFA for brand positioning. In neuroscience, EFA complements independent component approaches in work at Johns Hopkins University and University of Oxford.
Technical variants include constrained factor models such as bifactor models cited in literature from Schmid and Leiman and hierarchical factor models developed in the psychometric tradition by Cattell. Bayesian factor analysis frameworks draw on methods from Gelman and Jeffreys and are implemented in tools like Stan (software). Sparse factor techniques connect to compressed sensing theory from Donoho and are used in high‑dimensional genomics leveraging graphical models from Meinshausen and Bühlmann. Extensions embed EFA within structural equation modeling via software such as LISREL and Mplus, and integrate with machine learning pipelines in platforms developed by TensorFlow and scikit‑learn for feature extraction.
Critiques of EFA focus on overinterpretation of rotated solutions and the subjective nature of factor retention decisions criticized in debates involving Kaiser and Cattell. Limitations include sensitivity to sample size noted in methodological reviews from Nunnally and bias under violations of normality highlighted in articles associated with Mardia's kurtosis measures. High collinearity and model indeterminacy can cause unstable loadings, a problem discussed in statistical texts by Anderson and Rubin. Misapplication is documented in psychometric audits produced by American Educational Research Association panels and methodological commentaries in journals such as Psychometrika and Journal of Educational Measurement.
Category:Statistical methods