Generated by GPT-5-mini| Stata | |
|---|---|
| Name | Stata |
| Developer | StataCorp |
| Released | 1985 |
| Latest release | (varies) |
| Programming language | C, Mata |
| Operating system | Windows, macOS, Linux |
| License | Proprietary |
Stata is a general-purpose statistical software package used for data analysis, data management, and graphics. It is distributed by StataCorp and is widely used in fields such as epidemiology, economics, sociology, political science, biostatistics, and demography. Researchers at institutions like Harvard University, Stanford University, University of Oxford, London School of Economics, and Massachusetts Institute of Technology frequently employ it alongside packages such as R (programming language), Python (programming language), SAS, SPSS, and MATLAB.
Stata was created in 1985 by economists and software developers associated with institutions including University of California, Berkeley and Princeton University during a period when desktop computing from vendors like IBM and Apple Inc. was expanding. Early adopters included researchers at World Bank, International Monetary Fund, United Nations, Centers for Disease Control and Prevention, and National Institutes of Health. Over successive releases it incorporated methods developed in landmark works by statisticians from Columbia University, Yale University, University of Chicago, University of Pennsylvania, and Johns Hopkins University. Stata’s development paralleled advances documented by scholars at American Statistical Association, Royal Statistical Society, and Institute of Mathematical Statistics.
Stata provides command-line and graphical user interfaces used by practitioners from Bill & Melinda Gates Foundation, The Lancet, Nature (journal), Science (journal), and university labs. Its capabilities include point-and-click dialogs as seen in software from Microsoft Corporation and scripting akin to systems created at Bell Labs. Graphics facilities are comparable to those described by authors at Princeton University Press and in textbooks by researchers at Yale University Press and Cambridge University Press. Output integration supports workflows with tools from LaTeX Project, Microsoft Word, Excel, and Tableau Software.
Stata supports proprietary file formats and interoperability with formats produced by Microsoft Excel, CSV standards referenced by Internet Engineering Task Force, SAS, SPSS (Statistical Package for the Social Sciences), and RStudio. It handles panel data, time-series, and hierarchical structures commonly analyzed in studies by researchers at National Bureau of Economic Research, Organisation for Economic Co-operation and Development, European Central Bank, and Federal Reserve Board. Data import/export routines are comparable to those implemented in tools from Oracle Corporation and IBM SPSS Statistics; integration workflows mirror practices at World Health Organization and UNICEF data projects.
Stata implements regression, generalized linear models, survival analysis, time-series methods, multilevel mixed models, and survey design analysis used in research from American Economic Association, Royal Society, European Commission, and Pew Research Center. Its estimation commands are used in applied work by scholars at National Institutes of Health, Food and Agriculture Organization, International Labour Organization, and think tanks such as Brookings Institution and RAND Corporation. Advanced procedures include maximum likelihood estimation and Bayesian methods informed by research from Stanford University, University of Cambridge, and Princeton University; econometric modules reflect approaches from Econometrica and Journal of Econometrics authors.
Stata’s scripting language and macro facilities support automation and reproducible research workflows promoted by organizations like Center for Open Science, Open Data Institute, and journals such as The American Statistician. Users create extensions and ado-files shared in repositories maintained by universities like University of California, Duke University, University of Michigan, University of Washington, and community sites similar to GitHub and CRAN (Comprehensive R Archive Network). Integration points allow interoperability with Python (programming language), R (programming language), and database systems from PostgreSQL Global Development Group, MySQL, and SQLite. Teaching materials by professors at Columbia University, University of Chicago, Yale University, and Brown University frequently include automated do-file examples.
Stata is distributed under proprietary licenses with editions tailored for different scales of analysis, including student, single-user, and multi-user flavors used by institutions such as University of California, Los Angeles, University of Texas, Ohio State University, and University of Toronto. Platform support covers Microsoft Windows, macOS, and Linux kernel distributions favored by research computing groups at Lawrence Berkeley National Laboratory and Argonne National Laboratory. Licensing models and site licenses are negotiated with procurement offices like those at Stanford University and University of Cambridge; training and certification programs are offered by private firms and university continuing education departments including Harvard Extension School and London School of Hygiene & Tropical Medicine.
Category:Statistical software