LLMpediaThe first transparent, open encyclopedia generated by LLMs

R (programming language)

Generated by DeepSeek V3.2
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Expansion Funnel Raw 76 → Dedup 0 → NER 0 → Enqueued 0
1. Extracted76
2. After dedup0 (None)
3. After NER0 ()
4. Enqueued0 ()
R (programming language)
ParadigmMulti-paradigm: array, object-oriented, imperative, functional, procedural, reflective
DesignerRoss Ihaka and Robert Gentleman
DeveloperR Core Team
ReleasedAugust 1993
Latest release version4.4.1
Latest release date14 June 2024
TypingDynamic
LicenseGNU General Public License
Websitehttps://www.r-project.org

R (programming language). R is a free software environment and programming language specifically designed for statistical computing and data visualization. Developed as an implementation of the S (programming language), it has become a de facto standard within academia and is widely adopted across industry for data analysis, machine learning, and bioinformatics. Its extensive package ecosystem, managed through the Comprehensive R Archive Network, provides specialized tools for a vast array of scientific disciplines.

History

The language was conceived in the early 1990s by statisticians Ross Ihaka and Robert Gentleman at the University of Auckland. Their work was inspired by the S (programming language) developed at Bell Laboratories by John Chambers. The first public version was released in 1993, with its name reportedly chosen as a play on the names of its creators and its predecessor. In 1997, the R Core Team was formed to guide its development, a structure that continues to manage the project. Major milestones include the establishment of the Comprehensive R Archive Network and the founding of the R Foundation for Statistical Computing in 2003, which provides official support.

Features

R is distinguished by its comprehensive suite of built-in functions for linear regression, time-series analysis, and statistical hypothesis testing. It employs lazy evaluation and supports first-class functions, enabling a powerful functional programming style. A core strength is its sophisticated system for producing publication-quality data visualization, primarily through its base graphics device and the influential ggplot2 package. The language also features advanced object-oriented programming systems, including the S3 and S4 object systems, which facilitate the development of complex statistical software.

Syntax and examples

The syntax is similar to S (programming language), often utilizing a vectorized approach where operations apply to entire data structures without explicit loops. A basic example is creating a vector and calculating its mean: `x <- c(1, 2, 3, 4, 5); mean(x)`. Statistical modeling is typically concise, such as fitting a linear model with `lm(y ~ x1 + x2, data = mydata)`. The language uses the `<-` operator for assignment, though `=` is also permitted. Control flow is managed with standard constructs like `if`, `for`, and `while`, and user-defined functions are created using the `function` keyword.

Applications

R is extensively used in bioinformatics for analyzing DNA microarray data and genome sequencing projects, notably through packages from Bioconductor. In finance, it is applied to risk management, portfolio optimization, and econometrics. The social sciences leverage it for psychometrics and survey methodology, while fields like epidemiology and pharmacology use it for clinical trial analysis. Its role in machine learning has grown significantly, supported by interfaces to frameworks like TensorFlow and libraries for implementing random forest and neural network algorithms.

Implementation and development

The primary implementation is written in itself, along with C, Fortran, and Rust, and is distributed as source code. Execution involves an interpreter, and performance-critical sections can be offloaded to compiled code via the `.C` or `.Call` interfaces. Development is overseen by the R Core Team, with contributions from a global community. The source code is managed under a version control system, and new versions undergo rigorous testing on multiple platforms, including Microsoft Windows, macOS, and various Linux distributions.

Community and ecosystem

The community is supported by major conferences like useR! and RStudio's rstudio::conf. The Comprehensive R Archive Network hosts over 19,000 user-contributed packages, covering domains from spatial analysis to text mining. Commercial support and integrated development environments are provided by companies such as Posit (formerly RStudio) and Microsoft, which integrates it into Visual Studio. Educational resources are abundant, including the R Journal, online courses from Coursera, and active forums like Stack Overflow and the R-help mailing list.

Category:Free statistical software Category:Programming languages created in 1993 Category:Array programming languages