LLMpediaThe first transparent, open encyclopedia generated by LLMs

arXiv

Generated by DeepSeek V3.2
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: Google Scholar Hop 4
Expansion Funnel Raw 42 → Dedup 7 → NER 2 → Enqueued 2
1. Extracted42
2. After dedup7 (None)
3. After NER2 (None)
Rejected: 5 (not NE: 5)
4. Enqueued2 (None)
arXiv
NamearXiv
CaptionA preprint repository for scientific research.
TypePreprint repository
LanguageEnglish
RegistrationOptional for reading, required for submission
OwnerCornell University
AuthorPaul Ginsparg
Launch dateAugust 1991
Current statusActive

arXiv. It is a highly influential, open-access repository of electronic preprints, known as e-prints, approved for posting after moderation but not peer review. Established by physicist Paul Ginsparg, it has fundamentally transformed the scholarly communication landscape, particularly within physics, mathematics, computer science, and related disciplines. The service is operated by Cornell University and provides free distribution of research findings to a global audience, accelerating the pace of scientific discovery.

History and development

The repository originated in 1991 as an FTP server and email service for the high-energy physics community, conceived by Paul Ginsparg while at Los Alamos National Laboratory. Its initial purpose was to streamline the distribution of preprints, which were traditionally circulated by mail, within fields like particle physics and astrophysics. The system quickly expanded to other areas of physics and, by the late 1990s, began incorporating disciplines such as mathematics, computer science, quantitative biology, and statistics. A major transition occurred in 2001 when stewardship moved to Cornell University, ensuring its long-term stability and institutional support. Throughout its evolution, it has been supported by key grants from organizations like the National Science Foundation and has inspired similar platforms such as bioRxiv and SSRN.

Content and scope

The repository hosts scholarly articles across multiple scientific fields, organized into subject categories such as astrophysics, condensed matter physics, high-energy physics, mathematics, computer science, and quantitative biology. It primarily contains preprints, which are versions of academic papers submitted prior to or during the formal peer-review process at journals like Physical Review Letters or Nature. The scope also includes some post-peer-review versions and conference materials, forming a comprehensive, searchable archive of cutting-edge research. This breadth has made it an indispensable resource for scientists at institutions worldwide, including CERN, the Max Planck Society, and MIT, to stay current with rapid developments.

Submission and moderation process

Authors, typically affiliated with academic or research institutions, submit manuscripts through a web interface, where they must endorse the submission and classify it under the appropriate subject category. A team of volunteer moderators, often senior scientists, performs a process called endorsement or moderation to screen submissions for relevance, scholarly focus, and minimal quality, ensuring they are appropriate for the relevant academic community. This process is distinct from traditional peer review; it does not validate technical correctness or novelty but filters out obviously non-scientific content. The moderation system relies on the expertise of contributors from organizations like the American Physical Society and various university departments to maintain the repository's standing.

Impact and reception

The service has had a profound impact on the culture of scientific communication, particularly in fields like theoretical physics and mathematics, where it has become the primary venue for announcing new results. It has democratized access to research, especially for scientists in developing countries or at institutions without expensive journal subscriptions. Its model of rapid dissemination has influenced the broader open access movement and prompted traditional publishers to accelerate their own publication cycles. While some critics have raised concerns about the lack of formal peer review, its widespread adoption by leading researchers and institutions like the Institute for Advanced Study underscores its critical role in modern science.

Technical infrastructure and access

The platform is built on a robust, distributed technical infrastructure, with primary hosting at Cornell University and mirror sites globally, including at institutions in Germany, India, and Japan. It provides multiple access methods, including a web interface, FTP, and an API, supporting automated harvesting by other services. The entire corpus is freely accessible to anyone with an internet connection, with no fees for reading or downloading papers. Data is stored in formats like PDF and TeX, and the system features powerful search capabilities, RSS feeds, and integration with tools used by researchers at laboratories like SLAC National Accelerator Laboratory.

Governance and funding

Stewardship and primary operational responsibility lie with Cornell University, specifically through the Cornell University Library. Strategic guidance is provided by a multi-institutional advisory board comprising representatives from the research communities it serves. Financial support comes from a collaborative model involving annual contributions from a consortium of over 200 major research institutions and libraries worldwide, including Harvard University, Stanford University, and the University of California, Berkeley. Additional funding has been provided by grants from the Simons Foundation and the Alfred P. Sloan Foundation, ensuring its sustainability as a community-driven resource.

Category:Digital libraries Category:Open access Category:Scientific communication