LLMpediaThe first transparent, open encyclopedia generated by LLMs

NOMAD Repository

Generated by GPT-5-mini
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Expansion Funnel Raw 73 → Dedup 0 → NER 0 → Enqueued 0
1. Extracted73
2. After dedup0 (None)
3. After NER0 ()
4. Enqueued0 ()
NOMAD Repository
NameNOMAD Repository
Established2016
TypeResearch data repository
LocationBerlin

NOMAD Repository The NOMAD Repository is an open research data platform for computational materials science that aggregates, curates, and disseminates atomistic simulation data. It serves as an archival and discovery service linking high-throughput calculations, software provenance, and metadata to support reproducible research in materials science, solid-state physics, and computational chemistry. The repository interoperates with international initiatives and research infrastructures to enable reuse of datasets by academic groups, national labs, and industrial partners.

Overview

NOMAD Repository functions as a centralized archive connecting workflows, code, and datasets produced by projects using simulation engines such as VASP (software), Quantum ESPRESSO, ABINIT, WIEN2k, and GPAW. It integrates with data standards promoted by organizations like the Research Data Alliance, European Open Science Cloud, and ELIXIR. The platform supports FAIR principles endorsed by CODATA and OECD policy documents, enabling citation metrics aligned with practices from publishers like Nature, Science (journal), and Physical Review Letters. Collaborations include research centers such as Max Planck Society, Forschungszentrum Jülich, Lawrence Berkeley National Laboratory, Argonne National Laboratory, and universities like Technical University of Munich, University of Cambridge, and Massachusetts Institute of Technology.

Data and Metadata

Datasets in the Repository include raw simulation outputs, parsed results, and aggregated meta-analyses linked to provenance from workflow managers such as AiiDA, FireWorks, and Atomic Simulation Environment. Metadata schemas draw on schemas from Dublin Core, standards developed by NIST, and ontologies referenced by GO (Gene Ontology) projects adapted for materials. Records include identifiers compatible with DOI registration agencies and link to author profiles in ORCID and institutional repositories like Zenodo and Figshare. The Repository supports annotations that reference experimental databases such as ICSD and Materials Project as well as computational resources from PRACE and XSEDE.

Infrastructure and Technology

The technical stack relies on scalable storage and compute orchestration integrating services from cloud providers and national compute centers including Amazon Web Services, Google Cloud Platform, and Deutsche Forschungsgemeinschaft-backed infrastructures. It uses database technologies and APIs interoperable with tools like GraphQL, RESTful APIs, and semantic web standards from W3C. Containerization and reproducibility are enabled through Docker (software) and Singularity (software), while source-code provenance connects to GitHub and GitLab. Visualization and analytics integrate libraries and platforms such as Jupyter Notebook, Matplotlib, and TensorFlow for machine-learning workflows.

Access and Use

Access policies accommodate open access mandates from funders such as the European Commission, Horizon 2020, National Science Foundation, and Wellcome Trust. Users discover datasets via a web portal and programmatic interfaces with authentication tied to federated identity systems like eduGAIN and ORCID. Licensing options include those from Creative Commons and institutional agreements facilitating citation practices promoted by CrossRef. The platform provides tools for data mining, machine-learning model training, and linking to publications in journals such as Physical Review B and npj Computational Materials.

Governance and Funding

Governance involves academic consortia, national laboratories, and funding agencies including European Commission, German Research Foundation, and national ministries of science across participating countries. Advisory bodies include representatives from international projects like European Materials Modelling Council and industry partners from semiconductor and battery sectors represented by companies that collaborate in consortia such as BASF and Siemens. Funding sources combine competitive grants from Horizon Europe, infrastructure funding from agencies such as Bundesministerium für Bildung und Forschung, and partner contributions from research institutes like CERN for data management expertise.

Impact and Applications

The Repository accelerates discovery in fields linked to energy materials, catalysis, and electronic structure by enabling cross-study comparisons and large-scale data analytics used by researchers at Stanford University, Harvard University, ETH Zurich, and national labs including Oak Ridge National Laboratory. It supports machine-learning-driven materials design workflows that connect to industrial research at Toyota Research Institute and BASF. Outputs include materials property predictions that inform projects at National Renewable Energy Laboratory and computational studies relevant to standards from International Organization for Standardization.

History and Development

The platform emerged from collaborations among European and international research groups during the 2010s, building on open data movements exemplified by initiatives at institutions like Max Planck Society and projects influenced by policy papers from European Commission. Early technical development linked groups using codes like VASP (software) and Quantum ESPRESSO and received funding in programs related to Horizon 2020 and national research councils including Deutsche Forschungsgemeinschaft. Subsequent phases expanded interoperability with infrastructures such as European Open Science Cloud and partnerships with data-science groups at University of Oxford and University of California, Berkeley.

Category:Scientific data repositories