Generated by GPT-5-mini| Crystallography Open Database | |
|---|---|
| Name | Crystallography Open Database |
| Established | 2003 |
| Type | Open-access structural database |
| Discipline | Crystallography |
| Country | International |
Crystallography Open Database is an open-access repository that aggregates crystal structure data for inorganic, metal-organic, and organic materials from peer-reviewed publications, institutional deposits, and community contributions. The project operates as an international collaborative effort connecting researchers, libraries, journals, and standards bodies to improve discoverability and reuse of crystallographic information across computational materials science, solid-state chemistry, and structural biology. It supports interoperability with major archives, indexing services, and software ecosystems to accelerate reproducible research in structural science.
The database serves users ranging from academic researchers at Massachusetts Institute of Technology, University of Cambridge, and Stanford University to industrial teams at BASF, DuPont, and Siemens. It complements archives such as Protein Data Bank, Inorganic Crystal Structure Database, and American Mineralogist Crystal Structure Database by focusing on open redistribution and machine-readable exchange. Contributors include depositors affiliated with organizations like Max Planck Society, Lawrence Berkeley National Laboratory, and Chinese Academy of Sciences, while users integrate data with platforms like Materials Project, AFLOW, and Open Quantum Materials Database. Standards and identifiers are coordinated with authorities such as International Union of Crystallography, CrossRef, and Digital Object Identifier agencies.
The initiative emerged in the early 2000s through collaborations among crystallographers at institutions including Vanderbilt University, University of Barcelona, and Academy of Sciences of the Czech Republic. Early milestones involved aligning deposition formats with efforts by International Union of Crystallography and linking bibliographic metadata via CrossRef and ORCID to improve author attribution. Over time, the project integrated contributions from national facilities such as European Synchrotron Radiation Facility, Diamond Light Source, and Brookhaven National Laboratory, and established workflows interoperable with repositories like Zenodo and Figshare. Governance evolved with advisory input from stakeholders at universities, professional societies such as Royal Society of Chemistry, and funding agencies including European Commission and National Science Foundation.
Coverage spans small-molecule organic crystals, inorganic frameworks, metal–organic frameworks (MOFs), minerals, and pharmaceutical polymorphs, drawing on publications from journals such as Acta Crystallographica, Journal of the American Chemical Society, Angewandte Chemie International Edition, Nature Materials, and Science. Entries include unit cell parameters, space group assignments tied to tables in International Tables for Crystallography, atomic coordinates, and occupancy information compatible with conventions used by SHELX, OLEX2, and CRYSTALS. Metadata links to authors registered with ORCID, institutions indexed by GRID, and funding acknowledgments referencing programs like Horizon 2020 and U.S. Department of Energy. The collection supports searches by chemical composition, topology referencing Reticular Chemistry Structure Resource, and experimental provenance from beamlines at Swiss Light Source and Argonne National Laboratory.
Data are distributed in machine-readable formats such as CIF (Crystallographic Information File), which aligns with standards from International Union of Crystallography, as well as derived JSON and XML exports for integration with tools like VESTA, Mercury (CCDC), and PyMOL. Programmatic access is available through RESTful APIs compatible with workflows in Jupyter, MATLAB, and RStudio, and command-line utilities support batch retrieval for high-throughput screening used by groups at Harvard University and California Institute of Technology. Visualization and analysis pipelines connect with simulation codes including VASP, Quantum ESPRESSO, and LAMMPS for density functional theory and molecular dynamics studies.
Curation combines automated validation routines with manual review by experts from institutions such as University College London, University of Tokyo, and ETH Zurich. Automated checks compare reported space groups against symmetry operations in Bilbao Crystallographic Server and validate chemical connectivity against reference chemistry from PubChem and Cambridge Structural Database heuristics. Source records originate from publisher deposits to journals like CrystEngComm and institutional submissions from centers such as Paul Scherrer Institute and National Institute of Standards and Technology. Errata and revised coordinates are tracked via CrossMark-like identifiers and linked to revision histories managed with provenance tools used by DataCite.
The repository adopts open licensing practices compatible with open-data mandates of funders such as European Research Council and Wellcome Trust, and supports redistribution under licenses analogous to Creative Commons terms where permitted by publishers like Elsevier, Springer Nature, and Wiley. Depositors are encouraged to assert rights and provenance, with metadata noting publisher embargoes or copyright statements from outlets including Royal Society of Chemistry and American Chemical Society. The project liaises with institutional repositories at University of California and consortia such as SPARC to align policies on long-term preservation and permissible reuse.
Governance relies on an international advisory board with representatives from universities, national laboratories, and societies such as International Union of Crystallography and European Crystallographic Association. Community activities include workshops at conferences like American Crystallographic Association meetings, tutorials at Gordon Research Conferences, and collaborative grants with initiatives such as Materials Genome Initiative. Partnerships extend to software developers at The Cambridge Crystallographic Data Centre and standards organizations including OpenAIRE to foster interoperability, reproducibility, and broader adoption among materials and chemical research communities.