International Lattice Data Grid

International Lattice Data Grid
Name	International Lattice Data Grid
Formation	2000s
Type	Research infrastructure
Region served	International

Contents

Overview
History and Development
Architecture and Infrastructure
Data Policies and Standards
Major Collaborations and Projects
Impact and Applications
Challenges and Future Directions

International Lattice Data Grid

The International Lattice Data Grid was an international research infrastructure initiative created to coordinate and share large-scale computational datasets among lattice quantum chromodynamics collaborations and associated high-energy physics communities. It connected research centers, supercomputing facilities, and academic institutions to enable distributed storage, metadata curation, and reproducible analysis, drawing participation from projects associated with CERN, Brookhaven National Laboratory, Fermilab, KEK, and national laboratories across Europe, Asia, and the Americas.

Overview

The project provided a federated platform for sharing gauge configuration ensembles and analysis outputs produced by collaborations such as Hadron Spectrum Collaboration, RBC-UKQCD Collaboration, MILC Collaboration, ETM Collaboration, and CLS while interfacing with infrastructure at National Energy Research Scientific Computing Center, Jülich Research Centre, RIKEN, Oak Ridge National Laboratory, and NERSC. It emphasized interoperability with standards from World Wide Web Consortium, International Organization for Standardization, and scientific data frameworks developed by groups including HEPData, Zenodo, and the Open Science Grid to promote reuse among researchers affiliated with Princeton University, University of Cambridge, University of Edinburgh, Massachusetts Institute of Technology, and University of Tokyo.

History and Development

Origins trace to early 2000s meetings among lattice groups at conferences such as the International Conference on High Energy Physics, the Lattice Field Theory Symposium, and workshops hosted by CERN and DESY. Early adopters included teams at Brookhaven National Laboratory and Fermilab, with architecture discussions informed by projects like the European Grid Infrastructure and the Open Grid Forum. Funding and coordination involved agencies such as the National Science Foundation, the European Commission, Japan Society for the Promotion of Science, and national ministries connected to Lawrence Berkeley National Laboratory and TRIUMF.

The initiative evolved through milestones comparable to the deployment phases of EGEE and the development of software stacks from Globus Toolkit and GridFTP, with community governance influenced by steering committees that included representatives from IHEP, INFN, Swansea University, and University of Southampton.

Architecture and Infrastructure

The architecture used distributed storage nodes, metadata catalogs, and authentication/authorization mechanisms interoperable with services like Kerberos, OAuth, and certificate authorities similar to those used by European Middleware Initiative. Compute-level integration targeted supercomputers at Argonne National Laboratory, Leibniz Supercomputing Centre, and TSUBAME with workflow orchestration inspired by systems such as HTCondor and SLURM. Data transfer employed high-throughput pathways leveraging research networks including GÉANT, Internet2, and CANARIE to move large lattice ensembles between centers such as RAL and CINECA.

Metadata schemas drew on community standards and vocabularies overlapping with efforts at INSPIRE-HEP and registries at DOE Office of Science facilities, while persistent identifiers were coordinated in ways compatible with Digital Object Identifier practices used by American Physical Society and Institute of Physics publishing.

Data Policies and Standards

The initiative promoted open-data principles aligned with policies from organizations like the European Research Council, Wellcome Trust, and John Templeton Foundation where applicable, balancing openness with embargo models used by collaborations such as RBC-UKQCD Collaboration and MILC Collaboration. Citation and attribution guidelines referenced norms endorsed by American Physical Society, IOP Publishing, and the Council of European Union research frameworks, and recommended metadata elements compatible with Dublin Core and standards advocated by World Data System.

Access control procedures mirrored practices at CERN's collaborations, with user authentication models influenced by federated identity initiatives such as eduGAIN and authorization policies consistent with institutional review mechanisms at universities like Harvard University and Oxford University.

Major Collaborations and Projects

Key participants included established lattice groups: MILC Collaboration, RBC-UKQCD Collaboration, Hadron Spectrum Collaboration, ALPHA Collaboration, ETM Collaboration, and national consortia in Europe, North America, and Asia. Linked projects included repository integrations with HEPData, data publication pilots with Zenodo, and interoperability tests with the Open Science Grid and European Grid Infrastructure. Workshops and code sprints involved institutions such as CERN, Brookhaven National Laboratory, KEK, TRIUMF, and universities including University of Maryland and University of Barcelona.

Impact and Applications

By enabling shared access to gauge configurations and propagator datasets, the grid accelerated calculations of hadron spectra, weak matrix elements, and quark-gluon plasma observables used in studies associated with Large Hadron Collider experiments and theoretical efforts tied to Quantum Chromodynamics phenomenology. Results produced via resources coordinated through the platform informed analyses at CMS, ATLAS, and heavy-ion studies by ALICE, and supported theoretical collaborations including work linked to Institute for Advanced Study researchers and faculty at Princeton University and Yale University.

The infrastructure reduced duplication of computing effort among groups at University of Tokyo, Tata Institute of Fundamental Research, and Peking University, and enabled reproducibility practices encouraged by funders like the National Institutes of Health and European Research Council.

Challenges and Future Directions

Challenges included sustaining funding models across agencies such as National Science Foundation and the European Commission, ensuring long-term preservation with institutions like Portico and coordinating governance among partners including CERN, DOE, and national laboratories. Technical hurdles involved evolving standards for metadata interoperability, high-performance data transfer to centers like NERSC and Oak Ridge National Laboratory, and integration with cloud providers relevant to efforts by Amazon Web Services research programs and Google Cloud Platform collaborations.

Future directions emphasized tighter integration with open-science infrastructures such as Scholarly Communication Alliance, expanded adoption of persistent identifiers via DataCite, and deeper collaboration with experimental programs at CERN and computing centers including Jülich Research Centre to support exascale-era lattice QCD calculations and cross-disciplinary reuse by astrophysics and nuclear physics communities.

Category:Scientific databases