Generated by GPT-5-mini| LISE3 | |
|---|---|
| Name | LISE3 |
| Developer | CERN / European Organization for Nuclear Research collaboration |
| Released | 2024 |
| Latest release | 3.0 |
| Programming language | C++ / Python (programming language) / Rust (programming language) |
| Operating system | Linux / Windows / macOS |
| License | MIT License / Apache License |
LISE3 LISE3 is an advanced experimental system for large-scale data synthesis and inference developed by a consortium including CERN, Massachusetts Institute of Technology, Stanford University, University of Oxford, and industrial partners such as Google LLC, Microsoft, and NVIDIA. It integrates techniques from projects at OpenAI, DeepMind, IBM Research, Facebook AI Research, and national laboratories like Los Alamos National Laboratory to address high-throughput analysis and generative modeling in scientific workflows. LISE3 is positioned at the intersection of research initiatives exemplified by Large Hadron Collider, Human Genome Project, Square Kilometre Array, and multidisciplinary testbeds hosted at institutions like Argonne National Laboratory and Lawrence Berkeley National Laboratory.
LISE3 combines scalable Hadoop-style storage patterns with model orchestration approaches seen in TensorFlow and PyTorch. The platform emphasizes modularity inspired by Kubernetes orchestration, reproducibility practices rooted in ReproZip and Docker (software), and collaborative governance models comparable to the W3C and Linux Foundation. LISE3 targets domains ranging from analyses performed for LIGO Scientific Collaboration to simulation-driven investigations used by European Space Agency and National Aeronautics and Space Administration teams.
LISE3 originated from a research proposal co-sponsored by European Commission programs and National Science Foundation grants, evolving through partnerships with laboratories such as CERN, Fermilab, and university groups at University of California, Berkeley and Caltech. Early prototypes borrowed algorithms and evaluation pipelines from projects like ImageNet and BERT and integrated system lessons from Apache Spark and Dask (software). Milestones include demonstrations at conferences like NeurIPS, ICML, CVPR, and AAAI, collaborations with consortia such as Human Cell Atlas and interoperability tests during workshops organized by IEEE and ACM.
LISE3 provides high-throughput data ingestion compatible with SQUID-style proxies and common storage backends used at European Grid Infrastructure sites. It supports model families influenced by architectures in Transformer (machine learning model), Convolutional neural network, and probabilistic frameworks from Stan (software). Built-in connectors align with data services used by GenBank, SIMBAD Astronomical Database, and Zenodo; visualization interfaces borrow conventions from Matplotlib and ParaView. The platform offers federated workflows that interoperate with compute schedulers like Slurm Workload Manager and HTCondor and integrates monitoring solutions inspired by Prometheus (software) and Grafana.
The core of LISE3 uses a microservice topology orchestrated via Kubernetes with storage layers employing patterns from Ceph and HDFS. Model serving components are informed by designs used in TensorFlow Serving and TorchServe, while pipeline composition borrows from Apache Airflow and Prefect (software). Hardware acceleration strategies incorporate chips and libraries from NVIDIA GPUs and Intel accelerators, with experiments conducted on clusters at Oak Ridge National Laboratory and cloud environments such as Amazon Web Services, Google Cloud Platform, and Microsoft Azure. Security integration follows practices aligned with standards propagated by ISO/IEC JTC 1/SC 27 and identity federations like InCommon.
LISE3 has been piloted in particle physics analyses tied to experiments like ATLAS and CMS (detector), used in genomics pipelines related to ENCODE and 1000 Genomes Project, and applied in radio astronomy collaborations connected to LOFAR and MeerKAT. It supports environmental modeling initiatives associated with Intergovernmental Panel on Climate Change efforts and remote sensing projects coordinated with European Space Agency missions such as Copernicus Programme. Deployments span academic clusters at University of Cambridge and industrial research labs at Bell Labs and Siemens AG.
Benchmarks for LISE3 reference workloads derived from datasets like ImageNet, Common Crawl, and domain-specific corpora such as Protein Data Bank and Sloan Digital Sky Survey. Evaluation protocols reflect standards used in papers presented at NeurIPS and ICLR, comparing throughput and accuracy metrics against baselines from BERT (language model), ResNet, and hybrid probabilistic systems originating from Stanford University and UC Berkeley. Scalability tests have been conducted on platforms including Fermilab clusters and commercial clouds, demonstrating improvements in end-to-end pipeline latency relative to configurations using Apache Spark alone.
LISE3 incorporates access controls interoperable with identity providers like ORCID, eduGAIN, and OAuth 2.0 federations employed by institutions such as MIT and Stanford University. Data protection practices mirror compliance patterns observed in General Data Protection Regulation implementations for research consortia and align with audit frameworks referenced by National Institute of Standards and Technology. Privacy-preserving modules follow approaches from federated learning studies by groups at Google Research and OpenMined, while threat modeling draws on advisories issued by CERT Coordination Center and incident response playbooks used by European Network and Information Security Agency.
Category:Scientific software