Neurodata Without Borders

Neurodata Without Borders
Name	Neurodata Without Borders
Formation	2015
Type	Nonprofit consortium
Region served	Global
Leader title	Founders

Contents

Overview
History and development
Data model and standards
Software and tools
Adoption and impact
Governance and community
Challenges and future directions

Neurodata Without Borders

Neurodata Without Borders is an initiative to create standardized formats and metadata conventions for neurophysiology and neuroscience data to improve sharing, reproducibility, and analysis across laboratories and platforms. Founded to harmonize disparate recording modalities and experimental metadata, it bridges software ecosystems and repository efforts to enable interoperability among research groups, funding agencies, and archives.

Overview

Neurodata Without Borders provides a specification for storing electrophysiology, imaging, and behavioral datasets that interfaces with projects such as Allen Institute for Brain Science, Human Connectome Project, BRAIN Initiative, European Research Council, and National Institutes of Health archives. The project defines a schema that unifies data producers including laboratories from Massachusetts Institute of Technology, Stanford University, Harvard University, Cold Spring Harbor Laboratory, and University College London with repositories like Dryad (repository), Zenodo, OpenNeuro, and Kaggle-style challenge organizers. It is used alongside software stacks from Python (programming language), MATLAB, R (programming language), HDF5, and cloud providers such as Amazon Web Services and Google Cloud Platform to facilitate deposition workflows and metadata extraction.

History and development

The initiative emerged amid community discussions at meetings including Society for Neuroscience, COS (Center for Open Science), G-Node (German Neuroinformatics Node), and workshops linked to Neuroinformatics conferences. Early contributors included researchers affiliated with Cornell University, Princeton University, University of California, San Diego, and University of Pennsylvania, collaborating with infrastructure groups at Lawrence Berkeley National Laboratory and National Institute of Mental Health. Funding and endorsement came from programs tied to National Science Foundation awards, collaborative grants from Wellcome Trust, and partnerships with consortia like INCF (International Neuroinformatics Coordination Facility). Versioned releases were coordinated with community feedback cycles at venues such as Neural Information Processing Systems and International Conference on Machine Learning hackathons.

Data model and standards

The specification defines a hierarchical layout leveraging formats maintained by HDF Group and conventions inspired by projects such as Brain Imaging Data Structure and Dublin Core. It prescribes metadata fields capturing provenance common to datasets curated at institutions like Salk Institute, Max Planck Society, Kyoto University, and Karolinska Institutet. Schema elements reference ontologies and identifiers curated by groups including Gene Ontology Consortium, ORCID, Digital Object Identifier, and NIFSTD (Neuroscience Information Framework Standard Ontology). The standard supports modalities used in studies from Howard Hughes Medical Institute investigators, accommodating extracellular recordings, intracellular traces, calcium imaging, and behavioral annotations produced in labs at Johns Hopkins University and University of California, Berkeley.

Software and tools

A software ecosystem implements the specification with libraries and converters written for ecosystems at GitHub, interoperating with tools like SpikeInterface, Suite2p, CaImAn, Neo (library), and MNE (software). Bindings exist for Python (programming language), MATLAB, and Julia (programming language), enabling analysis pipelines deployed on platforms such as Docker, Kubernetes, and continuous integration services used by projects at Red Hat and Canonical (company). Community-developed viewers and validators integrate with repositories from Figshare and archive services at European Bioinformatics Institute, enabling deposition workflows compatible with mandates from funders such as Bill & Melinda Gates Foundation and Chan Zuckerberg Initiative.

Adoption and impact

Adoption spans academic laboratories at Columbia University, University of Oxford, University of Toronto, and University of Washington as well as industry research groups at Google DeepMind, Meta Platforms, Inc., IBM Research, and Microsoft Research. The standard has been cited in consortium studies associated with Alzheimer's Disease Neuroimaging Initiative, translational projects at Novartis, and multi-site collaborations funded by European Commission programs. Its influence is evident in data publication workflows at journals like Nature Neuroscience, Neuron (journal), and PLoS Biology, and in teaching materials from summer schools hosted by Cold Spring Harbor Laboratory and EMBL.

Governance and community

Governance follows a community-driven model with steering contributors from universities and research institutes including Yale University, Duke University, Imperial College London, and EPFL. Working groups coordinate specification updates in public fora mirrored on platforms such as GitHub and community mailing lists tied to Slack (software), with oversight informed by expert panels drawn from Society for Neuroscience committees, funders like National Institute of Neurological Disorders and Stroke, and standards organizations including W3C-adjacent initiatives. Outreach includes tutorials at conferences such as COSYNE and collaborative code sprints with groups like OpenAI research partners.

Challenges and future directions

Key challenges include accommodating high-volume modalities used at centers like Janelia Research Campus and handling privacy and consent constraints enforced by institutional review boards at Mayo Clinic and Cleveland Clinic. Interoperability with emerging standards from DANDI and harmonization with metadata frameworks endorsed by FAIR (principles) advocates remain priorities. Future directions emphasize scalability for cloud-native analysis with partners like Databricks, expanded ontology integration with NCBI, and enhanced provenance tracking to meet requirements from journals such as Science (journal) and funders including Wellcome Trust and European Research Council.

Category:Neuroscience data standards