ADHD-200 — LLMpedia

ADHD-200
Name	ADHD-200
Release year	2011
Domain	Neuroimaging, Psychiatry
Modalities	Resting-state fMRI, Structural MRI, Phenotypic data
Contributors	International ADHD-200 Consortium
Access	Publicly shared for research

Contents

Background and objectives
Dataset composition and acquisition
Preprocessing and quality control
Benchmarking and competitions
Applications and impact in research
Limitations and criticisms

ADHD-200 The ADHD-200 initiative was a publicly shared multicenter neuroimaging dataset assembled to accelerate computational studies of attention-deficit/hyperactivity disorder. It provided resting-state functional MRI, structural MRI, and standardized phenotypic measures collected across multiple sites to enable benchmarking of machine learning, connectomics, and biomarker discovery efforts. The project brought together investigators from academic institutions, clinical centers, and consortia to foster open science and reproducible methods in psychiatric neuroimaging.

Background and objectives

The initiative originated from collaborations among investigators affiliated with institutions such as Harvard University, Massachusetts General Hospital, University of Michigan, Johns Hopkins University, and McGill University and was motivated by prior large-scale projects including Human Connectome Project, Enhancing Neuroimaging Genetics through Meta-Analysis, and Alzheimer's Disease Neuroimaging Initiative. Primary objectives were to provide a standardized resource for algorithm development, to compare analytic pipelines used by teams at Stanford University, Princeton University, University College London, Yale University, and University of Oxford, and to stimulate participation from groups involved with National Institutes of Health, Wellcome Trust, and regional neuroinformatics centers. Goals included fostering reproducible classification studies, enabling connectome-based predictive modeling, and creating a benchmark analogous to community challenges like ImageNet and Kaggle competitions.

Dataset composition and acquisition

Data originated from multiple acquisition sites including Peking University, New York University, University of Utah, Hospital for Sick Children, University of Pittsburgh, and NIMH. Modalities comprised T1-weighted structural MRI and resting-state blood-oxygen-level-dependent functional MRI collected on scanners from manufacturers such as Siemens, GE Healthcare, and Philips. Phenotypic variables included age, sex, handedness, medication status, and clinical ratings from instruments linked to programs at National Institute of Mental Health, clinical centers like Children's Hospital Boston, and research groups at Columbia University and University of Toronto. The cohort pooled data from subjects diagnosed at clinics affiliated with Mayo Clinic, Cleveland Clinic, University of California, Los Angeles, and community samples recruited by groups tied to Boston Children's Hospital and regional research networks.

Preprocessing and quality control

Centralized preprocessing pipelines were developed drawing on software from projects at McGill University and toolboxes such as FSL (software), AFNI, SPM (software), and libraries developed at Laboratory for Computational Neuroimaging. Quality control procedures incorporated motion assessment, temporal signal-to-noise evaluation, and anatomical inspection following practices adopted in studies from Stanford School of Medicine and University of Cambridge. Data curation teams used standards influenced by initiatives at OpenfMRI and infrastructures supported by Amazon Web Services collaborations with academic nodes like University of California, San Diego. Preprocessing choices—slice-timing correction, spatial normalization to templates used in Montreal Neurological Institute studies, and nuisance regression strategies—reflected methods debated in workshops at Society for Neuroscience and conferences hosted by Organization for Human Brain Mapping.

Benchmarking and competitions

The ADHD-200 initiative catalyzed community competitions and benchmarking efforts modeled after events involving ImageNet and hosted through venues connected to Neuroimage special issues. Participating teams from Massachusetts Institute of Technology, University of Pennsylvania, Carnegie Mellon University, University of Toronto, and international groups submitted classification pipelines evaluated against held-out samples and metrics familiar to researchers at International Neuroinformatics Coordinating Facility events. Results were discussed at meetings of Organization for Human Brain Mapping, reported in proceedings associated with NeurIPS and International Conference on Machine Learning, and compared with approaches from groups affiliated with Google DeepMind and academic centers such as ETH Zurich.

Applications and impact in research

Researchers used the dataset to develop machine learning classifiers, graph-theory analyses, and predictive models led by labs at Princeton University, Yale School of Medicine, University of Cambridge, University of Oxford, and University of California, Berkeley. The ADHD-200 resource supported studies integrating data harmonization approaches from Stanford Medicine teams, evaluation of connectome-based biomarkers pursued at Massachusetts General Hospital, and meta-analytic syntheses related to pediatric psychiatry programs at Columbia University. It influenced subsequent open-data efforts like ABIDE and informed methodology in consortia coordinated with National Institute of Mental Health and funding agencies such as National Science Foundation and Medical Research Council.

Limitations and criticisms

Critiques highlighted heterogeneity across acquisition protocols from sites including Peking University and NYU Langone Health, raising concerns similar to debates at Organization for Human Brain Mapping and in commentaries from researchers at University College London and McGill University. Other limitations noted by teams at Harvard Medical School and Johns Hopkins University included modest phenotypic harmonization, variable motion artifacts emphasized in work from University of California, Los Angeles, and limited longitudinal follow-up compared to initiatives like Alzheimer's Disease Neuroimaging Initiative. Discussions at venues involving Society for Neuroscience and publications tied to NeuroImage examined reproducibility and potential confounds introduced by multi-site pooling.

Category:Neuroimaging datasets