LLMpediaThe first transparent, open encyclopedia generated by LLMs

ClinVar

Generated by GPT-5-mini
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: UGT Hop 4
Expansion Funnel Raw 42 → Dedup 0 → NER 0 → Enqueued 0
1. Extracted42
2. After dedup0 (None)
3. After NER0 ()
4. Enqueued0 ()
ClinVar
NameClinVar
DeveloperNational Center for Biotechnology Information; National Institutes of Health
Released2013
UrlClinVar
Versionongoing
Licensepublic domain (NCBI)

ClinVar ClinVar is a public archive that aggregates information about relationships between human genetic variation and reported phenotypes. It is maintained by the National Center for Biotechnology Information within the National Institutes of Health and interfaces with clinical laboratories, research consortia, and regulatory agencies. ClinVar supports translational genomics by sharing submissions from institutions such as American College of Medical Genetics and Genomics, research groups like the 1000 Genomes Project, and databases including dbSNP and RefSeq.

Introduction

ClinVar centralizes assertions about pathogenicity and clinical significance from submitters such as clinical laboratories, academic centers, and consortia including ClinGen, DECIPHER, and Exome Aggregation Consortium. It interoperates with reference resources such as dbGaP, GenBank, and UniProt and is used by guideline-producing organizations like the American College of Medical Genetics and Genomics and regulatory bodies including the Food and Drug Administration. ClinVar records link to supporting evidence from publications in journals like Nature Genetics, The New England Journal of Medicine, and The Lancet.

History and development

ClinVar was launched in 2013 as part of a broader push by the National Institutes of Health to promote data sharing following initiatives such as the Human Genome Project and policies promoted by the National Academies of Sciences, Engineering, and Medicine. Early development drew on resources and community efforts coordinated with groups like ClinGen and projects including the 1000 Genomes Project and Exome Aggregation Consortium. Over time, ClinVar expanded in response to needs identified in encounters involving institutions such as Mayo Clinic, Johns Hopkins Hospital, and consortia such as the Global Alliance for Genomics and Health.

Database content and structure

ClinVar stores records that connect reported alleles to asserted clinical significance, phenotype descriptions, conditions curated by organizations such as Orphanet, and supporting evidence like peer-reviewed articles from Cell or Science. Each record contains submitter metadata (laboratory or consortium), variant description mapped to references such as RefSeq and coordinate systems that use standards from groups like the Genome Reference Consortium. The archive tracks clinical significance categories influenced by frameworks from American College of Medical Genetics and Genomics and records review statuses used by expert panels such as those convened by ClinGen Expert Panels.

Submission and curation processes

Submitters include diagnostic laboratories, research programs at institutions like Broad Institute and Wellcome Sanger Institute, and clinical consortia such as ClinGen and GeneDx. Submission formats follow NCBI technical specifications and leverage standards from bodies like the Global Alliance for Genomics and Health and the Human Genome Variation Society. Curation occurs via automated checks and expert review; expert panels often include participants from academic centers such as Massachusetts General Hospital and regulatory stakeholders like the Food and Drug Administration. Discrepancies among submitters can trigger conflict resolution workflows involving reanalysis, literature review, and consensus efforts similar to those used by organizations like European Society of Human Genetics.

Access, tools, and data formats

ClinVar data are accessible through web interfaces at NCBI and programmatic endpoints including Entrez Programming Utilities and FTP distribution, compatible with file formats and standards from Variant Call Format and annotations referencing RefSeq and dbSNP. Visualization and analysis tools integrate ClinVar with platforms like UCSC Genome Browser, Ensembl, and variant interpretation tools used at Stanford University and Broad Institute. Data consumers include clinical laboratories, research groups from institutions such as Harvard Medical School and bioinformatics tool developers who employ formats governed by bodies like the Global Alliance for Genomics and Health.

Clinical and research applications

ClinVar supports clinical variant interpretation in diagnostic workflows at centers including Mayo Clinic, Children's Hospital of Philadelphia, and Johns Hopkins Hospital and informs guideline development by organizations like American College of Medical Genetics and Genomics. Researchers use ClinVar to prioritize variants in studies from consortia such as the 1000 Genomes Project, UK Biobank, and disease-focused groups like International Cancer Genome Consortium. Pharmaceutical and regulatory stakeholders, including companies and agencies such as the Food and Drug Administration, use ClinVar evidence when assessing genetic tests, pharmacogenomic markers, and actionable findings.

Limitations, controversies, and quality control

ClinVar faces challenges related to discordant submissions, variable provenance, and uneven coverage of populations studied by projects like the 1000 Genomes Project and UK Biobank. Debates involve the interpretation frameworks promoted by groups like the American College of Medical Genetics and Genomics versus community-driven reclassification efforts led by laboratories such as GeneDx. Quality control combines automated validation, submitter-provided evidence, and expert-panel curation coordinated by ClinGen and relies on cooperation with stakeholders including academic centers like Broad Institute and regulatory agencies such as the Food and Drug Administration to ameliorate misclassification and to improve transparency.

Category:Genetics databases