LLMpediaThe first transparent, open encyclopedia generated by LLMs

Vertebrate Genomes Project

Generated by GPT-5-mini
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: Ensembl Hop 4
Expansion Funnel Raw 70 → Dedup 0 → NER 0 → Enqueued 0
1. Extracted70
2. After dedup0 (None)
3. After NER0 ()
4. Enqueued0 ()
Vertebrate Genomes Project
NameVertebrate Genomes Project
Founded2016
LocationGlobal
FocusGenome sequencing, biodiversity, conservation

Vertebrate Genomes Project

The Vertebrate Genomes Project is an international consortium that produces high-quality, near-complete reference genomes for vertebrate species. The project coordinates researchers, sequencing centers, and museums to generate assemblies intended to support comparative genomics, conservation biology, and evolutionary studies. Partners include academic institutions, natural history museums, and funding agencies across North America, Europe, Asia, and Australia.

Overview

The consortium coordinates contributions from institutions such as Broad Institute, Wellcome Sanger Institute, Smithsonian Institution, Harvard University, and University of California, Berkeley to produce reference genomes for representatives across vertebrate diversity. Workflows integrate technologies developed by companies and centers including Pacific Biosciences, Oxford Nanopore Technologies, 10x Genomics, and Baylor College of Medicine Human Genome Sequencing Center. Sample provenance often involves collections from museums like the American Museum of Natural History, Natural History Museum, London, and Australian Museum as well as field programs associated with universities such as University of Cambridge and University of Oxford. The project publishes data in coordination with repositories including National Center for Biotechnology Information, European Bioinformatics Institute, and DNA Data Bank of Japan.

History and Organization

The initiative began in the mid-2010s with coordination among researchers affiliated with University of California, Santa Cruz, Caltech, University of Washington, and the Max Planck Society. Early organizational leadership involved scientists linked to grants from agencies like the National Science Foundation, the Wellcome Trust, and the Gates Foundation. The consortium established steering and working groups drawing members from institutions such as Cold Spring Harbor Laboratory, University of Chicago, Yale University, and Stanford University. Collaborative meetings and symposia have been held at venues including Cold Spring Harbor Laboratory and conferences like the International Congress on Vertebrate Morphology and Genome Informatics Conference.

Objectives and Scope

Primary goals include producing chromosome-scale, haplotype-resolved assemblies for all extant vertebrate orders represented by institutes like Smithsonian National Museum of Natural History and Canadian Museum of Nature. The scope spans mammals, birds, reptiles, amphibians, and fishes, engaging specialists from American Ornithological Society, Society for Conservation Biology, and taxon-focused groups such as the Herpetologists' League. Outputs are intended for use by researchers affiliated with laboratories at Princeton University, Columbia University, University of Michigan, and conservation programs at organizations like World Wildlife Fund and Conservation International.

Methodologies and Technologies

Protocols combine long-read sequencing from providers including Pacific Biosciences and Oxford Nanopore Technologies with proximity ligation approaches from companies like Dovetail Genomics and Arima Genomics to achieve chromosome scaffolding. Haplotype phasing uses approaches developed at institutions like Broad Institute and McDonnell Genome Institute, sometimes integrating linked-read data from 10x Genomics and optical maps from Bionano Genomics. Bioinformatics pipelines employ tools and frameworks created by groups at European Bioinformatics Institute, Broad Institute, and University of California, Santa Cruz using software such as assemblers and scaffolders developed in labs including Wellcome Sanger Institute and Genome Institute at Washington University in St. Louis. Quality standards reference benchmarks established by projects like the Human Genome Project and initiatives at the National Human Genome Research Institute.

Major Achievements and Releases

The consortium has released high-contiguity genomes for taxa sampled with support from museums and field programs linked to Field Museum, American Museum of Natural History, and Royal Ontario Museum. Representative releases have included assemblies that enabled comparative studies alongside reference sets from Human Genome Project, 1000 Genomes Project, and Genome 10K. Results have been highlighted in publications with coauthors from institutions such as Duke University, University of Copenhagen, Max Planck Institute for Evolutionary Anthropology, and Monash University. Datasets have facilitated phylogenomic analyses involving clades studied by researchers at University of Helsinki, University of São Paulo, and Peking University.

Data Access and Resources

All assemblies and raw data are deposited with archival repositories such as National Center for Biotechnology Information's Sequence Read Archive, European Nucleotide Archive at European Bioinformatics Institute, and mirrored at DNA Data Bank of Japan, with metadata curated in collaboration with natural history collections like Museum of Natural History, Vienna and National Museum of Natural History (France). Analytical resources and browsers leveraging outputs have been developed by teams at Ensembl, UCSC Genome Browser at University of California, Santa Cruz, and community portals maintained by groups at Broad Institute and Wellcome Sanger Institute.

Impact and Applications

High-quality vertebrate assemblies have supported conservation decisions by agencies and NGOs such as IUCN and BirdLife International and informed captive-breeding programs linked to institutions like San Diego Zoo Global and Zoological Society of London. Comparative genomic data have advanced research in evolutionary biology performed at University of Chicago, University of California, Santa Cruz, and Imperial College London, and have been used in studies intersecting with human biomedical research at National Institutes of Health and translational efforts at Broad Institute. The project’s outputs also underpin educational initiatives and public outreach coordinated with museums including Smithsonian Institution and Natural History Museum, London.

Category:Genomics projects