human genome — LLMpedia

human genome
Name	Human genome
Caption	Representation of human chromosomes
Organism	Homo sapiens
Size	~3.2 billion base pairs
Chromosomes	46 (22 pairs autosomes, 2 sex chromosomes)
Discovery	Completed draft announced 2000; finished assembly 2003

Contents

human genome The human genome encodes the hereditary information of Homo sapiens and underpins research at institutions such as the National Institutes of Health, Wellcome Trust, Cold Spring Harbor Laboratory, Max Planck Society and Medical Research Council. Major milestones involved collaborations among teams at the Baylor College of Medicine, University of California, Santa Cruz, Broad Institute, Sanger Institute, and private firms like Celera Genomics. Its sequence and interpretation have driven projects at the National Human Genome Research Institute, influenced policy debates in bodies including the United States Congress and the European Parliament, and informed clinical practice at hospitals such as Mayo Clinic and Johns Hopkins Hospital.

Overview and Definition

The human genome comprises the complete set of DNA in a typical Homo sapiens cell, organized across chromosomes housed in the nucleus and in mitochondrial DNA associated with organelles studied at the Max Planck Institute for Evolutionary Anthropology, Smithsonian Institution collections, and clinical centers like Massachusetts General Hospital. Projects led by the Human Genome Project consortium, the International HapMap Project and repositories at the European Molecular Biology Laboratory and GenBank defined reference assemblies used by research groups at the University of Cambridge, Yale University, and pharmaceutical companies such as GlaxoSmithKline and Pfizer.

Genomic DNA is partitioned into 22 pairs of autosomes and a pair of sex chromosomes, features mapped by cytogeneticists at institutions like the American Society of Human Genetics and visualized in karyotypes produced by labs at Harvard Medical School and the University of Oxford. Chromosomes contain euchromatin and heterochromatin regions characterized in studies at the Sanger Institute and the National Cancer Institute, while mitochondrial DNA was characterized by researchers affiliated with the University of Arizona and the University of Pennsylvania. Repetitive elements, centromeres and telomeres have been focal points for groups at the Broad Institute and the European Bioinformatics Institute.

Large-scale sequencing initiatives were pioneered by the publicly funded Human Genome Project and the private company Celera Genomics, with landmark announcements involving leaders from the White House and funders such as the Wellcome Trust and the U.S. Department of Energy. Technologies from firms like Illumina and Pacific Biosciences enabled phased improvements adopted by sequencing centers at the Broad Institute, Cold Spring Harbor Laboratory, and the Sanger Institute. The resulting reference assemblies have been curated by databases maintained by the National Center for Biotechnology Information, the European Nucleotide Archive and the Genome Reference Consortium.

Genetic variation across populations was cataloged in efforts such as the 1000 Genomes Project, the HapMap Project and population studies led by universities including Stanford University, University of Chicago and University College London. Research teams at institutes like the Broad Institute and the Wellcome Trust Sanger Institute analyzed single-nucleotide polymorphisms, structural variants and copy-number variants to study ancestry and disease associations, informing clinical guidelines at the World Health Organization and national agencies including the Centers for Disease Control and Prevention. Studies in population genetics often collaborate with biobanks such as the UK Biobank and regional initiatives connected to the All of Us Research Program.

Annotation of genes, promoters, enhancers and non-coding elements has been advanced by consortia including the ENCODE Project and the FANTOM Consortium, with computational resources provided by the European Bioinformatics Institute, the National Center for Biotechnology Information and research groups at the University of California, Santa Barbara and Princeton University. Functional genomics experiments at labs affiliated with the Howard Hughes Medical Institute, the Salk Institute and the Whitehead Institute have used transcriptomics, epigenomics and proteomics to assign function to genomic regions and to produce catalogs integrated into resources hosted by the Wellcome Trust Sanger Institute and the Broad Institute.

Clinical translation of genomic findings is practiced at centers such as Mayo Clinic, Cleveland Clinic and Johns Hopkins Hospital, while regulatory, ethical and legal issues have been debated before bodies like the United States Congress, the European Parliament and committees convened by the National Academies of Sciences, Engineering, and Medicine. Intellectual property disputes involving firms like Celera Genomics and policy frameworks championed by organizations including the World Health Organization and the National Institutes of Health shaped access to data and standards for use in precision medicine initiatives launched by pharmaceutical companies such as Roche and Novartis.