LLMpediaThe first transparent, open encyclopedia generated by LLMs

DNA sequencing

Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: Motoo Kimura Hop 5 terminal

This article was accepted into the corpus but its outbound wikilinks were never NER-processed — typical at the deepest BFS hop or when the run's entity cap was reached. No expansion funnel to show.

DNA sequencing
NameDNA sequencing
CaptionHigh-throughput sequencing workflow in a genomics laboratory
Introduced1977
InventorsFrederick Sanger; Walter Gilbert; Allan Maxam
TypeMolecular biology technique
PurposeDetermination of nucleotide order in DNA

DNA sequencing is the laboratory process used to determine the precise order of nucleotides within a DNA molecule. Developed from foundational work in molecular genetics, it underpins modern projects in Human Genome Project, 1000 Genomes Project, and pathogen surveillance such as during outbreaks of Ebola virus epidemic in West Africa and COVID-19 pandemic. Major contributors and institutions include pioneers like Frederick Sanger, Walter Gilbert, and organizations such as Wellcome Trust and National Institutes of Health.

History

Early steps toward sequencing emerged from protein work associated with Frederick Sanger and techniques from Maxam–Gilbert sequencing were contemporaneous with the Sanger method, which was used in the Human Genome Project led by the Department of Energy and the National Institutes of Health. Key events also include the publication of the first complete viral genome for Phi X 174 and the completion announcements by the Wellcome Trust and DOE Joint Genome Institute. The rise of high-throughput platforms coincided with investments from organizations such as Illumina, Inc. and support from initiatives like the International HapMap Project, while private-sector efforts by Celera Genomics and figures such as Craig Venter accelerated commercialization and competition.

Methods and technologies

Classical methods include the Sanger method and Maxam–Gilbert method. Next-generation sequencing platforms comprise technologies developed by companies such as Illumina, Inc., Roche with the 454 platform, and Applied Biosystems with capillary electrophoresis instruments. Third-generation single-molecule approaches were advanced by firms like Pacific Biosciences and Oxford Nanopore Technologies, enabling long-read sequencing used in assemblies by groups such as Broad Institute and European Bioinformatics Institute. Laboratory workflows often integrate instruments from Thermo Fisher Scientific, sample preparation kits from specialty providers, and standards set by bodies such as International Organization for Standardization in regulated environments like clinical laboratories accredited by College of American Pathologists.

Applications

Sequencing supports projects in human genetics led by the Human Genome Project, population studies like 1000 Genomes Project, and precision medicine efforts at institutions including Mayo Clinic and MD Anderson Cancer Center. Clinical uses include diagnostics for conditions cataloged by the American College of Medical Genetics and Genomics and tumor profiling applied in centers such as Dana-Farber Cancer Institute. Pathogen genomics informs public health responses in agencies like the Centers for Disease Control and Prevention during events including the 2009 swine flu pandemic and the COVID-19 pandemic. Conservation genomics projects by organizations such as World Wide Fund for Nature use sequencing for endangered species studies, while agrigenomics programs at International Rice Research Institute and CIMMYT deploy sequencing for crop improvement.

Data analysis and bioinformatics

Raw sequence data is processed with tools and resources from institutions like European Bioinformatics Institute and National Center for Biotechnology Information. Common software and algorithms—originating in academic groups at University of California, Berkeley, Broad Institute, and European Molecular Biology Laboratory—include aligners, assemblers, and variant callers used in pipelines run on computing platforms such as Amazon Web Services and supercomputers at Lawrence Berkeley National Laboratory. Reference datasets such as those from the Genome Reference Consortium and variant catalogs like dbSNP guide interpretation; standards and formats (BAM, FASTQ) were specified by consortia including Global Alliance for Genomics and Health. Training and reproducibility efforts come from programs at Carnegie Mellon University, Massachusetts Institute of Technology, and community workshops at Cold Spring Harbor Laboratory.

Accuracy, errors, and validation

Error modes differ by platform: substitution errors in systems produced by Illumina, Inc., indel-prone reads in platforms from Oxford Nanopore Technologies and Pacific Biosciences, and dye-terminator artifacts in legacy instruments from Applied Biosystems. Validation and proficiency testing are performed by clinical laboratories accredited by College of American Pathologists and guided by regulatory agencies such as the Food and Drug Administration and European Medicines Agency. Benchmarking datasets from the Genome in a Bottle consortium and projects at National Institute of Standards and Technology provide standards for assessing sensitivity, specificity, and reproducibility for diagnostic assays implemented in hospitals like Johns Hopkins Hospital.

Sequencing raises issues addressed by entities such as the Presidential Commission for the Study of Bioethical Issues, the National Bioethics Advisory Commission, and legislatures crafting laws like the Genetic Information Nondiscrimination Act of 2008. Privacy concerns involve databases maintained by organizations such as Ancestry.com and 23andMe, prompting policy debate in forums including the European Parliament and courts like the Supreme Court of the United States. Equity and access are topics for global initiatives involving World Health Organization, Bill & Melinda Gates Foundation, and national health services such as the National Health Service (England), while patent disputes historically involved parties like Myriad Genetics and shaped jurisprudence in courts including the United States Court of Appeals for the Federal Circuit.

Category:Genetics