DDBJ — LLMpedia

DDBJ
Name	DNA Data Bank of Japan
Formation	1986
Location	Kashiwa, Chiba Prefecture
Leader title	Director

Contents

History
Organizational structure and governance
Data content and major databases
Submission and accession processes
Data access, retrieval and tools
Collaborations and international role

DDBJ is a public nucleotide sequence archive established in 1986 that collects, organizes and disseminates DNA and RNA sequence data generated by researchers and institutions. The archive operates in partnership with international repositories and national institutes to support research in molecular biology, genomics, bioinformatics and biotechnology. It provides accession numbers, submission services and online tools for retrieval, annotation and analysis used by universities, research centers, hospitals and industrial laboratories.

History

The database was created during a period of rapid expansion in sequence generation alongside institutions such as GenBank, European Molecular Biology Laboratory, National Institutes of Health, Wellcome Trust, Cold Spring Harbor Laboratory and Human Genome Project. Early milestones involved collaborations with projects like the International Nucleotide Sequence Database Collaboration and initiatives funded by agencies including the Ministry of Education, Culture, Sports, Science and Technology (Japan), Japan Science and Technology Agency, National Institute of Genetics and RIKEN. Major historical contributions include submissions from sequencing centers engaged in efforts similar to the Human Genome Project and the International HapMap Project, as well as deposits originating from consortia such as the 1000 Genomes Project, Genome 10K, Earth BioGenome Project, Global Ocean Sampling Expedition and national surveillance networks like those linked to National Institute of Infectious Diseases (Japan). The archive evolved alongside technologies pioneered at institutions like Sanger Centre, Broad Institute, EMBL-EBI and J. Craig Venter Institute.

Organizational structure and governance

The archive is administered within a Japanese research framework involving organizations such as National Institute of Genetics, Japan Agency for Marine-Earth Science and Technology, RIKEN Center for Integrative Medical Sciences, University of Tokyo and regional partners in Chiba Prefecture and Kashiwa. Governance aligns with international standards promoted by bodies like the International Committee on Taxonomy of Viruses, World Health Organization, Global Alliance for Genomics and Health and advisory input from consortia including Genomic Standards Consortium and networks spanning European Bioinformatics Institute, National Center for Biotechnology Information and national academies such as the Japan Society for the Promotion of Science. Leadership interacts with research funders exemplified by MEXT and philanthropic organizations similar to Wellcome Trust in coordinating policy on data sharing, privacy and access. Institutional collaborations extend to universities including Keio University, Osaka University, Kyoto University, Waseda University and research hospitals like Tokyo Medical and Dental University Hospital.

Data content and major databases

Holdings encompass nucleotide sequences from projects at centers like Sanger Institute, Broad Institute, Biotechnology and Biological Sciences Research Council, Centers for Disease Control and Prevention and clinical laboratories such as Beth Israel Deaconess Medical Center and Johns Hopkins Hospital. Collections include whole genomes submitted by initiatives akin to 1000 Genomes Project, metagenomes from expeditions comparable to the Tara Oceans expedition, transcriptomes produced in laboratories at Stanford University, Massachusetts Institute of Technology, Harvard University, and viral genomes deposited during outbreaks monitored by World Health Organization and Centers for Disease Control and Prevention (United States). Databases integrate records similar to GenBank accession models, metadata standards from Minimum Information about any (x) Sequence and taxonomic annotation reflecting frameworks like NCBI Taxonomy and International Nucleotide Sequence Database Collaboration exchanges. Specialized resources include draft and finished assemblies contributed by consortia such as Human Microbiome Project, Plant Genome Project partners at International Rice Research Institute and vertebrate resources related to Genome 10K.

Submission and accession processes

Submitters range from teams at Harvard Medical School, Stanford School of Medicine, University of California, Berkeley, University of Oxford and national labs like Los Alamos National Laboratory and Lawrence Berkeley National Laboratory to clinical networks at National Cancer Center Hospital and surveillance units tied to National Institute of Infectious Diseases (Japan). Accessioning follows procedures comparable to those used by GenBank and EMBL-EBI with identifiers assigned for sequences, features and assemblies. Submission pipelines accommodate formats used by projects such as ENCODE Project, 1000 Genomes Project and clinical sequencing initiatives at centers like Mayo Clinic and Cleveland Clinic. Policies reflect consent and data protection frameworks influenced by rulings and guidance from entities like the European Commission, Personal Information Protection Commission (Japan) and international agreements used by multi-institutional studies including Global Alliance for Genomics and Health.

Data access, retrieval and tools

Users access content via web portals and programmatic interfaces modeled after services at European Bioinformatics Institute, NCBI and portals run by institutions such as EMBL, Wellcome Trust Sanger Institute and national supercomputing centers like National Institute of Informatics (Japan). Tools support BLAST searches similar to those from National Center for Biotechnology Information, alignment utilities originating at Heng Li's groups, assembly software inspired by projects at Broad Institute and visualization frameworks used by resources like UCSC Genome Browser, Ensembl and platforms developed at European Molecular Biology Laboratory. Data retrieval integrates standards from bodies such as Bioinformatics Open Source Conference communities and leverages cloud collaborations with providers akin to Amazon Web Services and platforms used by Google Cloud for large-scale sequence analysis. Training and user support connect to workshops held at universities like University of Tokyo, Kyoto University and conferences such as ISMB, RECOMB and Gordon Research Conferences.

Collaborations and international role

The archive is a founding partner in global exchange frameworks connecting repositories such as GenBank and European Nucleotide Archive under the umbrella of the International Nucleotide Sequence Database Collaboration. It contributes to outbreak response networks that collaborate with World Health Organization, Centers for Disease Control and Prevention and regional public health agencies, and to biodiversity initiatives working with United Nations Environment Programme, Convention on Biological Diversity and projects like Earth BioGenome Project. Scientific partnerships span academic centers including Harvard University, University of Cambridge, Max Planck Society, CNRS, Stanford University and industrial collaborations with biotechnology firms and consortia such as Illumina, Thermo Fisher Scientific, Roche and biobanking networks analogous to BBMRI-ERIC. The role in standards development intersects with groups like Genomic Standards Consortium, Global Alliance for Genomics and Health and international taxonomy committees exemplified by International Committee on Systematics of Prokaryotes.

Category:Biological databases