DNA Data Bank of Japan

DNA Data Bank of Japan
Name	DNA Data Bank of Japan
Formed	1986
Headquarters	Tokyo
Jurisdiction	Japan
Parent agency	National Institute of Genetics

Contents

DNA Data Bank of Japan is a public nucleotide sequence database located in Tokyo and operated by the National Institute of Genetics. It serves as one of the three primary international archival resources for nucleotide sequence data alongside the GenBank repository at the National Center for Biotechnology Information and the European Nucleotide Archive at the European Bioinformatics Institute. The database supports researchers at institutions such as the University of Tokyo, Kyoto University, Osaka University, Riken, and international partners including the World Health Organization and the Wellcome Trust.

Overview

The DNA Data Bank of Japan provides archival storage, retrieval, and distribution of nucleotide sequence records contributed by projects from laboratories at Harvard University, Stanford University, Massachusetts Institute of Technology, Oxford University, Cambridge University, Max Planck Society, Cold Spring Harbor Laboratory, Scripps Research Institute, California Institute of Technology, and national institutes like the Chinese Academy of Sciences and the Korean Advanced Institute of Science and Technology. Its services interoperate with standards set by the International Nucleotide Sequence Database Collaboration, coordinating exchanges with the European Molecular Biology Laboratory and the National Institutes of Health. The repository underpins analyses by researchers involved with the Human Genome Project, the 1000 Genomes Project, the International HapMap Project, and consortia such as the ENCODE Project and the Genome Reference Consortium.

The database traces origins to collaborations among Tokyo University of Agriculture and Technology scientists and the Ministry of Education, Culture, Sports, Science and Technology in the 1980s, formalized in 1986 as a national repository. Early exchanges of sequence records occurred with GenBank and the European Molecular Biology Laboratory under the framework of the International Nucleotide Sequence Database Collaboration, enabling joint curation during era-defining efforts like the Human Genome Project and the sequencing of organisms such as Escherichia coli, Saccharomyces cerevisiae, and Caenorhabditis elegans. Over time the database evolved through technological milestones set by innovators at Bell Labs, IBM Research, and companies like Illumina and Oxford Nanopore Technologies, adapting to high-throughput sequencing contributions from projects such as the Cancer Genome Atlas and national pathogen surveillance initiatives following outbreaks investigated by the Centers for Disease Control and Prevention and the European Centre for Disease Prevention and Control.

Operational leadership is provided through the National Institute of Genetics, with oversight connections to agencies including the Ministry of Health, Labour and Welfare and cooperative agreements with universities such as Hokkaido University and Tohoku University. Advisory relationships involve stakeholders from the Japan Science and Technology Agency, the Japan Agency for Medical Research and Development, and international organizations including the International Committee on Taxonomy of Viruses and the Global Alliance for Genomics and Health. Governance incorporates technical standards from bodies like the World Wide Web Consortium and collaborations that mirror initiatives at the European Bioinformatics Institute and the National Center for Biotechnology Information.

The repository accepts primary nucleotide sequences, assembled genomes, transcriptomes, and metadata for organisms ranging from model species such as Drosophila melanogaster and Mus musculus to agricultural taxa studied at institutions like Ibaraki University and Hiroshima University. Services include sequence submission tools, annotation pipelines, accessioning compatible with GenBank and the European Nucleotide Archive, and distribution mechanisms supporting analysis platforms used at Broad Institute, EMBL-EBI, and commercial entities like Thermo Fisher Scientific. The database integrates with resources such as UniProt, RefSeq, Gene Ontology, and visualization tools developed in collaboration with groups at RIKEN and the Protein Data Bank community.

The database is a founding partner in the International Nucleotide Sequence Database Collaboration, exchanging data daily with GenBank and the European Nucleotide Archive. It participates in global pathogen genomics efforts with the World Health Organization and regional networks including the Asia-Pacific Economic Cooperation public health initiatives, and collaborates on biodiversity projects linked to the Convention on Biological Diversity and the Global Biodiversity Information Facility. Joint projects and standardization efforts involve partners such as EMBL-EBI, the National Institutes of Health, Wellcome Sanger Institute, Institut Pasteur, and university consortia at University of California, Berkeley and ETH Zurich.

The repository supported submissions for landmark efforts including the Human Genome Project, the 1000 Genomes Project, the HapMap Project, and pathogen sequencing during outbreaks of SARS, MERS, and COVID-19. It has archived sequences used in studies published by researchers affiliated with Princeton University, Yale University, Columbia University, University of Cambridge, and University of Oxford. Contributions facilitated comparative genomics work on taxa studied at the Royal Botanic Gardens, Kew, the Smithsonian Institution, and the Natural History Museum, London, and underpinned translational research at medical centers such as Keio University Hospital and Sapporo Medical University Hospital.

Access to archived sequences is provided openly with accession identifiers consistent with INSDC conventions; submitters include academic groups at University of Tokyo Hospital, industrial laboratories at Takeda Pharmaceutical Company and Astellas Pharma, and public health agencies such as the National Institute of Infectious Diseases (Japan). Submission policies align with ethical frameworks advanced by organizations like the Global Alliance for Genomics and Health and legal considerations influenced by international agreements such as the Nagoya Protocol. Data reuse is governed by deposit terms that mirror practices at GenBank and EMBL-EBI, with metadata standards harmonized with initiatives like the Minimum Information about a Microarray Experiment and community curation efforts involving groups at Kyushu University and Nagasaki University.

Category:Biological databases