Generated by GPT-5-mini| CGG | |
|---|---|
| Name | CGG repeat |
| Type | Trinucleotide repeat |
| Location | 5' untranslated region (commonly) |
| Associated genes | Fragile X mental retardation 1 (FMR1), others |
| Clinical associations | Fragile X syndrome, Fragile X-associated tremor/ataxia syndrome, primary ovarian insufficiency |
CGG CGG denotes a trinucleotide DNA sequence motif composed of cytosine, guanine, guanine nucleotides that occurs in repetitive arrays within genomes. Arrays of CGG repeats are notable for their propensity to form secondary structures, alter transcriptional regulation of nearby loci, and undergo dynamic length changes across generations. Expanded CGG tracts are best known for their roles in human genetic disorders and as tools in molecular biology and genomics research.
CGG repeat tracts are often located in 5' untranslated regions or promoter-proximal sequences of genes such as Fragile X mental retardation 1 and can influence chromatin state and transcriptional activity at loci like FMR1 and other repeat-associated genes. Repeats show size polymorphism across populations studied by groups at institutions including National Institutes of Health, Wellcome Trust Sanger Institute, and laboratories led by investigators at Harvard University, Stanford University, and University of California, San Francisco. Population genetics surveys in cohorts from Iceland, Japan, United Kingdom, and United States report variable allele frequency distributions. Molecular mechanisms implicated in repeat instability have been examined in model systems including Saccharomyces cerevisiae, Drosophila melanogaster, Mus musculus, and human cell lines derived at centers such as Broad Institute.
At the molecular level, CGG repeats can adopt non-B DNA structures such as G-quadruplexes and hairpins, which were characterized by researchers at Max Planck Society and structural biology groups at European Molecular Biology Laboratory. DNA replication and repair proteins including MRE11, RAD51, FANCD2, and POLG modulate repeat stability. Epigenetic regulators like DNA methyltransferase 1, histone modifiers characterized by teams at Cold Spring Harbor Laboratory and chromatin remodelers such as SWI/SNF influence transcriptional silencing when repeats expand. Repeat-mediated RNA toxicity involves RNA-binding proteins including FMRP partners and factors identified in studies from Johns Hopkins University and Columbia University.
Expanded CGG tracts in the 5' region of the gene first described by researchers affiliated with University of California, Davis and Emory University cause methylation-mediated silencing leading to syndromes investigated clinically at Mayo Clinic and Cleveland Clinic. Full mutation expansions are causative for neurological and developmental conditions characterized in cohorts from Mount Sinai Hospital and Boston Children's Hospital. Intermediate and premutation alleles associate with late-onset neurodegenerative phenotypes reported by teams at Veterans Affairs Medical Center and gynecological endocrinology units at King's College Hospital who described primary ovarian insufficiency. Large consortia including International Fragile X Consortium and research networks at European Union institutions have described genotype–phenotype correlations and natural history studies.
Detection of CGG repeat length employs PCR-based assays refined by biotechnology firms such as Thermo Fisher Scientific, capillary electrophoresis platforms from Applied Biosystems, and Southern blot protocols optimized in clinical genetics laboratories at Mayo Clinic and academic centers like University of Cambridge. Triplet-primed PCR and long-read sequencing technologies commercialized by Pacific Biosciences and Oxford Nanopore Technologies enable detection of large expansions and interruptions characterized in methods developed at Wellcome Sanger Institute. Methylation-sensitive assays and bisulfite sequencing used by laboratories at National Institutes of Health assess epigenetic status. High-throughput screening pipelines at Broad Institute integrate bioinformatic tools and databases such as those maintained by Genome Reference Consortium.
Comparative genomics surveys led by groups at European Molecular Biology Laboratory and University of Tokyo show that CGG repeat loci vary across vertebrates and invertebrates, with conserved occurrences near orthologs of FMR1 in mammals including Homo sapiens, Mus musculus, Rattus norvegicus, and primates studied at Max Planck Institute for Evolutionary Anthropology. Population divergence and mutation rates have been modeled by evolutionary geneticists at Princeton University and University of Chicago. Studies in plants and fungi, including work at Wageningen University and University of Melbourne, document species-specific distributions and potential functional consequences on gene regulation.
CGG repeats are employed as tools in synthetic biology and nucleic acid engineering by research groups at Massachusetts Institute of Technology, California Institute of Technology, and industrial labs at Genentech to probe effects of repetitive DNA on transcription and genome stability. G-quadruplex-targeting small molecules developed by medicinal chemistry teams at Novartis and academic collaborators are tested for modulation of repeat-associated pathology. Reporter constructs containing CGG tracts have been used in screens at Dana-Farber Cancer Institute to identify modifiers of repeat instability, and CRISPR-based approaches pioneered at MIT and University of California, Berkeley are applied to model and edit expanded alleles.