IGV — LLMpedia

IGV
Name	IGV
Developer	Broad Institute
Released	2009
Programming language	Java
Operating system	Windows, macOS, Linux
License	Open source

Contents

Overview
Features and Functionality
File Formats and Data Sources
Usage and Interface
Development and Licensing
Applications and Case Studies

IGV is a high-performance, desktop genome browser designed for interactive exploration of large-scale genomic data. It was created to visualize sequence alignments, variant calls, copy-number data, and functional genomics tracks alongside annotated reference sequences, enabling researchers to inspect raw reads, calls, and annotations in a unified view. The tool integrates with widely used genomic resources and analysis pipelines to support clinical genomics, cancer research, population genetics, and functional studies.

Overview

The browser was developed at the Broad Institute to address visualization needs arising from high-throughput platforms used at institutions such as the Broad Institute, Wellcome Sanger Institute, National Institutes of Health, and company labs including Illumina and Thermo Fisher Scientific. It operates on desktop platforms supported by the Java (programming language), providing cross-platform capabilities across Windows, macOS, and Linux environments. IGV supports visualization of reference genomes and annotations from projects and resources like the Genome Reference Consortium, Ensembl, UCSC Genome Browser, GENCODE, and national genome efforts such as the 1000 Genomes Project and The Cancer Genome Atlas. The software is commonly used alongside aligners and variant callers such as BWA, Bowtie 2, STAR (aligner), SAMtools, GATK, and FreeBayes.

Features and Functionality

IGV implements functionality for visualizing aligned reads from formats produced by tools including BWA, Bowtie 2, and STAR (aligner), as well as variant annotations from callers like GATK and FreeBayes. Key features include multiresolution tiling for efficient navigation of large genomes, support for color-coding reads by strand or pair orientation for methods popularized in sequencing centers such as Broad Institute pipelines, and real-time filtering for allelic frequency thresholds used in clinical settings like ClinVar curation. The browser provides visualization modes for structural variation similar to representations used in studies from The Cancer Genome Atlas and copy-number plots compatible with outputs from tools like Control-FREEC and CNVkit. Integration with genome annotation sets such as GENCODE, RefSeq, and Ensembl enables display of gene models, transcripts, and regulatory features referenced in publications from groups like ENCODE and the Roadmap Epigenomics Consortium.

File Formats and Data Sources

IGV reads standard bioinformatics formats created by widely adopted tools and consortia. These include alignment formats such as SAM (file format), BAM (file format), and index files like BAI (file format). Variant data may be loaded from VCF (Variant Call Format), annotations from GTF/GFF3, and expression or signal tracks from BED (file format), WIG, BigWig, and BigBed. The program can stream data from remote servers using protocols and resources offered by projects like UCSC Genome Browser and cloud repositories maintained by Amazon Web Services buckets used by large initiatives such as 1000 Genomes Project and The Cancer Genome Atlas. IGV supports custom track hubs and integration with annotation hubs published by groups including Ensembl and the UCSC Genome Browser consortium, enabling researchers to combine datasets from consortia like GTEx and ENCODE.

Usage and Interface

The user interface implements a multi-track, scroll-and-zoom paradigm with context provided by gene models from sources such as GENCODE and RefSeq. Users commonly load local files produced by pipelines using BWA/GATK or access remote data hosted by projects like 1000 Genomes Project and The Cancer Genome Atlas. Session management allows saving and sharing of views consistent with reproducible workflows used in labs at institutions such as Broad Institute and Wellcome Sanger Institute. IGV supports scripting and batch operations through command-line options and API hooks that facilitate integration with workflow managers like Snakemake, Nextflow, and Cromwell. Visualization conventions mirror those used in clinical reports and publications from journals such as Nature, Science, and Cell when presenting read pileups, variant evidence, and structural rearrangements.

Development and Licensing

Primary development is led by teams at the Broad Institute, with contributions from academic groups and industry partners including contributors familiar with standards from GA4GH and data-sharing policies from NIH programs. The software is implemented in Java (programming language) and distributed under an open-source license allowing modification and redistribution consistent with practices at organizations like Open Bioinformatics Foundation. Releases are managed through channels used by bioinformatics projects such as GitHub and distributed binaries are provided for Windows, macOS, and Linux platforms. Development follows community-driven feature requests and issue tracking comparable to other open-source bioinformatics projects developed by groups like Bioconductor and Galaxy (platform).

Applications and Case Studies

IGV is widely used in cancer genomics studies from initiatives like The Cancer Genome Atlas and cohort projects such as UK Biobank to validate somatic mutations and structural variants identified by callers like Mutect2 and Strelka. Clinical genetics laboratories referencing databases such as ClinVar, OMIM, and HGMD use IGV to inspect candidate variants for diagnostic reporting. Population-genetics groups analyzing datasets from the 1000 Genomes Project, HapMap Project, and gnomAD use IGV to examine haplotypes and local read evidence. Functional genomics consortia including ENCODE and the Roadmap Epigenomics Consortium employ the browser to visualize ChIP-seq, DNase-seq, and RNA-seq signal tracks. Case studies in publications from institutions like Broad Institute, Wellcome Sanger Institute, Dana-Farber Cancer Institute, and Stanford University demonstrate IGV’s role in supporting discovery and validation across translational, clinical, and basic research projects.

Category:Bioinformatics software