Generated by GPT-5-mini| LASTZ | |
|---|---|
| Name | LASTZ |
| Developer | University of California, Santa Cruz Genome Browser Group |
| Latest release | 1.04.00 |
| Operating system | Unix-like, macOS |
| License | GNU GPL |
LASTZ LASTZ is a pairwise DNA sequence alignment program widely used for aligning genomic sequences such as chromosomes, scaffolds, and contigs. It was developed as a successor to BLASTZ to support whole-genome comparisons and synteny detection between assemblies produced by groups such as the Human Genome Project, the Wellcome Trust Sanger Institute, and the Broad Institute. LASTZ is optimized for long DNA sequences and is commonly integrated into pipelines alongside tools from institutions like the University of California, Santa Cruz, the National Center for Biotechnology Information, and the European Bioinformatics Institute.
LASTZ was created to address challenges encountered in whole-genome alignments during projects including the Human Genome Project, the 1000 Genomes Project, and comparative genomics efforts at the Broad Institute and the Wellcome Trust Sanger Institute. It fills a niche alongside algorithms from groups such as the National Institutes of Health, the National Human Genome Research Institute, and the Joint Genome Institute. LASTZ is frequently used with resources such as the UCSC Genome Browser, Ensembl, and FlyBase for producing alignments that feed into annotations generated by teams at EMBL-EBI, the Swiss Institute of Bioinformatics, and the Max Planck Institute for Evolutionary Anthropology.
LASTZ implements a seed-and-extend approach derived from earlier work like BLASTZ and BLAST and borrows ideas from sequence-analysis methods developed at institutions including the Cold Spring Harbor Laboratory and the Institute for Systems Biology. It supports gapped alignment models influenced by scoring schemes used in programs such as Clustal, MAFFT, and MUSCLE, and can apply affine gap penalties modeled after classical algorithms from the University of Washington and the Howard Hughes Medical Institute. Features include support for repeat masking using libraries from Repbase and Dfam, chain and net post-processing akin to approaches used by the UCSC Genome Browser, and options for handling ambiguous bases relevant to datasets produced by Illumina, Pacific Biosciences, and Oxford Nanopore. LASTZ also offers tunable scoring matrices comparable to those used in FASTA and Smith–Waterman implementations developed by groups at EMBL and the European Molecular Biology Laboratory.
LASTZ is distributed as a command-line tool typically invoked on Unix-like systems including Linux distributions used by the Broad Institute, macOS used by researchers at Harvard, and compute clusters at institutions such as Lawrence Berkeley National Laboratory. Common invocation patterns mirror workflows from the Galaxy Project, Bioconductor, and Nextflow pipelines used at the Sanger Institute. Options include seed pattern specification inspired by spaced-seed techniques from the Memorial Sloan Kettering Cancer Center, gap-open and gap-extend parameters used in alignment tools from the European Bioinformatics Institute, and filtering flags for handling sequences from GenBank, RefSeq, and ENA. Integration examples include running LASTZ before chaining with tools from UCSC, or feeding output into multiple sequence alignment programs used at Cold Spring Harbor, the Roslin Institute, or the Broad Institute.
Performance assessments of LASTZ have been reported in comparative studies alongside programs such as BLASTZ, BLAST, BWA, Bowtie, and minimap2 in analyses conducted by groups at the Broad Institute, the Wellcome Trust Sanger Institute, and the National Center for Biotechnology Information. Benchmarks often measure throughput on hardware from vendors such as Intel, AMD, and IBM and on clusters managed with SLURM or SGE at facilities like the European Bioinformatics Institute and Oak Ridge National Laboratory. LASTZ exhibits favorable scaling for long, highly similar sequence pairs as encountered in projects by the Human Genome Project, the 1000 Genomes Project, and the Vertebrate Genomes Project, while newer short-read mappers from Illumina and the Broad Institute may outperform it for high-volume short-read alignment tasks. Comparative accuracy evaluations reference alignment validations performed by groups including the Genome Reference Consortium, the Mouse Genome Project, and the International HapMap Project.
LASTZ has been applied in a wide range of genomics projects including whole-genome alignments for the Human Genome Project, comparative assemblies in the Vertebrate Genomes Project, and synteny analyses for Drosophila research at FlyBase and for Arabidopsis research at The Arabidopsis Information Resource. It supports evolutionary studies carried out by the Max Planck Institute for Evolutionary Anthropology, phylogenomic analyses by the Broad Institute, and structural variation detection workflows used by the 1000 Genomes Project and the Genome in a Bottle Consortium. LASTZ output is routinely incorporated into genome browsers and annotation platforms maintained by the UCSC Genome Browser Group, Ensembl, WormBase, and ZFIN, and underpins comparative annotations produced by the National Human Genome Research Institute and the European Molecular Biology Laboratory.
LASTZ development has been led by contributors affiliated with the University of California, Santa Cruz Genome Browser Group and is informed by methods established at organizations such as the National Center for Biotechnology Information, the Wellcome Trust Sanger Institute, and EMBL-EBI. The software is implemented in C and distributed under the GNU General Public License, aligning its distribution model with many open-source bioinformatics projects at institutions like the Open Bioinformatics Foundation, the Galaxy Project, and Bioconductor. Community engagement and maintenance are coordinated through developer and user communities similar to those surrounding projects at the Broad Institute, the Max Planck Society, and the Howard Hughes Medical Institute.
Category:Bioinformatics software