PAML — LLMpedia

PAML
Name	PAML
Author	Ziheng Yang
Released	1989
Latest release	4.10.0
Operating system	Cross-platform
License	Academic

Contents

Overview
Methods and Models
Implementation and Features
Applications
Performance and Validation
History and Development

PAML PAML is a software package for phylogenetic analysis using maximum likelihood methods. It is widely used in evolutionary biology, molecular systematics, and comparative genomics by researchers at institutions such as University of Oxford, University of Cambridge, Harvard University, Stanford University, Massachusetts Institute of Technology and University of California, Berkeley. PAML has influenced work connected to projects at Wellcome Trust Sanger Institute, Broad Institute, European Molecular Biology Laboratory, and datasets from GenBank, Ensembl, UniProt.

Overview

PAML provides tools for estimating evolutionary parameters and testing hypotheses about molecular sequence evolution across taxa like Homo sapiens, Mus musculus, Drosophila melanogaster, Saccharomyces cerevisiae, and Arabidopsis thaliana. The package supports analyses employed in studies published in journals such as Nature, Science, Proceedings of the National Academy of Sciences, Molecular Biology and Evolution, and Systematic Biology. It complements other software ecosystems including MEGA (software), RAxML, MrBayes, BEAST, IQ-TREE, PhyML and FastTree.

Methods and Models

PAML implements maximum likelihood estimation for substitution models introduced or used in work by researchers like John Felsenstein, Ziheng Yang, Masatoshi Nei, Walter Fitch, Adolf Bernhard Meyer and Motoo Kimura. It includes codon models, nucleotide models, and amino-acid replacement models building on frameworks from Kimura 2-parameter model, Jukes–Cantor model, Hasegawa–Kishino–Yano model, General Time Reversible model, and empirical matrices such as Dayhoff matrix, JTT matrix, WAG matrix, LG matrix. PAML offers branch models, site models, branch-site models, and likelihood ratio tests that relate to methods developed by authors from groups at University College London, Yale University, Columbia University and University of Chicago. The package supports analysis of positive selection using approaches related to the work of Nielsen Rasmus, Martha Kaplan, Douglas Futuyma and methods referenced alongside studies by Edward Blyth and Charles Darwin in historical context.

Implementation and Features

Implemented in C (programming language), PAML provides command-line executables and configurable control files echoing practices used in projects at Los Alamos National Laboratory and Sandia National Laboratories. Features include parameter optimization with algorithms reminiscent of methods from Davidon–Fletcher–Powell, Broyden–Fletcher–Goldfarb–Shanno algorithm, numerical differentiation techniques associated with work by Richard Brent, and bootstrapping approaches popularized in analyses at Cold Spring Harbor Laboratory. PAML reads alignments in formats used by Clustal, MUSCLE, MAFFT, BLAST, and supports tree inputs compatible with software like Newick-formatted trees produced by FigTree and Dendroscope. It interoperates with pipelines employing tools from Bioconductor, Galaxy (web platform), Nextflow, and Snakemake.

Applications

Researchers apply PAML to infer selection in genes from studies involving taxa represented in efforts like the 1000 Genomes Project, Human Genome Project, ENCODE Project, 1000 Fungal Genomes Project, and comparative studies involving Neanderthal and Denisovan sequences. Use cases include detecting adaptive evolution in immune-related loci studied by investigators at National Institutes of Health, tracing mitochondrial evolution investigated at Scripps Research, and assessing viral evolution in contexts such as outbreaks monitored by Centers for Disease Control and Prevention, World Health Organization and public health labs at Johns Hopkins University. PAML has been cited in conservation genetics work concerning species managed by International Union for Conservation of Nature and in evolutionary developmental biology investigations from groups at Max Planck Society and Howard Hughes Medical Institute.

Performance and Validation

Performance evaluations commonly compare PAML to tools such as PhyML, RAxML, IQ-TREE, MrBayes, BEAST 2, and FastTree using benchmark datasets from TreeBASE, PANDIT, and curated alignments from HOMSTRAD. Validation exploits simulated datasets generated with programs like Seq-Gen and INDELible, and leverages statistical frameworks developed by Ronald Fisher, Jerzy Neyman, Egon Pearson and tests popularized in applied work at Statistical Genetics groups across Broad Institute and Wellcome Trust Sanger Institute. Computational profiling often references high-performance computing centers such as XSEDE, PRACE and cluster systems at Argonne National Laboratory.

History and Development

PAML originated with work by Ziheng Yang while affiliated with institutions including University College London and later developed with contributions from collaborators at University of Michigan, University of Sydney, Peking University, and Chinese Academy of Sciences. Development milestones parallel advances in molecular phylogenetics chronicled alongside initiatives at European Bioinformatics Institute and periods associated with conferences like the Society for Molecular Biology and Evolution annual meetings. The software’s evolution reflects trends in computational biology seen in projects led by Michael Waterman, Temple F. Smith, Eugene Koonin, and Carl Woese.

Category:Phylogenetics software