LLMpediaThe first transparent, open encyclopedia generated by LLMs

Linux perf

Generated by GPT-5-mini
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Expansion Funnel Raw 51 → Dedup 0 → NER 0 → Enqueued 0
1. Extracted51
2. After dedup0 (None)
3. After NER0 ()
4. Enqueued0 ()
Linux perf
Nameperf
DeveloperLinus Torvalds kernel community
Released2008
Programming languageC (programming language)
Operating systemLinux
LicenseGNU General Public License

Linux perf is a performance analysis tool integrated into the Linux kernel that provides counting, sampling, tracing, and profiling capabilities for systems and applications. It is used by developers, systems engineers, and researchers working with GNU Compiler Collection, glibc, and kernel subsystems to diagnose bottlenecks across CPU, memory, I/O, and scheduling layers. Originating from work by kernel developers alongside contributions from organizations such as Red Hat, Intel, and Google (company), the tool complements observability stacks including systemd, eBPF, and ftrace.

Overview

perf presents a unified user-space front end to kernel performance monitoring features exposed by the perf_events subsystem. It unifies access to hardware performance counters implemented by vendors like Intel Corporation, Advanced Micro Devices, and ARM Holdings as well as software events provided by kernel instrumentation. perf integrates with build tools such as make (software) and debuggers like GDB (software) to connect runtime behavior with source and symbol information. Maintained in the Linux kernel tree and packaged by distributions including Debian, Ubuntu, Fedora, and SUSE Linux Enterprise, perf serves both interactive ad-hoc profiling and automated performance regression workflows.

Features and Architecture

perf is architected around the kernel's perf_events API which exposes PMU (Performance Monitoring Unit) counters, tracepoints, software events, and dynamic probes. The tool implements features such as event multiplexing, sampling, call-graph capture using frame pointers or DWARF-based unwinding, per-thread and per-cpu aggregation, and event filtering. Components interact with subsystems like cgroups, CPUscheduling primitives, and the procfs namespace for process metadata. Integration points include symbol resolution using build-ids from ELF (file format) binaries and source attribution via debug information handled by LLVM toolchain components and binutils.

Common Commands and Usage

Typical workflows use commands like perf record to capture events, perf report to analyze sampled data, and perf stat to collect aggregated counters. Users frequently combine perf top for live sampling with perf annotate to view disassembly hot spots mapped to function names maintained by GNU Binutils and addr2line utilities. For regression testing, perf bench profiles microbenchmarks and perf script permits custom processing with scripting engines. Permissions are governed by kernel interfaces and distribution policies; administrators may configure perf access using mechanisms provided by systemd services and Linux Security Module policies.

Performance Events and Counters

perf exposes hardware events such as CPU cycles, instructions retired, cache references and misses, branch instructions, and branch misses—capabilities implemented by microarchitectures from Intel Corporation, Advanced Micro Devices, and ARM Holdings. Software events include context switches, CPU migrations, page faults, and scheduling histograms emitted by the kernel. The subsystem supports programmable events, event groups for correlated measurements, statistical sampling (periodic or probabilistic), and overflow handling. Vendors supply event encodings and libraries; for example, Intel Corporation provides architecture-specific event documentation and tools that interoperate with perf.

Tracepoints, Probes, and Scripting

Beyond counters, perf leverages static kernel tracepoints, uprobes for user-space instrumentation, kprobes for kernel instrumentation, and perf_event_open-based dynamic events. These mechanisms interoperate with tracing frameworks such as ftrace and observability technologies like eBPF which can augment or replace perf in complex scenarios. perf script exports events for further analysis and supports scripting via languages supported by the platform toolchain, enabling post-processing with utilities from Perl, Python (programming language), and awk in automated pipelines maintained by teams at Red Hat and cloud providers like Amazon Web Services.

Use Cases and Examples

perf is applied for CPU hotspot analysis in large-scale services run by companies like Google (company) and Netflix, Inc.; diagnosing kernel lock contention in environments managed by Canonical (company) and SUSE; tuning database workloads from vendors such as Oracle Corporation and PostgreSQL Global Development Group; and validating microarchitecture behavior for platforms from ARM Ltd. and Intel Corporation. Example uses include profiling a web server compiled with GCC to reduce tail latency, analyzing page-fault storms in virtualized guests on KVM (kernel virtual machine), and measuring branch misprediction costs on embedded devices using toolchains from Linaro.

Limitations and Alternatives

perf depends on kernel support for the perf_events API and hardware PMUs; older kernels, restricted containers, or disabled counters limit functionality. Call-chain collection can be incomplete on binaries stripped of unwind information or on non-frame-pointer builds. For high-overhead or complex dynamic analysis, alternatives and complements include eBPF-based tools such as bcc and bpftrace, sampling profilers like gprof and Google Performance Tools, system-wide tracers like SystemTap, and vendor-specific profilers like Intel VTune. Choice among tools involves trade-offs in overhead, portability, observability depth, and integration with build and deployment pipelines managed by organizations such as Canonical (company), Red Hat, and cloud providers.

Category:Linux performance tools