Generated by DeepSeek V3.2| Perf (Linux) | |
|---|---|
| Name | Perf |
| Developer | Linux kernel developers |
| Programming language | C (programming language) |
| Operating system | Linux |
| Genre | Profiling (computer programming), Performance analysis |
| License | GNU General Public License |
Perf (Linux). Perf is a powerful performance analysis tool integrated into the Linux kernel, providing a unified interface for accessing hardware and software performance counters. It enables detailed profiling and tracing of system and application behavior, aiding in optimization and debugging. The tool is part of the broader Linux kernel infrastructure and is maintained by key developers like Ingo Molnar and Arnaldo Carvalho de Melo.
Perf emerged from the integration of the Performance Counters for Linux (PCL) project into the mainline Linux kernel, superseding the older OProfile system. Its development was championed by Ingo Molnar and involved contributions from numerous Linux kernel developers and organizations like Red Hat and Intel. The tool leverages the Performance Monitoring Unit (PMU) found in modern CPUs, such as those from Intel and AMD, to collect low-level performance data. It forms a core component of the performance analysis ecosystem on Linux, alongside tools like SystemTap and Ftrace.
The tool supports a wide array of performance monitoring features, including hardware event counting for metrics like CPU cycles and cache misses via the Performance Monitoring Unit. It enables software event profiling for occurrences like context switches and page faults. Perf offers dynamic tracing capabilities through Kprobes and Uprobes to instrument kernel and user-space functions. It can perform call-graph sampling to visualize execution paths and supports benchmarking through workload measurement. Additional features include tracepoint analysis and statistical profiling for identifying performance bottlenecks.
Users typically invoke Perf from the command line using the `perf` executable, which is part of the Linux kernel source tree and distributed in packages like linux-tools. Common operations include recording system-wide activity with `perf record` and generating reports with `perf report`. The tool can profile specific applications by attaching to their process ID or monitoring entire systems. Analysts often use it to identify hotspots in code, such as within the GNU Compiler Collection or the Apache HTTP Server. Output can be visualized with tools like flame graphs for intuitive analysis of performance data.
Perf monitors numerous predefined hardware events, such as cpu-cycles and instructions, which are standardized across architectures like those from Intel and ARM. Software events include core kernel activities like cpu-migrations and alignment-faults. Tracepoint events correspond to static instrumentation points in the kernel, for instance, in the TCP stack or the Ext4 filesystem. Users can also define custom dynamic events using the Kprobes infrastructure to trace specific kernel functions. The available events can be listed using the `perf list` command, revealing the extensive instrumentation within the Linux kernel.
The `perf stat` subcommand provides a summary of event counts for a running command, useful for quick benchmarking. `Perf record` samples events and stores them in a data file, often for later analysis with `perf report`. `Perf annotate` displays assembly code annotated with event counts, aiding low-level optimization. `Perf script` outputs trace data in a scriptable format for processing by tools like Python (programming language). `Perf top` shows a real-time view of system performance, similar to the Unix `top` command but for hardware events. Other subcommands include `perf trace` for strace-like functionality and `perf probe` for creating dynamic tracepoints.
Internally, Perf operates as a subsystem within the Linux kernel, interfacing directly with the Performance Monitoring Unit through architecture-specific code. It uses Perf event data structures to represent monitoring requests, which are managed by the kernel's scheduler. Sampling is often driven by interrupts generated when performance counters overflow. Data is relayed to user space via a memory-mapped ring buffer for efficiency. The tool's user-space components, maintained by developers like Arnaldo Carvalho de Melo, parse this data and provide the command-line interface. Integration with Debugfs provides access to tracepoint and event information.
A developer might use `perf stat ./a.out` to count events while running a program compiled with the GNU Compiler Collection. The command `perf record -g -- ./httpd` could profile the Apache HTTP Server with call-graphs to find slow functions. For kernel analysis, `perf record -e kmem:kmalloc` would trace allocation events within the Linux kernel's memory subsystem. Output from `perf script` can be fed into the Flame graph generator to create visualizations of CPU usage. These examples demonstrate Perf's role in optimizing performance for complex systems, from web servers to database engines like MySQL.