Generated by GPT-5-mini| epoll (Linux) | |
|---|---|
| Name | epoll |
| Author | epoll (Linux) |
| Developer | Linux kernel |
| Release | 2.5.44 |
| Operating system | Linux |
| Genre | I/O notification facility |
epoll (Linux) is an I/O event notification facility introduced into the Linux kernel to efficiently monitor multiple file descriptors for readiness events. It was developed to address scalability limits found in earlier interfaces and is widely used in networking stacks, Nginx, Node.js, Redis, and other high-performance servers. epoll enables applications to react to readiness of sockets, pipes, and device files while minimizing system call overhead and context switches.
epoll provides an edge-triggered and level-triggered mechanism integrated into the Linux kernel's event handling subsystems. It complements traditional POSIX interfaces such as select and poll by allowing applications to register sets of file descriptors and receive notifications when they become ready for I/O. The facility interacts with kernel internals like the Linux scheduler and the file descriptor table and is exposed through a small set of syscalls that user-space libraries and servers can employ to build scalable event loops.
The epoll API centers on three operations: creating an epoll instance, controlling monitored descriptors, and waiting for events. The epoll interface is accessed via syscalls analogous to system call conventions used by the Linux kernel: epoll_create1, epoll_ctl, and epoll_wait. The design supports both edge-triggered semantics, where notifications occur on state transitions, and level-triggered semantics, where readiness is reported while conditions persist. Control operations use flags and structures similar to those in other kernel interfaces and integrate with descriptor management performed by the file descriptor layer in Unix-like kernels.
Typical usage patterns embed epoll within event-driven servers and frameworks such as Nginx, HAProxy, Lighttpd, and Node.js's libuv. An application creates an epoll instance, registers sockets and timers, and then enters an event loop calling epoll_wait to obtain a batch of ready descriptors. Event handlers then perform nonblocking I/O, often in conjunction with APIs like sendfile, libaio, or user-space networking stacks. epoll's support for edge-triggered mode requires careful programming to drain the readiness condition to avoid starvation, influencing designs in projects such as libevent and libev.
epoll was explicitly optimized to improve scalability across large numbers of file descriptors, addressing O(n) scanning costs associated with select and poll. By allowing the kernel to directly notify user-space about changed descriptors, epoll reduces CPU overhead and memory bandwidth pressure in high-concurrency servers like Apache HTTP Server working in event-driven modes and proxies such as Varnish Cache. Benchmarks in representative workloads with TCP sockets, Unix domain socket IPC, and high connection churn demonstrate lower syscall frequency and reduced context-switching compared to legacy interfaces, benefiting applications deployed on platforms like Red Hat Enterprise Linux and Debian.
Internally, epoll uses kernel data structures to maintain interest lists and ready lists, leveraging mechanisms in the Linux kernel such as wait queues and poll tables. The epoll instance is represented by a kernel object that holds references to registered file descriptor structures; readiness notifications are generated when underlying file operations invoke the poll hooks provided by device drivers or sockets. The implementation interacts with kernel primitives like spinlocks and RCU to maintain concurrency safety and minimize lock contention on multicore systems. Changes in kernel versions adjusted per-CPU data structures and wakeup paths to improve throughput under heavy load and to integrate with features like IO uring.
Despite its advantages, epoll has limitations: it is Linux-specific and thus not portable to FreeBSD, OpenBSD, or Windows without adaptation. Edge-triggered semantics can lead to programming pitfalls requiring nonblocking I/O and careful loop draining. Alternative scalable notification APIs include kqueue on FreeBSD and DragonFly BSD, IOCP on Microsoft Windows, and event ports on Solaris/Illumos. Recent kernel developments such as IO uring offer asynchronous I/O models that can complement or replace epoll in certain workloads, providing direct submission/completion queues and reducing syscall overhead further.
epoll was merged into the Linux kernel mainline in the 2.5 development series to address growing needs for high-performance network servers during the early 2000s. Its adoption accelerated with event-driven architectures employed by projects like Nginx, Memcached, Lighttpd, and later Node.js, influencing server software design and operating system I/O APIs. Over successive Linux kernel releases, epoll received refinements for correctness, performance, and integration with other kernel subsystems, becoming a cornerstone technology in modern Linux-based infrastructure used by companies such as Facebook, Google, Twitter, and Amazon.