In this article, we’ll cover the final piece of the puzzle for high-performance network programming in C — epoll.
When we previously looked at select
and poll
, we noticed significant limitations. For example:
select
has a maximum file descriptor limit.poll
removes that limit, but still requires iterating through the entire descriptor set, making it inefficient.
The chart below shows the performance difference between select
, poll
, and epoll
across different numbers of file descriptors:
As you can see, select
and poll
behave almost identically since both need to scan the entire descriptor set. In contrast, epoll
performs consistently, even when scaling from 10 to 10,000 file descriptors — demonstrating its scalability and efficiency.
Creating an epoll Instance #
Before working with events, we first create an epoll instance:
int epoll_create(int size);
int epoll_create1(int flags);
- A return value > 0 is a valid epoll instance.
-1
indicates an error.
In modern Linux (>=2.6.8), the size
parameter is ignored (but must still be >0).
With epoll_create1
, you can pass EPOLL_CLOEXEC
to ensure file descriptors are automatically closed on exec()
— useful for preventing child processes from inheriting them.
Controlling epoll with epoll_ctl
#
Once the instance is created, use epoll_ctl
to add, modify, or remove monitored descriptors:
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
EPOLL_CTL_ADD
→ Register a descriptor.EPOLL_CTL_DEL
→ Remove it.EPOLL_CTL_MOD
→ Modify it.
An epoll_event
looks like this:
struct epoll_event {
uint32_t events; /* Event mask */
epoll_data_t data; /* User data */
};
Common event types:
EPOLLIN
→ readableEPOLLOUT
→ writableEPOLLRDHUP
→ peer closed connectionEPOLLHUP
→ hung upEPOLLET
→ edge-triggered mode (default is level-triggered)
Waiting for Events with epoll_wait
#
To process events:
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
- Returns >0 = number of ready events.
- Returns 0 = timeout.
- Returns -1 = error.
Unlike poll
, where you must scan all descriptors, epoll_wait
directly returns only the descriptors with events — a huge efficiency boost.
Example: Echo Server with epoll #
Here’s a simplified echo server using epoll in edge-triggered mode:
event.data.fd = sock_fd;
event.events = EPOLLIN | EPOLLET;
if (epoll_ctl(efd, EPOLL_CTL_ADD, sock_fd, &event) == -1) {
perror("epoll_ctl failed.");
exit(1);
}
And in the event loop:
read_num = epoll_wait(efd, events, MAX_EVENTS, -1);
for (i = 0; i < read_num; i++) {
if (events[i].events & EPOLLIN) {
int client_fd = events[i].data.fd;
int n = read(client_fd, buf, sizeof(buf));
if (n > 0) {
write(client_fd, buf, n); // echo back
} else {
close(client_fd);
}
}
}
👉 Full code: GitHub repo
Edge-Triggered vs. Level-Triggered #
- Level-triggered (default): Event fires repeatedly as long as the condition holds (e.g., data remains readable).
- Edge-triggered: Event fires once when the state changes (e.g., new data arrives).
Edge-triggered is more efficient in high-throughput scenarios (like file uploads) since it reduces kernel notifications, but requires careful non-blocking reads/writes.
Summary #
epoll
scales far better thanselect
orpoll
.- It avoids unnecessary iteration by returning only active descriptors.
- Edge-triggered mode reduces kernel overhead in high-load environments.
That’s why epoll
is the backbone of modern high-performance servers such as Nginx, Redis, and even the Linux networking stack used by Go’s runtime.