Linux select() - Synchronous I/O Multiplexing

The select() system call enables a program to keep track of multiple file descriptors. The function waits until one or more file descriptors become ready for a specific class of I/O operation without blocking, allowing for synchronous I/O multiplexing.

In this article, learn about the role of Linux select() in synchronous I/O multiplexing.

Linux select() - Synchronous I/O Multiplexing

Prerequisites

Linux system (this guide uses Ubuntu 20.04).
A text editor (this guide uses Vim).

What Does select Do in Linux?

The select() system call is one of the primary, non-blocking methods for synchronous I/O multiplexing.

Linux applications perform I/O operations (like read and write) on a single file descriptor at a time. This is the blocking method. Blocking reduces the program's performance, slowing the process down. However, select() allows users to implement synchronous I/O multiplexing. With this process, Linux programs monitor multiple file descriptors ready for I/O operations at once. Multiplexing is synchronous because users have to wait until at least one file descriptor is ready.

Modern applications also use other syscalls like poll() or epoll(), which have fewer limitations than select().

Note: select() monitors only file descriptor numbers that are less than FD_SETSIZE (1024). This limit is too low for contemporary applications. To track file descriptors with higher FD_SETSIZE, use poll() or epoll().

As a system call, select() provides an interface between a particular process, app, or program and Linux, allowing the former to request a service from the Linux kernel. The select() method has practical use in socket programming because it allows a server to monitor multiple sockets for readiness for I/O operations.

The select() system call enables a program to monitor several file descriptors. The function waits for one or multiple descriptors to be ready for a particular type of I/O activity. A file descriptor is ready when the incoming I/O action doesn't block.

Linux select Function Synopsis

For the select() system call to work, it needs to be included in the application's header file. The following is the select() syntax:

int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *errorfds, struct timeval *timeout)

The syntax consists of:

The nfds parameter. Defines the number of file descriptors select() monitors.
The fd_set readfds and fd_sed writefds arguments. Shows which file descriptors select() should test on readiness for an I/O operation (read or write).
The fd_set errorfds argument. Refers to file descriptors select() tests for awaiting error state of affairs.
The struct timeval timeout parameter. Determines the waiting period for the file descriptor to become ready.

In practice, app code needs to include more than the above syntax in the header file for the system call to work. Add the following code to the header file when using the C programming language:

#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
int select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout);
void FD_CLR(int fd, fd_set *set);
int FD_ISSET(int fd, fd_set *set);
void FD_SET(int fd, fd_set *set);
void FD_ZERO(fd_set *set);

Linux select Example

The select() function is used in scripts. The function should be placed in the header file to work. However, creating one main script file is enough to test the function independently.

Take to following steps to understand how select() works in a practical example:

1. Create a new c-type file in Vim called select.c with:

vim select.c

2. Copy and paste the code into the file.

#include <sys/time.h>
#include <sys/types.h>
#include<stdio.h>
#include <unistd.h>
int main() {
fd_set rd;
struct timeval tv;
int err;
FD_ZERO( & rd);
FD_SET(0, & rd);
tv.tv_sec=4;
tv.tv_usec=0;
err=select(1, & rd, NULL, NULL, & tv);
if (err==0) //Timeout
{
printf("select timeout!!!\n");
} else if (err == -1) //Fail
{
printf("failure to select\n");
} else // Success
{
printf("data avalible\n");
}
return 0;}

The script includes the header, the main function (which elaborates on file descriptors and timeout), and an if statement in which select() checks for errors.

Note: The code does not include the entire library to be pasted in the separate heading file when compiling an application.

3. Save and exit the Vim file.

4. Compile the code with the following:

gcc select.c

5. Since gcc has no output, verify the result with ls:

ls

The output shows an a.out executable created by compiling the select.c file.

6. Run the executable without input to test it:

./a.out

The output prints the select timeout!!! message, showing that the script works.

File Descriptor Sets

The three arguments in the select() syntax represent the file descriptor sets to be tested. These file descriptor sets are:

readfds. The select() function monitors this set to see if the file descriptors are ready for reading.
writefds. The system call tracks the file descriptors to check whether they are ready for writing, which happens when the write operation doesn't block.
exceptfds. select() watches the file descriptions in this set for exceptional conditions. Note that exceptional conditions don't mean an error.

The select() function uses four macros to manipulate the sets:

FD_ZERO() clears a set by removing all file descriptors.
FD_SET() adds a file descriptor to a set.
FD_CLR() deletes a file descriptor from a set.
FD_ISSET() tests whether a file descriptor is part of the set.

Arguments

Apart from the file descriptor sets, the select() function uses additional arguments. Common arguments of select() are:

Argument	Description
`nfds`	This argument is to be set to the highest-numbered file descriptor in any set plus 1.
`timeout`	The timeout argument specifies the time interval that `select()` blocks. Timeout represents the time `select()` spends waiting for a file descriptor to become ready. The call stops blocking when either the descriptor is ready, the timeout expires, or a signal handler interrupts it. If the argument is set as `NULL`, `select()` blocks indefinitely. However, if the timeout is set to `0`, `select()` does not block at all.
`errorfds`	This is a specific `fd_set` type argument. The argument checks file descriptors for error conditions pending. Since this is also a `fd_set` type argument, `select()` manipulates `errorfds` with the same macros as file descriptor sets.

Linux select Errors

The Linux select() call returns errors in specific circumstances. Common errors include:

EBADF shows when an invalid file descriptor is part of one set.
EINTR states a caught signal.
EINVAL shows that either nfds is negative or there is an invalid value within the timeout.
ENOMEM represents an inability to allocate memory for internal tables.

Linux select Bugs

Linux select() has specific behaviors that are not permitted by POSIX, some of which are fixed in poll() and epoll() system calls. The most common select() bugs are:

The implementation of fd_set as a value-result argument. However, this is a design error and is fixed in poll() and epoll().
The Linux kernel imposes no fixed upper limit on the range of file descriptors specified in a file descriptor set (which POSIX has). On the other hand, the GNU C Library (glibc) implementation of select() defines FD_SETSIZE as 1024. This limitation is removed in poll() and epoll.
According to POSIX, select() checks all specified file descriptors in the file descriptor sets up to the limit nfds-1. However, select() ignores any file descriptor in these sets greater than the maximum file descriptor number currently open. According to POSIX, any such file descriptor specified in one set should result in the error EBADF.
The select() function may report a socket file descriptor as ready while the subsequent action blocks.
The select() call modifies timeout if a signal handler interrupts the call.

Linux select vs. pselect

The pselect() system call was invented by POSIX and is supported by most popular Linux distributions. The function allows the system to wait until it catches a signal or a file descriptor becomes ready. The pselect() syntax is:

#include <sys/select.h>
#include <signal.h>
#include <time.h>
int pselect (int maxfdp1, fd_set readset, fd_set writeset, fd_set exceptset, const struct timespec timeout, const sigset_t sigmask)

The select() and pselect() functions are nearly identical. However, certain differences exist:

select() uses the timeval structure (with seconds and microseconds), while pselect() uses the timespec structure (with seconds and nanoseconds).
select() can update the timeout argument to show the remaining time. pselect() can not change the argument.
pselect() has an additional argument, sigmask, a pointer to a signal mask. The argument allows the program to disable the delivery of specific signals, test global variables for these now-disabled signals, and then call pselect(), telling it to reset the signal mask.

Conclusion

After reading this article, you know how to use Linux select() in synchronous I/O multiplexing. Next, learn more about Linux sockets and how they work.

Was this article helpful?

YesNo