Linux – How Are System Calls from Man 2 Invoked?

gcclinuxsystem-calls

By system calls, I mean functions like man 2 brk, not the 0x80 interrupt.

If I understand this thread correctly, a compiled C program never DIRECTLY invokes system calls. It can only invoke library calls, which might be dynamically linked from glibc.

However, man 3 brk returns No manual entry for brk in section 3. So I guess one of the following has to happen for the brk to be executed properly:

  1. My understanding above is wrong. Programs can invoke system calls without glibc support. But how is brk linked into the program then?
  2. There is indeed a glibc wrapper for the system call brk. Then which brk is included when I #include <unistd.h>? The glibc one or the system call one? If it is the glibc one, why is it not documented in man 3? Where can I find a complete list of available library calls?

Best Answer

For most of the system calls with man pages in section 2, the man pages actually describe the C library wrappers. The exceptions are usually mentioned explicitly, like gettid that @Sergei Kurenkov refer's to in their answer:

NOTES Glibc does not provide a wrapper for this system call; call it using syscall(2).

Similarly with pivot_root (which isn't that useful for general applications), tgkill (which performs the low-level function of pthread_kill). Then there's readdir, where the actual system call is somewhat different from the library function:

DESCRIPTION This is not the function you are interested in. Look at readdir(3) for the POSIX conforming C library interface. This page documents the bare kernel system call interface, which is superseded by getdents(2).

Note that there has to be some sort of wrapper. Function calls are made using the C calling conventions, which is different from the calling convention of the kernel interface. Usual function calls are made with the call assembly instruction (or similar), kernel calls with syscall or int 0x80 (and that's not counting stuff like gettimeofday or getpid in the vdso). The compiler doesn't (need to) know which function calls map to an an actual kernel call.

Even with the "usual" system calls, the C library wrapper acts slightly differently from the bare system call: The system calls return the error codes as varying negative values (if you look at the Linux kernel code, you'll see a lot of returns like return -EPERM;). The C library wrapper turns all such return values to -1, and moves the actual error code to errno.

Related Question