I would like to know what is the difference between a Library call and a System call in Linux. Any pointers for a good understanding of the concepts behind both will be greatly appreciated.
Linux – Difference Between a Library Call and a System Call
librarieslinuxsystem-calls
Related Solutions
Man pages are usually terse reference documents. Wikipedia is a better place to turn to for conceptual explanations.
Fork duplicates a process: it creates a child process which is almost identical to the parent process (the most obvious difference is that the new process has a different process ID). In particular, fork (conceptually) must copy all the parent process's memory.
As this is rather costly, vfork was invented to handle a common special case where the copy is not necessary. Often, the first thing the child process does is to load a new program image, so this is what happens:
if (fork()) {
# parent process …
} else {
# child process (with a new copy of the process memory)
execve("/bin/sh", …); # discard the process memory
}
The execve
call loads a new executable program, and this replaces the process's code and data memory by the code of the new executable and a fresh data memory. So the whole memory copy created by fork
was all for nothing.
Thus the vfork
call was invented. It does not make a copy of the memory. Therefore vfork
is cheap, but it's hard to use since you have to make sure you don't access any of the process's stack or heap space in the child process. Note that even reading could be a problem, because the parent process keeps executing. For example, this code is broken (it may or may not work depending on whether the child or the parent gets a time slice first):
if (vfork()) {
# parent process
cmd = NULL; # modify the only copy of cmd
} else {
# child process
execve("/bin/sh", "sh", "-c", cmd, (char*)NULL); # read the only copy of cmd
}
Since the invention of vfork, better optimizations have been invented. Most modern systems, including Linux, use a form of copy-on-write, where the pages in the process memory are not copied at the time of the fork
call, but later when the parent or child first writes to the page. That is, each page starts out as shared, and remains shared until either process writes to that page; the process that writes gets a new physical page (with the same virtual address). Copy-on-write makes vfork mostly useless, since fork
won't make any copy in the cases where vfork
would be usable.
Linux does retain vfork. The fork
system call must still make a copy of the process's virtual memory table, even if it doesn't copy the actual memory; vfork
doesn't even need to do this. The performance improvement is negligible in most applications.
There are in fact three gradations in system calls.
- Some system calls return immediately. “Immediately” means that the only thing they need is a little processor time. There's no hard limit to how long they can take (except in real-time systems), but these calls return as soon as they've been scheduled for long enough.
These calls are usually called non-blocking. Examples of non-blocking calls are calls that just read a bit of system state, or make a simple change to system state, such asgetpid
,gettimeofday
,getuid
orsetuid
. Some system calls can be blocking or non-blocking depending on the circumstances; for exampleread
never blocks if the file is a pipe or other type that supports non-blocking reads and theO_NONBLOCK
flag is set. - A few system calls can take a while to complete, but not forever. A typical example is
sleep
. - Some system calls will not return until some external event happens. These calls are said to be blocking. For example,
read
called on a blocking file descriptor is blocking, and so iswait
.
The distinction between “fast” and “slow” system calls is close to non-blocking vs. blocking, but this time from the point of view of the kernel implementer. A fast syscall is one that is known to be able to complete without blocking or waiting. When the kernel encounters a fast syscall, it knows it can execute the syscall immediately and keep the same process scheduled. (In some operating systems with non-preemptive multitasking, fast syscalls may be non-preemptive; this is not the case in normal unix systems.) On the other hand, a slow syscall potentially requires waiting for another task to complete, so the kernel must prepare to pause the calling process and run another task.
Some cases are a bit of a gray area. For example a disk read (read
from a regular file) is normally considered non-blocking, because it's not waiting for another process; it's only waiting for the disk, which normally takes only a little time to answer, but won't take forever (so that's case 2 above). But from the kernel's perspective, the process has to wait for the disk driver to complete, so it's definitely a slow syscall.
Best Answer
There's not really such a thing as a "library call". You can call a function that's linked to a shared library. And that just means that the library path is looked up at runtime to determine the location of the function to call.
System calls are low level kernel calls handled by the kernel.