Process Management – Why Fork is the Default Process Creation Mechanism

forkprocessprocess-management

The UNIX system call for process creation, fork(), creates a child process by copying the parent process. My understanding is that this is almost always followed by a call to exec() to replace the child process' memory space (including text segment). Copying the parent's memory space in fork() always seemed wasteful to me (although I realize the waste can be minimized by making the memory segments copy-on-write so only pointers are copied). Anyway, does anyone know why this duplication approach is required for process creation?

Best Answer

It's to simplify the interface. The alternative to fork and exec would be something like Windows' CreateProcess function. Notice how many parameters CreateProcess has, and many of them are structs with even more parameters. This is because everything you might want to control about the new process has to be passed to CreateProcess. In fact, CreateProcess doesn't have enough parameters, so Microsoft had to add CreateProcessAsUser and CreateProcessWithLogonW.

With the fork/exec model, you don't need all those parameters. Instead, certain attributes of the process are preserved across exec. This allows you to fork, then change whatever process attributes you want (using the same functions you'd use normally), and then exec. In Linux, fork has no parameters, and execve has only 3: the program to run, the command line to give it, and its environment. (There are other exec functions, but they're just wrappers around execve provided by the C library to simplify common use cases.)

If you want to start a process with a different current directory: fork, chdir, exec.

If you want to redirect stdin/stdout: fork, close/open files, exec.

If you want to switch users: fork, setuid, exec.

All these things can be combined as needed. If somebody comes up with a new kind of process attribute, you don't have to change fork and exec.

As larsks mentioned, most modern Unixes use copy-on-write, so fork doesn't involve significant overhead.