Pthreads and vfork

vfork

I am trying to check what really happens to pthreads while one of them performs vfork.
The spec says that the parent "thread of control" is "suspended" until the child process calls exec* or _exit.
As I understand, the consensus is that it means that the whole parent process (that is: with all of its pthreads) is suspended.
I'd like to confirm it using an experiment.
So far I performed several experiments, all of which suggest that other pthreads are running. As I have no linux experience, I suspect that my interpretation of these experiments is wrong, and learning the real interpretation of these results could help avoid further misconceptions in my life.
So here are the exepriments I did:

Experiment I

#include<unistd.h>
#include<signal.h>
#include<errno.h>
#include<cstring>
#include<string>
#include<iostream>
using namespace std;
void * job(void *x){
  int pid=vfork();
  if(-1==pid){
    cerr << "failed to fork: " << strerror(errno) << endl;
    _exit(-3);
  }
  if(!pid){
    cerr << "A" << endl;
    cerr << "B" << endl;
    if(-1 == execlp("/bin/ls","ls","repro.cpp",(char*)NULL)){
      cerr << "failed to exec : " << strerror(errno) << endl;
      _exit(-4);//serious problem, can not proceed
    }
  }
  return NULL;
}
int main(){
  signal(SIGPIPE,SIG_IGN);
  signal(SIGCHLD,SIG_IGN);
  const int thread_count = 4;
  pthread_t thread[thread_count];
  int err;
  for(size_t i=0;i<thread_count;++i){
    if((err = pthread_create(thread+i,NULL,job,NULL))){
      cerr << "failed to create pthread: " << strerror(err) << endl;
      return -7;
    }
  }
  for(size_t i=0;i<thread_count;++i){
    if((err = pthread_join(thread[i],NULL))){
      cerr << "failed to join pthread: " << strerror(err) << endl;
      return -17;
    }
  }
}

There are 44 pthreads, all of which perform vfork, and exec in the child.
Each child process, performs two output operations between the vfork and exec "A" and "B".
The theory suggests that the output should read ABABABABABA…without nesting.
However the output is a total mess: for example:

AAAA



BB
B

B

Experiment II

Suspecting that using I/O lib after vfork could be a bad idea, I've replaced the job() function with the following :

const int S = 10000000;
int t[S];
void * job(void *x){
  int pid=vfork();
  if(-1==pid){
    cerr << "failed to fork: " << strerror(errno) << endl;
    _exit(-3);
  }
  if(!pid){
    for(int i=0;i<S;++i){
      t[i]=i;
    }
    for(int i=0;i<S;++i){
      t[i]-=i;
    }
    for(int i=0;i<S;++i){
      if(t[i]){
        cout << "INCONSISTENT STATE OF t[" << i << "] = " << t[i] << " DETECTED" << endl;
      }
    }
    if(-1 == execlp("/bin/ls","ls","repro.cpp",(char*)NULL)){
      cerr << "failed to execlp : " << strerror(errno) << endl;
      _exit(-4);
    }
  }
  return NULL;
}

This time, I perform two loops such that the second one undoes the results of the first one, so at the end the global table t[] should be back to the initial state (which by definition is all zeros).
If entering the child process freezes the other pthreads making them unable to call vfork until current child finishes the loops, then the array should be all zeros at the end.
And I confirmed that when I use fork() instead of vfork() then the above code does not produce any output.
However, when I change fork() to vfork() I get tons of inconsistencies reported to stdout.

Experiment III

One more experiment is described here https://unix.stackexchange.com/a/163761/88901 – it involved calling sleep, but actually the results were the same when I've replaced it with a long for loop.

Best Answer

The Linux man page for vork is quite specific:

vfork() differs from fork(2) in that the calling thread is suspended until the child terminates

It is not the whole process, but indeed the calling thread. This behavior is not guaranteed by POSIX or other standards, other implementations may do different things (up to and including simply implementing vfork with a plain fork).

(Rich Felker also notes this behavior in vfork considered dangerous.)

Using fork in a multi-threaded program is hard enough to reason about already, calling vfork is at least as bad. Your tests are full of undefined behavior, you're not even allowed to call a function (let alone do I/O) inside the vfork'd child, except for exec-type functions and _exit (not even exit, and returning causes mayhem).

Here's an example adapted from yours which I believe is nearly free of undefined behavior assuming a compiler/implementation that doesn't output function calls for atomic reads and writes on ints. (The one problem is the write to start after the vfork - that's not allowed.) Error handling elided to keep it short.

#include<unistd.h>
#include<signal.h>
#include<errno.h>
#include<atomic>
#include<cstring>
#include<string>
#include<iostream>

std::atomic<int> start;
std::atomic<int> counter;
const int thread_count = 4;

void *vforker(void *){
  std::cout << "vforker starting\n";
  int pid=vfork();
  if(pid == 0){
    start = 1;
    while (counter < (thread_count-1))
      ;
    execlp("/bin/date","date",nullptr);
  }
  std::cout << "vforker done\n";
  return nullptr;
}

void *job(void *){
  while (start == 0)
    ;
  counter++;
  return NULL;
}

int main(){
  signal(SIGPIPE,SIG_IGN);
  signal(SIGCHLD,SIG_IGN);
  pthread_t thread[thread_count];
  counter = 0;
  start   = 0;

  pthread_create(&(thread[0]), nullptr, vforker, nullptr);
  for(int i=1;i<thread_count;++i)
    pthread_create(&(thread[i]), nullptr, job, nullptr);

  for(int i=0;i<thread_count;++i)
    pthread_join(thread[i], nullptr);
}

The idea is this: the normal threads wait (busy-loop) for the atomic global variable start to be 1 before incrementing a global atomic counter. The thread that does a vfork sets start to 1 in the vfork child, then waits (busy-loop again) for the other threads to have incremented the counter.

If the other threads were suspended during vfork, no progress could ever be made: the suspended threads would never increment counter (they'd have been suspended before start was set to 1), so the vforker thread would be stuck in an infinite busy-wait.