How could running strace be fixing the OpenGL issue

openglsegmentation faultstrace

Since a recent major upgrade to my distribution (PLD Linux), I have been having trouble with a whole slew of programs. As best I can tell, anything that touches OpenGL or PulseAudio segfaults. I'm using the proprietary nvidia drivers and a 3.2.x kernel. Xorg itself runs fine and I am able to run most programs, however things like mplayer segfault and no sound is produced by any program.

Once I figured out that it might be related to OpenGL, I started playing with glxgears as a test. Running it by itself segfaults instantly. Then I discovered that running it under strace runs fine. The same thing is true for mplayer. Running it on a test mp3 file segfaults instantly, running strace mplayer plays just fine (although pulse audio still dies and it reverts to a dummy output device).

How could running something under strace keep it from segfaulting and how would I continue to debug the situation?

Best Answer

I have observed that Nvidia's libGL.so attempts to detect if the current process is being traced, by opening /proc/self/status and looking for "TracerPid:". Different code paths are taken depending upon if the value of TracerPid is non-zero (i.e., is the current processing being traced or not).

Install sysdig, and capture the a trace for the offending process twice, once while stracing, once withouth strace. For example:

$ sysdig -w glxgears.scap proc.name=glxgears &
$ glxgears &
$ kill -TERM `pidof glxgears`
$ kill -TERM `pidof sysdig`
$ sysdig -w glxgears-strace.scap proc.name=glxgears &
$ strace glxgears &
$ kill -TERM `pidof glxgears`
$ kill -TERM `pidof sysdig`

Compare the textual output of the two different traces to observe the change in execution flow between the straced and non-straced runs of glxgears.

strace "fixes" your OpenGL issue, because libGL is behaving differently depending upon if the process is being traced/debugged.

Related Question