With timeout
in GNU coreutils
, you can do:
- Get the process id
- Run
timeout 60 strace -p PID
Here is an example.
test.sh
:
#!/bin/bash
while :; do
echo "$$"
sleep 100
done
Run it:
$ ./test.sh
27121
Run strace
with timeout
:
% cuonglm at ~
% timeout 60 strace -p 27121
Process 27121 attached - interrupt to quit
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27311
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(-1, 0x7fff374b8598, WNOHANG, NULL) = -1 ECHILD (No child processes)
rt_sigreturn(0xffffffffffffffff) = 0
rt_sigaction(SIGINT, {0x45c4d0, [], SA_RESTORER, 0x7fcdc10e05c0}, {0x443910, [], SA_RESTORER, 0x7fcdc10e05c0}, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
write(1, "27121\n", 6) = 6
rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fcdc1a699d0) = 27328
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGINT, {0x443910, [], SA_RESTORER, 0x7fcdc10e05c0}, {0x45c4d0, [], SA_RESTORER, 0x7fcdc10e05c0}, 8) = 0
wait4(-1,
After 1 minute:
....
rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fcdc1a699d0) = 27328
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
rt_sigaction(SIGINT, {0x443910, [], SA_RESTORER, 0x7fcdc10e05c0}, {0x45c4d0, [], SA_RESTORER, 0x7fcdc10e05c0}, 8) = 0
wait4(-1, <unfinished ...>
Process 27121 detached
% cuonglm at ~
No, strace
should not cause a program crash -
Except in this somewhat unusual case:
If it has a bug that depends on timing of execution, or runtime memory locations.
It may trigger this kind of "heisenbug" - but extremely rarely, because this kind of bug is rare, and it needs to only trigger under strace or other instrumentation.
And when you find a heisenbug, that's often a good thing.
Regarding ptrace()
- the syscall - that is just what strace
does inside I think, so it's similar. One can just do more than strace
can when using ptrace()
directly.
Your example would be just this kind of bug:
In the example, strace
would change the timing of the steps to create a network connection. If that causes a problem, it was a "problem waiting to happen" - the timing of execution changes constantly. With strace
, just a little more. But any other application could have changed the timing more, like starting a program.
Best Answer
Use:
If you want to learn more about a command, best it to read its documentation. The
strace
documentation is available inman
format, so it's just a matter of runningman strace
. In there, you'll find a section about Filtering which describes the syntax of-e
operands.