How to run strace on a user process for specific period of time, say 1 minute, without terminating the user process and without using Ctrl+C?
I want to create a script to automating strace execution for a user process.
strace
How to run strace on a user process for specific period of time, say 1 minute, without terminating the user process and without using Ctrl+C?
I want to create a script to automating strace execution for a user process.
This is all perfectly normal. You aren't supposed to prevent the failing library lookups from happening.
execve("./hello", ["./hello"], [/* 62 vars */]) = 0
This is your program starting. Since it is dynamically linked, the first code to execute is from the dynamic loader
.
brk(0) = 0x85a5000
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb774f000
The dynamic loader is allocating some heap space.
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
The dynamic loader checks whether there are dynamic libraries to preload. There aren't any.
open("/home/miguel/GNUstep/Library/Libraries/tls/i686/sse2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/home/miguel/GNUstep/Library/Libraries/tls/i686/sse2", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/home/miguel/GNUstep/Library/Libraries/tls/i686/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/home/miguel/GNUstep/Library/Libraries/tls/i686", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/home/miguel/GNUstep/Library/Libraries/tls/sse2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/home/miguel/GNUstep/Library/Libraries/tls/sse2", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/home/miguel/GNUstep/Library/Libraries/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/home/miguel/GNUstep/Library/Libraries/tls", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/home/miguel/GNUstep/Library/Libraries/i686/sse2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/home/miguel/GNUstep/Library/Libraries/i686/sse2", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/home/miguel/GNUstep/Library/Libraries/i686/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/home/miguel/GNUstep/Library/Libraries/i686", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/home/miguel/GNUstep/Library/Libraries/sse2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/home/miguel/GNUstep/Library/Libraries/sse2", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/home/miguel/GNUstep/Library/Libraries/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/home/miguel/GNUstep/Library/Libraries", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/usr/lib/tls/i686/sse2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/tls/i686/sse2", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/usr/lib/tls/i686/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/tls/i686", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/usr/lib/tls/sse2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/tls/sse2", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/usr/lib/tls/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/tls", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/usr/lib/i686/sse2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/i686/sse2", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/usr/lib/i686/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/i686", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/usr/lib/sse2/libc.so.6", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
stat64("/usr/lib/sse2", 0xbf8df160) = -1 ENOENT (No such file or directory)
open("/usr/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
The dynamic loader is looking for libc6
, which is the standard library. It looks in several directories: first in the directories specified in the LD_LIBRARY_PATH
, then in the directories listed in /etc/ld.so.conf
. (See the manual for the full story.). In each directory, the loader checks several subdirectories first: it determines which hardware features are present (P6 instructions, SSE2), and looks for a version of the library binary which may use these extra features to run more efficiently; when it fails to find one that may use all the features, it looks for a more generic one. In the end, the library is found in a standard system directory, in a non-specialized version.
read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300\177\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=2035943, ...}) = 0
mmap2(NULL, 1801892, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7573000
mmap2(0xb7724000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b0000) = 0xb7724000
mmap2(0xb7729000, 7844, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7729000
close(3) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7572000
set_thread_area({entry_number:-1, base_addr:0xb7572700, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 (entry_number:6)
mprotect(0xb7724000, 12288, PROT_READ) = 0
mprotect(0xb7750000, 4096, PROT_READ) = 0
The standard library is loaded, then its initialization code runs.
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7571000
This is the result of executing the printf
call.
write(1, "Hello World!\n", 13) = 13
exit_group(0) = ?
This is your program exiting, which includes flushing the stdout buffer.
I will answer for Linux only.
Surprisingly, in newer kernels, the ptrace
system call, which is used by strace
in order to actually perform the tracing, is allowed to trace the init process. The manual page says:
EPERM The specified process cannot be traced. This could be because
the tracer has insufficient privileges (the required capability
is CAP_SYS_PTRACE); unprivileged processes cannot trace pro‐
cesses that they cannot send signals to or those running set-
user-ID/set-group-ID programs, for obvious reasons. Alterna‐
tively, the process may already be being traced, or (on kernels
before 2.6.26) be init(8) (PID 1).
implying that starting in version 2.6.26, you can trace init
, although of course you must still be root in order to do so. The strace
binary on my system allows me to trace init
, and in fact I can even use gdb
to attach to init
and kill it. (When I did this, the system immediately came to a halt.)
ptrace
cannot be used by a process to trace itself, so if strace
did not check, it would nevertheless fail at tracing itself. The following program:
#include <sys/ptrace.h>
#include <stdio.h>
#include <unistd.h>
int main() {
if (ptrace(PTRACE_ATTACH, getpid(), 0, 0) == -1) {
perror(NULL);
}
}
prints Operation not permitted
(i.e., the result is EPERM
). The kernel performs this check in ptrace.c
:
retval = -EPERM;
if (unlikely(task->flags & PF_KTHREAD))
goto out;
if (same_thread_group(task, current)) // <-- this is the one
goto out;
Now, it is possible for two strace
processes can trace each other; the kernel will not prevent this, and you can observe the result yourself. For me, the last thing that the first strace
process (PID = 5882) prints is:
ptrace(PTRACE_SEIZE, 5882, 0, 0x11
whereas the second strace
process (PID = 5890) prints nothing at all. ps
shows both processes in the state t
, which, according to the proc(5)
manual page, means trace-stopped.
This occurs because a tracee stops whenever it enters or exits a system call and whenever a signal is about to be delivered to it (other than SIGKILL
).
Assume process 5882 is already tracing process 5890. Then, we can deduce the following sequence of events:
ptrace
system call, attempting to trace process 5882. Process 5890 enters trace-stop.SIGCHLD
to inform it that its tracee, process 5890 has stopped. (A trace-stopped process appears as though it received the `SIGTRAP signal.)ptrace(PTRACE_SYSCALL, 5890, ...)
to allow process 5890 to continue.ptrace(PTRACE_SEIZE, 5882, ...)
. When the latter returns, process 5890 enters trace-stop.SIGCHLD
since its tracee has just stopped again. Since it is being traced, the receipt of the signal causes it to enter trace-stop.Now both processes are stopped. The end.
As you can see from this example, the situation of two process tracing each other does not create any inherent logical difficulties for the kernel, which is probably why the kernel code does not contain a check to prevent this situation from happening. It just happens to not be very useful for two processes to trace each other.
Best Answer
With
timeout
inGNU coreutils
, you can do:timeout 60 strace -p PID
Here is an example.
test.sh
:Run it:
Run
strace
withtimeout
:After 1 minute: