Understanding The Linux Kernel says that execve()
calls do_execve( )
which in turn
copies the file pathname, command-line arguments, and environment strings
into one or more newly allocated page frames. (Eventually, they are assigned to
the User Mode address space.)
Am I correct that after execve()
terminates with success, the process invokes _start
routine of rt0.o
?
According to APUE:
When a C program is executed by the kernel—by one of the exec functions, a special start-up routine is called before the main
function is called. The executable program file specifies this routine as the starting address for the program; this is set up by the link editor when it is invoked by the C compiler. This start-up routine takes values from the kernel—the command-line arguments and the environment — and sets things up so that the main function is called as shown earlier.
Does the __start
routine also copy command line arguments and the environment again?
What are differences between do_execve()
and _start
both copying the command line arguments and environment? Isn't it wasteful to copy twice?
Thanks.
Best Answer
Not necessarily. When the
execve
system call returns, the process will continue executing from whatever text/code address is the entry point of the binary (in ELF, that's thee_entry
field from the header). Example:_start
is simply the usual name of the entry point routine on many (most?) Unix system.It could do that, but usually it does no such thing. The only thing that it should do is rearrange them in a way that they could be passed to a C function like
main
.The problem is that you couldn't simply declare the entry point as a C function
and get the arguments with
va_args
, because eg. onx86_64
, the C calling convention expects the (first couple of) arguments to be passed in registers, and that's not how they're passed to_start
.There are other things that
_start
is usually doing before callingmain
; a very important thing is running the static constructors, which is required by programs written inC++
, but could be used by any program if it defines the correct section attributes in the ELF binary (withgcc
, you can do that in aC
program by defining a function with__attribute__((constructor))
).The standard startup code in a glibc-based system will also go through a function defined in the (dynamically-linked)
libc.so
--__libc_start_main()
, which is very nice as you could override it from a preloaded dynamic library and add your own initialization stuff without having to modify a binary. Look here for an example.