Linux – FreeBSD vs Linux: performance of kernel calling conventions

conventionsfreebsdlinuxperformancesystem-calls

From int80h.org, the FreeBSD Assembly Language Tutorial

[The Linux Calling] convention has a great disadvantage over the Unix way, at least as far as assembly language programming is concerned: Every time you make a kernel call you must push the registers, then pop them later. This makes your code bulkier and slower.

Going on it says about FreeBSD supporting both the Linux convention and the "Unix Convention"

If you are coding specifically for FreeBSD, you should always use the Unix convention: It is faster, you can store global variables in registers, you do not have to brand the executable, and you do not impose the installation of the Linux emulation package on the target system.

It seems weird to me that the Linux way would be bulkier and slower. It seems as if there are two options,

  • Save just the registers you need to preserve which are either
    • those volatile registers that may be clobbered by the system call (to my knowledge ecx)
    • or, the registers needed to send the appropriate arguments to the kernel to make the syscall (which may be eax, ecx, edx, esi, edi, ebp)
  • Save 100% of the arguments to the kernel on the stack.

It would seem like the FreeBSD one is the worst case scenario of the Linux convention. What am I missing? How is the FreeBSD convention (which they call the "Unix way") less bulky and faster?

Best Answer

This really boils down to the author’s opinion, in my opinion.

In the FreeBSD (“Unix”) convention, you push the arguments on the stack, specify the system call number in EAX, and invoke interrupt 0x80 (with an extra operand on the stack because it expects to be called from a separate function).

In the Linux i386 convention, you place the arguments in the appropriate registers, and invoke interrupt 0x80.

The bulky/slow argument presumably comes from the fact that with the Linux convention, the caller needs to deal with its use of registers. If the system call needs arguments in registers that contain values the caller cares about, it needs to preserve them, which results in additional legwork; see this example from the C library. In this example, the system call needs values in EAX, EBX, EDX, EDI, and ESI; but the caller only cares about preserving EBX, EDI, and ESI, so it only pushes those to the stack. The general case is quite a bit more complex (but that’s also the result of dealing with a mixture of C and assembly language, trying to generate optimal code in all cases), however when writing in assembly language, which is the point of the site you’re referring to, that wouldn’t be as much of an issue.

It seems to me that it’s six and half-a-dozen: in the FreeBSD convention, you push to the stack in all cases, in the Linux convention, you push to the stack (or elsewhere) depending on what you’re doing around the call site. You could argue that the Linux convention enables faster code since you can perform all your calculations in registers... As Rob points out however, on Linux the registers still end up being pushed (to build the struct pt_regs instance which is used to provide the arguments to the C functions which deal with the system calls), so the overall cost is greater on the Linux side than on the FreeBSD side.

In any case arguing about performance when talking about stack or register-based code around a system call seems rather pedantic, given the cost of performing the system call itself. Any cycle saved is of course good in absolute terms, but the relative improvement will be tiny.