Linux – Determine if a specific process is 32- or 64-Bit

64bitelflinuxproc

Given a 2.6.x or newer Linux kernel and existing userland that is capable of running both ELF32 and ELF64 binaries (i.e. well past How do I know that my CPU supports 64bit operating systems under Linux?) how can I determine if a given process (by PID) is running in 32- or 64-bit mode?

The naive solution would be to run:

file -L /proc/pid/exe | grep -o 'ELF ..-bit [LM]SB'

but is that information exposed directly in /proc without relying on libmagic?

Best Answer

If you want to limit yourself to ELF detection, you can read the ELF header of /proc/$PID/exe yourself. It's quite trivial: if the 5th byte in the file is 1, it's a 32-bit binary. If it's 2, it's 64-bit. For added sanity checking:

If the first 5 bytes are 0x7f, "ELF", 1: it's a 32 bit ELF binary.
If the first 5 bytes are 0x7f, "ELF", 2: it's a 64 bit ELF binary.
Otherwise: it's inconclusive.

You could also use objdump, but that takes away your libmagic dependency and replaces it with a libelf one.

Another way: you can also parse the /proc/$PID/auxv file. According to proc(5):

This contains the contents of the ELF interpreter information passed to the process at exec time. The format is one unsigned long ID plus one unsigned long value for each entry. The last entry contains two zeros.

The meanings of the unsigned long keys are in /usr/include/linux/auxvec.h. You want AT_PLATFORM, which is 0x00000f. Don't quote me on that, but it appears the value should be interpreted as a char * to get the string description of the platform.

You may find this StackOverflow question useful.

Yet another way: you can instruct the dynamic linker (man ld) to dump information about the executable. It prints out to standard output the decoded AUXV structure. Warning: this is a hack, but it works.

LD_SHOW_AUXV=1 ldd /proc/$SOME_PID/exe | grep AT_PLATFORM | tail -1

This will show something like:

AT_PLATFORM:     x86_64

I tried it on a 32-bit binary and got i686 instead.

How this works: LD_SHOW_AUXV=1 instructs the Dynamic Linker to dump the decoded AUXV structure before running the executable. Unless you really like to make your life interesting, you want to avoid actually running said executable. One way to load and dynamically link it without actually calling its main() function is to run ldd(1) on it. The downside: LD_SHOW_AUXV is enabled by the shell, so you'll get dumps of the AUXV structures for: the subshell, ldd, and your target binary. So we grep for AT_PLATFORM, but only keep the last line.

Parsing auxv: if you parse the auxv structure yourself (not relying on the dynamic loader), then there's a bit of a conundrum: the auxv structure follows the rule of the process it describes, so sizeof(unsigned long) will be 4 for 32-bit processes and 8 for 64-bit processes. We can make this work for us. In order for this to work on 32-bit systems, all key codes must be 0xffffffff or less. On a 64-bit system, the most significant 32 bits will be zero. Intel machines are little endians, so these 32 bits follow the least significant ones in memory.

As such, all you need to do is:

1. Read 16 bytes from the `auxv` file.
2. Is this the end of the file?
3.     Then it's a 64-bit process.
4.     Done.
5. Is buf[4], buf[5], buf[6] or buf[7] non-zero?
6.     Then it's a 32-bit process.
7.     Done.
8. Go to 1.

Parsing the maps file: this was suggested by Gilles, but didn't quite work. Here's a modified version that does. It relies on reading the /proc/$PID/maps file. If the file lists 64-bit addresses, the process is 64 bits. Otherwise, it's 32 bits. The problem lies in that the kernel will simplify the output by stripping leading zeroes from hex addresses in groups of 4, so the length hack can't quite work. awk to the rescue:

if ! [ -e /proc/$pid/maps ]; then
    echo "No such process"
else
    case $(awk </proc/$pid/maps -- 'END { print substr($1, 0, 9); }') in
    *-) echo "32 bit process";;
    *[0-9A-Fa-f]) echo "64 bit process";;
    *) echo "Insufficient permissions.";;
    esac
 fi

This works by checking the starting address of the last memory map of the process. They're listed like 12345678-deadbeef. So, if the process is a 32-bit one, that address will be eight hex digits long, and the ninth will be a hyphen. If it's a 64-bit one, the highest address will be longer than that. The ninth character will be a hex digit.

Be aware: all but the first and last methods need Linux kernel 2.6.0 or newer, since the auxv file wasn't there before.

Related Solutions

Linux – Unix/Linux Loader Process

A user generally encounters three types of ELF files—.o files, regular executables, and shared libraries. While all of these files serve different purposes, their internal structure files are quite similar.

One universal concept among all different ELF file types (and also a.out and many other executable file formats) is the notion of a section. A section is a collection of information of a similar type. Each section represents a portion of the file. For example, executable code is always placed in a section known as .text; all data variables initialized by the user are placed in a section known as .data; and uninitialized data is placed in a section known as .bss.

Actually, one can devise an executable file format where everything is jumbled together(like MS DOS). But dividing executables into sections has important advantages. For example, once you have loaded the executable portions of an executable into memory, these memory locations need not change. On modern machine architectures, the memory manager can mark portions of memory read-only, such that any attempt to modify a read-only memory location results in the program dying and dumping core. Thus, instead of merely saying that we do not expect a particular memory location to change, we can specify that any attempt to modify a read-only memory location is a fatal error indicating a bug in the application. That being said, typically you cannot individually set the read-only status for each byte of memory—instead you can individually set the protections of regions of memory known as pages. On the i386 architecture the page size is 4096 bytes—thus you could indicate that addresses 0-4095 are read-only, and bytes 4096 and up are writable, for example.

Given that we want all executable portions of an executable in read-only memory and all modifiable locations of memory (such as variables) in writable memory, it turns out to be most efficient to group all of the executable portions of an executable into one section of memory (the .text section), and all modifiable data areas together into another area of memory (henceforth known as the .data section).

A further distinction is made between data variables the user has initialized and data variables the user has not initialized. If the user has not specified the initial value of a variable, there is no sense wasting space in the executable file to store the value. Thus, initialized variables are grouped into the .data section, and uninitialized variables are grouped into the .bss section, which is special because it doesn't take up space in the file—it only tells how much space is needed for uninitialized variables.

When you ask the kernel to load and run an executable, it starts by looking at the image header for clues about how to load the image. It locates the .text section within the executable, loads it into the appropriate portions of memory, and marks these pages as read-only. It then locates the .data section in the executable and loads it into the user's address space, this time in read-write memory. Finally, it finds the location and size of the .bss section from the image header, and adds the appropriate pages of memory to the user's address space. Even though the user has not specified the initial values of variables placed in .bss, by convention the kernel will initialize all of this memory to zero.

So you see, it is actually the kernel which issues the orders to load the executable in memory. The text section as a result of any such calls is loaded in read only memory and the data section is loaded in read-write memory.

Linux Proc – How to Determine Max User Process Value

All values is correct and have different meanings./proc/sys/kernel/pid_max is maximum value for PID, ulimit -u is maximum value for number of processes.

From man 5 proc:

/proc/sys/kernel/pid_max (since Linux 2.5.34)
              This  file  specifies the value at which PIDs wrap around (i.e.,
              the value in this file is one greater  than  the  maximum  PID).
              The  default  value  for  this  file, 32768, results in the same
              range of PIDs as on earlier kernels.  On 32-bit platforms, 32768
              is  the  maximum  value for pid_max.  On 64-bit systems, pid_max
              can be set to any value up to 2^22 (PID_MAX_LIMIT, approximately
              4 million).

From man bash:

ulimit [-HSTabcdefilmnpqrstuvx [limit]]
              .....
              -u     The maximum number of processes available to a single user
              .....

Note

When a new process is created, it is assigned next number available of kernel processes counter. When it reached pid_max, the kernel restart the processes counter to 300. From linux source code, pid.c file:

....
#define RESERVED_PIDS       300
....
static int alloc_pidmap(struct pid_namespace *pid_ns)                           
{                                                                               
    int i, offset, max_scan, pid, last = pid_ns->last_pid;                      
    struct pidmap *map;                                                         

    pid = last + 1;                                                             
    if (pid >= pid_max)                                                         
        pid = RESERVED_PIDS;

Best Answer

Related Solutions

Linux – Unix/Linux Loader Process

Linux Proc – How to Determine Max User Process Value

Related Question