Shell – When are the built-in commands loaded to memory

shell-builtin

Let's say if I type in cd in my shell. Is cd loaded from the memory at that moment? My intuition is that these built-in commands are pre-loaded to the system memory after the kernel has been loaded, but someone insisted that they are loaded only when I actually invoke the command (press enter on a shell). Could you please tell me if there is a reference that explains this?

Best Answer

Let's say if I type in cd in my shell. Is cd loaded from the memory at that moment? My intuition is that these built-in commands are pre-loaded to the system memory after the kernel has been loaded, but someone insisted that they are loaded only when I actually invoke the command...

In broad terms the other answers are correct -- the built-ins are loaded with the shell, the stand-alones are loaded when invoked. However, a very stickly weasel-y "someone" could insist that it isn't that simple.

This discussion is somewhat about how the OS works, and different OS's work different ways, but I think in general the following is probably true for all contemporary *nixes.

First, "loaded into memory" is an ambiguous phrase; really what we are referring to is has its virtual address space mapped into memory. This is significant because "virtual address space" refers to stuff that may need to be placed into memory, but in fact is not initially: mostly what is actually loaded into memory is the map itself -- and the map is not the territory. The "territory" would be the executable on disk (or in disk cache) and, in fact, most of that is probably not loaded into memory when you invoke an executable.

Also, much of "the territory" is references to other territories (shared libraries), and again, just because they have been referred to does not mean they are really loaded either. They don't get loaded until they are actually used, and then only the pieces of them that actually need to be loaded in order for whatever "the use" is to succeed.

For example, here's a snippet of top output on linux referring to a bash instance:

VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                  
113m 3672 1796 S  0.0  0.1   0:00.07 bash   

The 113 MB VIRT is the virtual address space, which is mapped in RAM. But RES is the actual amount of RAM consumed by the process -- only 3.7 kB. And of that, some is part of the shared territory mentioned above -- 1.8 kB SHR. But my /bin/bash on disk is 930 kB, and the basic libc it links to (a shared lib) twice as big again.

That shell isn't doing anything right now. Let's say I invoke a built-in command, which we said earlier was already "loaded into memory" along with the rest of the shell. The kernel executes whatever code is involved starting at a point in the map, and when it reaches a reference to code that hasn't really been loaded, it loads it -- from an executable image on disk -- even though in a more casual sense, that executable (be it the shell, a stand-alone tool, or a shared library) was already "loaded into memory".

This is called demand paging.