Say I log into a shell on a unix system and begin tapping away commands. I initially begin in my user's home directory ~
. I might from there cd
down to the directory Documents
.
The command to change working directory here is very simple intuitively to understand: the parent node has a list of child nodes that it can access, and presumably it uses an (optimised) variant of a search to locate the existence of a child node with the name the user entered, and the working directory is then "altered" to match this — correct me if I'm wrong there. It may even be simpler that the shell simply "naively" tries to attempt to access the directory exactly as per the user's wishes and when the file system returns some type of error, the shell displays a response accordingly.
What I am interested in however, is how the same process works when I navigate up a directory, i.e. to a parent, or a parent's parent.
Given my unknown, presumably "blind" location of Documents
, one of possibly many directories in the entire file system tree with that name, how does Unix determine where I should be placed next? Does it make a reference to pwd
and examine that? If yes, how does pwd
track the current navigational state?
Best Answer
The other answers are oversimplifications, each presenting only parts of the story, and are wrong on a couple of points.
There are two ways in which the working directory is tracked:
chdir()
andfchdir()
system calls, the latter bychroot()
. One can see them indirectly in/proc
on Linux operating systems or via thefstat
command on FreeBSD and the like:When pathname resolution operates, it begins at one or the other of those referenced vnodes, according to whether the path is relative or absolute. (There is a family of
…at()
system calls that allow pathname resolution to begin at the vnode referenced by an open (directory) file descriptor as a third option.)In microkernel Unices the data structure is in application space, but the principle of holding open references to these directories remains the same.
chdir()
.If one changes to a relative pathname, it manipulates the string to append that name. If one changes to an absolute pathname, it replaces the string with the new name. In both cases, it adjusts the string to remove
.
and..
components and to chase down symbolic links replacing them with their linked-to names. (Here is the Z shell's code for that, for example.)The name in the internal string variable is tracked by a shell variable named
PWD
(orcwd
in the C shells). This is conventionally exported as an environment variable (namedPWD
) to programs spawned by the shell.These two methods of tracking things are revealed by the
-P
and-L
options to thecd
andpwd
shell built-in commands, and by the differences between the shells' built-inpwd
commands and both the/bin/pwd
command and the built-inpwd
commands of things like (amongst others) VIM and NeoVIM.As you can see: obtaining the "logical" working directory is a matter of looking at the
PWD
shell variable (or environment variable if one is not the shell program); whereas obtaining the "physical" working directory is a matter of calling thegetcwd()
library function.The operation of the
/bin/pwd
program when the-L
option is used is somewhat subtle. It cannot trust the value of thePWD
environment variable that it has inherited. After all, it need not have been invoked by a shell and intervening programs may not have implemented the shell's mechanism of making thePWD
environment variable always track the name of the working directory. Or someone may do what I did just there.So what it does is (as the POSIX standard says) check that the name given in
PWD
yields the same thing as the name.
, as can be seen with a system call trace:As you can see: it only calls
getcwd()
if it detects a mismatch; and it can be fooled by settingPWD
to a string that does indeed name the same directory, but by a different route.The
getcwd()
library function is a subject in its own right. But to précis:..
directory. It stopped when it reached a loop where..
was the same as its working directory or when there was an error trying to open the next..
up. This would be a lot of system calls under the covers.However, note that even on FreeBSD and those other operating systems the kernel does not keep track of the working directory with a string.
Navigating to
..
is again a subject in its own right. Another précis: Although directories conventionally (albeit, as already alluded to, this is not required) contain an actual..
in the directory data structure on disc, the kernel tracks the parent directory of each directory vnode itself and can thus navigate to the..
vnode of any working directory. This is somewhat complicated by the mountpoint and changed root mechanisms, which are beyond the scope of this answer.Aside
Windows NT in fact does a similar thing. There is a single working directory per process, set by the
SetCurrentDirectory()
API call and tracked per process by the kernel via an (internal) open file handle to that directory; and there is a set of environment variables that Win32 programs (not just the command interpreters, but all Win32 programs) use to track the names of multiple working directories (one per drive), appending to or overwriting them whenever they change directory.Conventionally, unlike the case with Unix and Linux operating systems, Win32 programs do not display these environment variables to users. One can sometimes see them in Unix-like subsystems running on Windows NT, though, as well as by using the command interpreters'
SET
commands in a particular way.Further reading
pwd
". The Open Group Base Specifications Issue 7. IEEE 1003.1:2008. The Open Group. 2016.