Ubuntu – Why is the current directory in the ls command identified as linked to itself

command line

In the book "Learning the UNIX operating system", there is a section: "3.1.8 Listing Files", that describes the ls command.

In the paragraph on ls -l it describes the columns of the output of this command.

The second column of the ls -l command contains a single number. This number is in the book described as "The number of files and directories linked to this one." ( linked to the file or directory named in the last column of the same row as the concerned number. )

I tried this command and compared the output with the actual amount of files and directories in the current directory.

ls -l
drwxr-xr-x   6 azbc  staff    192 Sep  7 16:09 test

In the directory test, I have 2 subdirectories and 1 file, and 1 hidden file and a listing of the current directory, plus a listing of the parent directory, thus together 6 files and directories.

 ls -a -F
 ./                .hidden_file.txt  dir_2/
 ../               dir_1/            file_1.sh

It seems logical to me to identify all files and directories (including hidden files and directories) as linked to the current directory.
It also seems logical to identify the parent directory as linked to the current directory.

But why is the current directory identified as linked to itself ?

The ls -la command for the test directory gives the following output. ( the -F option shows a / in case of a directory behind the directory name, and a * in case of an executable)

 ls -la -F
 total 0
 drwxr-xr-x   6 azbc  staff   192 Sep  7 16:09 ./
 drwxr-xr-x+ ?? azbc  staff    ?? Sep  7 16:06 ../
 -rw-r--r--   1 azbc  staff     0 Sep  7 16:09 .hidden_file.txt
 drwxr-xr-x   2 azbc  staff    64 Sep  7 16:06 dir_1/
 drwxr-xr-x   2 azbc  staff    64 Sep  7 16:06 dir_2/
 -rwx--x--x   1 azbc  staff     0 Sep  7 16:06 file_1.sh*

A file itself is identified with only one link.
Is the file linked to itself ? Or is it linked to the directory it is in ?

Since in the listing of a directory the directory itself is represented in the listing and therefore logical to be counted as a link.

However in the listing of a file itself there is only the file itself represented in the listing.

 ls -la -F file_1.sh
 -rwx--x--x  1 azbc  staff  0 Sep  7 16:06 file_1.sh

That makes it logical to say that the file is linked to itself.

However it seems more logical to me to say that the file is linked to the directory it is in.

This seems not consequent.

Or is the listing of the linked files merely a counting of of the files and directories present in the listing output of the command, and not an identification of the real links to the file or directory in the the file system ?

Edit: as reply to @George Udosen, on:

"Now to try and answer your query in the comment:

'What is here being listed as a link ?
Is the file itself listed ? Or is the directory that contains the file being listed ?'"

If I list the directory test :

 ls -la -F test
 ...
 drwxr-xr-x   2 azbc  staff    64 Sep  7 16:06 dir_1/
 ...
 -rwx--x--x   1 azbc  staff     0 Sep  7 16:06 file_1.sh*

it identifies the directory dir_1 with 2 links !

If I then list that directory test/dir_1

 ls -la -F test/dir_1
 total 0
 drwxr-xr-x  2 azbc  staff   64 Sep  7 16:06 ./
 drwxr-xr-x  9 azbc  staff  288 Sep  7 21:37 ../

Hey, indeed !! it lists 2 entries !

The file file_1.sh* was identified with 1 link.
If I list the file file_1.sh

 ls -la -F test/file_1.sh
 -rwx--x--x  1 azbc  staff  0 Sep  7 16:06 test/file_1.sh*

Ho !! it lists indeed 1 entry !! , namely file_1.sh itself ! and again identifies that file with 1 entry.

By the way from this can I conclude that every entry listed having 1 link
is a file and not a directory ? Ho, this seems not to be the case as symbolic links are also listed as having 1 link / 1 entry.

Best Answer

I recommend you read What are directories, if everything on Linux is a file? for more in-depth knowledge on directory structure, history, and terminology of how directories work and its elements (inode, dirent structure, etc.), although it's not required for this question.

What are dot '.' and dot-dot '..' directories ?

Looking at format of directories manual page from 1971 edition of UNIX programmer's manual, we see that . and .. were already there:

By convention, the first two entries in each directory are for "." and “..“. The first is an entry for the directory itself.

As for their significance, an answer can be found on Panos's answer. Ken Thompson explained how .. came about in the 1989 interview:

Every time we made a directory, by convention we put it in another directory called directory - directory, which was dd. Its name was dd and that all the users directories and in fact most other directories, users maintain their own directory systems, had pointers back to dd, and dd got shortened into �dot-dot,� and dd was for directory-directory. It was the place back to where you could to get to all the other directories in the system to maintain this spaghetti bowl

Naturally, . as you can guess stands for d or short of directory. Such directory itself naturally shares same inode number as the directory's actual name. Now, this still doesn't explain why the directory . is linked to itself, but I have couple ideas.

0. Unix Philosophy:

In the 1996 book "UNIX Internals: The NEw Frontiers" by Uresh Vahalia, in Chapter 8, page 222 it is stated:

Unix supports thenotion of a current working directory for each process, maintained as part of the process state. This allows users to refer to files by their relative pathnames,which are interpreted relative to the current directory.

Considering that a directory is just a special file, we need consistent relative filename to refer to directory itself and that would be a special filename ., which evolved from d, which was short for directory.

1. Technical advantages

Main advantage I could think of is for the system to simplify the inode lookup, and thus metadata information. Since directory already has an entry containing . with the same inode, there's no need to query via full path. Same goes for programming. Consider a very simple implementation of ls. There I use getcwd() function to obtain current working directory path, and then pass it to opendir(). Or I could throw away getcwd() and just use opendir('.') directly. In the day of old PDP-11 terminals where memory size was in few kilobytes, saving up on syscall overhead would be crucial.

2. User convenience:

Consider the following example:

mv ../filename.txt .

In the presentation by Hendrik Jan Thomassen it's been mentioned that original Unix commands were short due to old terminal keys being hard to press, thus it was a physical effort to actually type commands all day long. If you are deep into directory tree, retyping full path of the current working directory would be tedious. Of course, mv could be implemented with assuming that when we do mv <file> we imply destination as "current working directory". I can only guess as to why mv <original> <new> prevailed, perhaps due to influence of other programming languages of the day.

3. Improving over MULTICS:

Note: I've never worked on MULTICS myself, so this is based on reading online sources only

According to 1986 MULTICS manual on Pathnames:

A relative pathname may begin with one or more less-than characters ("<").

The > character is used on MULTICS as path separator (like / on Linux). Arguably this may look confusing. Thus, ./ when referencing a command is arguably clearer - we're referencing a filename that is located in current working directory.

This may be beneficial for other commands. It's well known how to create a file on Unix/Linux: touch ./file. On MULTICS, according to swenson.org is done via an or add_name command:

cd foo
r 18:03 0.041 1

an foo bar
r 18:03 0.077 3

ls foo

Directories = 1.

sma  foo
       bar

r 18:03 0.065 0

On side note, there's obvious similarity when it comes to .. : navigating up one directory is done via cwd <<.

4. Referencing executables

If you're running scripts on daily basis, you know well ./script.sh syntax. The why of it is simple: the way shell works is that it looks for executable files in PATH variable so when you provide ./ it doesn't have to look anywhere. The magic of PATH variable is what makes you use echo instead of /bin/echo or other very lengthy paths. Now lets say you don't have that script.sh in your path, and it's there in your current working directory. What do you do now ? Type /very/long/path/to/the/executable/this/typing/gets/exhausting/on/PDP-11/finally/script.sh ? This will throw away all concept of Unix simplicity ! So going back to the Unix philosophy, it also aligns with the principle of elegant design/simplicity.

Of course, some folks want to add . to PATH, but this is actually a very bad practice, so don't do that.

side note: The special case of .. and . pointing to the same is inode 2 - the / dir , and it makes sense since it is highest point in directory tree. Of course, .. being NULL could also work, but it's more elegant to make it point to / itself.


Note on Link Count and Directory Hardlinks

As Gilles properly pointed out (and referenced by George Udosen) , the link count for a directory starts with 2 ( .. for parent directory and .), with all additional link being a subdirectory:

# new directory has link count of 2
$ stat --format=%h .
2
# Adding subdirectories increases link count
$ mkdir subdir1
$ stat --format=%h .
3
$ mkdir subdir2
$ stat --format=%h .
4
# Adding files doesn't make difference
$ cp /etc/passwd passwd.copy
$ stat --format=%h .
4
# Count of links for root
$ stat --format=%h /
25
# Count of subdirectories, minus .
$ find / -maxdepth 1 -type d | wc -l
24

Intuitively, links of a directory being subdirectories only - makes sense, since hard links are of the same time as original file. Except, these aren't exactly hard links - hard links create a filename that points to same data. By that definition, a hard link to directory would contain same data, i.e. contain same listing of files. This would lead to loops in filesystem or lots of orphan files if all hard links to directory were removed. For that reason, hard link creation is not allowed for directories, and to use Gilles's phrasing from another question (which I recommend you read) "...[i]n fact, many filesystems do have hard links on directories, but only in a very disciplined way..." and those are the special cases of . and .. directories.

Now, question becomes what is actually meant by "links" in context of directories ? TL;DR: directory structure is a tree, and Links here means number of child nodes for each tree item (with each leaf, or directory without subdirs, having only 2 links). In particular, ext3 and ext4 use HTree and xfs uses B+ tree


Conclusion

In the end, the reason why . is linked to itself is simply because it's good design. Original authors of Unix may have been working under technological constraints of their time, but they were some of the most brilliant minds of the day, or as often they're called "Wizards", and they did things for a reason.