TL;DR
- a file is an object on which you can perform some or all of basic operations - open,read,write,close - and has metadata stored in an inode.
- file descriptors are references to those objects
- open file description (yes, open part is important) is how file (represented by at least one file descriptor) is open
File As Abstraction
Let's consult POSIX definitions 2017, section 3.164 as to how File is defined:
An object that can be written to, or read from, or both. A file has certain attributes, including access permissions and type. File types include regular file, character special file, block special file, FIFO special file, symbolic link, socket, and directory. Other types of files may be supported by the implementation.
So a file is anything we can read from, write to, or both, which also has metadata. Everyone - go home, case closed !
Well, not so fast. Such definition opens up whole lot of room for related concepts, and there's obviously differences between regular files and say pipes. "Everything is a file" is itself a concept and design pattern rather than a literal statement. Based on that pattern such filetypes as directories, pipes, device files, in-memory files, sockets - all of that can be manipulated via set of system calls such as open()
, openat()
, write()
, and in case of sockets recv()
and send()
, in a consistent manner; take for example USB as analogy - you have so many different devices but they all connect to exactly the same USB port (nevermind there's actually multiple types of USB port types from A to C, but you get the idea).
Of course, there has to be a certain interface or reference to actual data in a consistent manner for that to work, and that's File Descriptor:
A per-process unique, non-negative integer used to identify an open file for the purpose of file access. The value of a newly-created file descriptor is from zero to {OPEN_MAX}-1.
As such, we can write()
to STDOUT via file descriptor 1 in the same fashion as we would write to a regular file /home/user/foobar.txt
. When you open()
a file, you get file descriptor and you can use same write()
function to write to that file. That's the whole point that original Unix creators tried to address - minimalist and consistent behavior. When you do command > /home/user/foobar.txt
the shell will make a copy of file descriptor that refers to foobar.txt
and pass it as command
's file descriptor 1 ( echo's STDOUT ), or to be more precise it will do dup2(3,1)
and then execve()
the command. But regardless of that, command
will still use the same write syscall into file descriptor 1 as if nothing happened.
Of course, in terms of what most users think is a file, they think of a regular file on disk filesystem. This is more consistent with Regular File definition, section 3.323:
A file that is a randomly accessible sequence of bytes, with no further structure imposed by the system.
By contrast, we have Sockets:
A file of a particular type that is used as a communications endpoint for process-to-process communication as described in the System Interfaces volume of POSIX.1-2017.
Regardless of the type, the actions we can take over different filetypes are exactly the same conceptually - open, read,write, close.
All Files Have Inodes
What you should have noticed in the file definition is that file has "certain attributes", which are stored in inodes. In fact on Linux specifically, we can refer to inode(7) manual first line:
Each file has an inode containing metadata about the file. An application can retrieve this metadata using stat(2) (or related calls)
Boom. Clear and direct. We're mostly familiar with inodes as bridge between blocks of data on disk and filenames stored in directories (because that's what directories are - lists of filenames and corresponding inodes). Even in virtual filesystems such as pipefs and sockfs in kernel, we can find inodes. Take for instance this code snippet:
static char *pipefs_dname(struct dentry *dent, char *buffer, int buflen)
{
return dynamic_dname(dentry, buffer, buflen, "pipe:[%lu]",
dentry->d_inode->i_ino);
}
Open File Description
Now that you're thoroughly confused, Linux/Unix introduces something known as Open File Description, and to make explanation simple - it's another abstraction. In words of Stephane Chazelas,
It's more about the the record of how the file was opened more than the file itself.
And it's consistent with POSIX definition:
A record of how a process or group of processes is accessing a file. Each file descriptor refers to exactly one open file description, but an open file description can be referred to by more than one file descriptor. The file offset, file status, and file access modes are attributes of an open file description.
Now if we also look at Understanding the Linux Kernel book, the author states
Linux implements BSD sockets as files that belong to the sockfs special filesystem...More precisely, for every new BSD socket, the kernel creates a new inode in the sockfs special filesystem.
Remembering that sockets are also referenced by file descriptors and therefore there will be open file description in kernel related to sockets, we can conclude sockets are files alright.
to be continued . . .maybe
Best Answer
If you could call mknod arbitrarily, then you could create device files owned and accessible by you for any device. The device files give you unlimited access to the corresponding devices; therefore, any user could access devices arbitrarily.
For instance, suppose
/dev/sda1
holds a file system to which you have no access. (Say, it is mounted to/secret
). Over here,/dev/sda1
is block special 8,1, so if you could call mknod, e.g.mknod ~/my_sda1 b 8 1
, then you could access anything on/dev/sda1
through your own device file for/dev/sda1
regardless of any filesystem restrictions on/dev/sda1
. (You get the device as a flat file without any structure, so you would need to know what to do with it, but there are libraries for accessing block device files.)Likewise, if you could create your own copy of
/dev/mem
or/dev/kmem
, then you could examine anything in main memory; if you could create your own copy of/dev/tty*
or/dev/pts/*
, then you could record any keyboard input - and so on.Therefore, mknod in the hand of ordinary users is harmful and thus its use must be restricted.
N.B. This is why the
nodev
mount option is crucial for mobile devices, for otherwise you could bring in your own device files on prepared mobile media.