What characterizes a file in Linux/Unix?
A file can have many types: regular file, directory, symlink, device file, socket, pipe, fifo, and more that I miss. For example, a symlink:
$ sudo file /proc/22277/fd/23
/proc/22277/fd/23: broken symbolic link to socket:[7540288]
a socket:
$ sudo ls -l /run/user/1001/systemd/notify
srwxrwxr-x 1 testme testme 0 Feb 6 16:41 /run/user/1001/systemd/notify
-
Is a file characterized as something with an inode (an inode in some filesystem, either in memory or in secondary storage device?)? Do files of all the file types have inodes? (I guess yes to both questions.)
-
Linux's Internet domain socket, transport protocols (TCP/UDP)'s socket and port seems to say something with an open file description is a file. Does something with an open file description necessarily have an inode?
open file description is a much better terminology than file, you can't define "file". Network socket and Unix domain socket are all open file description. UDS might or might associate something on the disk(there's a lot of condition can affect this). NS never associate anything on disk.
Thanks.
Best Answer
TL;DR
File As Abstraction
Let's consult POSIX definitions 2017, section 3.164 as to how File is defined:
So a file is anything we can read from, write to, or both, which also has metadata. Everyone - go home, case closed !
Well, not so fast. Such definition opens up whole lot of room for related concepts, and there's obviously differences between regular files and say pipes. "Everything is a file" is itself a concept and design pattern rather than a literal statement. Based on that pattern such filetypes as directories, pipes, device files, in-memory files, sockets - all of that can be manipulated via set of system calls such as
open()
,openat()
,write()
, and in case of socketsrecv()
andsend()
, in a consistent manner; take for example USB as analogy - you have so many different devices but they all connect to exactly the same USB port (nevermind there's actually multiple types of USB port types from A to C, but you get the idea).Of course, there has to be a certain interface or reference to actual data in a consistent manner for that to work, and that's File Descriptor:
As such, we can
write()
to STDOUT via file descriptor 1 in the same fashion as we would write to a regular file/home/user/foobar.txt
. When youopen()
a file, you get file descriptor and you can use samewrite()
function to write to that file. That's the whole point that original Unix creators tried to address - minimalist and consistent behavior. When you docommand > /home/user/foobar.txt
the shell will make a copy of file descriptor that refers tofoobar.txt
and pass it ascommand
's file descriptor 1 ( echo's STDOUT ), or to be more precise it will dodup2(3,1)
and thenexecve()
the command. But regardless of that,command
will still use the same write syscall into file descriptor 1 as if nothing happened.Of course, in terms of what most users think is a file, they think of a regular file on disk filesystem. This is more consistent with Regular File definition, section 3.323:
By contrast, we have Sockets:
Regardless of the type, the actions we can take over different filetypes are exactly the same conceptually - open, read,write, close.
All Files Have Inodes
What you should have noticed in the file definition is that file has "certain attributes", which are stored in inodes. In fact on Linux specifically, we can refer to inode(7) manual first line:
Boom. Clear and direct. We're mostly familiar with inodes as bridge between blocks of data on disk and filenames stored in directories (because that's what directories are - lists of filenames and corresponding inodes). Even in virtual filesystems such as pipefs and sockfs in kernel, we can find inodes. Take for instance this code snippet:
Open File Description
Now that you're thoroughly confused, Linux/Unix introduces something known as Open File Description, and to make explanation simple - it's another abstraction. In words of Stephane Chazelas,
And it's consistent with POSIX definition:
Now if we also look at Understanding the Linux Kernel book, the author states
Remembering that sockets are also referenced by file descriptors and therefore there will be open file description in kernel related to sockets, we can conclude sockets are files alright.
to be continued . . .maybe