Linux – How does Linux reads “real files” and “virtual files”

filesystemslinuxproc

I have found out that in Linux there are "real files" and there are "virtual files", real files are files that resides on the hard disk, while virtual files are just data represented by the kernel as files.

For example, the files in the /proc directory are virtual files.

I want to understand how a function like read() knows how to read a real file and how to read a virtual file. I have created the following diagram to show my understanding of this subject, please correct me if I am wrong about my understanding:

enter image description here

Best Answer

In VFS layer all files are virtual (it was actually invented by SunOS engineers to tie UFS (disk-based) and NFS (network-based) filesystem).

Each open file has table of functions f_op that provide implementations for common routines (some of them may be generic) and each inode has an attached address_space object that also has table of C functions (a_ops) containing necessary implementations. The sequence is this:

  1. sys_read(): Application initiates file reading using system call
  2. Call is passed to VFS stack top layer (vfs_read())
  3. Call is passed to filesystem driver using file->f_op->read() or do_sync_read() or new_sync_read()
  4. If file is opened in direct input output mode, appropriate function (a_ops->direct_IO(), ext4_direct_IO() for ext4) is called and data is returned
  5. If page is found in page cache, data is returned file_get_page()
  6. If page was not found in page cache, it is read from filesystem using a_ops->readpage(), which is implemented by ext4_readpage() from ext4 driver
  7. VFS stack creates block input-output request using submit_bio()

From http://myaut.github.io/dtrace-stap-book/kernel/fs.html, it is a bit outdated as VFS stack was refactored a bit after I'd written this

Related Question