Why Linux Uses File Extensions for Default Programs

file openingfilenamesfilesfilesystems

I have a text file as- abc.text and it has its contents as Hi I'm a text file.

If I double click to open this file, then the files is opened in gedit editor.

Whereas, if I rename the file to abc.html (without changing any of its contents) then by default it opens in Chrome.

This sort of behavior is acceptable on a Windows machine, since Windows uses file extensions to identify file types. But as far as I've read, Linux doesn't need file extensions.

So why does changing file extensions in Linux changes the default program that opens it?

Best Answer

Linux doesn't use file extensions to decide how to open a file, but Linux uses file extensions to decide how to open a file.

The problem here is that “Linux” can designate different parts of the operating system, and “opening a file” can mean different things too.

A difference between Linux and Windows is how they treat application files vs data files. On Windows, the line between the two is blurred; there are a few types of executable files, and they are determined by their extension (.exe, .bat, etc.), but in most contexts you can “execute” any file (e.g. by clicking in Explorer), and this executes the executable that is associated with that file type, where the file type is entirely determined by the extension (so executing a .doc file might start c:\Program Files\something or other\winword.exe, executing a .py file might start a Python interpreter, etc.).

On Linux, there is a notion of executable file which is independent of the file name. Executables generally have no extension, because they're meant to be typed by the user. The type of the file is irrelevant, all the user wants to do is execute the file. The kernel determines how to execute the file from the file contents: it knows some file types natively, and the shebang mechanism allows a file to declare any other executable file¹ as its interpreter.

On the other hand, data files usually do have an extension that indicates the type of data. The general idea here is that the type of data is not synonymous with what application to use to open the file with. You may want to view a PDF in Okular, or in Evince, or in Xpdf, or in Acroread, or in Mupdf, etc.

There are many tools that do however allow opening a data file without having to explicitly specify what application to use. These tools almost exclusively base their decision on the file extension. The file extension and the file's content are the only information that these tools have at their disposal: Linux does not store any meta information regarding the file format. So when you click on a .pdf file in a file manager (or when you run the .pdf file on a suitably-configured zsh command line, etc.), the file manager consults a database to find what application is the preferred one for .pdf file. This database may be structured in two sections, one that associates extensions to MIME types (/etc/mime.types, ~/.local/share/mime) and one that associates MIME types to applications (/etc/mailcap, ~/.local/share/applications), but even so the origin is the extension. While it would often be possible to figure out the application from the file content, this would be slower, and not always possible (many formats look just like text files, a .jar is a type of .zip, etc.).

Linux doesn't need file extensions, and it doesn't use them to determine how to run an executable file, but it does use them to determine which program to use to open a data file.

¹ That file has to be a native executable, a shebang executable can't point to another shebang executable to avoid potentially unending recursion.

Related Question