Cat Command Termination – Difference Between Using Ctrl+D and Ctrl+C to Terminate Cat Command

cat

I have the following two test files:

test1 test2

Both of them are blank. Now I issue the following commands:

$ cat > test1

Enter

This is a test file

Enter

Ctrl + D

$ cat > test2

Enter

This is another test file

Enter

^C

Ctrl + C

Now I check the contents of the two files

$ cat test1
This is a test file
$ cat test2
This is another test file
$

So is there any real difference in the outcome if we use the above two methods to achieve the same outcome?

Best Answer

When the cat command is running, the terminal is in canonical input mode. This means, in short, that the terminal's line discipline is handling line editing, and is responding to all of the special characters configured for the terminal (viewable and settable with the stty command).

The cat command is simply read()ing from its standard input until a read() call returns zero bytes read, the POSIX convention for hitting end of file.

Terminals do not really have an "end". But there is a circumstance where read() of a terminal device returns zero bytes. When the line discipline receives the "EOF" special character, whatever that happens to be configured as at the time, it causes read() to return with whatever is in the editing buffer at that point. If the editing buffer was empty, that returns zero bytes read from read(), causing cat to exit.

cat also exits in response to signals whose default actions are to terminate the process. The line discipline also generates signals in response to special characters. The "INTR" and "QUIT" special characters cause the INT and QUIT signals to be sent to the foreground process (group), which will be/contain the cat process. The default action of these signals is to terminate the cat process.

Which leads to the observable differences:

Ctrl+D only has this action when it is the EOT special character. This is usually the case, but it is not necessarily the case. Similarly, Ctrl+C only has its action when it is the INTR special character.
Ctrl+D will not cause cat to terminate when the line is not in fact empty at the time. An interrupt generated by Ctrl+C will, though.
A naïve implementation of cat in the C language will block buffer standard output if it finds it directed at a file, as in the question. In theory, this could lead to buffered and not yet output lines being lost if cat is terminated by SIGINT.
In practice, the BSD and GNU C libraries implement a buffering mode that is not described in the C or C++ language standards. Standard output when redirected to file or pipe is smart buffered. It is block buffered; except that whenever the C library finds itself about to read() the beginning of a new line from any file descriptor that is open to a terminal device, it flushes standard output. (The BSD and GNU C libraries do not quite implement the same semantics and do more than this, strictly speaking, but this behaviour is a common subset.) Thus an interrupt signal will not cause lost buffered output when cat is built on top of such a C library.
Of course, if cat is part of a command pipeline, some other process could be buffering the data, downstream of cat before those data reach an output file. So again when the line discipline sends SIGINT, which (by default) terminates all of the processes in the pipeline, input data buffered and not yet written will be lost; whereas terminating cat normally with the "EOF" special character will cause the pipeline to terminate normally, with all of the data passing to the downstream process before it receives an EOF indication from its read() of its standard input.

Note that this bears very little relationship to what happens when your interactive shell is reading a line of input from you. When your shell is waiting for input, the terminal is in non-canonical input mode, in which mode the line discipline does not do any special handling of special characters. How your shell treats Ctrl+D and Ctrl+C is entirely up to the input editing library that your shell uses (libedit, readline, or ZLE) and how that editing library has been configured (with key bindings and suchlike).

It is completely up to programs and users to decide what the numbers "mean." If we want to store text, then it's probably a good idea to use the numbers as code, where each number is assigned a letter. That's what ASCII and Unicode do. If we want to display text, then it's probably a good idea to build a device or write a program that can take these numbers and display a bitmap looking like the corresponding ASCII/Unicode code. That's what terminals and terminal emulators do.

Of course, for graphics, we probably want the numbers to represent pixels and their colors. Then we'll need a program that goes through the file, reads all the bytes, and renders the picture accordingly. A terminal emulator is expecting the bytes to be ASCII/Unicode numbers and is going to behave differently, for the same chunk of bytes (or file).

Shell – Cat all files in a folder including filename by using a for loop

for f in *; do
  printf '%s\n' "$f"
  paste /dev/null - < "$f"
done

Would print the file name followed by its content with each line preceded by a TAB character for each file in the directory.

Same with GNU awk:

gawk 'BEGINFILE{print FILENAME};{print "\t" $0}' ./*

Or to avoid printing the name of empty files (this one not GNU specific):

awk 'FNR==1 {print FILENAME}; {print "\t" $0}' ./*

Or with GNU sed:

sed -s '1F;s/^/\t/' ./*

Cat Command Termination – Difference Between Using Ctrl+D and Ctrl+C to Terminate Cat Command

Best Answer

Further reading

Related Question

Best Answer

Further reading

Related Solutions

Produced after using cat on an image

It may be useful to explain how files work at the lowest level:

Shell – Cat all files in a folder including filename by using a for loop

Related Question