On an older RHEL system I've got, /bin/cat
does not loop for cat x >> x
. cat
gives the error message "cat: x: input file is output file". I can fool /bin/cat
by doing this: cat < x >> x
. When I try your code above, I get the "looping" you describe. I also wrote a system call based "cat":
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
int
main(int ac, char **av)
{
char buf[4906];
int fd, cc;
fd = open(av[1], O_RDONLY);
while ((cc = read(fd, buf, sizeof(buf))) > 0)
if (cc > 0) write(1, buf, cc);
close(fd);
return 0;
}
This loops, too. The only buffering here (unlike for stdio-based "mycat") is what goes on in the kernel.
I think what's happening is that file descriptor 3 (the result of open(av[1])
) has an offset into the file of 0. Filed descriptor 1 (stdout) has an offset of 3, because the ">>" causes the invoking shell to do an lseek()
on the file descriptor before handing it off to the cat
child process.
Doing a read()
of any sort, whether into a stdio buffer, or a plain char buf[]
advances the position of file descriptor 3. Doing a write()
advances the position of file descriptor 1. Those two offsets are different numbers. Because of the ">>", file descriptor 1 always has an offset greater than or equal to the offset of file descriptor 3. So any "cat-like" program will loop, unless it does some internal buffering. It's possible, maybe even likely, that a stdio implementation of a FILE *
(which is the type of the symbols stdout
and f
in your code) that includes its own buffer. fread()
may actually do a system call read()
to fill the internal buffer fo f
. This may or may not change anything in the insides of stdout
. Calling fwrite()
on stdout
may or may not change anything inside of f
. So a stdio-based "cat" might not loop. Or it might. Hard to say without reading through a lot of ugly, ugly libc code.
I did an strace
on the RHEL cat
- it just does a succession of read()
and write()
system calls. But a cat
doesn't have to work this way. It would be possible to mmap()
the input file, then do write(1, mapped_address, input_file_size)
. The kernel would do all the work. Or you could do a sendfile()
system call between the input and output file descriptors on Linux systems. Old SunOS 4.x systems were rumored to do the memory mapping trick, but I don't know if any one has ever done a sendfile-based cat. In either case the "looping" wouldn't happen, as both write()
and sendfile()
require a length-to-transfer parameter.
Yes, n > 1 is an explicit requirement:
A correctly-formed brace expansion must contain unquoted opening and closing braces, and at least one unquoted comma or a valid sequence expression. Any incorrectly formed brace expansion is left unchanged.
As for the why - historical reasons, to some extent (though it was copied from csh
originally, which has the other behaviour). There are commands that take {}
as a literal argument (find
, parallel
, and others with more complex arguments), and also other uses of {}
in the shell language. Because brace expansions are only processed when written literally (and not from variables), there's really no motivation to support degenerate expansions, and some reasons not to.
Best Answer
The less than and symbol (
<
) is opening the file up and attaching it to the standard input device handle of some application/program. But you haven't given the shell any application to attach the input to.Example
These 2 examples do essentially the same thing but get their input in 2 slightly different manners.
opens file
opens STDIN
Peeking behind the curtain
You can use
strace
to see what's going on.When we read from a file
When we read from STDIN (identified as 0)
In the first example we can see that
cat
opened the file and read from it,blah.txt
. In the second we can see thatcat
reads the contents of the fileblah.txt
via the STDIN file descriptor, identified as descriptor number 0.