When ls
is executed it parses various options. It also detect if output is a tty or not by isatty().
ls.c:
code
case LS_LS:
/* This is for the `ls' program. */
if (isatty (STDOUT_FILENO))
{
format = many_per_line;
/* See description of qmark_funny_chars, above. */
qmark_funny_chars = true;
}
else
{
format = one_per_line;
qmark_funny_chars = false;
}
break;
...
code
/* disable -l */
if (format == long_format)
format = (isatty (STDOUT_FILENO) ? many_per_line : one_per_line);
etc.
If you want you can compile a simple test:
isawhat.c
#include <stdio.h>
#include <unistd.h>
int main(void)
{
if (isatty(STDOUT_FILENO)) {
fprintf(stdout, "Word by word my world.\n");
} else {
fprintf(stdout, "HELP! Stranger handling my words!!\n");
}
fprintf(stderr, "Bye bye.\n");
return 0;
}
Compile by:
gcc -o isawhat isawhat.c
Then e.g.:
$ ./isawhat | sed 's/word/world/'
Width is measured in columns. One column is one character. It starts out with 80, then check if the environment variable COLUMNS is set and holds a valid int that is not larger then SIZE_MAX (Which is arch dependant - your terminal will never be that wide (at least not yet)).
Try e.g. echo $COLUMNS
. It most probably reflect the number of columns you have available in the window. As window get resized - this get updated. It most probably also get reset by various commands.
One way to set it a bit harder is by stty
. E.g. stty columns 60
. Use stty -a
to view all (man stty). A fun piece of software.
If compiled in it also query for columns by ioctl(), Window size detect.. By passing the filenumber for stdout to ioctl
and passing the request TIOCGWINSZ the structure winsize
get filled with the number of columns.
This can also be demonstrated by a simple c-code:
Compile, run and resize window. Should update. Ctrl+C to quit.
#include <stdio.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include <signal.h>
static int run;
void sig_handler(int sig) {
switch (sig) {
case SIGINT:
case SIGTERM:
case SIGSTOP:
run = 0;
break;
}
}
void sig_trap(int sig) {
if ((signal(sig, sig_handler)) == SIG_IGN)
signal(sig, SIG_IGN);
}
int main(void)
{
struct winsize ws;
sig_trap(SIGINT);
sig_trap(SIGTERM);
sig_trap(SIGSTOP);
run = 1;
while (run) {
if (ioctl(STDOUT_FILENO, TIOCGWINSZ, &ws) != -1) {
fprintf(stdout, "\r %s: %3d, %s: %d\r",
"Columns", ws.ws_col,
"Rows", ws.ws_row
);
fflush(stdout);
}
usleep(5000);
}
fprintf(stdout, "\n");
return 0;
}
Best Answer
The first command you mention,
find . -type f -exec wc -l {} +
, really says "runwc -l
on as many files as possible, until all of them have been processed". This can runwc
multiple times!On the other hand,
find . -type f -exec cat {} + | wc -l
can runcat
several times, but will only runwc
once. (More in detail, this is because in this casecat
is called byfind
, which can and does decide to run it however many times it wants, whereas the part after the pipe character,wc -l
, is beyond the reach offind
, and is therefore run by your shell, just once.)You say that the first command "yields 394968", but it really does not; on my system its output ends with:
Yet, by adding
grep total
, one can see thatwc
was really run twice:And, indeed, 1590407 plus 258450 is 1848857, which agrees with the second command.
An explanation of why
wc
was run more than once in thefind -exec wc +
version of the command is vaguely hinted at by the find man page:Note how this says "much less than ..." rather than "only once". The documentation for xargs hints that its option
--max-chars
is set automatically if not set by the user:This limits how many filenames can be passed to a single call to
wc
, explaining why, for large numbers of files, several calls towc
will occur, each operating on a partition of the input.