There is no newline, so wc -l
is correct. Instead, you want to count the number of start of lines. One way to do it:
$ diff -y --suppress-common-lines a b | grep '^' | wc -l
1
The first command you mention, find . -type f -exec wc -l {} +
,
really says "run wc -l
on as many files as possible, until all of
them have been processed". This can run wc
multiple times!
On the other hand, find . -type f -exec cat {} + | wc -l
can run
cat
several times, but will only run wc
once. (More in detail,
this is because in this case cat
is called by find
, which can and
does decide to run it however many times it wants, whereas the part
after the pipe character, wc -l
, is beyond the reach of find
, and
is therefore run by your shell, just once.)
You say that the first command "yields 394968", but it really does
not; on my system its output ends with:
(Many more lines elided...)
23 ./po/Makefile.win
64 ./po/README
1 ./VERSION-NICK
97 ./README
258450 total
Yet, by adding grep total
, one can see that wc
was really run twice:
$ find . -type f -exec wc -l {} + | grep total
1590407 total
258450 total
And, indeed, 1590407 plus 258450 is 1848857, which agrees with the second command.
An explanation of why wc
was run more than once
in the find -exec wc +
version of the command
is vaguely hinted at by the find man page:
-exec command {} +
This variant of the -exec
action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end;
the total number of invocations of the command
will be much less than the number of
matched files. The command line is built in much the same way
that xargs
builds its command lines.
Note how this says "much less than ..." rather than "only once". The
documentation for xargs hints that its option --max-chars
is set
automatically if not set by the user:
--max-chars=max-chars
-s max-chars
Use at most max-chars
characters per command line, including the
command and initial-arguments and the terminating nulls at the
ends of the argument strings.
The largest allowed value is system-dependent,
and is calculated as the argument length limit
for exec, less the size of your environment, less 2048 bytes of
headroom. If this value is more than 128KiB, 128Kib is used as
the default value; otherwise, the default value is the maximum.
This limits how many filenames can be passed to a single call to wc
,
explaining why, for large numbers of files, several calls to wc
will
occur, each operating on a partition of the input.
Best Answer
When
ls
is executed it parses various options. It also detect if output is a tty or not by isatty().ls.c:
code
...
code
etc.
If you want you can compile a simple test:
isawhat.c
Compile by:
Then e.g.:
Width is measured in columns. One column is one character. It starts out with 80, then check if the environment variable COLUMNS is set and holds a valid int that is not larger then SIZE_MAX (Which is arch dependant - your terminal will never be that wide (at least not yet)).
Try e.g.
echo $COLUMNS
. It most probably reflect the number of columns you have available in the window. As window get resized - this get updated. It most probably also get reset by various commands.One way to set it a bit harder is by
stty
. E.g.stty columns 60
. Usestty -a
to view all (man stty). A fun piece of software.If compiled in it also query for columns by ioctl(), Window size detect.. By passing the filenumber for stdout to
ioctl
and passing the request TIOCGWINSZ the structurewinsize
get filled with the number of columns.This can also be demonstrated by a simple c-code:
Compile, run and resize window. Should update. Ctrl+C to quit.