Shell – the best way to pipe the output of a command through a pager if (and only if) it is too long

command linelessshellzsh

I'd like to be able to be able to wrap a command so that if its output doesn't fit in a terminal it will be automatically piped through a pager.

Right now I'm using the following shell function (in zsh, under Arch Linux):

export LESS="-R"

RET="$($@)"
RET_LINES="$(echo "${RET}" | wc -l)"

if [[ $RET_LINES -ge $LINES ]]; then
  echo "${RET}" | ${PAGER:="less"}
else
  echo "${RET}"
fi

but this doesn't really convince me. Is there a better way (in terms of robustness and overhead) to achieve what I want? I'm open to zsh-specific code, too, if it does the job well.


Update: Since I asked this question I found an answer which provides a somewhat better — if more complicated — solution, which buffers at most $LINES lines before piping the output to less instead of caching it all. Sadly that's not really satisfying either, because neither solution accounts for long, wrapped lines. For example, if the code above is stored in a function called pager_wrap, then

pager_wrap echo {1..10000}

prints a very long line to stdout instead of piping through a pager.

Best Answer

I’ve got a solution that’s written for POSIX shell compliance, but I’ve tested it only in bash, so I don’t know for sure whether it’s portable.  And I don’t know zsh, so I have made no attempt to make it zsh-friendly.  You pipe your command into it; passing a command as argument(s) to another command is a bad design*.

Of course any solution to this problem needs to know how many rows and columns the terminal has.  In the code below, I’ve assumed that you can rely on the LINES and COLUMNS environment variables (which less looks at).  More reliable methods are:

  • use rows="${LINES:=$(tput lines)}" and cols="${COLUMNS:=$(tput cols)}", as suggested by A.P., or
  • look at the output from stty size.  Note that this command must have the terminal as its standard input, so, if it’s in a script, and you’re piping into the script, you’ll have to say stty size <&1 (in bash) or stty size < /dev/tty.  Capturing its output is even more complicated.

The secret ingredient: the fold command will break long lines the way the screen will, so the script can handle long lines correctly.

#!/bin/sh
buffer=$(mktemp)
rows="$LINES"
cols="$COLUMNS"
while true
do
      IFS= read -r some_data
      e=$?        # 1 if EOF, 0 if normal, successful read.
      printf "%s" "$some_data" >> "$buffer"
      if [ "$e" = 0 ]
      then
            printf "\n" >> "$buffer"
      fi
      if [ $(fold -w"$cols" "$buffer" | wc -l) -lt "$rows" ]
      then
            if [ "$e" != 0 ]
            then
                  cat "$buffer"
            else
                  continue
            fi
      else
            if [ "$e" != 0 ]
            then
                  "${PAGER:="less"}" < "$buffer"
                  # The above is equivalent to
                  # cat "$buffer"   | "${PAGER:="less"}"
                  # … but that’s a UUOC.
            else
                  cat "$buffer" - | "${PAGER:="less"}"
            fi
      fi
      break
done
rm "$buffer"

To use this:

  • Put the above into a file; let’s assume you call it mypager.
  • (Optionally) put it into a directory that’s is your search path; e.g., $HOME/bin.
  • Make it executable by typing chmod +x mypager.
  • Use it in commands like ps ax | mypager or ls -la | mypager.
    If you skipped the second step (putting the script into a directory that’s is your search path), you’ll have to do ps ax | path_to_mypager/mypager, where path_to_mypager can be a relative path like “.”.

* Why is passing a command as argument(s) to another command a bad design?

I. Aesthetics / Conformance to Traditions / Unix Philosophy

Unix has a philosophy of Do One Thing and Do It Well.  For example, if a program is going to display data in a certain way (as pagers do), then it shouldn’t also be invoking the mechanism that produces the data.  That’s what pipes are for.

Not many Unix programs execute user-specified commands or programs.  Let’s look at some that do:

  • The shell, as in sh -c "command"
    Well, running user-specified commands is the shell’s job; it’s the One Thing that the shell does.  (Of course I am not saying that the shell is a simple program.)
  • env, nice, nohup, setsid, su, and sudo.  These programs have something in common — they all exist to run a program with a modified execution environment1.  They have to work the way they do, because Unix generally doesn’t allow you to change the execution environment of another process; you have to change your own process, and then fork and/or exec.
    _______
    1 I’m using the phrase execution environment in the broad sense, referring not only to environment variables, but also process attributes such as “nice” value, UID and GIDs, process group, session ID, controlling terminal, open files, working directory, umask value, ulimits, signal dispositions, alarm timer, etc.
  • Programs that allow a “shell escape”.  The only example that springs to mind is vi/vim, although I’m pretty sure that there are others.  These are historical artifacts.  They predate window systems and even job control; if you were editing a file, and you wanted to do something else (like look at a directory listing), you would have had to save your file and exit from the editor to get back to your shell.  Nowadays you can switch to another window, or use Ctrl+Z (or type :suspend) to get back to your shell while keeping your editor alive, so shell escapes are, arguably, obsolete.

I’m not counting programs that execute other (hard-coded) programs so as to leverage their capabilities rather than duplicate them.  For example, some programs may execute diff or sort.  (For example, there are tales that that early versions of spell used sort -u to get a list of the words used in a document, and then diff — or perhaps comm — to compare that list to the dictionary word list and identify which words from the document were not in the dictionary.)

II. Timing Issues

The way your script is written, the RET="$($@)" line doesn’t complete until the invoked command completes.  Therefore, your script cannot begin to display data until the command that generates it has completed.  Probably the simplest way to fix that is to make the data-generating command separate from the data-displaying program (although there are other ways).

III. Command History

  1. Suppose you run some command with output processed by your display filter, and you look at the output, and decide that you want to save that output in a file.  If you had typed (as a hypothetical example)

    ps ax | mypager
    

    you can then type

    !:1 > myfile
    

    or press and edit the line appropriately.  Now, if you had typed

    mypager "ps ax"
    

    you can still go back and edit that command into ps ax > myfile, but it’s not so straightforward.

  2. Or suppose you decide that you want to run ps uax next.  If you had typed ps ax | mypager, you could do

    !:0 u!:*
    

    Again, with mypager "ps ax", it’s still doable, but, arguably, harder.

  3. Also, look at the two commands: ps ax | mypager and mypager "ps ax".  Suppose you run a history listing an hour later.  ISTM that you’d have to look at mypager "ps ax" a little bit harder to see what the command being executed is.

IV. Complex Commands / Quoting

  1. echo {1..10000} is obviously just an example command; ps ax isn’t much better.  What if you want to do something just a little bit more realistic, like ps ax | grep oracle?  If you type

    mypager ps ax | grep oracle
    

    it will run mypager ps ax and pipe the output from that through grep oracle.  So, if the output from ps ax is 30 lines long, mypager will invoke less, even if the output from ps ax | grep oracle is only 3 lines.  There are probably examples that will fail in a more dramatic fashion.

    So you have to do what I was showing earlier:

    mypager "ps ax | grep oracle"
    

    But RET="$($@)" can’t handle that.  There are, of course, ways to handle things like that, but they are discouraged.

  2. What if the command line whose output you want to capture is even more complicated; e.g.,

    command1  "arg1"   |   command2  'arg2'  $'arg3'

    where the arguments contain messy combinations of space, tab, $, |, \, <, >, *, ;, &, [, ], (, ), `, and maybe even ' and ".  A command like that can be hard enough to type directly into the shell correctly.  Now imagine the nightmare of having to quote it to pass it as an argument to mypager.