How to Find Line with Least Characters in Shell

shelltext processingwc

I am writing a shell script, using any general UNIX commands. I have to retrieve the line that has the least characters (whitespace included). There can be up to around 20 lines.

I know I can use head -$L | tail -1 | wc -m to find the character count of line L. The problem is, the only method I can think of, using that, would be to manually write a mess of if statements, comparing the values.

Example data:

seven/7
4for
8 eight?
five!

Would return 4for since that line had the least characters.

In my case, if multiple lines have the shortest length, a single one should be returned. It does not matter which one is selected, as long as it is of the minimum length. But I don't see the harm in showing both ways for other users with other situations.

Best Answer

A Perl way. Note that if there are many lines of the same, shortest length, this approach will only print one of them:

perl -lne '$m//=$_; $m=$_ if length()<length($m); END{print $m if $.}' file 

Explanation

  • perl -lne : -n means "read the input file line by line", -l causes trailing newlines to be removed from each input line and a newline to be added to each print call; and -e is the script that will be applied to each line.
  • $m//=$_ : set $m to the current line ($_) unless $m is defined. The //= operator is available since Perl 5.10.0.
  • $m=$_ if length()<length($m) : if the length of the current value of $m is greater than the length of the current line, save the current line ($_) as $m.
  • END{print $m if $.} : once all lines have been processed, print the current value of $m, the shortest line. The if $. ensures that this only happens when the line number ($.) is defined, avoiding printing an empty line for blank input.

Alternatively, since your file is small enough to fit in memory, you can do:

perl -e '@K=sort{length($a) <=> length($b)}<>; print "$K[0]"' file 

Explanation

  • @K=sort{length($a) <=> length($b)}<> : <> here is an array whose elements are the lines of the file. The sort will sort them according to their length and the sorted lines are saved as array @K.
  • print "$K[0]" : print the first element of array @K: the shortest line.

If you want to print all shortest lines, you can use

perl -e '@K=sort{length($a) <=> length($b)}<>; 
         print grep {length($_)==length($K[0])}@K; ' file 
Related Question