Given a single column file of numbers, call it f, the following awk code will return the maximum value
cat f | awk ' BEGIN {max = -inf}
{if ($1>max) max=$1}
END { print max }
'
The same approach to get the minimum doesn't produce anything
cat f | awk '
BEGIN {min = inf}
{if ($1<min) min=$1}
END {print min}
'
But if instead of using inf
, I start off with min = [some large number]
, if the number is large enough, depending upon what's in the file, then the revised code works.
Why doesn't inf
work, and is there some way to make the min case work like the max case, without having to know what's in the file?
Best Answer
The actual task is best solved by initializing your max/min values not by an imaginary "smallest" or "greatest" number (which may not be implemented in the framework you are using, in this case
awk
), but by initializing it using actual data. That way, it is always guaranteed to provide a meaningful result.In your case, you can use the very first value you encounter (i.e. the entry in the first line) to initialize
max
andmin
, respectively, by adding a ruleto your
awk
script. Then, if the first value is already the minimum, the subsequent test will not overwrite it, and in the end the correct result will be produced. The same holds for searches of the maximum value, so in combined searches, you can stateAs for the reason why your approach with
inf
didn't work withawk
whereas-inf
seemed to, @steeldriver has provided a good explanation in a comment to your question, which I will also summarize for the sake of completeness:awk
, variables are "dynamically typed", i.e. everything can be a string or a number depending on use (butawk
will "remember" what it was last used as and keep that information along for use in the next operation).awk
will try to interpret the content of that variable as a number and perform the operation, from where on the variable is typed as numerical if successful.inf
has no special meaning inawk
, hence when used just so, it is an empty variable that will evaluate to 0 in an arithmetic expression such as-inf
. Therefore, the "maximum search" with themax
variable initialized to-inf
works if your data is all positive, because-inf
is simply 0 (and as such, the smallest non-negative number).min
toinf
will initialize the variable to the empty string, as no arithmetic operation is present that would warrant an automatic conversion of that empty string to a number.Therefore, in the later comparisons
the input,
$1
, is compared with a string value, which is whyawk
treats$1
as a string, too, and performs a lexicographical comparison rather than a numerical one.However, lexicographically, nothing is "smaller" than the empty string, and so
min
never gets assigned a new value. Therefore, in theEND
section, the statementprints the (still) empty string.
(*) see Stephen Kitt's answer on how a string with content
"inf"
can actually have a meaning inawk
.