Invoke date within an awk command to format output

awkcommand linedate

I have a csv file in the following format:

20171129,1
20171201,0.5
20171201,0.5
20171202,1.25
20171202,1.75

I use the following command to sum up the second field if it is following the same date with this command:

awk -F ',' '{a[$1] += $2} END{for (i in a) print "On " i, "you spend: "a[i] " hour(s)"}' << "file.csv"

the output I get looks like this:

On 20171129 you spend: 1 hour(s)
On 20171201 you spend: 1 hour(s)
On 20171202 you spend: 3 hour(s)

what I want to achieve now is format the date as if I would with:

awk -F ',' '{a[$1]} END{for (i in a) print i}' << "file.csv" \
 | date +"%a, %d.%m.%Y" -f -

# prints:
Wed, 29.11.2017
Fri, 01.12.2017
Sat, 02.12.2017

so my final result would look like:

On Wed, 29.11.2017 you spend: 1 hour(s)
On Fri, 01.12.2017 you spend: 1 hour(s)
On Sat, 02.12.2017 you spend: 3 hour(s)

is it possible to invoke date within the awk command to format the output?

Best Answer

You could use gawk which has the strftime and mktime functions (https://www.gnu.org/software/gawk/manual/html_node/Time-Functions.html).

gawk -F ',' '{a[$1] += $2} END{for (i in a) print "On " strftime("%a, %d.%m.%Y", mktime( substr(i,1,4) " " substr(i,5,2) " " substr(i,7,2) " 0 0 0" )) " you spend: "a[i] " hour(s)" }' files.csv

In more details:

gawk -F ',' '
  {
    a[$1] += $2
  }
  END{
    for (i in a) {

      # mktime needs a date formated like this "2017 12 31 23 59 59"
      # strftime needs a Unix timestamp (produced by mktime)

      print "On " strftime("%a, %d.%m.%Y", mktime( substr(i,1,4) " " substr(i,5,2) " " substr(i,7,2) " 0 0 0" )) " you spend: "a[i] " hour(s)"

    }
  }' files.csv

With a basic awk, you need to call the command and read its result with getline:

awk -F ',' '{a[$1] += $2} END{ for (i in a) { COMMAND = "date +\"%a, %d.%m.%Y\" -d " i ; if ( ( COMMAND | getline DATE ) 0 ) { print "On " DATE " you spend: "a[i] " hour(s)" } ; close(COMMAND) } }' files.csv

In more details:

awk -F ',' '
  {
    a[$1] += $2
  }
  END{
    for (i in a) {

      # Define command to call

      COMMAND = "date +\"%a, %d.%m.%Y\" -d " i

      # Call command and check that it prints something
      # We put the 1st line of text displayed by the command in DATE

      if ( ( COMMAND | getline DATE ) > 0 ) {
        # Print the result
        print "On " DATE " you spend: "a[i] " hour(s)"
      }

      # Close the command (important!)
      # Your child process is still here if you do not close it

      close(COMMAND)
    }
  }' files.csv 
Related Question