I have a CSV file (in which the field separator is indeed comma) with 8 columns and a few million rows. Here's a sample:
1000024447,38111220,201705,181359,0,12,1,3090
1064458324,38009543,201507,9,0,1,1,1298
1064458324,38009543,201508,9,0,2,1,90017
What's the fastest way to print the sum of all numbers in a given column, as well as the number of lines read? Can you explain what makes it faster?
Best Answer
GNU datamash
Some testing
So
mawk
anddatamash
appear to be the pick of the bunch.