Linux – How to split CSV files as per number of rows specified

awkcsvlinuxsedunix

I've CSV file (around 10,000 rows ; each row having 300 columns) stored on LINUX server. I want to break this CSV file into 500 CSV files of 20 records each. (Each having same CSV header as present in original CSV)

Is there any linux command to help this conversion?

Best Answer

For the sake of completeness, here are some minor improvements:

  • You could save the header once and reuse many times
  • You could insert the header in the split files using sed without temporary files

Like this:

header=$(head -n 1 file.csv)
tail -n +2 file.csv | split -l 20
for file in x??; do
    sed -i -e 1i$'\\\n'"$header" "$file"
done

The $'\\\n' there is a NEWLINE character escaped with a backslash. The sed expression means: insert $header before the 1st line.

Related Question