Shell – Randomly draw a certain number of lines from a data file

linuxshelltext processing

I have a data list, like

12345
23456
67891
-20000
200
600
20
...

Assume the size of this data set (i.e. lines of file) is N. I want to randomly draw m lines from this data file. Therefore, the output should be two files, one is the file including these m lines of data, and the other one includes N-m lines of data.

Is there a way to do that using a Linux command?

Best Answer

This might not be the most efficient way but it works:

shuf <file> > tmp
head -n $m tmp > out1
tail -n +$(( m + 1 )) tmp > out2

With $m containing the number of lines.

Related Question