Linux – Why has ‘cat’ this strange time behaviour

linuxperformance

I'm using cat to pipe different files into one big file. The number of different files varies, from two files up to ten, but the total size of all files is always the same (a couple of GB).

My problem: Whenever I get to the case where I have a total of six files, the time it takes to concatenate them peaks (i.e significantly more than with five or seven), and I have no idea why.

Anyone has an idea?

The files (all same size)

output
outputTEMP1
outputTEMP2
outputTEMP3
outputTEMP4
outputTEMP5

Command

cat outputTEMP* >> output && rm -f outputTEMP*

Currently, the Machine has to perform some calculations, but I will update later when new measurements are available.

Best Answer

One way to debug this problem is to use strace.

strace -tt -e trace=open,close -o /tmp/strace.cat.log cat apt.list authors.txt >/tmp/t.test
cat /tmp/strace.cat.log 

23:12:08.022588 open("apt.list", O_RDONLY|O_LARGEFILE) = 3
23:12:08.023451 close(3)                = 0
23:12:08.023717 open("authors.txt", O_RDONLY|O_LARGEFILE) = 3
23:12:08.025403 close(3)                = 0

-tt option logs the time stamp of system call to milli-seconds resolution. -e trace=open,close log only open,close API. Try remove them and you will see a very noisy log file.

Related Question