Awk memory leak

awkmemoryosx

Base on this I'm running the command

< /dev/urandom hexdump -v -e '/1 "%u\n"' |
awk '{ split("0,2,4,5,7,9,11,12",a,",");
       for (i = 0; i < 1; i+= 0.0001)
         printf("%08X\n", 100*sin(1382*exp((a[$1 % 8]/12)*log(2))*i)) }' |
xxd -r -p |
sox -traw -r44100 -b16 -e unsigned-integer - -tcoreaudio

I notice that the memory used by awk continually grows while this command is running, for example consuming over 500MB of memory by the time 75MB of raw audio data has been played. All of the other commands in the pipeline maintain a constant amount of memory.

What is awk using this memory for and is there an alternative that does the intended stream processing using only a constant amount of memory?


in case the awk version matters:

⑆ awk --version
awk version 20070501

Here's the command I tested based on Thomas Dickey's answer:

< /dev/urandom hexdump -v -e '/1 "%u\n"' |
awk 'BEGIN { split("0,2,4,5,7,9,11,12",a,",") }
           { for (i = 0; i < 1; i+= 0.0001)
               printf("%08X\n", 100*sin(1382*exp((a[$1 % 8]/12)*log(2))*i)) }' |
xxd -r -p |
sox -traw -r44100 -b16 -e unsigned-integer - -tcoreaudio

Best Answer

This statement is odd:

split("0,2,4,5,7,9,11,12",a,",");

It repetitively splits a constant string to create an array a. If you move that into a BEGIN section, the program should work the same — without allocating a new copy of the a array for each input-record.

Addressing comments: the for-loop and expression do not allocate memory in a simple manner. A quick comparison of mawk, gawk and awk shows that there is no problem with the first two, but /usr/bin/awk on OSX does leak rapidly. If Apple had a bug-reporting system, that would be the place to go.

Related Question