I was trying to determine the best performance for string fill like in:
str+="A"
#one per loop
I came with this script for bash:
#!/bin/bash
bReport=false
nLimit=${1-3000}; #up to 25000
echo "nLimit='$nLimit'"
shopt -s expand_aliases
nStop=100000;fMaxWorkTime=1.0;
alias GetTime='date +"%s.%N"';
nTimeBegin="`GetTime`";
nDelayPart="`GetTime`";
strFinal="";
str="";
fPartWorkSleep="`bc <<< "scale=10;($fMaxWorkTime/$nStop)*$nLimit"`"
echo "fPartWorkSleep='$fPartWorkSleep'"
nCount=0;
while true;do
str+="A";
((nCount++))&&:;
if(((nCount%nLimit)==0)) || ((nCount==nStop));then
strFinal+="$str";
str="";
if $bReport;then
echo "`bc <<< "$(GetTime)-$nDelayPart"` #${#strFinal} #`bc <<< "$(GetTime)-$nTimeBegin"`";
nDelayPart="`GetTime`";
fi
sleep $fPartWorkSleep # like doing some weigthy thing based on the amount of data processed
fi;
if((nCount==nStop));then
break;
fi;
done;
echo "strFinal size ${#strFinal}"
echo "took `bc <<< "$(GetTime)-$nTimeBegin"`"
And in bash the best performance/size is when str
is limited from 3000 to 25000 characters (on my machine). After each part is filled, it must be emptied and some weigthy action can be performed with str
value (and the weight is relative to its size).
So my question is, what shell has the best string fill performance? based on what I exposed. I am willing to use other shell than bash, just for this kind of algorithm, it if proves to be faster.
PS.: I had to use nCount as checks on string size degraded performance.
Best Answer
So this benches the various shells set to
$sh
in thefor
loop on how quickly they can generate a string of 100,000 characters. The first 11 of those 100,000 chars aresome string
as is first set to the value of$str
, but the tail fill is 999,989A
chars.The shells get the
A
chars in$*
which substitutes in the first character in the value of the special shell parameter$IFS
as a concatenation delimiter between every positional parameter in shell's argument array. Because all of the arguments are""
null, the only chars in$*
are the delimiter chars.The arguments are accrued at an exponential rate for each iteration of the
while
loop - which onlybreak
s when the$20001
parameter has finally been${set+}
. Until then, basically thewhile
loop does:...and so on.
After the
while
loop completes$IFS
is set toA
and the special shell parameter$*
is concatenated five times to the tail of$str
.printf
trims the resulting%s
tring argument to a maximum of.100000
bytes before writing it out to its stdout.One might use the same strategy like:
...which results in a total argument count of 40 - and so 39 delimiters...
And you can reuse the same arguments you've already set w/ a different
$IFS
for a different fill:You can also fill in the null arguments with a
printf
format string rather than using$IFS
: