Extract download progress from curl output

curlsed

I'm trying to extract the download progress of a file that's being retrieved by curl.

I tried this, but this doesn't work:

curl --progress-bar http://127.0.0.1/test.tar.bz2 -o test.tar.bz2 2>/dev/stdout | sed -r 's/[# ]//g;s/^/#/g'

The sed expression seems to be fine though:

$ echo '########                      10.2%' | sed -r 's/[# ]//g;s/^/#/g'
#10.2%

Can anyone please point out what I'm doing wrong?

Best Answer

The main issue is that sed works on lines so it doesn't do anything until the first \n is reached and that doesn't happen until your command is finished. You can get around this by exchanging \rs with \ns:

$ curl --progress-bar http://127.0.0.1/test.tar.bz2 -o test.tar.bz2 2>&1 | 
   tr $'\r' $'\n' | sed -r 's/[# ]+/#/g;'

This, however, brings you up against buffering, sed will now act on groups of lines. The final solution I hacked together was to redirect the error to a file and then deal with that file:

$ curl --progress-bar http://127.0.0.1/test.tar.bz2 -o test.tar.bz2 2>er
$ while :; do 
    echo -ne "$(tr $'\r' $'\n' < er | tail -n 1 | sed -r 's/^[# ]+/#/;')\r"; 
  done

The command above will parse the error file (er) and print the result with an \r making it update constantly. You will need to break out of it manually.

Suggestion from an anonymous user: You can also put stdbuf -oL in front of tr and sed thus modifying the buffering behaviour of those commands.

Related Question