Get size of file except trailing zero bytes

aria2disk-usagefiles

I want to get the size of a file that is being downloaded. Since the file is preallocated, using du -sd just returns its final, full size. I want to know how much has been downloaded, so I don't want those trailing zero bytes to count. How do I get this size?

This should be possible, since aria2c can easily resume its stopped downloads, and it doesn't seem to store the downloaded length in its control (session) files. I have written a script to read total_length from .aria2 control files. This is the total length though, not downloaded length. You can easily use that script and the technical specs to get any other property aria2 stores.

Update from comments:

As ilkkachu was hinting, BITFIELD in the .aria2 file seems to actually
be a map: each bit corresponds to a file chunk, 1 meaning "downloaded"
(0 meaning "not downloaded"). BITFIELD LENGTH gives you the number of
chunks (and the chunk size is likely just that of the file divided by
the chunk number). I'm pretty sure the download progress is given by
the ratio of 1s over the number of chunks in BITFIELD. Unfortunately,
AFAICT, the .aria2 file seems to be updated after some delay, or as
soon as the download is interrupted.

Best Answer

Considering just the issue of finding out how far along aria2 is on a download, there's a few choices.

As discussed in the comments, the information is in a bitmap in the control file (filename.aria2). It's documented in https://aria2.github.io/manual/en/html/technical-notes.html . Having a bitmap doesn't make much sense for an HTTP download, which goes linearly from the start, but I suppose it would make more sense for a BitTorrent download or such.

Here's a hex dump of a control file for a particular download with the important fields marked (od -tx1 file.aria2):

0000000 00 01 00 00 00 00 00 00 00 00 00 10 00 00 00 00
                                      ^^^^^^^^^^^ ^^^^^^  
0000020 00 00 82 9d c0 00 00 00 00 00 00 00 00 00 00 00 
        ^^^^^^^^^^^^^^^^^                         ^^^^^^
0000040 01 06 ff ff ff ff ff ff ff ff ff ff ff ff ff ff
        ^^^^^ ^^^... 
0000060 ff ff ff ff ff ff ff ff ff fe 00 00 00 00 00 00


offset 10: 00 10 00 00 => piece length = 0x100000 = 1 MiB
offset 14: 00 00 00 00 
           82 9d c0 00 => file length = 0x829dc000 = 2191376384 (~ 2 GiB)
offset 30: 00 00 01 06 => size of bitmap = 0x0106 = 262 bytes, could fit 2096 pieces
offset 34: ff ff ...   => bitmap

Counting the set bits in the bitmap, that particular download was interrupted after at least 191 pieces of 1 MiB (200278016 bytes) were downloaded, which pretty much matches the resulting file size I got, 201098200 bytes. (The actual file was bigger by just less then an MiB, the records for in-flight pieces in the control file might mark that, but I didn't care. I didn't have pre-allocation on, just so that I could cross check with the size on the filesystem.)

By default aria2c saves the control file every 60 seconds, but we can use --auto-save-interval=<secs> to change that:

--auto-save-interval=<SEC>
       Save a control file(*.aria2) every SEC seconds.  If 0 is
       given, a control file is not saved during download. aria2
       saves  a  control  file  when  it stops regardless of the
       value.  The possible values are between 0 to 600. 
       Default: 60

Alternatively, I suppose you could use aria2c --log=<logfile> and fish the download progress out of the log. Though it seems the progress is only shown write cache entries in DEBUG level messages, and with those enabled, the log is rather verbose.

Also, you could use --summary-interval=1 to print some progress output to stdout, possibly redirected to some log file (and perhaps with --show-console-readout=false to hide the live readout). Though it only seems to give rounded figures:

 *** Download Progress Summary as of Wed May 13 12:57:11 2020 ***
=================================================================
[#b56779 1.7GiB/2.0GiB(86%) CN:1 DL:105MiB ETA:2s]
FILE: /work/blah.iso
-----------------------------------------------------------------
Related Question