Ubuntu – Ambiguity using “ls -l” and “file” commands on a qcow file

file sizeqcow2

I have a qcow2 file on my filesystem and I am trying to find the size of that file.

For this, when I do a ls -l in the location where the file is stored, I get 13041664, which means the file size is around 13 MB and when I do a file <filename>, I get:

disk: QEMU QCOW Image (v2), has backing file (path 
/var/lib/nova/instances/_base/035db99541e92b5cca93bf18a997d626f), 21474836480 bytes

which, I presume, says the filesize is around 21 GB.

Is this my misinterpretation of the command output or is something else going on inside the filesystem(thin provisioning kind of thing)?

UPDATE: When I do a ls -l on var/lib/nova/instances/_base/035db99541e92b5cca93bf18a997d626f, I get a ls: cannot access /var/lib/nova/instances/_base/035db99541e92b5cca93bf18a997d626f: No such file or directory and its correct that I have no file there

UPDATE 2: The output of qemu-img info <filename> is as follows:

image: disk
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 12M
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/035db99541e92b5cca93bf18a997d626f512b73d (actual path: /var/lib/nova/instances/_base/035db99541e92b5cca93bf18a997d626f512b73d)

Best Answer

From Qcow in Wikipedia:

One of the main characteristics of qcow disk images is that files with this format can grow as data is added. This allows for smaller file sizes than raw disk images, which allocate the whole image space to a file, even if parts of it are empty.

So the file size is really 13MB, but it can grow up to 20GB when data is written to it. Example:

$ qemu-img create -f qcow2 test.img 2G
Formatting 'test.img', fmt=qcow2 size=2147483648 encryption=off cluster_size=65536 lazy_refcounts=off 
$ ls -l test.img 
-rw-r--r-- 1 carvalho carvalho 197120 Jul 18 09:30 test.img
$ file test.img 
test.img: QEMU QCOW Image (v2), 2147483648 bytes

An empty qcow2 file was created. It can hold a 2GB filesystem, but for now it only occupies 197KB in disk.


From http://en.wikibooks.org/wiki/QEMU/Images:

The "cow" part of qcow2 is an acronym for copy on write, a neat little trick that allows you to set up an image once and use it many times without changing it. This is ideal for developing and testing software, which generally requires a known stable environment to start off with. You can create your known stable environment in one image, and then create several disposable copy-on-write images to work in.

To start a new disposable environment based on a known good image, invoke the qemu-img command with the option -o backing_file and tell it what image to base its copy on. When you run QEMU using the disposable environment, all writes to the virtual disc will go to this disposable image, not the base copy.

From the qemu-img manpage:

If the option backing_file is specified, then the image will record only the differences from backing_file. No size needs to be specified in this case. backing_file will never be modified unless you use the "commit" monitor command (or qemu-img commit).

Example:

$ qemu-img create -f qcow2 -o backing_file=test.img test01.img
Formatting 'test01.img', fmt=qcow2 size=2147483648 backing_file='test.img' encryption=off cluster_size=65536 lazy_refcounts=off 
$ file test01.img 
test01.img: QEMU QCOW Image (v2), has backing file (path test.img), 2147483648 bytes

In your case /var/lib/nova/instances/_base/035db99541e92b5cca93bf18a997d626f512b73d is the backing file. I don't know what is the expected bahavior if you try to use your qcow file without the backing file.


About /var/lib/nova/instances in OpenStack documentation:

This directory contains the libvirt KVM file-based disk images for the instances that are hosted on that compute node. If you are not running your cloud in a shared storage environment, this directory will be unique across all compute nodes.

/var/lib/nova/instances contains two types of directories.

The first is the _base directory. This contains all of the cached base images from glance for each unique image that has been launched on that compute node. Files ending in _20 (or a different number) are the ephemeral base images.

The other directories are titled instance-xxxxxxxx. These directories correspond to instances running on that compute node. The files inside are related to one of the files in the _base directory. They're essentially differential-based files containing only the changes made from the original _base directory.