Tag Archives: qcow2

Documentation of QEMU Block Device Operations

QEMU Block Layer currently (as of QEMU 2.10) supports four major kinds of live block device jobs – stream, commit, mirror, and backup. These can be used to manipulate disk image chains to accomplish certain tasks, e.g.: live copy data from backing files into overlays; shorten long disk image chains by merging data from overlays into backing files; live synchronize data from a disk image chain (including current active disk) to another target image; and point-in-time (and incremental) backups of a block device.

To that end, recently I have written documentation (thanks to the QEMU Block Layer maintainers & developers for the reviews) of the usage of following commands:

  • block-stream
  • block-commit
  • drive-mirror (and blockdev-mirror)
  • drive-backup (and blockdev-backup)

Each of the above block device jobs, their QMP (QEMU Machine Protocol) invocation examples are documented.

Here’s the source. And here’s the Sphinx-rendered HTML version.

This documentation can be handy in those (debugging) scenarios when it’s instructive to look at what is happening behind the scenes of QEMU. For example, live storage migration (without shared storage setup) is one of the most common use-cases that takes advantage of the QMP drive-mirror command and QEMU’s built-in Network Block Device (NBD) server. Here’s the QMP-level workflow for it — this is the flow libvirt internally implements (with some additional niceties).

Leave a comment

Filed under Uncategorized

Creating rapid thin-provisioned guests using QEMU backing files

Provisioning virtual machines very rapidly is highly desirable, especially, when deploying large number of virtual machines. With QEMU’s backing files concept, we can instantiate several clones, by creating a single base-image and then sharing it(read-only) across multiple guests. So that, these guests, when modified will write all their changes to their disk image

To exemplify:

Initially, let’s create a minimal Fedora 17 virtual guest (I used this script), and copy the resulting qcow2 disk image as base-f17.qcow2. So, base-f17.qcow2 has Fedora 17 on it, and is established as our base image. Let’s see the info of it

$ qemu-img info base-f17.qcow2
image: base-f17.qcow2
file format: qcow2
virtual size: 5.0G (5368709120 bytes)
disk size: 5.0G
cluster_size: 65536
[root@localhost vmimages]# 

Now, let’s make use of the above F17 base image and try to instantiate 2 more Fedora 17 virtual machines, quickly. First, create a new qcow2 file(f17vm2-with-b.qcow2) using the base-f7.qcow2 as its backing-file:

$ qemu-img create -b /home/kashyap/vmimages/base-f17.qcow2 \
  -f qcow2 /home/kashyap/vmimages/f17vm2-with-b.qcow2
Formatting '/home/kashyap/vmimages/f17vm2-with-b.qcow2', fmt=qcow2 size=5368709120 backing_file='/home/kashyap/vmimages/base-f17.qcow2' encryption=off cluster_size=65536 lazy_refcounts=off 

And now, let’s see some information about the just created disk image. (It can be noticed the ‘backing file’ attribute below pointing to our base image(base-f17.qcow2)

$ qemu-img info /home/kashyap/vmimages/f17vm2-with-b.qcow2
image: /home/kashyap/vmimages/f17vm2-with-b.qcow2
file format: qcow2
virtual size: 5.0G (5368709120 bytes)
disk size: 196K
cluster_size: 65536
backing file: /home/kashyap/vmimages/base-f17.qcow2
[root@localhost vmimages]# 

Now, we’re set — our ‘f17vm2-with-b.qcow2‘ is ready to use. We can verify it in two ways:

  1. to quickly verify, we can invoke qemu-kvm (not recommended in production) — this will boot our new guest on stdio, and throws a serial console (NOTE: the base-f17.qcow2 had ‘console=tty0 console=ttyS0,115200’ on its kernel command line, so that it can provide serial console) —
    $ qemu-kvm -enable-kvm -m 1024 f17vm2-with-b.qcow2 -nographic
    
                              GNU GRUB  version 2.00~beta4
    
     +--------------------------------------------------------------------------+
     |Fedora Linux                                                              | 
     |Advanced options for Fedora Linux                                         |
     |                                                                          |
     |                                                                          |
     |                                                                          |
     |                                                                          |
     |                                                                          |
     |                                                                          |
     |                                                                          |
     |                                                                          |
     |                                                                          |
     |                                                                          | 
     +--------------------------------------------------------------------------+
    
          Use the ^ and v keys to select which entry is highlighted.      
          Press enter to boot the selected OS, `e' to edit the commands      
          before booting or `c' for a command-line.      
                                                                                   
                                                                                   
    Loading Linux 3.3.4-5.fc17.x86_64 ...
    Loading initial ramdisk ...
    [    0.000000] Initializing cgroup subsys cpuset
    .
    .
    .
    (none) login: root
    Password: 
    Last login: Thu Oct  4 07:07:54 on ttyS0
    $ 
    
  2. The other, more traditional way(so that libvirt could track it & can be used to manage the guest), is to copy a similar(F17) libvirt XML file, edit and update the name, uuid, disk path, mac-address, then define it, and start it via ‘virsh’:
    $ virsh define f17vm2-with-b.xml
    $ virsh start f17vm2-with-b --console
    $  virsh list
     Id    Name                           State
    ----------------------------------------------------
     9     f17v2-with-b                  running
    

Now, let’s quickly check the disk-image size of our new thin-provisioned guest. It can be noticed, the size is quite thin (14Mb) — meaning, only the delta from the original backing file will be written to this image.

$ ls -lash f17vm2-with-b.qcow2
14M -rw-r--r--. 1 root root 14M Oct  4 06:30 f17vm2-with-b.qcow2
$

To instantiate our 2nd F17 guest(say f17vm3-with-b) — again, create a new qcow2 file(f17vm3-with-b.qcow2) with its backing file as our base image base-f17.qcow2 . And then, check the info of the disk image using ‘qemu-img’ tool.

#----------------------------------------------------------#
$ qemu-img create -b /home/kashyap/vmimages/base-f17.qcow2
 &nbsp -f qcow2 /home/kashyap/vmimages/f17vm3-with-b.qcow2
Formatting '/home/kashyap/vmimages/f17vm3-with-b.qcow2', fmt=qcow2 size=5368709120 backing_file='/home/kashyap/vmimages/base-f17.qcow2' encryption=off cluster_size=65536 lazy_refcounts=off 
#----------------------------------------------------------#
$ qemu-img info /home/kashyap/vmimages/f17vm3-with-b.qcow2
image: /home/kashyap/vmimages/f17vm3-with-b.qcow2
file format: qcow2
virtual size: 5.0G (5368709120 bytes)
disk size: 196K
cluster_size: 65536
backing file: /home/kashyap/vmimages/base-f17.qcow2
$
#----------------------------------------------------------#

[it’s worth noting here that we’re pointing to the same base image, and multiple guests are using it as a backing file.]

Again check the disk image size of the thin-provisioned guest:

$ ls -lash f17vm3-with-b.qcow2
14M -rw-r--r--. 1 qemu qemu 14M Oct  4 07:18 f17vm3-with-b.qcow2

Goes without saying, the 2nd F17 guest also has a new XML file, defined w/ its unique attributes just like the 1st F17 guest.

$ virsh list
 Id    Name                           State
----------------------------------------------------
 9     f17vm2-with-b                  running
 10    f17vm3-with-b                  running

For reference sake, I’ve posted the xml file I’ve used for ‘f17vm3-with-b’ guest here

To summarize, by sharing a single, common base-image, we can quickly deploy multiple thin-provisioned virtual machines.


                      .----------------------.
                      | base-image-f17.qcow2 |
                      |                      |
                      '----------------------'
                         /       |         \
                        /        |          \
                       /         |           \
                      /          |            \
         .-----------v--.  .-----v--------.  .-v------------.
         | f17vm2.qcow2 |  | f17vm3.qcow2 |  | f17vmN.qcow2 |
         |              |  |              |  |              |
         '--------------'  '--------------'  '--------------'
            

2 Comments

Filed under Uncategorized

External (and Live) snapshots with libvirt

Previously, I posted about snapshots here , which briefly discussed different types of snapshots. In this post, let’s explore how external snapshots work. Just to quickly rehash, external snapshots are a type of snapshots where, there’s a base image(which is the original disk image), and then its difference/delta (aka, the snapshot image) is stored in a new QCOW2 file. Once the snapshot is taken, the original disk image will be in a ‘read-only’ state, which can be used as backing file for other guests.

It’s worth mentioning here that:

  • The original disk image can be either in RAW format or QCOW2 format. When a snapshot is taken, ‘the difference’ will be stored in a different QCOW2 file
  • The virtual machine has to be running, live. Also with Live snapshots, no guest downtime is experienced when a snapshot is taken.
  • At this moment, external(Live) snapshots work for ‘disk-only’ snapshots(and not VM state). Work for both disk and VM state(and also, reverting to external disk snapshot state) is in-progress upstream(slated for libvirt-0.10.2).

Before we go ahead, here’s some version info, I’m testing on Fedora-17(host), and the guest(named ‘testvm’) is running Fedora-18(Test Compose):

$ rpm -q libvirt qemu-kvm ; uname -r
libvirt-0.10.1-3.fc17.x86_64
qemu-kvm-1.2-0.2.20120806git3e430569.fc17.x86_64
3.5.2-3.fc17.x86_64
$ 

External disk-snapshots(live) using QCOW2 as original image:
Let’s see an illustration of external(live) disk-only snapshots. First, let’s ensure the guest is running:

$ virsh list
 Id    Name                           State
----------------------------------------------------
 3     testvm                          running


$ 

Then, list all the block devices associated with the guest:

$ virsh domblklist testvm --details
Type       Device     Target     Source
------------------------------------------------
file       disk       vda        /export/vmimgs/testvm.qcow2

$ 

Next, let’s create a snapshot(disk-only) of the guest this way, while the guest is running:

$ virsh snapshot-create-as testvm snap1-testvm "snap1 description" \
  --diskspec vda,file=/export/vmimgs/snap1-testvm.qcow2 \
  --disk-only --atomic

Some details of the flags used:
– Passing a ‘–diskspec’ parameter adds the ‘disk’ elements to the Snapshot XML file
– ‘–disk-only’ parameter, takes the snapshot of only the disk
– ‘–atomic’ just ensures either the snapshot is run completely or fails w/o making any changes

Let’s check the information about the just taken snapshot by running qemu-img:

$ qemu-img info /export/vmimgs/snap1-testvm.qcow2 
image: /export/vmimgs/snap1-testvm.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 2.5M
cluster_size: 65536
backing file: /export/vmimgs/testvm.qcow2
$ 

Apart from the above, I created 2 more snapshots(just the same syntax as above) for illustration purpose. Now, the snapshot-tree looks like this:

$ virsh snapshot-list testvm --tree

snap1-testvm
  |
  +- snap2-testvm
      |
      +- snap3-testvm
        

$ 

For the above example image file chain[ base<-snap1<-snap2<-snap3 ], it has to be read as – snap3 has snap2 as its backing file, snap2 has snap1 as its backing file, and snap1 has the base image as its backing file. We can see the backing file info from qemu-img:

#--------------------------------------------#
$ qemu-img info /export/vmimgs/snap3-testvm.qcow2
image: /export/vmimgs/snap3-testvm.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 129M
cluster_size: 65536
backing file: /export/vmimgs/snap2-testvm.qcow2
#--------------------------------------------#
$ qemu-img info /export/vmimgs/snap2-testvm.qcow2
image: /export/vmimgs/snap2-testvm.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 3.6M
cluster_size: 65536
backing file: /export/vmimgs/snap1-testvm.qcow2
#--------------------------------------------#
$ qemu-img info /export/vmimgs/snap1-testvm.qcow2
image: /export/vmimgs/snap1-testvm.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 2.5M
cluster_size: 65536
backing file: /export/vmimgs/testvm.qcow2
$
#--------------------------------------------#

Now, if we do not need snap2 any more, and want to pull all the data from snap1 into snap3, making snap1 as snap3’s backing file, we can do a virsh blockpull operation as below:

#--------------------------------------------#
$ virsh blockpull --domain testvm \
  --path /export/vmimgs/snap3-testvm.qcow2 \
  --base /export/vmimgs/snap1-testvm.qcow2 \
  --wait --verbose
Block Pull: [100 %]
Pull complete
#--------------------------------------------#

Where, –path = path to the snapshot file, and –base = path to a backing file from which the data to be pulled. So from above example, it’s evident that we’re pulling the data from snap1 into snap3, and thus flattening the backing file chain resulting in snap1 as snap3’s backing file, which can be noticed by running qemu-img again.
Thing to note here,

$ qemu-img info /export/vmimgs/snap3-testvm.qcow2
image: /export/vmimgs/snap3-testvm.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 145M
cluster_size: 65536
backing file: /export/vmimgs/snap1-testvm.qcow2
$ 

A couple of things to note here, after discussion with Eric Blake(thank you):

  • If we do a listing of the snapshot tree again(now that ‘snap2-testvm.qcow2’ backing file is no more in use),
$ virsh snapshot-list testvm --tree
snap1-testvm
  |
  +- snap2-testvm
      |
      +- snap3-testvm
$

one might wonder, why is snap3 still pointing to snap2? Thing to note here is, the above is the snapshot chain, which is independent from each virtual disk’s backing file chain. So, the ‘virsh snapshot-list’ is still listing the information accurately at the time of snapshot creation(and not what we’ve done after creating the snapshot). So, from the above snapshot tree, if we were to revert to snap1 or snap2 (when revert-to-disk-snapshots is available), it’d still be possible to do that, meaning:

It’s possible to go from this state:
base <- snap123 (data from snap1, snap2 pulled into snap3)

we can still revert to:

base<-snap1 (thus undoing the changes in snap2 & snap3)

External disk-snapshots(live) using RAW as original image:
With external disk-snapshots, the backing file can be RAW as well (unlike with ‘internal snapshots’ which only work with QCOW2 files, where the snapshots and delta are all stored in a single QCOW2 file)

A quick illustration below. The commands are self-explanatory. It can be noted the change(from RAW to QCOW2) in the block disk associated with the guest, before & after taking the disk-snapshot (when virsh domblklist command was executed)

#-------------------------------------------------#
$ virsh list | grep f17btrfs2
 7     f17btrfs2                      running
$
#-------------------------------------------------#
$ qemu-img info /export/vmimgs/f17btrfs2.img
image: /export/vmimgs/f17btrfs2.img
file format: raw
virtual size: 20G (21474836480 bytes)
disk size: 1.5G
$ 
#-------------------------------------------------#
$ virsh domblklist f17btrfs2 --details
Type       Device     Target     Source
------------------------------------------------
file       disk       hda        /export/vmimgs/f17btrfs2.img

$ 
#-------------------------------------------------#
$ virsh snapshot-create-as f17btrfs2 snap1-f17btrfs2 \
  "snap1-f17btrfs2-description" \
  --diskspec hda,file=/export/vmimgs/snap1-f17btrfs2.qcow2 \
  --disk-only --atomic
Domain snapshot snap1-f17btrfs2 created
$ 
#-------------------------------------------------#
$ qemu-img info /export/vmimgs/snap1-f17btrfs2.qcow2
image: /export/vmimgs/snap1-f17btrfs2.qcow2
file format: qcow2
virtual size: 20G (21474836480 bytes)
disk size: 196K
cluster_size: 65536
backing file: /export/vmimgs/f17btrfs2.img
$ 
#-------------------------------------------------#
$ virsh domblklist f17btrfs2 --details
Type       Device     Target     Source
------------------------------------------------
file       disk       hda        /export/vmimgs/snap1-f17btrfs2.qcow2
$ 
#-------------------------------------------------#

Also note: All snapshot XML files, where libvirt tracks the metadata of snapshots are are located under /var/lib/libvirt/qemu/snapshots/$guestname (and the original libvirt xml file is located under /etc/libvirt/qemu/$guestname.xml)

18 Comments

Filed under Uncategorized

Little more disk I/O perf. improvement with ‘fallocate’ing a qcow2 disk

[04-NOV-2015: Important update: Since this change in upstream QEMU, which introduces two new options preallocation=falloc and preallocation=full to qemu-img create, this is strongly recommended to use qemu-img create -f qcow2 preallocation=falloc […] to get the said performance benefits.]

Recently I’ve started using ‘preallocation=metadata’ flag while creating qcow2 disk images to extract some decent I/O performance. Today, while discussing qcow2 disk image performance with Stefan Hajnoczi (thank you!) on irc, I found, using fallocate — which preallocates all the blocks to a file — on a qcow2 disk image would improve disk I/O performance a little more as alls the blocks are allocated to the file ahead of time. (Just to note – fallocate comes w/ the linux standard pkg ‘util-linux-ng’)

Let’s run a quick test to see the disk I/O performance improvement by preallocating all the space in a qcow2 disk.

Create the disk image with ‘preallocation=metadata’

 
$ qemu-img create -f qcow2 -o preallocation=metadata /export/vmimgs/f16-test1.qcow2 8G
Formatting '/export/vmimgs/f16-test1.qcow2', fmt=qcow2 size=8589934592 encryption=off cluster_size=65536 preallocation='metadata' 
 

Let’s check the size of the image in bytes


$ ls -l /export/vmimgs/f16-test1.qcow2
-rw-r--r--. 1 root root 8591507456 Dec  2 16:55 /export/vmimgs/f16-test1.qcow2

# Also, print the allocated file size in blocks
$ ls -lash /export/vmimgs/f16-test1.qcow2
1.4M -rw-r--r--. 1 root root 8.1G Dec  2 16:55 /export/vmimgs/f16-test1.qcow2
 

Run fallocate to preallocate space to the disk image:


$ fallocate -l 8591507456 /export/vmimgs/f16-test1.qcow2 
 

Now, re-run ‘ls’ to print the allocated file size in blocks. (Notice that all the disk size, 8G, is now allocated.)


$ ls -lash /export/vmimgs/f16-test1.qcow2
8.1G -rw-r--r--. 1 root root 8.1G Dec  2 16:55 /export/vmimgs/f16-test1.qcow2
$ 
 

Also, let’s run ‘qemu-img info’ to get the disk size, virtual size.


$ qemu-img info f16-test1.qcow2 
image: f16-test1.qcow2
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 8.0G
cluster_size: 65536
$ 
 

As a simple test, I used the above disk image to create an @core only Fedora-16 guest(on a Fedora-16 host) and clocked the timing — it took roughly 5 min 32 sec to finish. While, previously, w/o fallocateing a disk image, when I clocked the same f-16 timing, it took nearly 8 minutes. So, there is a decent improvement noticed here.

With this, Stefan noted, disk write speed inside the guest machine should also be improved, when blocks are written for the first time. And also, due to less disk fragmentation — as all the space was preallocated in one operation — there would be fewer disk seeks during large read operations.

7 Comments

Filed under Uncategorized

Snapshotting with libvirt for qcow2 images

Libvirt 0.9.6 was recently out. Take a look at 0.9.5 changelog for truckload of features/bugfixes/cleanups(specifically snapshot related) from the libvirt team.

So, I grabbed the F14 srpm from Libvirt ftp, and made a quick Fedora koji scratch build of libvirt-0.9.6 for Fedora 15 and gave the snapshot features a whirl. Here it goes:

(Also noted below is some very useful discussion I had(on #virt, OFTC) with Eric Blake (Upstream/Red Hat’s Libvirt dev, very friendly guy.) on snapshots. It was way informative not to capture it.)

Context on types of snapshots
At the moment, snapshotting in KVM/QEMU/Libvirt land is supported primarily for QCOW2 disk images. I briefly discussed about Qcow2 previously here.

There are several different types of snapshots possible. Some idea on that:

Internal snapshot: A type of snapshot, where a single QCOW2 file will hold both the ‘saved state’ and the ‘delta’ since that saved point. ‘Internal snapshots’ are very handy because it’s only a single file where all the snapshot info. is captured, and easy to copy/move around the machines.

External snapshot: Here, the ‘original qcow2 file’ will be in a ‘read-only’ saved state, and the new qcow2 file(which will be generated once snapshot is created) will be the delta for the changes. So, all the changes will now be written to this delta file. ‘External Snapshots’ are useful for performing backups. Also, external snapshot creates a qcow2 file with the original file as its backing image, and the backing file can be /read/ in parallel with the running qemu.

VM State: This will save the guest/domain state to a file. So, if you take a snapshot including VM state, we can then shut off that guest and use the freed up memory for other purposes on the host or for other guests. Internally this calls qemu monitor’s ‘savevm’ command. Note that this only takes care of VM state(and not disk snapshot). To try this out:

 
#------------------------------------------------
# Memory before saving the guest f15vm3
$ free -m
             total       used       free     shared    buffers     cached
Mem:         10024       5722       4301          0        164       4445
-/+ buffers/cache:       1112       8911
Swap:            0          0          0
#------------------------------------------------
$ virsh list
 Id Name                 State
----------------------------------
  5 f15guest             running
  6 f15vm3               running
#------------------------------------------------
# Save the guest f15vm3 to a file 'foof15vm3'
$ virsh save f15vm3 foof15vm3
Domain f15vm3 saved to foof15vm3
#------------------------------------------------
# Now, f15vm3 is gracefully saved/shutdown.
$ virsh list
 Id Name                 State
----------------------------------
  5 f15guest             running
#------------------------------------------------
# Notice the RAM being freed
$ free -m
             total       used       free     shared    buffers     cached
Mem:         10024       5418       4605          0        164       4493
-/+ buffers/cache:        760       9263
Swap:            0          0          0
#------------------------------------------------
# Let's restore the guest back from the file 'foof15vm3'
$ virsh restore foof15vm3
Domain restored from foof15vm3
#------------------------------------------------
# List the status. f15vm3 is up and running.
$ virsh list
 Id Name                 State
----------------------------------
  5 f15guest             running
  7 f15vm3               running
#------------------------------------------------

For brevity, let’s try out internal disk snapshots where all the snapshot info. (like disk and VM state info) are stored in a single qcow2 file.
Virsh(libvirt shell interface to manage guests) has some neat options for snapshot supports. So, I’ve got an F15 guest (Qcow2 disk image).

Internal Disk Snapshots when the guest is online/running

For illustration purpose, let’s use a Fedora-15 guest called ‘f15guest’ .

$ virsh list
 Id Name                 State
----------------------------------
  4 f15guest             running

$ 

For clarity, ensure there are no prior snapshot instances around.

$ virsh snapshot-list f15guest
 Name                 Creation Time             State
------------------------------------------------------------

$ 

Before creating a snapshot, we need to create a snapshot xml file with 2 simple elements (name and description) if you need sensible name for the snapshot. Note that only these two fields are user settable. Rest of the info. will be filled by Libvirt.

$  cat /var/tmp/snap1-f15guest.xml
<domainsnapshot>
    <name>snap1-f15pki </name>
    <description>F15 system with dogtag pki packages </description>
</domainsnapshot>

$ 

Eric Blake noted that, the domainsnapshot xml file is optional now for ‘snapshot-create’ if you don’t need a description for the snapshot. And if it’s okay with libvirt generating the snapshot name for us. (More on this, refer below)

Now, I’m taking a snapshot while the ‘guest’ is running live. Here, Eric noted that, especially when running/live, the more RAM the guest has, and the more active the guest is modifying that RAM, the longer the it will take to create a snapshot. This was a guest was mostly an idle guest.

$ virsh snapshot-create f15guest /var/tmp/snap1-f15guest.xml
Domain snapshot snap1-f15pki  created from '/var/tmp/snap1-f15guest.xml'
$ 

While the snapshot-creation is in progress on the live guest, the state of the guest will be ‘paused’.

$ virsh list
 Id Name                 State
----------------------------------
  4 f15guest             paused

$ 

Once, the snapshot is created, list the snapshots of f15guest

$ virsh snapshot-list f15guest
 Name                 Creation Time             State
------------------------------------------------------------
 snap1-f15pki         2011-10-04 19:04:00 +0530 running

$

Internal snapshot while the guest is offline

For fun, I created another snapshot, but after shutting down the guest. Now, the snapshot creation is just instantaneous.

$ virsh list
 Id Name                 State
----------------------------------

$ virsh snapshot-create f15guest
Domain snapshot 1317757628 created

List the snapshots of ‘f15guest’ using virsh.

$ virsh snapshot-list f15guest
 Name                 Creation Time             State
------------------------------------------------------------
 1317757628           2011-10-05 01:17:08 +0530 shutoff
 snap1-f15pki         2011-10-04 19:04:00 +0530 running

To see some information about the VM size, snapshot info:

$ qemu-img info /export/vmimgs/f15guest.qcow2
image: /export/vmimgs/f15guest.qcow2
file format: qcow2
virtual size: 8.0G (8589934592 bytes)
disk size: 3.2G
cluster_size: 65536
Snapshot list:
ID        TAG                 VM SIZE                DATE       VM CLOCK
1         snap1-f15pki           1.7G 2011-10-04 19:04:00   32:06:34.974
2         1317757628                0 2011-10-05 01:17:08   00:00:00.000
$ 

To revert to a particular snapshot, virsh snapshot-revert domain snapshotname

Also, discussed with Eric, in what cases does virsh invoke Qemu’s ‘savevm‘ and ‘qemu-img snapshot -c‘ commands while creating different types of snapshots discussed earlier above. Here is the outline:

  • it uses ‘qemu-img snapshot -c‘ if the domain is offline and –disk-only was not specified
  • it uses qemu’s ‘savevm‘ if the domain is online and –disk-only was not specified
  • it uses qemu’s ‘snapshot_blkdev‘ if the domain is online and –disk-only is specified

(Note: –disk-only is an option to capture only ‘disk state’ but not VM state. This option is available w/ virsh ‘snapshot-create’ or ‘snapshot-create-as’ commands.)

Thanks Eric for the detail.

18 Comments

Filed under Uncategorized

Creating a Qcow2 virtual machine

Qcow2 disk image is an interesting format which supports features like internal and external snapshots, backing files, image compression, encryption. But also, it’s I/O performance is very slow compared to RAW format. Here are a couple of settings which can extract reasonable performance out of Qcow2 disk images.

Create a qcow2 disk image
First, let’s create a qcow2 disk image using ‘qemu-img’ tool

$ /usr/bin/qemu-img create -f qcow2 -o preallocation=metadata /export/vmimgs/glacier.qcow2 8G

NOTE: At this point in time, preallocation=metadata option is the best we can do to extract max. possible (near RAW) I/O performance out of QCOW2 format. (hint from Kevin Wolf – Qemu/Qcow2 developer )

From the below listing that 970M is the allocated or used size of the guest, 8.1G is the max size the image can ‘grow to’.


[root@moon tmp]# ls -lash /export/vmimgs/glacier.qcow2 
970M -rw-r--r--. 1 qemu qemu 8.1G Sep 24 23:45 /export/vmimgs/glacier.qcow2
[root@moon tmp]# 

Create the guest

# Create an unattended minimal guest install using a qcow2 disk image
virt-install --connect=qemu:///system \
    --network=bridge:br0 \
    --initrd-inject=/var/tmp/fed-minimal.ks \
    --extra-args="ks=file:/fed-minimal.ks console=tty0 console=ttyS0,115200" \
    --name=glacier \
    --disk path=/export/vmimgs/glacier.qcow2,format=qcow2 \
    --ram 2048 \
    --vcpus=2 \
    --check-cpu \
    --hvm \
    --location=http://download.fedora.redhat.com/pub/fedora/linux/releases/15/Fedora/x86_64/os/ \
    --nographics 

The above will create a minimal guest w/ a qcow2 disk image format. Content of the fed-minimal kickstart is here

Once, the guest is created, ensure to have cache=’none’ parameter in ‘disk’ element of the guest’s xml file (if not present, add it and redefine the xml. It looks like below). This is another aspect which can improve the disk I/O performance.


[root@moon ~]# grep cache /etc/libvirt/qemu/glacier.xml -A 4 -B 1
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/export/vmimgs/glacier.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
[root@moon ~]# virsh define /etc/libvirt/qemu/glacier.xml
Domain glacier defined from /etc/libvirt/qemu/glacier.xml
[root@moon ~]# virsh start glacier
Domain glacier started

[root@moon ~]#

I’m still trying to wrap my head around the caching and preallocation mechanisms of the qcow2 format. Meanwhile, work on Qcow2 version-3 is in progress in upstream qemu.

4 Comments

Filed under Uncategorized