VMware snapshots are the base of every backup solution designed around VADP (VMware APIs for Data Protection). If a virtual disk can be snapshotted, it can be saved. Plain and simple.
However, there are some situations where this solution can eventually lead to some problems. The best known, and if for example you are among the usual Veeam forums users you know them well, are Exchange 2010 virtual machines.
Exchange 2010 disk usage behaviour
The new underlying database format was designed by Microsoft to be light on the storage it runs on, and is well explained here. The technical reason is in this quote:
“By increasing the size of the I/O and reducing the frequency of read/writes in Exchange 2010, ESE is able to increase performance. In addition, ESE can increase performance by making the data in the database more sequential, which increases the likelihood that related data is in the same vicinity in the B-tree.
In Exchange, all data inside the database is stored in B-trees, and the B-trees are then divided into pages. In Exchange 2007 and earlier, the data stored in the B-trees isn’t contiguous. In fact, previous versions of Exchange performed random read/writes to the database. This means that related data may not be in the same vicinity on the hard disk. Non-contiguous data requires more passes to read and write to the hard disk.”
So, to sum it up, Exchange tries to use as much as possible contiguous disk space and writes data in larger blocks. Sounds like a design for physical installations, but even in virtualized environments it gives great benefits. Since RAM is becoming cheaper and cheaper, and on the new vSphere 5.1 vRam limits went away, you can add much more RAM to an Exchange 2010 and use it as cache, thus lowering the burden on disks.
VMware has conducted some performance tests and the results can be read here.
At the end, from a production standpoint, these has been great news, we can run Exchange 2010 on slow sata disks and have great performances nontheless, so another “Tier 1” application can be virtualized without fears.
The problem with VADP backups
But, what happens from a VADP standpoint? Incremental backups are made leveraging CBT informations made available by vSphere itself. The backup program reads which data blocks are changed since the previous backup, and saves only them. But since Exchange 2010 works in the above described fashion, there are much more changed blocks, and backup data sets are bigger than previous Exchange versions.
This leads to two consequences: first, backup sets are bigger and use more backup storage space, second they take more time to be completed and thus the commitment of the snapshot at the end of the backup requires much more time than previous Exchange version, and sometimes this creates problems to the guest OS and the Exchange services. You can read around about Exchange 2010 users having to wait for literally hours to complete a snapshots commit.
Also, to further decrease performances, you placed Exchange VM in a slow SATA datastore as per suggested practices, so every disk operation involving it would be affected by this slowness, backups and snapshots commits included.
At the end, if you take a look at the whole picture, a really slow sata datastore for Exchange 2010 maybe is not the perfect fit after all…