I’ve always been a fan of PernixData since the day I first learned about them at Storage Field Day 3 more than a year ago. I’ve always been in touch with them afterwards, visited their offices in San Jose, and I was also accepted into their PernixPro program. Thanks to the program, I got the opportunity to test with some advance the new release of PernixData FVP 2.0, and a couple of new enhancements seemed to me like a perfect fit for an integration with Veeam Backup & Replication. In fact, I also talked about these ideas during some joint presentation I had with Todd Mace and Frank Denneman.
First of all, the new version of FVP allows to use RAM as a caching medium in addition to Flash devices. I see many advantages in using RAM with or instead of Flash. First of all, the speed obviously: for those customers always looking for faster and faster caching systems, instead of swapping and replacing every year a flash device with a newer generation model, they can simply use the RAM inside any server, which is way faster than any existing Flash solution in the market. Second, the costs: each hypervisor has usually some amount of free memory in it, and without incurring in additional costs for new SSD or Flash devices, a user can simply dedicate a portion of the server’s RAM to PernixData. Think also about POC (proof of concept): instead of stopping an hypervisor to install the Flash memory, the POC can be done using the local memory already available. Obviously, there are cons like the volatile nature of the memory (I wrote about this in the post Your next storage tier? Your volatile memory), but PernixData technology has been carefully designed to avoid data loss with a technology called Distributed Fault Tolerant Memory (DTFM).
Other great innovations in FVP 2.0 sound really useful for SMB and MidMarket environments: network compression allows now to use also 1G networks for cache replication instead of 10G, so small customers are not forced to do expensive network upgrades. And finally, what sparked a light in my head for these tests, is NFS support.
Veeam patented an awesome technology few years ago: vPower. vPower is the foundation of different Veeam Backup & Replication solutions like SureBackup and Instant VM Recovery. The base concept is simple yet extremely powerful: a backup file can be published and be seen by the ESXi server as an NFS share, and the VM that needs to be restored can be directly registered in the hypervisor. Instead of waiting for the virtual disks to be copied back into the production storage, the VM can be immediately powered on and used.
A Veeam backup file is deduplicated and compressed, so when it’s published via vPower NFS technology, its performances cannot be obviously compared to those of the production storage where the restored VM was running before. It’s always better to have a VM back in 5 minutes even if with lower performances, then waiting hours before the restore completes. But since now PernixData has NFS support, what if we can apply its caching to the Veeam vPower NFS? With the nice addition of support for RAM and Network Compression, even small customers can probably benefit from an even more performant Veeam vPower.
Setup
I imagined my Lab as a fake small customer with few resources, so I first of all configured my three ESXi hosts with PernixData and assigned it 4 GB of RAM per host. If needed, a customer will evaluate RAM expansion at a second stage since the value can be modified on a running system without any interruption. Start small, and then grow later.
I then selected a simple Windows 2012 VM, create a backup of it, and prepared the system to run the first Instant VM Recovery. During the first execution the NFS share is published in the ESXI host, and stays there for future needs:
Performance comparison
Before activating PernixData, I prepared a test to see the original performances of the test virtual machine in the production datastore and when executed in Instant VM recovery. Instead of running pure performance tests using FIO or IOmeter, I decided to run Microsoft JetStress on top of it, using the same configuration I explained in the post “My new I/O Test Virtual Machine”. This is a better way to simulate a production workload: JetStress simulates an Exchange Server, so the test can reproduce a scenario where a company is restoring its mail server, and would like to offer its users immediate access to email leveraging Instant VM Recovery. What performance degradation should the users expect when using the restored VM instead of the production one?
First of all, I run the tests on the VM over its production datastore, and I used PernixData itself to monitor the performances (Frank Denneman wrote a great article titled Investigate your application performance by using FVP monitor capabilities explaining how to do this):
Once the VM has been backed up and restored using Instant VM Recovery, I repeated the same test, again using PernixData only to monitor the results. I ignored the write redirection option in vPower in order to obtain the worse possible scenario; remember that in order to eventually use this feature, you need to have an additional NFS datastore, since PernixData can accelerate VMs whose disks are stored in the same type of volumes. Since vPower NFS is, well, NFS, also the additional volume for redirections needs to be NFS. Without redirection, these are the results:
Those were way lower results than the ones obtained from the production storage: they were totally expected, since vPower NFS starts the VM directly from the deduplicated and compressed backup file. But decreased IOPS is not the only problem:
This is Veeam Server virtual machine that is running the vPower NFS datastore. It has 2 vCPU, and one of them is completely consumed by vPower, that is a single-threaded service. The high CPU usage on the Veeam Repository can become an additional issue, especially if the recovery operation takes a long amount of time: during the entire period, almost 1 entire core is consumed by vPower, and so additional Veeam activities could be impacted too.
Time to add PernixData!
First of all, one important note is about the scope of the acceleration: since the virtual machines in vPower NFS are dinamically published and removed, you can decide to manually enable acceleration on any newly published VM after Instant VM Recovery has started, or you can do a better choice and within PernixData enable acceleration per datastore. In this way, any VM published by Veeam vPower will be automatically accelerated:
Once PernixData is enabled and ready to apply Write Back caching to restored VMs, it’s time to run again Instant VM Recovery and repeat the JetStress test. Note that I selected WriteBack + 1 remote node as the acceleration policy: with Instant VM Recovery, you are probably running virtual machines in production environment, so you want to protect the cache with an additional remote copy of every data. The results were pretty interesting:
Even if the source datastore is in reality a deduplicated and compressed backup file, PernixData was able to accelerate the I/O coming out from Veeam vPower NFS. Obviously, such a test cannot completely replicate a real production workload, so numbers were still a little bit below the original production ones; this is mainly due to the fact the hit ratio of PernixData cache was 0 at the beginning since it never saw such blocks and had to be warmed before. You can clearly see here how during the test the cache itself was steadily improving (the last part of Jetstress test is the final write to disk, so it does not creates new blocks):
In a production scenario, with real data moving around, you can expect even better results of the cache hit ratio, and thus even better overall performances. But overall, is interesting to see how much I/O PernixData was able to save from the vPower NFS and how latency for example was greatly reduced.
Oh, and you remember the heavy CPU load created on the Veeam repository for running the vPower NFS? Well, this is the same graph when the NFS share was accelerated by PernixData:
From a fixed 50% utilization to a 12% on average and a peek of 23%! This means, at the end, more CPU cycle for other Veeam activities even when running an Instant VM Recovery.
Final notes
As you have seen in these examples, the addition of a caching solution to the Veeam NFS share can dramatically improve its performances. This can mean faster VMs when running in Instant VM Recovery, but also quicker SureBackups for example, so you can run more tests in the same timeframe and check more VMs with it. Or, by reducing the overall I/O of a datastore, have quicker backup operations since the I/O saved by PernixData can be used by the production storage to better serve Veeam during its backup activities. Now that PernixData supports NFS, you have even more solutions for your Veeam data protection.