In some previous articles I described Fusion-IO technology and how to install on of their card in a ESXi server. In this post, as many of you were waiting for, it’s time to run some performance tests and see what is able to do this product.
For my tests I used VMware I/O Analyzer, a ready-to-run appliance specifically designed to do benchmarks. It can be installed in few minutes in a ESXi server and it can run several I/O tests on the storage is deployed upon.
Some preliminary notes about my WhiteBox, that is not a real server…
Fusion-IO cards have incredibly high IOPS and latency values thanks to the complete bypass of the disk scsi/sata bus, using instead a direct pci-e connection, and also thanks to other essentials elements like their drivers and the way they intercept system calls to the guest file system they host. To take full advantage of these cards however also other components of a system need to be good enough; one for all, the mainboard chipset. This is why my home whitebox has not reached the nominal values stated by Fusion-IO.
Another key component to obtain the best performances is RAM. Following Fusion-IO tables in their user guide, their drivers need a fair amount of System RAM based on the size of the Fusion-IO card you are going to use. If you are going to write on average at 4k blocksize, the driver is going to use 425 MB for each 80 GB card space. With the IoDrive 640 I’m usin, this translates in 3.32 Gb RAM.
Keep these calculations clear in your mind while you are going to size a server that will host a Fusion-IO card: in this scenario, ESXi is no longer a small footprint OS…
I used pre-defined tests already available in the VMware appliance, and each of them has been executed for 60 seconds. No other Virtual Machine was running during the tests.
I/O Analyzer tests on my WhiteBox
This first test used 512byte blocks, 100% read and 0% random. It was aimed to reach the highest IOPS value, and the Fusion-IO card did its job: 58716 IOPS throughout the test, with an outstanding latency of 0.10 milliseconds, that is 103 microseconds!
This second test was exactly as the first, with only and increase in blocksize to 4k. You can see a drop in IOPS (almost halved) and a rise in latency up to 433 microsecondi; we are anyway talking about values way below a single millisecond.
In the third al last test, I used again 4k blocks, but this time with 50% balance between reads and writes and a completely random access. IOPS dropped a little bit more but anyway stayed on high values, but the latency was once again incredible even in a completely random access, with a final value of 513 microseconds.
Conclusions
Fusion-IO has always stated hyper-low latency is the real goal of their products, much more than IOPS. The numbers I saw in my tests were a complete confirmation of this statement.