In a previous article, I wrote about the server-side caching solution Sandisk FlashSoft 3.1 for vSphere, its features and how to install and configure it in a vSphere environment. In this article I will show the results of some performance tests I did.
Test environment
For my tests I used my SkunkWorks Lab. On two of my three ESXi servers I installed two Fusion-IO ioDrive 320 Gb cards, and then the FlashSoft software, using the 3.1.3 release. All these results has been obtained by executing the tests I described in my article My new “I/O Test Virtual Machine, where I described the initial results in my lab in its “base” configuration.
One important warning about my results: my lab is built on HP Proliant G5 servers. Their PCIe bus is not so efficient (to say the least), and because of this the Fusion-IO cards cannot run as fast as they would be able to, but they usually can only reach around 23-30k IOPS; on different servers they would be able to go above 60-70k, even if those are first generation cards.
So, look at these numbers merely as a comparison with the performances of my lab without acceleration and to understand how FlashSoft can improve performances, but not as absolute results. In order to have real performance results I should have a newer lab, but I don’t have it at the moment.
Results
FIO Max Bandwidth: 100% read, 100% sequential, Block size 1M, IO depth 32, 16 jobs
Labs: 194 IOPS, 199,57 MBs, 2593 ms latency
Labs + FlashSoft: 741 IOPS, 759,62 MBs, 684,40 ms latency
FIO Max Real I/O: 100% read, 100% sequential, Block Size 8k, IO depth 32, 16 jobs
Labs: 12717 IOPS, 101,73 MBs, 4016 ms latency
Labs + FlashSoft: 30278 IOPS, 242,23 MBs, 16,9 ms latency
FIO Real Life Test: 80% read, 100% random, Block Size 8k, IO depth 32, 16 jobs
Labs: 2800 IOPS, 22,40 MBs, 181 ms latency
Labs + FlashSoft: 15303 IOPS, 122,43 MBs, 34,16 ms latency
JetStress: 2 hr run performance test
Labs: 1013,57 IOPS
Labs + FlashSoft: 1533,07 IOPS
This result needs an explanation. At first sight the result seems to be only slightly better than the starting value, but you need to keep in mind how a caching solution works: FlashFoft improves its performances as time goes by, since it needs some time to identify hot blocks and thus optimize its cache content. Jetstress tests last for 2 hours, and the first execution gave me only 1060 IOPS. In the second run, that is what I reported here, value has increased by 50%. For sure longer execution times can give you even better results, and a real production environment would be even better, since it runs endlessly. Moreover, remember FlashSoft in its current release does not accelerate writes, so a portion of the Jetstress test is not improved.
HammerDB:
Labs: 17790 TPM e 7845 NOPM
Labs + FlashSoft: 31536 TPM e 12128 NOPM
Final notes
I’ve been really impressed by the performances of FlashSoft. In virtualized environments, where the vast majority of I/O is made of read operations, this product can be an effective solution to improve I/O without the need to update or replace the existing storage. More than the expected improvement in IOPS, I was really happy with the high reduction in latency: for sure part of the credit goes to Fusion-IO cards, since one of their strenghts is low latency itself, but nonetheless FlashSoft made good use of this. This means the software is really well optimized and efficient.
Now that the tests are over, I will wait for a following release when FlashSoft will maybe add support for writes, so I will be able to see further improvements, especially in those tests like JetStress or HammerDB that have 50% of their I/O made of writes.