I’ve been invited to Storage Field Day 4 in San Josè, California. One of the vendors I met has been GridStore. This is a review ot the meeting I had.
Who is GridStore
GridStore is located in Mountain View, CA, and has been founded in 2009. They have a precise focus for their product: their main customers are small and medium companies, that use physical Windows servers or Hyper-V systems. The idea of their solution is to separate storage capacity from performances. The solution is divided into two components, a scale-out storage, made with different hardware models, and a “Virtual Controller” that can be installed in any Windows system, regardless it’s physical, virtual or the hypervisor.
The Virtual Controller
From a design point of view, GridStore idea is really interesting: the controller of a typical storate system has been moved from the array itself into the server that needs to connect to the storage, thus “borrowing” compute power from the server in order to be executed. Since it can leverages the huge compute power of modern CPUs, and in all but few situations CPUs are not the bottlenecks of an infrastructure, this design choice makes sense. Even in shared environments like Hyper-V, the virtual controller is installed in the Partition 0 of Hyper-V, so it does not have to compete with virtual machines for the CPU resources; otherwise an overloaded system would create problems to the vController.
A Virtual Controller (also called Server-side Virtual Controller Technology, or SVCT in short) is installed in each Hyper-V or physical server that needs access to storage. It’s comprised of a driver responsible for showing volumes coming from the Grid as block devices, and several services. Multiple controllers can access at the same time to the same “Grid”, and they are also responsible for the optimization of the I/O before flushing it to the storage, for operations like sequencing random I/O, snapshots management per VM, and also replica at the VM level, and a prioritization mechanism for the workloads, configurable on three different levels.
The Grid
Coming out of the server, the vController connects to several “Nodes” via the ethernet network. Each Node (there are three different models, one of this also has a PCIe Flash card) is a 1RU system with 4 disks and a CPU, and it does not have any internal redundancy. The overall solution works thanks to a “Grid” architecture (that’s where the company name comes from) with a minimum of 3 nodes, up to the actual limit of 250 nodes. Each node add its processor and disk space to the infrastructure, so the complete solution has the same power as the sum of all nodes.
I/O and data are evenly distributed on all nodes, and automatically rebalanced each time a node is added. Redundancy is guaranteed thanks to Reed Solomon Erasure Code rather than RAID or replicas, and erasure code calculations are made directly by the vController. The redundancy level is configurable so it can loose 1 or more complete elements (a disk, a network link, a complete node), and it can also be decided per each published volume, called vLUN by GridStore. The disk overhead is the same as a RAID solution, but without the further performance problem created by parity calculations, and also without the redundancy limit for example of RAID-5, that is a N+1 system. In fact, you can configure different redundancy schemes like 3+2 and so on. The biggest affordable loose is the half of the available resources +1; that is when you have 11 nodes you can loose up to 5 of them.
Such a massive amount of computing power in the nodes seems excessive, since the Virtual Controller uses the CPU of the host where it is installed into. 1 Quad Core CPU and 32 Gb of RAM in each node are not that much per se, but if you think about a cluster with several nodes, the overall computing power is pretty big. However, there are some services directly delegated to nodes instead of being executed by a vController. One for all is replication rebuild after a failure. This is a savvy design choice, so I do not have to wait for a new server in order to run the vController, but replication starts immediately after a crash.
Final notes
GridStore solution is really interesting. I can’t help myself from repeating how storage limits can only be eliminated with new scale-out solutions, as I outlined in a previous article. GridStore has chosen a pretty unique approach, both from a design and a target market standpoint. In order to avoid scalability problems of usual communication protocols, they designed a proprietary driver, able to communicate with the storage in a native format, without needing NFS neither iSCSI. Also, they decided on purpose to avoid being the N-th storage for VMware, but instead they created one of the few storage optimized for Hyper-V. This does not means GridStore is not evaluating other platforms, maybe in the future we will see osupport for VMware or also KVM/Openstack…
Another thing I liked was the easeness of the solution. Thinking about the typical target of GridStore solution, this is a winning feature. Overall, is a great solution wherever you need to start small because you have a small budget, but a quick growth could create problems to the initial storage that could also end up in a forklift upgrade. Starting price is 1 USD per raw GB for capacity models, and 1.5 for the Hybrid one. That means, the starter kit costs 12.000 dollars and has 12 TB raw space, that are 8 TB with Erasure Code configured as a RAID5.
And, even if you are working in VMware environments, GridStore can be nonetheless a good fit for some scenarios. For example, it could be a good solution for backups, since it can be a primary storage able to scale without interruption or reconfiguration. In fact GridStore and Veeam has agreed for a official partnership in order to offer GridStore storage as a backup primary repository.