For the second time in a row, I’ve been invited to attend as a delegate the 4th edition of Storage Field Day, that’s going to take place next week in San Josè, California.
Again, the list of delegates and vendors is really huge, and I can’t wait to be there. Since I feel these are some precious opportunities, I forced myself to acquire as much informations as possible about all the vendors, so to arrived at the event really prepared.
I collected in this article only public informations that are available on vendors’ websites and thanks to articles from other bloggers and journalists. The ideas I got could also be wrong, but are most a recap of what I learned by now. If I was wrong, for sure at Storage Field Day 4 I’ll be able to change my ideas, and go deeper into these technologies.
After writing this article, I realized there are two recurring trends: first, in this SFD edition a recurring theme will be scale-out storage, one of my favorite. Second, mark this word, you will hear it really often in the future: “commodity hardware“.
Avere Systems
I like a solution when is well defined and targeted. It allows to quickly understand what it is, what benefits it can bring, and so be able to quickly identify its potential customers. Avere Systems sells a scale-out storage designed to optimize traditional NAS, that natively do not have scale-out features.
Their system, called FXT, is available in several models (so in a way it’s also scale-up, so what about rip & replace problems?), and can be installed in multiple nodes (up to 50) in front of a traditional NAS. Thanks to it, you can largely increase the performances of a Filer, without the need to replace it. You can think about an environment where the NAS is sized only in regards to capacity, and you leave to Avere the management of performances.
Also, it allows to “virtualize” the managed storage thanks to some interesting features, called FlashMove and FlashMirror. Thanks to them you can maange NAS from several vendors at the same time, migrate data between them, or replicate them. It’s an ideal solution when for example you have to change a NAS with a different model, being it from the same vendor of from another one.
Finally, I’m curious about the possibility to replicate data towards and external cloud storage. I didn’t found much information about this feature, so I will try to get more informations during the meeting.
CleverSafe
I already met CleverSafe in last April during Storage Field Day 3. They offer an object storage solution based on their patented “Dispersed Storage” technology. Their solution is really interesting for Cloud Service Provider and all those companies needing a distributed storage able to scale to notable size. By using data reduction algorythms like Erasure Code, CleverSafe allows the creation of really large storage spaces, completely redundant and replicated (even geographically) without the overprovisioning problems of the usual RAID systems, that would be totally anti-economical at that scale. For example, a system with 16 disks is still available even if 6 disks of them are lost.
Matt Simmons wrote back in April a really detailed article. We’ll see during this next meeting what news we will get in regards to this last meeting.
CloudByte
Cloudbyte offers a software-only storage solution, based on ZFS; in fact the operating system is basically FreeBSD 9.0. Their software creates a scale-out storage using commodity hardware, and each server acts as a “Storage Controller”, managing their own local disks and sharing them with other controllers in order to form a Cluster. Many clusters also can be aggregated in a “site”.
The solution is specifically designed for multi-tenant environments, regardless they are service providers or large companies with several internal business units. CloudByte focused their design on the QoS management, that is the ability to guarantee to different workloads specific amount of performances, without competing with “noisy neighbours”. Thanks to their technology named TSM, admins can assign to each workload guaranteed IOPS, throughput and latency, and change these values in real time. Storage can be exported to servers via FC, iSCSI, NFS or even CIFS. Storage provisioning can be completely automated, by assigning a profile to a customer the system automatically determines the best placement for the new resources. For further needs, all these activities can be executed via RESTful APIs; however there are also two plugins for vCenter and OpenStack Cinder.
Really interesting, for those who would like to try the solution, is the availability of a completely free version, capped at 4 TB of managed storage.
CoHo Data
Coho Data, previously known as Convergent.IO, just came out of stealth mode announcing its first product, called DataStream 1000. It’s a scale-out storage, with some common features as others “last generation” solutions, but with also many interesting differences.
First of all, the “common” features: it’s a scale-out storage, it uses commodity hardware (at first look a MicroArray seems based on SuperMicro 2U Twin), inside there are a Nand Flash PCI-E card, some mechanical disks and a network card with two 10G ports. Above this hardware, there is probably a Linux OS, where the software developed by Coho Data is running.
Differences are obviously somewhere else: as usual, being it a software solution (even if it’s sold as an appliance), the core is its code. But, there is also a 10G switch, sold together with storage modules and use to connect all the storage servers together, and to servers using it. The switch is completely managed and configured in real time by the software, in order to optimize traffic and always guarantee both redundancy and performances.
From a storage perspective, it is basically an Object Storage, even if they call it “Bare Metal Object Store”, and the name space is managed at the same time by all the available MicroArrays. By adding nodes, the overall space increases linearly and without any interruption, and data are rebalanced across all nodes. So, you can expand your storage based on your needs, without any disrupting upgrade. Storage is published as a NFS share, so it can be consumed by VMware vSphere and physical servers. Each server can be associated with a profile, that is a defined set of guaranteed space and I/O.
It’s a really interesting and innovative solution, and the ensemble of storage and networking in the same solution is for sure a peculiarity. The main target seems to me are large companies and service providers needing a scale-out storage to consolidate several workloads, both virtualized and physical, and to have a place for applications like big data. This is confirmed by the size of the solution: the starting kit is a single block (2 MicroArray) and 1 switch, and this already means 39 TB and 180.000 IOPS, but for sure the solution makes even more sense if its customer is planning to scale far beyond these numbers. Cormac Hogan wrote an excellent review on his blog, go check it out for further informations.
GridStore
GridStore is another completely scale-out solution, where the two main components of a storage, controllers and disks, has been completely separated into two distinct elements. As in other solutions, the main goal is to defeat the infamous I/O blender and its consequences, by creating a system able to scale independently compute power and capacity. This is possible by using the two elements that form the GridStore solution.
First of all, there is a software controller, that can be installed on any Windows OS, regardless it is a physical server or an Hyper-V hypervisor machine. The controller, called Server-side Virtual Controller Technology (or SVCT in short) publishes storage as a local SCSI resource to the system it’s installed on, and it then connects via ethernet to one or more “Nodes”. Each Node (there are three different models, one also with a PCIe Flash card) it’s a 1RU buidling block with 4 disks and a CPU: its goal is to serve a defined amount of disk space and compute power to the overall Grid. In this way, storage can be expanded dinamically by adding nodes to the “Grid” when you need more capacity, or by adding additional SVCT when you need more I/O.
I/O is evenly distributed among all the nodes, and each of them receives the same amount of data from the controllers. Redundancy is guaranteed by Erasure Code and not via RAID or replicas, and the computation of erasure code is made by the vControllers.
This solution seems to be really cheap, and well designed for small environments that need to start really small, but that could eventually face a fast growth in the future. If I understood correctly, SVCT technology is only available on Windows, but even it could possibly used in VMware environments, for example as backup landing areas for softwares like Veeam.
Nimble Storage
Nimble Storage is one of the biggest players in the Hybrid Storage market (together with Tegile and Tintri). Their solution has been firstlaunched few years ago, and has been widely adopted from several customers, and the latest news is they will be quoted in the stock exchange in 2014. In a really competing and aggressive market that already saw the first victims, this is a great achievement, and a statement about both the quality of the solution and the ability to sell it.
Nimble offers an hybrid storage (SSD + HDD) based on their patented CASL (Cache Accelerated Sequential Layout) technology. Based on several analysts and users feedback, this technology is really efficient in managing tiering between the two disk levels: SSD collects all the incoming writes, aligns them and finally write them sequentially to mechanical disks. Also, it keeps those incoming data, so reads can also be accelerated. This kind of approach takes advantages of both performances of SSD in random I/O, and optimize the usage of mechanical disks. CASL also uses real-time data compression, and most of all a complete solution for copies and snapshots via redirect-on-write, and this makes them really fast and efficient.
External connectivity is iSCSI (not sure if there are other supported protocols), and there is an interesting capability of both doing scale-up (by adding disk shelves to a storage node) and scale-out, by joining several systems in the same cluster.
Overland Storage
I have no idea about which of the several products they have, Overland will present to us, or if it would eventually be a new upcoming solution. Overland Storage has different products in their portfolio, like traditional SAN systems, tape library, VTL and NAS. I don’t know very well this company, but at first sight an interesting product is Snap Scale. It’s a scale-out NAS, that offers CIFS, NFS or HTTP connectivity, and also offers iSCSI at the same time. Each node is 2RU tall and based on different models it has 8 to 144 TB of raw space. It can also replicate between different clusters, both locally and geographically. I didn’t found detailed informations about the maximum numbers of node in a cluster, but if this product is going to be the topic of the meeting, I will try to get further informations.
Oxygen Cloud
Oxygen Cloud is a file sharing solutions for corporate customers. It offers usual features like all these kind of solutions, but also some specifically designed for corporate customers: encryption, access controls based on Active Directory or LDAP, support for RSA Secur-ID, remote wipe, and the possiiblity to choose where to store data, in a public cloud (Amazon) or on the Private Cloud solution offered by IBM. Oxygen is for sure the “black swan” in these series of meetings, and it would be interesting to meet them and find out more about them.
Proximal Data
Proximal Data offers a server caching solution, called AutoCache. It’s a solution designed for VMware ESXi, and has been designed in order to be installed at the kernel level in order to transparently accelerate all kind of virtual machines, without any need to modify the guest operating system. Acceleration is available only for read operations, and it replicates cache contents among all the servers of a vSphere cluster, so when a virtual machine is vmotioned there is no performance impact when it arrives on the destination host.
An interesting design choice made by Proximal Data is that, by default, every VM and datastore in a ESXi server is accelerated when the software is installed. Afterwards is however possible fo configure exclusions. It’s a different behaviour if compared to similar solutions, that instead need to be explicitly configured and by default do not accelerate anything.
Virident
Virident is a maker of Flash PCIe cards. Recently acquired by HGST, a division of storage giant Western Digital, that also acquired in the same period Stec, another Flash PCIe cards vendor. I have no information about what technology will be the topic of our meeting, they could be their cards or their software solutions . I really hope it would be the latter.