Atlantis Computing is announcing today its new software storage solution, name USX, or Unified Software-defined Storage.
If you forget about the buzzword-based acronym, this is a really awesome product as it was described to me.
Do you know Atlantis?
Atlantis Computing was foundes around 4 years ago, and its main focus has always been in-memory storage.
By using the RAM inside servers, they were able to create a solution that could expose memory (and other storage resources) as a repository to the Hypervisors and be scaled among servers. They initially focused on VDI environments, probably because the stateless nature of in-memory storage sounded too risky to put inside it business critical applications like Databases and such.
In the past years, they were able to validate the solution thanks to many achievement like official certification from both VMware and Citrix, and partnerships with HP, Dell, Cisco, IBM, VCE and NetApp. The have now more than 400 customers, with more than 500.000 VMs running on top of it, with some huge use cases like JP Morgan Chase running 100.000 desktops with Citrix and Atlantis ILIO, as it’s called the VDI solution.
The experience with ILIO let Atlantis develop some good technologies like inline deduplication, replication between nodes, and others. Now, with USX, Atlantis is bringing its technology to the general servers market, while adding something new.
Here comes USX
USX is already available, and sounds like a really promising technology. USX is deployed on every ESXi server as a virtual storage appliance; it uses local memory as its first tier of storage, so you can simply choose its amount by configuring the memory size of the VSA, and then it tiers data using any available local storage (PCIe Flash, SSD or HDD) as also any backend storage mounted on the VMware server, regardless it’s a block device or an NFS share. All the USX machines talks to each other and replicate data in order to protect their content.
The USX solution is software-only, and offers many features: obviously, the most relevant one is the in-memory storage tier. I know many of you are already scared about using a volatile medium like RAM as a storage device, but sounds like the smart guys at Atlantis have solved the issues about data protection. Taking advantage of the extreme speed of memory (10 times faster and with even lower latency than the fastest Flash memory available) USX can process incoming data before saving it, and this means mainly deduplication, compression and I/O sequencing. USX is basically an object storage, and each chunk of data is saved twice inside the cluster. When a new block needs to be written, USX first calculates the metadata informations of the new incoming block; it then compares it against all the existing metadata, and it finally saves the block only if it is not equal to any existing one. If instead the block is a duplicate, the block itself is not written, and USX simply updates the metadata, recording the fact this block is used twice by the hypervisor.
The final result is a great data reduction, and so the smallest amount of blocks are saved into memory. For additional resiliency, each VSA has a complete copy of all the metadata, even if any given block is only saved into two positions. Atlantis claims it obtained on average 5x space reduction in its tests, and this is a good results, given the high price of RAM. If you think for example of a modern x86 server with 128Gb RAM, by using 24 for the VSA, it means you can have a memory tier as big as 120 Gb, as much as a SSD.
Metadata are also saved into the lower tier, that is a persistent one, and is replicated in real time to another VSA of the cluster. In this way, even if a VSA crashes and its volatile memory tier is lost, data are still available in another node. Data are spread in different nodes, and as in many object storage solutions, there is no quorum architecture.
You are probably asking, how much can it scale? Atlantis did some internal tests and the biggest cluster was made with 256 hosts, but technically there is no limit. In reality, at some point the realtime replication between nodes is going to become the bottleneck of the solution by saturating the available bandwidth, so you would better stop at some point, and start with an additional cluster. This is also the reason why officially Atlantis requires a 10G network in production environments for the replica network, and at least 4G of overall bandwidth in lab/test environments.
A real modern storage
If the schema above seems to you something “already seen somewhere”, don’t worry, you are right. There are many features in USX I already saw in the latest storage solutions that i described in previous months, and Atlantis designed a real modern storage solution. Let’s try to list and describe them:
– it’s a pure software solution, in fact it can be executed on any x86 server, the only limit is the hardware compatibility list of ESXi
– the back-end is an object storage
– data chunks are dispersed among nodes, and data protection is obtained via replica and not via raid parity
– nodes are loosely coupled, there is no common element between nodes
– metadata are known by every node, there is no master/slave or active/passive mode, and each node partecipates to the cluster. This means the storage improves its performances as more nodes are added
– management is completely based on REST APIs, and the GUI is simply an easy interface to talk with the APIs. A nice feature is the possibility to see the REST version of every task you run in the GUI. In this way, you can quickly learn the syntax and have a command reference for developing your own commands.
A dedicated paragraph is needed for the front-end. Storage is exposed to ESXi as an NFS share. But, instead of having a huge NFS share and let the storage manage tiering and prioritization of workloads, you can take advantage of the several tiers you can have, and design different Volumes with different characteristics. Atlantis calls those volumes “Application Defined Storage Volumes”, and they can be summarized with this schema:
Depending on your needs, you can still have a pure in-memory volume like in their ILIO solution (2), and also replicate it to the shared storage for higher protection (3). Then, USX adds two more storage profiles, Hybrid and All-Flash. Once you configured USX with all the available storage you have, you don’t have to care about data placement; when you create a new volume and you choose which profile you want to use, USX manages everything for you, and you can also change the profile on-the-fly without any interruption.
It’s a 1.0 release, expect more in the future
As all the 1.0 releases, there are some missing features, and you can see some of them in the architecture schema above, there are in fact some features listed in grey. Right now, vSphere is the only supported platform, and is an obvious choice looking at its market share. Hyper-V is already planned, while KVM is “in the radar”. I would say it would not be so hard to support other hypervisors, since the VSA is probably a linux server running the Atlantis software in it.
Replication between clusters is also not available at the moment, is in the roadmap and it will support both syncronous and asyncronous replication, and it will support VMware SRM.
A nice addition is going to be the support for persistent memory, like Diablo Technologies ULLtraDIMM (better, the commercial version sold by SanDisk). When the support for “flash on dimm” will be available, USX will have no more need to flush metadata to persistent storage, and so we can expect even better performances. Atlantis already tested this technology using IBM X6 servers, the only machine supporting those memories at the moment, and they showed me some impressive results. I think this picture is worth a hundred words:
Final Notes
I’ve been briefed directly by Atlantis CTO Chetan Venkatesh, and I must say we had a really pleasant conversation on the phone. USX sounds like a really promising solution, and since he offered me some licenses for my lab, I can’t wait to test it. I see some good values in the USX solution: first, it can use existing storage resources inside a server, and pool them across the ethernet network to realize a redundant solution. In these times where budgets are shrinking, being able to reuse existing storage without buying new one is a plus. Even in a server bought some years ago, memory is faster than any SSD on the market, and modern servers maybe are lacking disk bays, but they are plenty of DIMM slots; if you can pool that memory and protect it using some local disks and USX, you can create your own “No SAN” cluster.
Second, the in-memory storage: once you convince yourself the technology developed by Atlantis is able to protect your data even if it’s stored on a volatile medium, you can unleash an incredibly fast storage layer like the memory inside your servers, and thanks to deduplication and other data optimizations, get the most out of it.
A final note on licensing: I don’t know the prices, but USX is going to be licensed on effective capacity, regardless the number of nodes or the mix of storage types you are going to use. This is another smart choice.