After the recent release of VMware VSAN, there has been a series of blog posts from some of my peers talking about the design considerations that VSAN has brought. If you read them in the order they were published, you can follow the conversation that is going on:
VSAN – The Unspoken Truth by Maish Saidel-Keesing
VSAN – The Unspoken Future by Christian Mohn
VSAN – The spoken reality by Duncan Epping
I totally agree with both Christian and Duncan, and in a certain way also with Maish in thinking blade servers are not a good solution for VSAN. My opinion is even more radical, as I think blade servers has almost never been a good solution “at all”… This reminded my of an idea I always had in my mind (and I applied it often in my datacenter designs): I don’t like blade servers. In this post I’m going to explain you why, reason by reason.
Be careful, there is nothing in favor of blade servers listed here.
Space savings? Sometimes
One of the biggest selling value of blade servers has always been the savings in rack space. Compared to a 1 RU (Rack Units) server, a blade chassis can hold more than 1 server per U. For example, a Dell M1000e occupies 10 RU while holding up to 16 servers. In this way, a server uses 0,625 RU. You would say 38% in space saving is a lot, but there is a trick in this number: this value is true only if you load the chassis with at least 11 servers, that is 68% of its capacity. Any number below this makes the rack usage comparable with rack servers, or even worst. For example, the quoted SuperMicro Fat Twin in Christian’s post offers a space saving of 50%, much more than blade servers.
This means, from a design perpective, that with blade servers you need to size your infrastructure with the correct number of servers from the beginning, even if they are supposed to be added “as you go”. At the end, from a scalability perspective, the growth is effective only if it’s made with 10 servers at a time.
Have you always seen completely filled chassis? I did only few times, the vast majority of them are half empty, maybe because the initial requirement was only about few servers and the planned growth never happened, or other times because there are some blade combinations you need to follow inside the chassis, and not all models can be inserted as wanted. In all these scenarios, your blade system is using more RU than the same number of rack servers will do.
Look at this picture a friend published on twitter some weeks ago, do you think 4 servers in 10 RU are saving space?
The chassis is a lock-in
Even if vendors always tell you the chassis is going to be supported for years, and it can accomodate several generations of blade servers, are you sure this is going to happen for real? I’ve seen many customers having to spent so much money at a certain point because they need a new blade model, but the actual chassis was not able to run it.
So, instead of adding only one new blade, they had to add a complete new chassis. And often the price of it is much more than a single blade server.
Shared Backplane
This is by far the biggest complaint I’ve always had in regards to blade servers. I know, I know, modern backplanes are redundant, completely passive, it’s almost impossible they can break. To me, that “almost” is enough to be afraid of them. When a backplane breaks, I loose all of a sudden a huge bunch of servers. If I’m a small company and I only ordered one chassis (like my friend’s company did), I have no other chassis to power up my servers. No matter how many blade servers you have, your single point of failure is no more that single server but the entire chassis.
Datacenter un-friendly elements
When it comes to server rooms design, rack footprint is only one of the elements you need to consider.
First, your servers are not the only component: if your central storage is going to use 4 racks, do you really care about few more RU used by your servers? You can save waaaay more money by optimizing your storage infrastructure than your server infrastructure.
Then, it comes air cooling. Since a blade server has more or less the same internal components of a rack server (mainly the CPU and other internal chipsets), the power consumption is going to be the same, and so it will be the required cooling. But since you are concentrating many servers in a small amount of RU, a blade chassis can become a “hot spot”, and your cooling system needs to take this into account. It’s not a bad thing per se, also SuperMicro fat twin systems have this problem, maybe even more because of internal disks; but you end up designing your datacenter specifically for blade servers. What if you want to install a new chassis in an area of your server room where air conditioning is not enough?
Connections savings
One of the other selling value of blade servers is the savings on connections. You only need to connect to the outside world a single chassis with few cables, thus saving on cabling. This is true, no doubt. But do we still need this? The savings on connections start from the assumption that not all servers are going to fully use the available bandwidth (being it ethernet or fiber channel) at the same time, so you are basically overprovisioning those connections. But with new technologies like Flash Memories, it’s easy to saturate a 10G connection on a single server, so why this would not happen in a shared connection? The solution can be to have a bigger backbone, like 40G or 100G, but in this way, we are still saving money? Or the price of few 100G connections is much more than two 10G connetions per server?
Modern datacenters are embracing ethernet connections for storage too, so the prices of Fiber Channel networks are not a problem, bacause they are simply ignored. And when it comes to ethernet, 10G connections are becoming more and more common as their prices are dropping. Bypassing the blade connections means I have one less component in my data path that can break and one less hop for my data. In some datacenter I saw, VMware clusters for example are spread horizontally, to prevent problems to PDUs. With this kind of design, where also TOR (Top of the Rack) switches are (maybe) less useful, servers are often directly connected to central big switches. And this makes internal connections of a blade chassis less relevant…
What’s the point? As price per network port is falling down, the complexity of some designs is becoming less relevant. In the past, a rack server with 4 gigabit connections was consuming 4 ethernet ports to have 4 GB total bandwidth. New servers with 10G ethernet ports only consume 2 ethernet ports to offer 20GB total bandwidth. As this ports/bandwidth ratio improves by the day, the need for network concentrators like those inside the blade chassis is becoming less clear.
Convergence? Not on a blade
This brings me to my last point. Maish stated correctly that a blade server cannot be used for a converged infrastructure. Not only VSAN, but also other solutions like Nutanix for example are using a totally different form factor, with better space for local disks and cards (and NO shared backplane by the way)
But even before converged systems, a blade servers already had problems in the past to accomodate anything else from CPU or RAM. Any additional PCI card in a blade server often requires a dedicated model (mezzanine or whatever), and you couldn’t simply buy a common PCI card and connect it into a blade. Think about a Flash PCIe card like Fusion-IO or Virident, or a GPU accelerated card. Some would say: you can use the bigger blade models. But then, again, where is the space saving if I buy a “fat” blade server?
In conclusion, I think converged architectures are only exposing even more some of the problems that blade servers has always had.
I know, it’s a radical position; feel free to disagree, but if you work for a vendor of blade servers, please state it before commenting.