Some weeks ago, Nimble Storage published on its blog an article titled It’s Time To Get Aggressive About Data Protection. the post is a nice summary and analysis of a research they conducted on their customers. The topic, as you can understand from the title, is Data Protection. They interviewed 1600 of their customers with different sizes, from small to medium to large enterprises; and the results were pretty interesting. If I’d have to summarize in a phrase: a need for Next-Gen Data Protection is arising.
Increasingly demanding environments
The first outcome of the survey is really clear, and is a sound request for all the players in IT: customers are facing more and more demanding requests from their stakeholders, and this in returns is forcing IT vendors to introduce different and more effective ways to guarantee the “always-on” business. In Data Protection, the two dogma when it comes to service levels are RPO and RTO, and you can translate them in simple language as: “we need to save data as fast as we can and as frequently as possible, and in case we need to restore them quickly, really quickly”.
I’m always doubtful when a customer says that more than 50% of workloads are “critical”. When it comes to a Disaster Recovery Plan, one of the most important steps is to clearly identify workloads, and to give them a priority level. You simply cannot list all of them as critical, because it would be like none of them is. Not all workloads are really “critical” when there is a DR situation in place. But nonetheless, this is an interesting result, it means at least the “preceived” criticality of those workloads is increasing.
6 is the new 24
The direct consequence, as pointed out correctly by Nimble, is that once common RPO and RTO values are no more effective in modern scenarios: the days of daily night backups and 24 hours RPO are almost gone in any environment, and customers are requiring more frequent backups to lower their RPO values. This means one thing: the production infrastructure must be able to sustain backup or replica operations in the middle of the day, while the same workloads are serving users. Really few customers and applications require RPO and RTO close to zero, but even in small and medium companies, the new default value seems to be 6 hours, both for RPO and RTO.
What does it means? The biggest issue coming from these new business needs is that, even in environments not running 24/7 business, some of these backup operations are going to be completed during working hours. So, the production environment, and mainly the storage infrastructure, needs to be able to sustain the load of workloads and backups at the same time. It’s no more enough to size the storage for production needs, but every designer needs to take into account also the additional I/O created by backup operations, or leverage new integrated functionalities between storage arrays and data protection solutions.
It’s interesting how more than a year ago (January 2013) I wrote a post titled It’s time for “Storage Aware” backups? talking exactly about this issue. My last phrase in the post was:
In the future, we would probably see a deeper integration between those two elements.
Seems like that future is meanwhile become the present.
New backups require new repositories, and new tools
If a new style of data protection is required, it’s somewhat expected that customers are looking for new repositories to save their data into. In the past tape was the king of backups, and its RTO and RPO capabilities were more than enough to fulfill business requirements about data protection.
In these new scenarios, tape can only be used as a long term archival solution, and in this role it’s still the best option. But when it comes to primary backup targets, the only two viable solutions are disks and cloud services (again based on disks). Why? Simply because disk-based repositories are the only ones able to guarantee both the new and demanding RPO and RTO values. I see as backup appliances are in second position now, but expected to be overtaken by cloud services. To me, it’s a sign that customers are looking at a simpler approach to data protection: the primary target is still going to be a disk-based solution, since it’s the fastest one in terms of RTO, but when it comes to a secondary location for an additional backup copy, customers are probably looking at a solution that does not requires up-front investments in hardware, but it can be rather consumed “as a service” with a pay-as-you-go service.
Maybe some Cloud Providers are using those tapes and backup appliances that customers tends to avoid, but the message is not about what solution is preferred, it’s more about how data protection services are going to be consumed.
After choosing the right repository for new demanding SLA, what tools will be used to achieve those goals?
This graphic is extremely interesting, and it tell us (both users and data protection vendors) what to look for in the near future. Backup Software is still going to be the main solution, especially in small and medium businesses. In large enterprises however, we can expect a high growth of Storage Replication solutions.
This means two things:
– storage vendors cannot have solutions without this feature, and the feature itself will be less and less perceived as an optional feature. Customers will ask for it more and more, and vendors will be probably forced to offer as a standard feature more than an optional one.
– backup software still plays a huge role, because it can offer granular backup and recovery options, where Storage Replication is a little bit limited by the fact it works at the volume level and cannot identify single workloads. But at the same time, Backup Software needs to work in conjunction with the storage and its requested replication features. Again, a confirmation of the need for Backup and Storage integration.
Finally, an interesting note and Storage Snapshots: the demand for them is now at the same level of Replication, but in the future the will be probably less required. Hopefully, customers are finally understanding that snapshots are NOT backups, and saving data inside the same storage that they are supposed to protect, is a stupid solution.
Final notes
I liked very much this research from Nimble Storage, and I was fascinated in seeing how some of the ideas I’m telling since years are finally becoming common in the IT Community.
I agree with many: backups were boooring. When we only had file level backup softwares sending data to tapes, there wasn’t really a data protection story to tell. Today, with modern data protection solutions, demanding environments to protect, and several effective solutions that can be implemented to achieve data protection requirements, backups are probably interesting again.