In previous Part 1 of this series I introduced you to Ceph, the open-source scale-out object storage. In this Part 2, I’m going to explain you the architecture of Ceph. It took me a while to fully understand how Ceph is built, and what all its components are and what is their functions; as such, I’ll try to give you a really simple explanation, so you will not have to spent so much time as I did, even if its principles are really simple.
Also available in this series:
Part 1: Introduction
Part 3: Design the nodes
Part 4: deploy the nodes in the Lab
Part 5: install Ceph in the lab
Part 6: Mount Ceph as a block device on linux machines
Part 7: Add a node and expand the cluster storage
Part 8: Veeam clustered repository
Part 9: failover scenarios during Veeam backups
Part 10: Upgrade the cluster
Ceph architecture for dummies (like me)
First of all, credit is due where credit is deserved. I had hard times at the beginning to read all the documentation available on Ceph; many blog posts, and their mailing lists, usually assume you already know about Ceph, and so many concepts are given for granted. After some searches I found out this video, and I learned A LOT about the basics of Ceph architecture. This presentation was made, like the video I linked in Part 1, by Ross Turk from Inktank (now part of RedHat). You can find the original slides on SlideShare here, and this is the recorded video. It’s 37 minutes, and you should really look at it:
This video is simply “perfect” to understand how Ceph works. I don’t really have to add anything. The guys at Inktank should really put this video in their homepage!
My use case, the block device cluster
As you learned from the video, there are many use cases for a Ceph storage cluster. Anyway, remember what we are trying to achieve from Part 1: to use Ceph as a general purpose storage server, mounted then as a device driver into one or more linux machines, where you can drop whatever you have around in your datacenter; in my case, it’s going to be my Veeam Repository for all my backups.
Because of this, in the next chapters we will start desigining the Lab in regards to this specific use case: we will deal mainly with the Block Device option of Ceph, so if you are more interested in their File System or the “pure” Object Storage access to Ceph (librados), you’d better look at other resources. If you refer to the design below, we will work with the RBD (Rados Block Device) technology of Ceph.
In order to create the cluster, I will use 6+1 different Linux servers (virtual machines in my case, running in my vSphere lab). Each one will run only one role of the Ceph RBD solution:
3 servers will act as OSD (Object Storage Daemon) machines, holding multiple OSDs, one per disk. 3 additional servers will be my MON (Monitor Server) servers. Finally, not depicted in this design, there will be the Ceph Admin machine, that will be used to control and configure all the existing nodes. These roles can be mixed in different ways: for example the same server can be at the same time an OSD and a MON, and the admin console doesn’t need to be in a dedicated machine. But if you want to learn how really Ceph works, and most of all learn the best practices to scale it afterwards, it’s better in my opinion to design the solution from the beginning with the right design choices.
The final result will be this one:
With RBD, each OSD will hold several 4MB chunks, and any file and block entering the cluster will be split into those 4MB chunks, written in different OSDs of the cluster, and then replicated another time in order to guarantee redundancy. But from outside of the cluster, the host mounting RBD will see a unique block device. This host in my case will be my Veeam Repository, but you can connect any host that you want to the RBD cluster, by simply carving out of the cluster an additional volume to be then presented to the Host using it (more on this in the next chapters). The host will mount the RBD cluster as a local device, thanks to the fact Ceph is natively supported in any Linux Kernel starting from 2.6.39.
In the next chapter, I will describe in details the specific design I have in my lab, and some sizing rules.
See you next time!