After the release of the latest version of Ceph, Luminous (v12.2.x), I read all the announcements and blogs, and based on the list of new interesting features as Bluestore (about which I blogged in my previous post here), I decided to upgrade the Ceph cluster running in my lab.
Preparation
My cluster runs right now the version called Jewel (v10.2.z). Reading the release notes I found this scary disclaimer:
“There have been major changes since Kraken (v11.2.z) and Jewel (v10.2.z), and the upgrade process is non-trivial. Please read these release notes carefully.”
Well, I thought, this is going to be really interesting. Luckily, my cluster is a lab deployment, so I can easily lose it and rebuild it if something goes wrong. Nonetheless, I took the upgrade process as seriously as possible, simulating a production environment where the cluster itself HAS to stay up and running during the entire process. After all, it’s a scale-out system, isn’t it?
The upgrade process
There are many steps in the official upgrade guide that you need to follow carefully. I’m repeating them here, using, however, a different numbering and sequence, as I found this one better. Also, I changed the text and instruction where they need to be more verbose, so that they are better understood. The obvious recommendation, during the entire upgrade process, is that the cluster is in healthy conditions and no new creation/edit operation is done against its configuration.
1 – once logged into the Ceph cluster, you need to ensure that the sortbitwise flag is enabled. You can set it with the command:
# ceph osd set sortbitwise
2 – Set the noout flag for the duration of the upgrade. This will instruct Ceph to do not rebalance the cluster; while it’s optional, this is recommended so that we will avoid that each time a node is stopped, Ceph will try to balance the cluster by replicating data to other available nodes. We will run during the upgrade process in a partially degraded state, but this is acceptable. If your cluster runs critical application, you may want to avoid this configuration and let Ceph rebalance its data during the upgrade itself. To set the flag:
# ceph osd set noout
3a – For the upgrade process, you can choose to either upgrade each node manually, or use ceph-deploy. I went for the automated mode via ceph-deploy, but in case you want to do it manually, on CentOS you need to first edit the Ceph yum repo to target the Luminous release instead of the Jewel I was using in my cluster, this requires a simple text replacement:
sed -i 's/jewel/luminous/' /etc/yum.repos.d/ceph.repo
3 – Using ceph-deploy you can upgrade the entire cluster automatically with one command. I talked already about ceph-deploy and I used it extensively, as it’s cumbersome to properly manage even a small Ceph cluster manually, and sooner or later you will mis-write some command and damage the cluster. You can also use automation tools like Ansible if you prefer, but I’m not using it for Ceph so I don’t have any procedure to share. Supposing you are running ceph-deploy from an administration node like in the lab I have, you first need to upgrade ceph-deploy itself on this node:
sudo yum install ceph-deploy python-pushy
4 – Once ceph-deploy is upgraded, we first need to upgrade Ceph on the same machine. You may find many commands in the upgrade guide, but at first many will return you the error:
no valid command found
This is because many new Ceph commands are only available in Luminous. So, we first upgrade the administration node (it’s name is ceph in my Lab)
ceph-deploy install --release luminous ceph
5 – Then, we need to upgrade the monitors. To do so, we run:
ceph-deploy install --release luminous mon1 mon2 mon3
The tool will go into each monitor, one by one, and upgrade yum repos, dependencies, and Ceph components. There will be a lot of scrolling text, but the important part is that at the end of each host, you should see these lines:
[mon1][INFO ] Running command: sudo ceph --version [mon1][DEBUG ] ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)
This means that Ceph packages have been upgraded correctly to Luminous. We can check at any time the versions of the cluster components with one of the new commands Ceph versions:
You can see for example that we have one monitor already upgraded to Luminous, and the other two in an unknown condition. This is because we need to restart each monitor after the upgrade, as I explain in step 6 below.
6 – On each monitor node, we need to restart the service, by issuing the command:
systemctl restart ceph-mon.target
After the restart, the version check gives me a different result:
"mon": { "ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)": 3 },
7 – One new component that was introduced in Kraken is the Ceph manager. The manager runs alongside monitor daemons, to provide additional monitoring and interfaces to external monitoring and management systems. Since the 12.x (luminous) Ceph release, the ceph-mgr daemon is required for normal operations. The ceph-mgr daemon is an optional component in the 11.x (kraken) Ceph release. That’s why my Jewel cluster had no manager, and so I had to first install it on each monitor. To do so, we can run:
ceph-deploy mgr create mon1 mon2 mon3
If you encounter an error like:
RuntimeError: config file /etc/ceph/ceph.conf exists with different content; use --overwrite-conf to overwrite
It’s because the installer wants to write the ceph.conf file as part of the deployment, but the file obviously is already there, since this is an upgrade. I tested the suggested solution, and actually the file is written with exactly the same parameters, so we can safely run:
ceph-deploy --overwrite-conf mgr create mon1 mon2 mon3
If we then run again ceph versions, we see the three new managers up and running, with already the latest 12.2 version:
Also, we can see these information if we run ceph -s:
8 – Time to upgrade the OSDs. We use the same command we used before to upgrade the monitor nodes, and we can again upgrade all our OSD nodes at once, using a single command:
ceph-deploy install --release luminous osd1 osd2 osd3 osd4
As explained before, you can use ceph versions or ceph -s to monitor the progress. On each node, once the upgrade is complete, you can restart the ceph-osd daemons with the command:
systemctl restart ceph-osd.target
At the end of the upgrade process, the situation should be like this:
9 – The upgrade itself is completed since every component is at version 12.2. Now we can disallow any pre-Luminous OSD and enable Luminous-only functionality:
ceph osd require-osd-release luminous
This means that only Luminous nodes can be now added to the cluster.
10 – the last step, we can finally disable the noout option so that the cluster can re-balance itself when needed:
ceph osd unset noout
Final notes
The upgrade process, at the end, has not been as difficult as stated by the release notes, I was able to upgrade my entire cluster, made with 7 nodes, in less than one hour. Note, however, that all the OSDs I have are still based on FileStore. In order to use BlueStore, I need to convert them first. This will be the topic of my next post.