One year ago, i published a series of 10 blog posts called My adventures with Ceph Storage. As I had recently to rebuild my Ceph cluster from scratch, I decided it was time to create a quick guide to build the cluster as fast as possible.
Ceph cluster
I designed again my cluster to have one administrator machine, three monitor nodes and four OSD nodes. Each node has been built using CentOS 7.0, and with a disk layout like it’s explained in Part 3 of the previous series. These are the machines I created:
ceph 10.10.51.200 mon1 10.10.51.201 mon2 10.10.51.202 mon3 10.10.51.203 osd1 10.10.51.211 (10.10.110.211) osd2 10.10.51.212 (10.10.110.212) osd3 10.10.51.213 (10.10.110.213) osd4 10.10.51.214 (10.10.110.214)
Each OSD node has two network connections, the primary to talk with all the other nodes, and a secondary one for the storage replication.
For each node, the sequence of commands to be executed once the machine is up and running is:
useradd -d /home/cephuser -m cephuser passwd cephuser echo "cephuser ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/cephuser chmod 0440 /etc/sudoers.d/cephuser yum install -y ntp ntpdate ntp-doc ntpdate 0.us.pool.ntp.org hwclock --systohc systemctl enable ntpd.service systemctl start ntpd.service yum install -y open-vm-tools ## If you run the nodes as virtual machines, otherwise remove this line systemctl disable firewalld systemctl stop firewalld sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config sed -i s'/Defaults requiretty/#Defaults requiretty'/g /etc/sudoers yum -y update
On the admin node
In this machine, login as cephuser. This is the user you will always use to run any Ceph operation. Run these commands:
ssh-keygen ssh-copy-id cephuser@osd1
repeat this command for each other node in the cluster.
vi ~/.ssh/config Host osd1 Hostname osd1 User cephuser Host osd2 Hostname osd2 User cephuser Host osd3 Hostname osd3 User cephuser Host osd4 Hostname osd4 User cephuser Host mon1 Hostname mon1 User cephuser Host mon2 Hostname mon2 User cephuser Host mon3 Hostname mon3 User cephuser
chmod 440 ~/.ssh/config
On the OSD nodes
Go on each OSD node and prepare the disks with these commands:
parted -s /dev/sdc mklabel gpt mkpart primary xfs 0% 100% mkfs.xfs /dev/sdc -f parted -s /dev/sdd mklabel gpt mkpart primary xfs 0% 100% mkfs.xfs /dev/sdd -f parted /dev/sde mklabel gpt mkpart primary xfs 0% 100% mkfs.xfs /dev/sde -f parted -s /dev/sdb mklabel gpt mkpart primary 0% 33% mkpart primary 34% 66% mkpart primary 67% 100%
Repeat the same commands on each node.
Build the cluster
We will build the cluster from the admin node (the “ceph” machine) using the cephuser account. To save keys and logs in a defined position, we create a dedicated folder:
mkdir ceph-deploy cd ceph-deploy/
Remember to move into this folder each time you login into the machine.
Install ceph-deploy and create the new cluster by defining the monitor nodes:
sudo rpm -Uhv http://download.ceph.com/rpm-jewel/el7/noarch/ceph-release-1-1.el7.noarch.rpm sudo yum update -y && sudo yum install ceph-deploy -y
ceph-deploy new mon1 mon2 mon3
Create then the initial Ceph configuration:
vi ceph.conf
public network = 10.10.51.0/24 cluster network = 10.10.110.0/24 #Choose reasonable numbers for number of replicas and placement groups. osd pool default size = 2 # Write an object 2 times osd pool default min size = 1 # Allow writing 1 copy in a degraded state osd pool default pg num = 256 osd pool default pgp num = 256 #Choose a reasonable crush leaf type #0 for a 1-node cluster. #1 for a multi node cluster in a single rack #2 for a multi node, multi chassis cluster with multiple hosts in a chassis #3 for a multi node cluster with hosts across racks, etc. osd crush chooseleaf type = 1
ceph-deploy install ceph mon1 mon2 mon3 osd1 osd2 osd3 osd4 ceph-deploy mon create-initial ceph-deploy gatherkeys mon1
Then, create the OSD disks on each OSD node:
ceph-deploy disk zap osd1:sdc osd1:sdd osd1:sde ceph-deploy osd create osd1:sdc:/dev/sdb1 osd1:sdd:/dev/sdb2 osd1:sde:/dev/sdb3
repeat these two commands on each osd node.
NOTE: if after the creation of the cluster, OSDs are all down and not able to be started, I’ve found an issue with the latest ceph-deploy version. You can solve it by running:
sgdisk -t 1:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdb sgdisk -t 2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdb sgdisk -t 3:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdb
on each node and reboot them. There is also an issue in the systemd creation of Ceph services in the monitors, as they are not enabled by default. To solve it, go in each monitor and run:
systemctl enable ceph-mon.target
Finally, deploy the management keys to all the nodes:
ceph-deploy admin ceph mon1 mon2 mon3 osd1 osd2 osd3 osd4 sudo chmod +r /etc/ceph/ceph.client.admin.keyring
If everything worked without errors, you would see the new cluster using ceph -v and ceph -s:
ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9) cluster cc15a4bb-b4f7-4f95-af17-5b3e796cb8d5 health HEALTH_OK monmap e2: 3 mons at {mon1=10.10.51.201:6789/0,mon2=10.10.51.202:6789/0,mon3=10.10.51.203:6789/0} election epoch 614, quorum 0,2 mon1,mon3 osdmap e112: 12 osds: 12 up, 12 in flags sortbitwise pgmap v321: 256 pgs, 1 pools, 0 bytes data, 0 objects 430 MB used, 1198 GB / 1199 GB avail 256 active+clean
We have a working Ceph cluster using the latest version 10.2 (Jewel, as of May 2016).