Quickly build a new Ceph cluster with ceph-deploy on CentOS 7

One year ago, i published a series of 10 blog posts called My adventures with Ceph Storage. As I had recently to rebuild my Ceph cluster from scratch, I decided it was time to create a quick guide to build the cluster as fast as possible.

Ceph cluster

I designed again my cluster to have one administrator machine, three monitor nodes and four OSD nodes. Each node has been built using CentOS 7.0, and with a disk layout like it’s explained in Part 3 of the previous series. These are the machines I created:

ceph    10.10.51.200
mon1    10.10.51.201
mon2    10.10.51.202
mon3    10.10.51.203
osd1    10.10.51.211   (10.10.110.211)
osd2    10.10.51.212   (10.10.110.212)
osd3    10.10.51.213   (10.10.110.213)
osd4    10.10.51.214   (10.10.110.214)

Each OSD node has two network connections, the primary to talk with all the other nodes, and a secondary one for the storage replication.

For each node, the sequence of commands to be executed once the machine is up and running is:

useradd -d /home/cephuser -m cephuser
passwd cephuser
echo "cephuser ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/cephuser
chmod 0440 /etc/sudoers.d/cephuser
yum install -y ntp ntpdate ntp-doc
ntpdate 0.us.pool.ntp.org
hwclock --systohc
systemctl enable ntpd.service
systemctl start ntpd.service
yum install -y open-vm-tools ## If you run the nodes as virtual machines, otherwise remove this line
systemctl disable firewalld
systemctl stop firewalld
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
sed -i s'/Defaults requiretty/#Defaults requiretty'/g /etc/sudoers
yum -y update

On the admin node

In this machine, login as cephuser. This is the user you will always use to run any Ceph operation. Run these commands:

ssh-keygen
ssh-copy-id cephuser@osd1

repeat this command for each other node in the cluster.

vi ~/.ssh/config
Host osd1
Hostname osd1
User cephuser
Host osd2
Hostname osd2
User cephuser
Host osd3
Hostname osd3
User cephuser
Host osd4
Hostname osd4
User cephuser
Host mon1
Hostname mon1
User cephuser
Host mon2
Hostname mon2
User cephuser
Host mon3
Hostname mon3
User cephuser

chmod 440 ~/.ssh/config

On the OSD nodes

Go on each OSD node and prepare the disks with these commands:

parted -s /dev/sdc mklabel gpt mkpart primary xfs 0% 100%
mkfs.xfs /dev/sdc -f
parted -s /dev/sdd mklabel gpt mkpart primary xfs 0% 100%
mkfs.xfs /dev/sdd -f
parted /dev/sde mklabel gpt mkpart primary xfs 0% 100%
mkfs.xfs /dev/sde -f
parted -s /dev/sdb mklabel gpt mkpart primary 0% 33% mkpart primary 34% 66% mkpart primary 67% 100%

Repeat the same commands on each node.

Build the cluster

We will build the cluster from the admin node (the “ceph” machine) using the cephuser account. To save keys and logs in a defined position, we create a dedicated folder:

mkdir ceph-deploy
cd ceph-deploy/

Remember to move into this folder each time you login into the machine.

Install ceph-deploy and create the new cluster by defining the monitor nodes:

sudo rpm -Uhv http://download.ceph.com/rpm-jewel/el7/noarch/ceph-release-1-1.el7.noarch.rpm
sudo yum update -y && sudo yum install ceph-deploy -y

ceph-deploy new mon1 mon2 mon3

Create then the initial Ceph configuration:

vi ceph.conf

public network = 10.10.51.0/24
cluster network = 10.10.110.0/24
#Choose reasonable numbers for number of replicas and placement groups.
osd pool default size = 2 # Write an object 2 times
osd pool default min size = 1 # Allow writing 1 copy in a degraded state
osd pool default pg num = 256
osd pool default pgp num = 256
#Choose a reasonable crush leaf type
#0 for a 1-node cluster.
#1 for a multi node cluster in a single rack
#2 for a multi node, multi chassis cluster with multiple hosts in a chassis
#3 for a multi node cluster with hosts across racks, etc.
osd crush chooseleaf type = 1

Then, install ceph on each node:

ceph-deploy install ceph mon1 mon2 mon3 osd1 osd2 osd3 osd4
ceph-deploy mon create-initial
ceph-deploy gatherkeys mon1

Then, create the OSD disks on each OSD node:

ceph-deploy disk zap osd1:sdc osd1:sdd osd1:sde
ceph-deploy osd create osd1:sdc:/dev/sdb1 osd1:sdd:/dev/sdb2 osd1:sde:/dev/sdb3

repeat these two commands on each osd node.

NOTE: if after the creation of the cluster, OSDs are all down and not able to be started, I’ve found an issue with the latest ceph-deploy version. You can solve it by running:

sgdisk -t 1:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdb
sgdisk -t 2:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdb
sgdisk -t 3:45b0969e-9b03-4f30-b4c6-b4b80ceff106 /dev/sdb

on each node and reboot them. There is also an issue in the systemd creation of Ceph services in the monitors, as they are not enabled by default. To solve it, go in each monitor and run:

systemctl enable ceph-mon.target

Finally, deploy the management keys to all the nodes:

ceph-deploy admin ceph mon1 mon2 mon3 osd1 osd2 osd3 osd4
sudo chmod +r /etc/ceph/ceph.client.admin.keyring

If everything worked without errors, you would see the new cluster using ceph -v and ceph -s:

ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)
 cluster cc15a4bb-b4f7-4f95-af17-5b3e796cb8d5
 health HEALTH_OK
monmap e2: 3 mons at {mon1=10.10.51.201:6789/0,mon2=10.10.51.202:6789/0,mon3=10.10.51.203:6789/0}
 election epoch 614, quorum 0,2 mon1,mon3
 osdmap e112: 12 osds: 12 up, 12 in
 flags sortbitwise
 pgmap v321: 256 pgs, 1 pools, 0 bytes data, 0 objects
 430 MB used, 1198 GB / 1199 GB avail
 256 active+clean

We have a working Ceph cluster using the latest version 10.2 (Jewel, as of May 2016).

Ceph cluster

On the admin node

On the OSD nodes

Build the cluster

Share this: