In my previous post, I talked about BTRFS, a modern and exciting filesystem for Linux. In this new post, I’m going to give you a quick walkthrough on what you can do with it.
Create a new BTRFS filesystem
In my lab, I’ve created a CentOS 7 virtual machine with 4 disks:
[root@linux-repo ~]# lsscsi -s
[2:0:0:0] disk VMware Virtual disk 1.0 /dev/sda 17.1GB
[2:0:1:0] disk VMware Virtual disk 1.0 /dev/sdb 75.1GB
[2:0:2:0] disk VMware Virtual disk 1.0 /dev/sdc 75.1GB
[2:0:3:0] disk VMware Virtual disk 1.0 /dev/sdd 75.1GB
The first disk, sda, contains the operating system and has been automatically formatted by CentOS installer using ext4. The other 3 disks are completely empty. One great feature of btrfs is that you can create btrfs file systems on unformatted hard drives, so you don’t have to use tools like fdisk to partition them in advance. To create a btrfs file system on /dev/sdb, /dev/sdc, and /dev/sdd, we simply run:
mkfs.btrfs /dev/sdb /dev/sdc /dev/sdd
without any additional switch for different options, the default configuration is RAID0 for data, and RAID1 for metadata.
[root@linux-repo ~]# mkfs.btrfs /dev/sdb /dev/sdc /dev/sdd
Btrfs v3.16.2
See http://btrfs.wiki.kernel.org for more information.
Turning ON incompat feature 'extref': increased hardlink limit per file to 65536
adding device /dev/sdc id 2
adding device /dev/sdd id 3
fs created label (null) on /dev/sdb
nodesize 16384 leafsize 16384 sectorsize 4096 size 210.00GiB
As you can see, the 3 disks, each with 70 GiB of space, where grouped to create a new filesystem that has now 210 GiB of space. As usual, the filesystem can be mounted manually with:
mount /dev/sdb /mnt/btrfs/
note that we used one of the disks of the btrfs stripe to do the mount. It’s not really important which one we call, since btrfs reads automatically the configuration of the entire tree, and the final result is that the entire filesystem is mounted and we can check its availability:
[root@linux-repo ~]# df -h /dev/sdb
Filesystem Size Used Avail Use% Mounted on
/dev/sdb 210G 18M 207G 1% /mnt/btrfs
and status:
# btrfs fi df /mnt/btrfs/
Data, RAID0: total=3.00GiB, used=1.00MiB
Data, single: total=8.00MiB, used=0.00
System, RAID1: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, RAID1: total=1.00GiB, used=112.00KiB
Metadata, single: total=8.00MiB, used=0.00
GlobalReserve, single: total=16.00MiB, used=0.00
As you can see, data are configured in RAID0, and in fact the df command returns us 210GB as the total space (3 times 70GB), while Metadata are in RAID1. We may want to use the 3 disks to act as a Raid1 device and not as Raid0, so we should change the RAID profile. This can be done with the command:
[root@linux-repo ~]# btrfs fi balance start -dconvert=raid1 /mnt/btrfs
Done, had to relocate 2 out of 6 chunks
[root@linux-repo ~]# btrfs fi df /mnt/btrfs/
Data, RAID1: total=2.00GiB, used=512.00KiB
System, RAID1: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, RAID1: total=1.00GiB, used=112.00KiB
Metadata, single: total=8.00MiB, used=0.00
GlobalReserve, single: total=16.00MiB, used=0.00
Now also Data is configured in RAID1. And if we check the available mountpoint:
[root@linux-repo ~]# df -h /dev/sdb
Filesystem Size Used Avail Use% Mounted on
/dev/sdb 105G 18M 70G 1% /mnt/btrfs
we can see that the size has already become 105GB, half the original space. RAID1 is already applied and enforce, and the best thing is that we have done the conversion without ever unmounting the filesystem! In the options of the balance command, dconvert is for converting data (d), but you can also use mconvert to change the protection level of metadata.
Add another device
Now, if at some point we are running out of space, BTRFS can also be used to expand dinamically the filesystem. By the way, there’s a nice way to test out and fill a filesystem: I’ve found this page with a nice bash script that generates multiple files with random size by simply setting some parameters in the script. I’m copying the script here just in case the page may disappear in the future. Be careful with the MAXSIZE parameter, this is bytes, so you want to use a large number otherwise it will take ages to fill the disk. In my modified version, I’m using up to 100MB per file.
12 | TOP=` pwd | tr - cd '/' | wc -c` |
18 | files=$(($RANDOM*$MAXFILES /32767 )) |
22 | size=$(($RANDOM*$MAXSIZE /32767 )) |
23 | head -c $size /dev/urandom > $f |
26 | depth=` pwd | tr - cd '/' | wc -c` |
27 | if [ $(($depth-$TOP)) - ge $MAXDEPTH ] |
33 | dirs =$(($RANDOM*$MAXDIRS /32767 )) |
37 | dirlist= "$dirlist${dirlist:+ }$PWD/$d" |
We just need to save this script in the btrfs partition, and run it multiple times until the filesystem is completely filled:
head: write error: No space left on device
head: write error
Meanwhile, the partition is completely filled:
[root@linux-repo btrfs]# df -h /dev/sdb
Filesystem Size Used Avail Use% Mounted on
/dev/sdb 105G 105G 64K 100% /mnt/btrfs
Now, to fix the problem and regain some free space, first we need to add a new device to the virtual machine:
[root@linux-repo ~]# lsscsi -s
[2:0:0:0] disk VMware Virtual disk 1.0 /dev/sda 17.1GB
[2:0:1:0] disk VMware Virtual disk 1.0 /dev/sdb 75.1GB
[2:0:2:0] disk VMware Virtual disk 1.0 /dev/sdc 75.1GB
[2:0:3:0] disk VMware Virtual disk 1.0 /dev/sdd 75.1GB
[2:0:4:0] disk VMware Virtual disk 1.0 /dev/sde 75.1GB
Then, we add /dev/sde to the btrfs filesystem:
[root@linux-repo ~]# btrfs device add /dev/sde /mnt/btrfs/
The disk is now part of the btrfs tree:
[root@linux-repo btrfs]# btrfs fi show
Label: none uuid: 421171a1-bb75-40a4-9e48-527be91dc143
Total devices 4 FS bytes used 104.14GiB
devid 1 size 70.00GiB used 70.00GiB path /dev/sdb
devid 2 size 70.00GiB used 70.00GiB path /dev/sdc
devid 3 size 70.00GiB used 70.00GiB path /dev/sdd
devid 4 size 70.00GiB used 6.00MiB path /dev/sde
Btrfs v3.16.2
And we immediately see the new size and available space:
[root@linux-repo btrfs]# df -h /dev/sdb
Filesystem Size Used Avail Use% Mounted on
/dev/sdb 140G 105G 5.9M 100% /mnt/btrfs
What we are still missing is the RAID1 balanced distribution of the chunks, as the total size of the volume is indeed now 140G (4 disks, 70G each, divided by two by the RAID1 option) but as you can see from the previous output, the first three disks are completely full while the new one is empty, and the effective free space is only 5.9M. We can rebalance the chunks distribution in BTRFS. This can be done incrementally, by using the option dusage. 5 for example means btrfs will try to relocate chunks with less than 5% of usage:
btrfs fi balance start -dusage=5 /mnt/btrfs/
You may want to increase the value at each run, you will see new chunks will be rebalanced at each run. If you can fee up just a little bit of space, you can even choose 100 for the value. You can also monitor the progress with a command like this:
[root@linux-repo ~]# while :; do btrfs balance status -v /mnt/btrfs; sleep 60; done
Balance on '/mnt/btrfs' is running
10 out of about 77 chunks balanced (16 considered), 87% left
Dumping filters: flags 0x1, state 0x1, force is off
DATA (flags 0x2): balancing, usage=100
After some time, the final result will be like this:
[root@linux-repo btrfs]# btrfs balance start -v -dusage=100 /mnt/btrfs
Dumping filters: flags 0x1, state 0x0, force is off
DATA (flags 0x2): balancing, usage=100
Done, had to relocate 77 out of 109 chunks
[root@linux-repo btrfs]# btrfs fi show
Label: none uuid: 421171a1-bb75-40a4-9e48-527be91dc143
Total devices 4 FS bytes used 98.44GiB
devid 1 size 70.00GiB used 50.99GiB path /dev/sdb
devid 2 size 70.00GiB used 49.00GiB path /dev/sdc
devid 3 size 70.00GiB used 49.01GiB path /dev/sdd
devid 4 size 70.00GiB used 50.99GiB path /dev/sde
Btrfs v3.16.2
See? Now each disk is consuming the same amount of space. The balance can take some time, depending on the amount of chunks that need to be relocated, and the speed of the disks. And before you ask, yes btrfs can do the rebalancing automatically, if you have at least version 3.18; on CentOS sadly it’s still at version 3.16…
Final notes
Those are few examples of what you can do with BTRFS, but the new filesystem and its tools really allow for many more operations.
There are many tutorials all over the Internet, and the best way to learn more about BTRFS is to build a small virtual machine like I did to play with it.