Tag Archives: BtrFS

BtrFS as ongoing project

BtrFS is still an ongoing project for me, but if it will become a production platform for me soon is the question. Also playing with mirroring on BtrFS level made me wonder even more as it does the calculating about storage usage a little bit differently. Normally with mirroring you see the storage you can allocate and has been allocated. With BtrFS you see the total amount of data available on all disks combined as show in the example below.

$ sudo btrfs filesystem df /mnt
Data, RAID1: total=5.98GB, used=5.36GB
System, RAID1: total=8.00MB, used=4.00KB
System: total=4.00MB, used=0.00
Metadata, RAID1: total=256.00MB, used=6.01MB
$ df -h /mnt
Filesystem                Size  Used Avail Use% Mounted on
/dev/mapper/vg01-btrfsm1   16G   11G  4.8G  70% /mnt

I really like ZFS, but I really wonder if BtrFS could replace it. For now I see too many drawbacks in how BtrFS has been implemented and how distributions may use it. Maybe when Debian 8 is in testing it may be a better time to give BtrFS another chance, but swap space and encrypted file systems are still problems that need to be tackled.

Switching from VirtualBox to KVM (maybe)

I have been a VirtualBox user for a long time, but since I’m now looking more closely at BtrFS I also took a closer look at what is in $HOME. VirtualBox harddisks and ISO-images are a large chunk of it and maybe the time has come to look at a different solution. One of the plans is to move virtual machines to a dedicated machine instead of running some on my workstation when I need them. This could give me more options for longer experiments as then my personal data doesn’t has to share the same encrypted volume with the virtual machines.

As VirtualBox is mainly a desktop solution, then the other options are Xen and KVM for now. I picked KVM as it is shipped with RHEL6 and part of the vanilla Linux kernel since 2007. Also there is a nice (remote) management solution and closer integration in GNOME 3.4 in the form of GNOME Boxes. So the time has come to give it a go and first we create a line in /etc/fstab to mount the BtrFS subvolume.

LABEL=datavol	/var/lib/libvirt	btrfs	defaults,relatime,nodiratime,subvol=libvirt	0	0

Now we create the BtrFS subvolume and mount it. Afterward we install all required software and make a user member of the right group. It is important to note that one needs to logout and login afterwards. These right are only needed when doing local maintenance.

$ sudo btrfs subvolume create libvirt /media/btrfs-datavol
$ sudo mount /var/lib/libvirt
$ sudo apt-get install qemu-kvm virt-manager virt-viewer virtinst
$ sudo usermod -a -G libvirt <username>
</username>

The machine is now able to run virtual machines if it has an CPU with Intel-VT or AMD-V technology. And the first tests with Debian 6.0, Solaris 11 and Windows 7 looked very promising. The management interface is very clean and people who have worked with Solaris Container the commandline tool virsh is also a good option. One thing that seems to be missing is a storage snapshot option as in VirtualBox, but if it is a real miss I doubt as most images are on BtrFS and BtrFS supports snapshots on subvolume level.

For now KVM appears to be a good and free alternative for VirtualBox and VMWare. It may need some more love in the future, but for now it deserves some more testing from my side together with SELinux for stronger separation of virtual machines. Maybe I can say goodbye to DKMS for recompiling VirtualBox modules with every release and the Qt-toolkit as dependency for VirtualBox and switching back on the default GTK toolkit on my machine.

No smooth transition in Debian

Bugreport 638019 appears to be very straight forward, until the code finally hit Debian Testing last weekend. A simple relocation of a FIFO-buffer from /dev to /run caused direct trouble for machines with systemd and a normal shutdown wasn’t possible anymore. Both bugs 657979 and 657990 are a results of the modification. Seeing the overview of effected files and made me go back to the previous working release of source package sysvinit with the following commands

$ cd `xdg-user-dir DOWNLOAD`
$ wget http://snapshot.debian.org/archive/debian/20111223T034013Z/pool/main/s/sysvinit/bootlogd_2.88dsf-18_amd64.deb
$ wget http://snapshot.debian.org/archive/debian/20111223T034013Z/pool/main/s/sysvinit/initscripts_2.88dsf-18_amd64.deb
$ wget http://snapshot.debian.org/archive/debian/20111223T034013Z/pool/main/s/sysvinit/sysv-rc_2.88dsf-18_all.deb
$ wget http://snapshot.debian.org/archive/debian/20111223T034013Z/pool/main/s/sysvinit/sysvinit-utils_2.88dsf-18_amd64.deb
$ wget http://snapshot.debian.org/archive/debian/20111223T034013Z/pool/main/s/sysvinit/sysvinit_2.88dsf-18_amd64.deb
$ dpkg -i bootlogd_2.88dsf-18_amd64.deb initscripts_2.88dsf-18_amd64.deb sysvinit_2.88dsf-18_amd64.deb sysvinit-utils_2.88dsf-18_amd64.deb sysv-rc_2.88dsf-18_all.deb

And as there is no solution for now except a dependency change for systemd the package are being placed on hold like the last time they broke systemd.

$ echo "bootlogd hold" | sudo dpkg --set-selections
$ echo "initscripts hold" | sudo dpkg --set-selections
$ echo "sysvinit hold" | sudo dpkg --set-selections
$ echo "sysvinit-utils hold" | sudo dpkg --set-selections
$ echo "sysv-rc hold" | sudo dpkg --set-selections

It sounds strange for Linux-people, but I really wished I had an alternative boot environment like Solaris has. Maybe this is the reason for me to invest more time in read-write within BtrFS.

BtrFS and readonly snapshots

In a previous posting I started with BtrFS and as mentioned BtrFS supports snapshotting. With this you can create a point in time copy of a subvolume and even create a clone that can be used as a new working subvolume. To start we first need the BtrFS volume which can and must always be identified as subvolid 0. This as the default volume to be mounted can be altered to a subvolume instead of the real root of a BtrFS volume. We start with updating /etc/fstab so we can mount the BtrFS volume.

LABEL=datavol	/home	btrfs	defaults,subvol=home	0	0
LABEL=datavol	/media/btrfs-datavol	btrfs	defaults,noauto,subvolid=0	0	0

As /media is a temporary file system, meaning it is being recreated with every reboot, we need to create a mountpoint for the BtrFS volume before mounting. After that we create two read-only snapshots with a small delay in between. As there is currently no naming guide for how to call snapshots, I adopted the ZFS naming schema with the @-sign as separator between the subvolume name and timestamp.

$ sudo mkdir -m 0755 /media/btrfs-datavol
$ sudo mount /media/btrfs-datavol
$ cd /media/btrfs-datavol
$ sudo btrfs subvolume snapshot -r home home\@`date "+%Y%M%d-%H%m%S-%Z"`
Create a readonly snapshot of 'home' in './home@20124721-080109-CET
...
$ sudo btrfs subvolume snapshot -r home home\@`date "+%Y%M%d-%H%m%S-%Z"`
Create a readonly snapshot of 'home' in './home@20124721-080131-CET'
$ ls -l
totaal 0
drwxr-xr-x 1 root root 52 nov 21  2010 home
drwxr-xr-x 1 root root 52 nov 21  2010 home@20124721-080109-CET
drwxr-xr-x 1 root root 52 nov 21  2010 home@20124721-080131-CET

We now have two read-only snapshots and lets test to see if they are real read-only subvolumes. The creation a new file shouldn’t be possible.

$sudo touch home@20124721-080109-CET/test.txt
touch: cannot touch `home@20124721-080109-CET/test.txt': Read-only file system

Creating snapshots is fun and handy for migrations or as on disk backup solution, but they do consume space as the delta’s between snapshots is being kept on disk. Meaning that changes between the snapshots are being keept on disk even when you remove them. Freeing diskspace will not only be removing them from the current snapshot, but also removing previous snapshots that include the removed data.

$ sudo btrfs subvolume delete home@20124721-080109-CET
Delete subvolume '/media/btrfs-datavol/home@20124721-080109-CET'
$ ls -l 
totaal 0
drwxr-xr-x 1 root root 52 nov 21  2010 home
drwxr-xr-x 1 root root 52 nov 21  2010 home@20124721-080131-CET

As last step we unmount the BtrFS volume again. This is where ZFS and BtrFS differ too much for my taste. To create and access snapshots on ZFS the zpool doesn’t needs to be mounted, but then again with the first few release of ZFS the zpool needed to mounted as well. So there is still hope as BtrFS is still under development.

$ sudo umount /media/btrfs-datavol

Seeing what is possible with BtrFS, Sun’s TimeSlider becomes an option. Also the option of Live Upgrades with rollbacks as is possible with Solaris 11, but for that BtrFS with read-write snapshots needs to be tested in the near future.

First steps with BtrFS

After using ZFS on Solaris, I missed the ZFS features on Linux and with no chance of ZFS coming to Linux I had to do with MD and LVM. Or at least until BtrFS became mature enough and since the Linux 3.0 that time slowly has come. With Linux 3.0 BtrFS supports autodefragmentation and scrubbing of volumes. The second is maybe the most important feature of both ZFS and BtrFS as it can be used to actively scan data on disk for errors.

The first tests with BtrFS where in a virtual machine already a longtime ago, but the userland tools where still in development. Now the command btrfs follows the path set by Sun Microsystems and basically combines the commands zfs and zpool for ZFS. But nothing compares to a test in the real world and so I broke a mirror and created a BtrFS volume with the name datavol:

$ sudo mkfs.btrfs -L 'datavol' /dev/sdb2

Now we can mount the volume and create a subvolume on it which we are going to be using as our new home volume for users homedirectories.

$ sudo mount /dev/sdb2 /mnt
$ sudo btrfs subvolume create /mnt/home
$ sudo umount /dev/sdb2

When updating /etc/fstab we can tell mount to use the volumename instead of a physical path to a device or some obscure UUID number. Also you can tell which subvolume you want to mount.

LABEL=datavol	/home	btrfs	defaults,subvol=home	0	0

After unmounting and disabling the original volume for /home we can mount everything and copy all the data with rsync for example to see how BtrFS is working in the real world.

$ sudo mount -a

As hinted before scrubbing is important as you can verify that all your data and metadata on disk is still correct. You can do a read-write test by default or only read test to see if all data can be accessed. There is even an option to read parts of the volume that are still unused. In the example below the subvolume for /home is being scrubbed and with success.

$ sudo btrfs scrub status /home
scrub status for afed6685-315d-4c4d-bac2-865388b28fd2
	scrub started at Sat Jan 17 15:11:58 2012, running for 106 seconds
	total bytes scrubbed: 5.77GB with 0 errors
...
$ sudo btrfs scrub status /mnt
scrub status for afed6685-315d-4c4d-bac2-865388b28fd2
	scrub started at Sat Jan 17 15:11:58 2012 and finished after 11125 seconds
	total bytes scrubbed: 792.82GB with 0 errors

The first glances of BtrFS in the real world are a lot better with kernel 3.1 then somewhere with kernel 2.6.30 and I’m slowly starting to say it becomes ready to be included in RHEL 7 of Debian 8 for example as default storage solution. The same as ZFS became in Solaris 11. But it is not all glory as still a lot of work needs to be done.

The first is encryption as the LUKS era ends with BtrFS as it is not smart to put it between your disks and BtrFS. You lose the advantage of balancing data between disks when you do mirroring for example. But then again LVM has the same issue where you then also first need to setup software raid with MD with LUKS on top of it and LVM on top of that. For home directories EncFS maybe an option, but it still leaves a lot of area’s uncovered that would be covered by LUKS out of the box.

The second issue is the integration of BtrFS in distributions and the handling of snapshots. As for now you first need to mount the volume before you can make a snapshot of a subvolume. The same for access a snapshot and for that I think ZFS still has an advantage with the .zfs directory accessible for everyone who has access to the filesystem. But time will tell and for now the first tests look great.