Adding disks by label in ZFS and making those labels stick around

When I stared building my new file server, I decided to add the disks to ZFS vdevs by label and not by the device id, i.e:

#glabel label l1 /dev/ada0
#glabel label l2 /dev/ada1

After a reboot, those labelled disks suddenly started to show up as /dev/ada0 and /dev/ada1 again and the labels disappeared from /dev/label directory.

For the existing disks, I tried to offline each disk in turn and re-label it. A new problem turned up then: I could not replace the /dev/adaX offlined disks with the same labelled ones, as zpool gave an error of the device “is part of active pool”.

After some further searching, I found out that I had to zero out the first and the last megabyte of the disk before labelling it and replacing in zpool:

#dd if=/dev/zero of=/dev/ada0 bs=1m count=1
#dmesg | grep ada0
<read the block count value, subtract 2048 and provide the result to the seek switch below>
#dd if=/dev/zero of=/dev/ada0 seek=358746954
#glabel label l1 /dev/ada0
#zpool replace zstore /dev/ada0 label/l1

At this point zpool status was again showing labels. However, after the next reboot, the labels were gone again and I was pretty frustrated. Back to the search engine.

On page 3 of some discussion of this matter, I noticed two additional steps, which should fix the problem. After performing the steps above and re-labelling and re-placing the disks, I issued:

#zpool export zstore
#zpool import -d /dev/label zstore

The -d switch is what instructs zpool to read the disk references from a specific directory and it makes the labels stick around.

When I added subsequent new disks to the pool, I followed these steps to make the labels stick and to avoid re-labelling at a later point:

  1. Zero-out the first and the last part of each disk that will comprise the new vdev (especially important if the disk has been in use before and does not come staight from the factory)
  2. Label each disk with glabel
  3. #zpool add zstore raidz label/l5 label/l6 etc….
  4. #zpool export zstore
  5. #zpool import -d /dev/label zstore

And the labels never disappeared again.

This same procedure can be applied to labelling your ZIL and LARC devices.

Reporting correct space usage for Samba shared ZFS volumes

ZFS is all the rage now and there are lots of tutorials and how-to’s out there covering most of the topics. There is one issue, for which I could not find any ready solution. When sharing a zfs volume over Samba, Windows would report incorrect total volume size. More precisely, Windows would always show the same size for both total size and free size and both values will be changing as the volume gets used.
This is obviously not what we want. Some digging uncovered that Samba relies internally on the result of the df program, which will report incorrect values for ZFS systems. More digging lead to this page and to the man pages of smb.conf, showing that it is possible to override space usage detection behaviour by creating a custom script and pointing Samba server to it using the following entry in smb.conf:


[global]
dfree command = /usr/local/bin/dfree


The following bash script is where the magic lies (tested on FreeBSD):

#!/bin/sh

CUR_PATH=`pwd`

let USED=`zfs get -o value -Hp used $CUR_PATH` / 1024 > /dev/null
let AVAIL=`zfs get -o value -Hp available $CUR_PATH` / 1024 > /dev/null

let TOTAL = $USED + $AVAIL > /dev/null

echo $TOTAL $AVAIL

And the following is a variation, which works on Linux (courtesy commenter nem):

#!/bin/bash

CUR_PATH=`pwd`

USED=$((`zfs get -o value -Hp used $CUR_PATH` / 1024)) > /dev/null
AVAIL=$((`zfs get -o value -Hp available $CUR_PATH` / 1024)) > /dev/null

TOTAL=$(($USED+$AVAIL)) > /dev/null

echo $TOTAL $AVAIL

Make sure to check the comments section, as several variations of this script are posted there, for example taking account for both ZFS and non-ZFS shares on the same system!

I can’t use zpool list as it reports the total size for the pool, including parity disks, so the total size might be greater than the real usable total size.
zfs list could have been used if there was a way to display the information in bytes and not in human-readable form of varying granularity.
The solution was to use zfs get and then normalise the values reported to Samba to the 1024 byte blocks. (I tried providing the third, optional, parameter of 1 as mentioned in the man pages, but Samba seemed to have trouble parsing really large byte values, so I ended up doing the normalisation in the script).

Also, I can’t rely on the $1 input parameter to the script, as it turned out to always be equal to ‘.’, which is usable for df, but not for zfs. This ‘.’ lead me to check the working directory of the invocation and, bingo, it turned out to be the root path of the requested volume, so I could simply get the value from pwd and pass it to zfs.