ZFS root

From SlackWiki
Jump to: navigation, search

If you have heard about the ZFS filesystem and you're curious to try it out, then you can try it on slackware64, thanks to the ZFSonLinux.org project!

In this article, an example is shown of how to install slackware64 14.0 onto a ZFS root filesystem (zfs-root) inside a QEMU virtual machine for testing purposes. It is recommended that you install qemu and learn how to use it to test a ZFS installation before you decide to use it on real hardware. A slackbuild for qemu can be found on slackbuilds.org. If you have a processor that supports Intel VT-x or AMD AMD-V virtualization acceleration (try modprobe kvm-intel or modprobe kvm-amd), then use the slackbuild for qemu-kvm.

This article isn't going to go into any detail on actually using ZFS or what kinds of installation options and settings might be optimal. This article just presents an example of how to get started with ZFS on slackware.

At this time of writing, ZFSonLinux is at "release candidate" (version 0.6.0-rc11) development stage, and many issues can occur. See the ZFSonLinux.org web site and the issues tracker for an idea about the current issues. There is also an irc channel on irc.freenode.net, #zfsonlinux, for discussion of issues and help.

Although not required, it is helpful to keep a local mirror of slackware64 on your system. Having a local mirror makes it easier to upgrade to slackware64-current or to patches/* in a release. To help manage your own local slackware mirror and generate your own CD and DVD slackware installers, see http://alien.slackbook.org/blog/local-slackware-mirror/.

In this article, we use a slackware64-current DVD image file as the qemu cdrom device media for the installation of the slackware guest. This was tested at slackware64-current = slackware64 14.0rc4. You can obtain a slackware DVD image using a torrent, buy the DVD from slackware.com, or generate your own DVD image file using the mirror script. For an actual DVD, you can pass your DVD device (e.g., /dev/sr0) as the qemu cdrom device. For a DVD image file, pass the image file as the qemu cdrom device.

The ZFSonLinux packages are not included in slackware at this time. There are two packages involved: spl and zfs. The first, spl, is "Solaris Porting Layer" and must be installed on the system before zfs can be compiled. The source code tarballs for these are at ZFSonLinux.org. At this time of writing, they are spl-0.6.0-rc11.tar.gz and zfs-0.6.0-rc11.tar.gz.

At this point, it is assumed you have qemu installed, have obtained a slackware64-current/14.0 DVD, and have downloaded spl and zfs tarballs to your host system. The first step now is, if your host is not slackware64-current/14.0 (the same as the guest will be), then you need to create a minimal development slackware64 guest system in qemu for just the purpose of compiling and packaging the spl and zfs slackware packages and transferring a copy of these packages back onto your host system to use later to install onto zfs during the slackware install process.

From this point onward, instructions become a bit terse and mostly limited to examples of the commands you might use to complete tasks. You are encouraged to examine what each command does by reading the command's man pages. For example, the command "man qemu" can show you most of what you need to know about running qemu on the command line.

So, here's an example of how you might build the spl and zfs packages, starting in your home directory:

(on host)
cd ~
mkdir qemu
cd qemu
mkdir slackdevel
cd slackdevel
mkdir packages
cp ~/spl-*.tar.gz packages
cp ~/zfs-*.tar.gz packages
qemu-img create disk1.raw 10G
ln -s ~/slackware64-current-install-dvd.iso dvd.iso
qemu-kvm \
 -option-rom 8xx_64.rom,bootindex=1 \
 -no-quit \
 -no-shutdown \
 -boot order=cd,menu=on \
 -m 2G \
 -cpu host \
 -smp sockets=1,cores=4 \
 -net nic,model=e1000 -net user \
 -cdrom dvd.iso \
 -drive format=raw,media=disk,cache=none,aio=native,if=scsi,index=0,file=disk1.raw \
 -virtfs local,path=./packages/,security_model=none,mount_tag=hostpackages

When you need to change boot to cd or disk, try changing "order=dc" or hitting F12 at boot prompt and select the CD/DVD boot option.

Note: The support for booting drives using if=scsi is not included with current versions of qemu. Booting scsi requires the 8xx_64.rom option-rom, which can be downloaded at LSI.com. Unzip it, and then:

cp 8xx_64.rom /usr/share/qemu
# then run qemu-kvm with additional command line option: -option-rom 8xx_64.rom,bootindex=1

Qemu should boot into the install-dvd, and now perform a very ordinary slackware64 install onto /dev/sda. Install d package set and most others, but package sets x, kde, xfce not really needed.

First note: examples here use tools from the gptfdisk package (gdisk, sgdisk, cgdisk) to partition the disk with GUID partition table (GPT) format. GPT might not be too familiar to everyone, who mostly have only used MBR partitions with fdisk. In the slackware installations this author has tested in qemu, using GPT partitions has caused no problems with lilo or other tools used in slackware. GPT disk labels support creating up to 128 partitions and support disks larger than 2TB (the limit of MBR fdisk). LILO installs into the first 440 bytes on /dev/sda, inside a GPT "protective MBR" that serves to provide compatibility with MBR bootloaders like lilo. LILO uses LBA32 sector addressing and can see all 512-byte sectors inside the first 2TB of a disk. So long as the /boot partition is within the first 2TB of a disk, lilo can work on a disk larger than 2TB. Unlike the common default of fdisk to start the first partition at sector 63, which is not aligned with the 4KiB sector size of Advanced Format hard disks, gdisk's default is to start partitions at 2048 sector (1MiB) boundaries which are also 4KiB aligned. Using gdisk GPT works on all disks and is compatible with using Advanced Format hard disks larger than 2TB. More information about GPT disk label format can be found on wikipedia.org. For those that have used grub instead of lilo, grub installation is different with GPT than with MBR, and grub requires a GPT partition of type grub-bios or 0xEF02 "BIOS Boot Partition" of about 8MiB where it can install it's additional bootloader stages because grub cannot embed in sectors 1-62 on GPT where GPT keeps it's partition tables. On MBR, sectors 1-62 were normally reserved space for bootloaders to keep a 2nd stage.

For this purpose of development guest, just create a simple 1-partition system, /dev/sda1, and do a simple slackware install on it. Don't bother with swap or separate /boot unless you want to:

(on install-dvd)
cgdisk /dev/sda
setup
reboot

When this simple development guest boots, login as root, and continue:

(on qemu guest)
cd ~
mkdir packages
mount hostpackages -t 9p -o trans=virtio packages
echo --prefix=/ --libexecdir=/usr/libexec --libdir=/lib64 --includedir=/usr/include --datarootdir=/usr/share > configure-options
mkdir src
cd src

mkdir install
tar xvzf ~/packages/spl-*.tar.gz
cd spl-*
./configure $( cat ~/configure-options )
make
make install DESTDIR=~/src/install
cd ~/src/install
makepkg ../spl-<version>_<kernel_version>-x86_64-<build_id>.txz
(e.g., makepkg ../spl-0.6.0rc11_3.2.28-x86_64-1root.txz)
cd ..
rm -r install
installpkg spl-0.6.0rc11_3.2.28-x86_64-1root.txz
depmod
cp spl-0.6.0rc11_3.2.28-x86_64-1root.txz ~/packages

mkdir install
tar xvzf ~/packages/zfs-*.tar.gz
cd zfs-*
./configure $( cat ~/configure-options )
make
make install DESTDIR=~/src/install
cd ~/src/install
makepkg ../zfs-<version>_<kernel_version>-x86_64-<build_id>.txz
(e.g., makepkg ../zfs-0.6.0rc11_3.2.28-x86_64-1root.txz)
cd ..
rm -r install
installpkg zfs-0.6.0rc11_3.2.28-x86_64-1root.txz
depmod
cp zfs-0.6.0rc11_3.2.28-x86_64-1root.txz ~/packages

At this point spl and zfs packages are copied back to the host in the packages directory. Also, spl and zfs are installed in this simple development guest, so one could begin to experiment with zfs pools and datasets but while the root fs is whatever you made it during setup (probably the default, ext4). This development guest isn't really needed anymore, so it can be deleted:

halt
(exit qemu: ctrl-alt-2 then enter quit at QEMU command prompt)

(on host)
rm disk1.raw dvd.iso
mv packages ..
cd ..
rmdir slackdevel

Now that we have the spl and zfs packages on the host, it is possible to copy them into your local slackware64 mirror and then generate your own install DVD image that includes them. For example, change to the directory that contains the slackware README files, and create a "addon" directory and copy the spl and zfs slackware packages into there. Then, generate the DVD image using a command such as $( ./mirror-slackware-current.sh -i ) from where you have this script configured to maintain your local mirror. We can also just continue to access them inside the installer using qemu's virtfs option. When you are going to do a real installation onto real hardware, you need to generate and burn your own DVD that includes the packages.

It is time to begin the installation of slackware64 on a ZFS root, assuming either you have regenerated dvd.iso with zfs packages in addon/* or we access the zfs packages while in the installer using qemu's virtfs to the host's packages directory.

ZFS supports creating complex data storage configurations in data storage pools (similar to a LVM volume group) backed by one or more virtual devices (similar to physical disks and raid arrays) and containing one or more datasets (similar to mounted filesystems, logical volume devices, or snapshots). But here, we keep it simple and will install on just a single disk device.

On a real ZFS installation, users will typically want to use the ZFS mirror or raidz virtual device (vdev) types for reliability against hard drive failures. For example, a good configuration might be to use a USB memory stick (and a backup stick) for /boot that is plugged into an on-motherboard "Type A" USB connector. Then, use four whole hard drives in a raidz2 vdev as the initial storage space in the root zpool. At a later time, to add more storage space, additional whole four-drive raidz2 vdevs of the same size could be added to the zpool. It is recommended to only add the same vdev type (disk, mirror, or raidz|1|2|3) and size to a zpool since zfs stripes across the vdevs, and vdevs of different types/sizes do not stripe together optimally. After adding a vdev to your zpool, you might need to copy the new /etc/zfs/zpool.cache to your initrd-tree or initramfs-source (and rebuild/reinstall them) if you are using the cachefile to import your zpool. Using mirror vdevs would be similar to a raid10. Using 4-disk raidz2 vdevs would give the same storage space as raid10 while fully reliable against any double-disk failure; however, the raidz2 parity calculations results in higher CPU usage and/or lower performance. Because raidz vdevs cannot be grown by adding hard drives to them, and vdevs in a zpool are striped, it is important with ZFS to decide what type and size of vdev will be used in a zpool as the units of add-on storage space; it could be difficult to alter this decision later. Finally, on a real installation with raidz, data scrub should be scheduled to actively search for "latent defects" on the hard drives so that unknown read/write errors do not become known later during a disk swap/rebuild and result in the zpool becoming faulted (lost).

If you decide to test raidz in qemu, consider making the qemu disk files on different physical disks on your host to avoid putting all the stress onto a single disk on your host. The raidz qemu disks that are all on a single physical hard drive will cause poor performance and stress on the disk due to the large amount of seek operations to access the stripes/columns in the raidz. Again, we'll keep it simple in this wiki and just make a root zpool that is just holding a single qemu disk that you've given to qemu to avoid stress on the host hard drive. But, if you really want to test raidz, you can do it by just making more qemu disks (with incremented index=X number) and making your zpool a raidz; there's no technical restriction here on the kind of vdevs in your root zpool.

Make "slackzfs" guest, and boot it into the slackware installer to begin installation:

(on host)
mkdir slackzfs
mv packages slackzfs/
cd slackzfs
qemu-img create disk1.raw 10G
ln -s ~/slackware64-current-install-dvd.iso dvd.iso
qemu-kvm \
 -option-rom 8xx_64.rom,bootindex=1 \
 -no-quit \
 -no-shutdown \
 -boot order=cd,menu=on \
 -m 2G \
 -cpu host \
 -smp sockets=1,cores=4 \
 -net nic,model=e1000 -net user \
 -cdrom dvd.iso \
 -drive format=raw,media=disk,cache=none,aio=native,if=scsi,index=0,file=disk1.raw \
 -virtfs local,path=./packages/,security_model=none,mount_tag=hostpackages

The install-dvd should boot, and then begin installation steps on the installer command line:

(on install-dvd)
# mount the install-dvd
mkdir /dvd
mount /dev/sr0 /dvd

# install coreutils, which provides the "hostid" command needed to load the spl module
installpkg /dvd/slackware64/a/coreutils-8.17-x86_64-1.txz

# install kmod package which includes "depmod" command and others
installpkg /dvd/slackware64/a/kmod-9-x86_64-2.txz
depmod

# install the full kernel modules so you can load 9p filesystem and use it if you want
# .. you might keep the zfs modules packages off the dvd and use a plain slack dvd
installpkg /dvd/slackware64/a/kernel-modules-*.txz
depmod

# install bash and libtermcap because udev rules in zfs use bash scripts
installpkg /dvd/slackware64/a/bash-*.txz
installpkg /dvd/slackware64/l/libtermcap-*.txz

# install spl and zfs that you have copied into the mirror before running the DVD creation
installpkg /dvd/addon/spl-0.6.0rc11_3.2.28-x86_64-1root.txz
depmod
installpkg /dvd/addon/zfs-0.6.0rc11_3.2.28-x86_64-1root.txz
depmod

#Or:
mkdir /packages
find /lib/modules/ | grep 9p
lsmod
# check that 9p modules load, mainly 9pnet_virtio
modprobe 9pnet_virtio
mount hostpackages -t 9p -o trans=virtio /packages
installpkg /packages/spl-0.6.0rc11_3.2.28-x86_64-1root.txz
depmod
installpkg /packages/zfs-0.6.0rc11_3.2.28-x86_64-1root.txz
depmod
modprobe zfs

Now zfs tools: "zpool" and "zfs" should work in the live installer. On the installer, there can be trouble seeing man pages, so install:

installpkg /dvd/slackware64/ap/man-*.txz
installpkg /dvd/slackware64/ap/groff-*.txz
installpkg /dvd/slackware64/a/less-*.txz
export MANPATH=/usr/man:/usr/share/man

The manual pages should be readable:

man zpool
man zfs

After seeing the man pages, we should be fully ready to continue to setup on ZFS.

(still on install-dvd)
ls -l /dev/sd*

Make sure the disk devices you expect are there. We'll use sda and make boot (/boot on /dev/sda1) and root (/ on /dev/sda2) partitions.

First, some background info: A ZFS filesystem is not supported by lilo as the filesystem containing /boot, which holds lilo's 2nd stage and the linux kernel files. The on-disk format of a ZFS filesystem is often a complex raid configuration where even if lilo understood ZFS it would likely require /boot to reside in only certain simple ZFS configurations to be plainly readable to lilo. As things are, a linux root installed in a partition with ZFS filesystem needs /boot to be on another partition with a filesystem that lilo understands, such as ext4 fs. In addition, because the spl and zfs modules are not built into the kernel, and the zfs startup requires some configuration to mount the zfs root, an initial ram/root filesystem (initrd) is needed. The initrd is a gzip compressed cpio archive file that contains a small linux root operating system that is decompressed into a ramfs block device at boot up and takes initial control of the system. The initrd contains an init script that configures devices in devtmpfs mounted at /dev. For example, if using mdadm, lvm, luks, losetup, and now zfs, this init script runs all the commands to create or start these devices (one of which is the root device), mounts the root device, and then transfers system control to the init script in the mounted root filesystem. The ramfs-based root environment on the booted install-dvd or booted initrd image (in rescue command line) are small slackware installations that exist only in ram and are temporary, but while inside them you can install, upgrade, and remove slackware packages inside them using the regular package management tools; this allows you access to any kernel modules and programs as needed inside these ramfs root. Packages can be installed, upgraded, or removed permanantly in the initrd image file to manage the programs and modules required during the initrd boot stage. Also from within these ramfs initial root environments, once the root device is mounted, chroot can be used to work on the mounted root system as if it were booted; this allows one to fix problems that are occuring in the root fs init scripts that cause normal booting to fail, or to install a kernel and/or modules etc. Now on to partitioning:

cgdisk /dev/sda

Here, cgdisk looks just like cfdisk, but it makes GPT partitions.

Make a partition for /boot that is about 512MiB of whatever partition type you like (8300 "Linux filesystem" is fine), or make it type bf07 "Solaris Reserved 1" if you want ZFS to maybe see the disk as if it was given as a whole disk to ZFS. Make a second partition using the remaining space with partition type bf01 "Solaris ZFS". Currently, when ZFSonLinux is given a whole disk device like /dev/sdb to use, it will automatically GPT partition it with a bf01 partition #1 that is the whole disk except a small 8MiB type bf07 partition #9 at the end of the disk (/dev/sdb1 and /dev/sb9). The small 8MiB type bf07 partition #9 is most likely reserved for use by bootloaders, like lilo, to have a /boot or other boot-related files kept there. For example, you might have /dev/sda that is totally non-ZFS for booting and /dev/sdb passed to ZFS whole where it automatically partitions sdb, but you can just use one disk here and make partitions. Let's use type 8300 for /boot to be sure nothing is confused:

We'll continue with this partitioning:
Part. #    Size        Partition Type           Partition Name
           4.0 MiB     free space
   1       512.0 MiB   Linux filesystem         boot
           4.0 MiB     free space
   2       9.4 GiB     Solaris /usr & Mac ZFS   root
           96.0 MiB    free space

The free space is left as a recommendation. The free space has potential uses for data recovery techniques, changing boot configuration, accomodating a replacement disk that is slightly smaller, and avoiding the end of the disk where GPT and other disk labels may need to claim space at the end of a drive to hold metadata. The "partition name" can be blank or any arbitrary name you'd like to give them.

If you are going to test a ZFS mirror or raidz configuration, then you can repeat this same partitioning on each disk. For example, for a four-disk raidz2 with /dev/sd[abcd] you might do:

sgdisk -R /dev/sdb /dev/sda
sgdisk -G /dev/sdb
sgdisk -R /dev/sdc /dev/sda
sgdisk -G /dev/sdc
sgdisk -R /dev/sdd /dev/sda
sgdisk -G /dev/sdd

Then, you could use the boot partitions /dev/sd[abcd]1 in a mdadm raid1 as your boot device:

mdadm --create /dev/md0 -l 1 -n 4 /dev/sd[abcd]1

In this configuration, /dev/md0 is your 4-way mirror /boot device and is used as explained in the slackware README_RAID.TXT. The /dev/sd[abcd]2 would be given to ZFS for the raidz2 (instead of to mdadm raid). This is only an example and there are many possible configurations, including using whole un-partitioned disks; however, we will continue in this wiki with a simpler configuration with /dev/sda1 as a non-raid /boot device and /dev/sda2 in a non-raidz/non-mirror zpool.

Create a zpool called zfs-root:

zpool create -m none zfs-root /dev/sda2

A zpool created like this, the normal way, is a kernel-internal block device that contains already a zfs filesystem. There is normally no entry in /dev for this device. However, it can still be referenced as a mountable device similar to how devtmpfs, proc, and sysfs are also kernel-internal mountable block devices. The -m option told zpool not to automatically mount it anywhere, and without -m the default is to always automatically mount to /zfs-root. These kinds of details are explained better elsewhere. For this situation, we do not want it automounted.

Because a zpool itself is a useable zfs fs, it has a full set of zfs "properties" to set options about the filesystem, perhaps similar to tune2fs. The "mountpoint=<some place to mount>" is one such property. We want it set to the special value "none":

zfs get mountpoint zfs-root
zfs set mountpoint=none zfs-root

You can also use "legacy" for the mountpoint and the management of mounting and unmounting the pool will be expected to be done using the regular "mount" and "umount" commands:

zfs set mountpoint=legacy zfs-root

Later in the wiki, it is shown how to mount using "zfs mount" for mountpoint=none, and regular "mount" for mountpoint=legacy. Using mountpoint=legacy is probably the preferred setting for the zfs-root because you can use the normal mount and umount commands and do not have to change slackware /etc/rc.d/* scripts much, if any. The normal slackware scripts handle a root filesystem's mounting and remounting using the normal mount command.

On creation, the zpool zfs-root is "imported" and has a zpool status that can be viewed:

zpool status zfs-root

zpool export zfs-root
zpool status zfs-root
# Now there is no status.

zpool import zfs-root
# Now it is back.

When a zpool is imported, it has it's current configuration cached in a file /etc/zfs/zpool.cache. When a zpool is exported, then it's entries in the cache file are removed and the cache file is deleted if it becomes empty. So long as the zfs kernel modules see this cache file and valid cache data inside it, ZFS considers the described zpools in the cache file to be valid as imported zpools. If the zpool.cache file is present at the time the zfs modules are loaded, then there is no further need to use the import and export zpool commands at system startup and shutdown. With root on a zfs, it is not possible to export the pool that root is on, but this is not a problem. At shutdown, the safe procedure will be to set the zfs-root property readonly=on or remount read-only and sync disks.

To avoid the requirement to import the zfs-root pool at startup, you can keep a copy of the zpool.cache file to later copy onto the initrd image to read at modules load time to have the pool reactivated without need to issue the zpool import command:

cp /etc/zfs/zpool.cache /etc/zfs/zpool.cache-initrd

A warning about the zpool.cache file: Having /etc/zfs/zpool.cache present at zfs module load or module init time can be unsafe if the device names in /dev have changed since the last boot. Especially dangerous is if two hard drives have swapped names, like /dev/sda and /dev/sdb have swapped names. It is safer to not have a zpool.cache file on the system at boot and zfs module load/init time to guard against the possibility that the cache file is no longer correct. This wiki will try to show how you can use zpool.cache to import your pool if the regular zpool import command seems to always require you to force it. This wiki will also show how to not need a zpool.cache file to import normally.

A zfs filesystem will not normally allow you to mount over any files. The mountpoint is expected to be an empty directory. Remove any junk files inside the installer's /mnt because zfs won't mount over any files:

rm /mnt/README

The directory /mnt is where the slackware setup expects you to mount your root for installation.

Controlling whether a zfs fs is to be mounted read-only or read-write is best managed by setting the readonly property while it is not yet mounted:

zfs set readonly=on zfs-root
zfs set readonly=off zfs-root

If you are using mountpoint=legacy then the options for the mount command for ro or rw mounting will work as normal also and override whatever the readonly property is set to. The readonly property does serve as a default.

Slackware expects root to be mounted readonly when /etc/rc.d/rc.S runs during normal startup, so later this is used to satisfy slackware's expectation to avoid rc.S complaining.

The zfs-root can be mounted and unmounted in two different ways:

zfs set mountpoint=/mnt zfs-root
zfs set mountpoint=none zfs-root
# or
mount zfs-root -t zfs /mnt
umount /mnt

If it is already mounted and you want to change it to mount somewhere else, it is best to transition it to unmounted first, then mount it in the new place.

The mount options -o ro, -o rw, -o remount,ro, and -o remount,rw can also be used and will override the readonly=on|off property without altering the property value. This can be confusing until you are familiar with how the two methods of mounting interact. When mountpoint=legacy is set, zfs expects the regular mount and umount commands to be used to mount and unmount the dataset with the assistance of the mount.zfs helper. For now, we will continue with mountpoint=none, but using legacy is also a good option.

Now, let's continue some, and mount to /mnt using the current readonly=off setting (rw mount):

zfs set mountpoint=/mnt zfs-root

This method of mounting will show the most detailed mount options in /etc/mtab. They are the mount options that zfs thinks are proper, and they are probably what should be kept in /etc/fstab for zfs-root until we know better.

cat /etc/mtab >> /etc/fstab

Now edit /etc/fstab to make sure it looks right, and remove the "rw" mount option, because we want to let it use a default for that, which will come from how we set the readonly= property most of the time. Unmount again:

zfs set mountpoint=none zfs-root

Mount again, now using all options from /etc/fstab:

mount zfs-root

It should be mounted at /mnt, and touch /mnt/t should allow you to make the file t. Then rm it: rm /mnt/t For installation, we want to make sure it is mounted read-write. After installation, it should mount readonly inside the initrd.

Sometimes there can be problems using the mount command, and then you can try it with the -i option:

 mount -i zfs-root
 mount -i zfs-root -t zfs /mnt

but, you should not have to use -i. When using -i makes it work, it performs an "internal-only" mount and does not use any mount.<fstype> helper scripts that might be on the system. You can use mountpoint=legacy" to avoid this small problem if you are preferring to use the regular mount and umount commands on your zfs-root.

Setup the /boot device:

mkfs.ext4 /dev/sda1
mkdir /mnt/boot
mount /dev/sda1 /mnt/boot

We have both root and boot mounted where the slackware installer expects them. Unmount the dvd so setup can find it normally:

umount /dev/sr0
setup

In the setup, skip straight to TARGET selection for the root device, but the setup doesn't see or understand the zfs-root, so it is not listed. This does not matter because we have mounted it already. Just skip down to a slot that say to continue, and continue on with setup. Because TARGET is skipped, you may not see the ADDITIONAL prompt (or if so, skip it too) where normally you give /dev/sda1 and have it mounted as /boot. This is also okay because we have mounted /mnt/boot already. Because we did not give TARGET the root device or say what device /boot is on, the lilo installation will fail, or you can skip it. Otherwise, this installation process is normal.

# Finish installation and exit setup, but do NOT reboot!
# There is still a lot to do before rebooting.
umount /dev/sr0

At this point, it is assumed that the setup has completed installing into /mnt. Congrats! You've installed slackware into a ZFS root!

The next steps are to prepare to use "chroot /mnt" and finish up the configuration on the installed root system and then be able to reboot onto the completed installation. Before doing chroot, copy some files from the setup environment into /mnt.

Copy the current installer /etc/fstab for later use inside the initrd. This fstab is what is proper inside the initrd, except remove option "rw" (if not done already).

cp /etc/fstab /mnt/etc/fstab.initrd

The fstab.initrd will look like:

 proc /proc proc defaults 0 0
 zfs-root /mnt zfs defaults,atime,dev,exec,suid,xattr,nomand,zfsutil 0 0

In the initrd, we want zfs-root to mount to /mnt.

Create /mnt/etc/fstab:

cat /etc/mtab >> /mnt/etc/fstab
vi /mnt/etc/fstab

Using vi started above, edit /mnt/etc/fstab and make it have lines such as:

 zfs-root / zfs defaults,atime,dev,exec,suid,xattr,nomand,zfsutil 0 0
 /dev/sda1 /boot ext4 defaults 0 2

These two lines in /mnt/etc/fstab are how the mounts should be looking when root is booted. Although the fstab option for fs_passno is 0 and indicates that root (/) should not be fsck, this fs_passno option in fstab is ignored during system startup for / in rc.S because rc.S explicity runs "fsck /" which does not look at options inside fstab. Fsck doesn't work on a zfs fs, and we will deal with that problem in a moment (see below).

Mount some special mounts before chroot:

mount -t devtmpfs none /mnt/dev
mount -t proc none /mnt/proc
mount -t sysfs none /mnt/sys

Make sure /boot is mounted:

mount /dev/sda1 /mnt/boot

If you had mounted a 9p fs share to access spl and zfs packages, then unmount that now: umount /packages

Ready, now:

chroot /mnt

At this point, you should be on the installed system, and now try to configure mkinitrd and lilo. First, switch to use the generic kernel:

cd /boot
rm System.map config vmlinuz
ln -s System.map-gen* System.map
ln -s config-gen* config
ln -s vmlinuz-gen* vmlinuz

We installed spl and zfs inside the live ramfs install-dvd environment, but not yet have we installed them into the root system:

mount /dev/sr0 /mnt/cdrom
depmod
installpkg /mnt/cdrom/addon/spl-*.txz
depmod
installpkg /mnt/cdrom/addon/zfs-*.txz
depmod

If instead, you used a 9p share to access the spl and zfs packages, then repeat that procedure here to install them onto the root system.

Setup lilo: edit /etc/lilo.conf so it is like:

  lba32
  append = " vt.default_utf8=1"
  boot = /dev/sda
  message = /boot/boot_message.txt
  prompt
  timeout = 1200
  change-rules
   reset
  vga = normal
  image = /boot/vmlinuz
   label = vmlinuz
   initrd = /boot/initrd.gz
   root = /dev/ram0
   read-write
# alternatively, the root and read-write lines
# can be this one line of kernel parameters:
#  addappend = " root=/dev/ram0 rw "

A note about initrd: The /boot/initrd.gz is a gzipped cpio archive containing a minimal BusyBox-based slackware root system that is sourced from /boot/initrd-tree. The commands "mkinitrd" and "mkinitrd -F" generates /boot/initrd-tree and /boot/initrd.gz. When using a initrd, the initial root device is normally set to root=/dev/ram0 (a ramfs block device) and allowed read-write. The kernel refers to devices in terms of "major" and "minor" numbers, and the ram0 device is assigned the block device major 1 and minor 0. This ram0 major/minor is written in hexadecimal as 0x0100 (256 in decimal). This hexademical number makes an appearance in /boot/initrd-tree/int where it runs: "echo 0x0100 > /proc/sys/kernel/real-root-dev". Echoing a device code to real-root-dev is part of a deprecated initrd mechanism, but is probably harmless to leave alone as slackware has it unless you know otherwise. More information about how initrd works can be found in:

/usr/src/linux/Documentation/devices.txt (info about major:minor device numbers)
/usr/src/linux/Documentation/initrd.txt  (mostly obsolete info on old initrd)
/usr/src/linux/Documentation/filesystems/ramfs-rootfs-initramfs.txt (current info on new initrd, called initramfs)

Advanced note about initrd: Slackware's mkinitrd package does not use the "pivot_root" command as documented in "initrd.txt", but instead uses "switch_root" provided by a project called BusyBox. The switch_root command changes the effective root filesystem from the initrd (unpacked in rootfs at /) to the new root filesystem (zfs-root mounted at /mnt); it overmounts rootfs / with zfs-root /mnt. The switch_root works as follows: delete everything in rootfs / (one filesystem only) that held initrd to free up ram memory, then use mount, chroot, and chdir to make the effective root change. In concept (as a shell script), the command "exec switch_root /mnt $INIT $RUNLEVEL" is performed as follows:

 find / -xdev | xargs rm -rf
 cd "$1"
 shift
 mount --move . /
 exec chroot . "$@"

Here, typically INIT=/sbin/init RUNLEVEL=3. The 'exec' causes switch_root to take the place of the parent process having PID=1. In system calls, the switch_root looks similar to this:

chdir("/mnt");
/* <insert code to unlink/rmdir all initrd files/directories
 *  installed in rootfs / without crossing filesystems > */
mount(".", "/", NULL, MS_MOVE, NULL);
chroot(".");
chdir("/");
execl("/sbin/init", "3", (char*) NULL );

The kernel's rootfs / that contained initrd is never actually unmounted (it can't be) because it is built into the kernel and always exists as at least a tiny ramfs or tmpfs, so you can continue to see it in /proc/mounts. When you give lilo root=/dev/ram0 (a ramfs device), the kernel will actually use the internal rootfs, which is an instance of a tmpfs (if your kernel config includes tmpfs) or ramfs. The rootfs that is a tmpfs can by default use up to half of system ram. If rootfs is a ramfs, it can use up to almost all of system ram until the system panics. Typically, tmpfs is built into the kernel, and so depending on how much ram your computer has, the initrd-tree can probably be up to about half that size.

Quite a lot of notes... now to continue on...

Setup mkinitrd:

cp /etc/mkinitrd.conf.sample /etc/mkinitrd.conf
vi /etc/mkinitrd.conf

Edit it to look like this (other defaults should be ok):

 MODULE_LIST="sym53c8xx:ext4:e1000:zfs:9p:9pnet_virtio"
 ROOTDEV="zfs-root"
 ROOTFS="zfs"
#Module sym53c8xx is the driver for qemu's if=scsi drive device.
#Module ext4 for /boot.
#Module e1000 for eth0 net device.
#Module zfs for ZFSonLinux.
#Modules 9p and 9pnet_virtio for mounting qemu -virtfs mount_tag.

These 9p modules could be useful from the rescue command line inside the initrd to get needed files that are on the host.

All dependent modules will also be included automatically into /boot/initrd-tree, which is packaged as /boot/initrd.gz. For example, to see the dependent modules of zfs:

modprobe --show-depends zfs

All listed modules will also be in the initrd. Now, make a /boot/initrd-tree that we can begin to look at and edit:

mkinitrd

Create an empty /etc/mtab or else zpool import fails (zfs expects /etc/mtab to exists already, it won't make it):

touch /boot/initrd-tree/etc/mtab

Copy in the fstab.initrd that was made earlier while on the installer. This provides all the mount options for a simple "mount zfs-root" onto /mnt:

cp /etc/fstab.initrd /boot/initrd-tree/etc/fstab

More package installing! This time into an alternate root, the root of the initrd, /boot/initrd-tree. This is how you can put whatever packages you like into your initrd. These installations work independent of your normal installations. You can install, upgrade, and remove packages using an altroot:

If spl and zfs are on your customized dvd:

installpkg --root /boot/initrd-tree /mnt/cdrom/addon/spl-*.txz
installpkg --root /boot/initrd-tree /mnt/cdrom/addon/zfs-*.txz

Otherwise, repeat same procedure but you mount your 9p mount_tag and get them installed.

More regular packages from the dvd:

installpkg --root /boot/initrd-tree /mnt/cdrom/slackware64/a/kmod-*.txz
installpkg --root /boot/initrd-tree /mnt/cdrom/slackware64/a/coreutils-*.txz
installpkg --root /boot/initrd-tree /mnt/cdrom/slackware64/a/gptfdisk-*.txz

# A bash script is in /boot/initrd-tree/lib/udev/, so we need it or udev rules have problems:
installpkg --root /boot/initrd-tree /mnt/cdrom/slackware64/a/bash-*.txz
# Bash needs this:
installpkg --root /boot/initrd-tree /mnt/cdrom/slackware64/l/libtermcap-*.txz

# Install net-tools to ensure that /etc/HOSTNAME, /etc/hosts, and potentially, /etc/hostid are handled proper.
installpkg --root /boot/initrd-tree /mnt/cdrom/slackware64/n/net-tools-*.txz

When ZFS has to import a zpool while the initrd is running (this can be avoided by not exporting and keeping /etc/zfs/zpool.cache inside the initrd), ZFS will refuse to import a pool if it thinks that the pool may be in use by another system. ZFS was designed for network attached storage (NAS) where multiple hosts might have access to the same disks but ZFS cannot support two hosts importing and using a ZFS pool at the same time (that requires a "cluster" fs). ZFS checks a value called "hostid" to determine this kind of conflict, and it can be a tricky issue with an initrd boot process. The value of hostid is returned by the gethostid() function or the hostid command. This function will return the hostid from /etc/hostid if that file exists and has a valid id in it, or else if the file doesn't exist it obtains a hostid number based on the hostname and ip address assigned to hostname found in /etc/hosts or in DNS. If the hostid trying to import a pool is not the same as the hostid that imported it last time, ZFS will not import the pool unless "zpool import -f zfs-root" is used to force the import. This force option is not recommended to use, so it is best to configure your initrd with matching /etc/HOSTNAME and /etc/hosts as on the root installation and ensure what your system looks like on the initrd matches the booted root system. The hostid can also be passed to the SPL module with the module parameter spl.spl_hostid=0xHHHHHHHH where the Hs are a hexadecimal hostid value and this overrides any /etc/hostid or hostname/IP-based hostid value.

Edit /boot/initrd-tree/init (this is the main system startup script on the initrd, responsible for configuring your block devices, mounting root at /mnt, and booting root):

Inside init, right near the top of the file after the PATH= line, add this:

/bin/hostname -F /etc/HOSTNAME
/bin/hostname $( /bin/hostname -s )
#/sbin/chhostid 13371400
# you can make your hostid whatever, but this didn't work right with ZFS, so it is commented out

This sets the name reported by the hostname command (the default is "darkstar") to be what is contained in /etc/HOSTNAME. The hostname is modified in the next line to be the short hostname. These lines will help ZFS to properly identify your host when importing pools.

Inside init, right after it runs "mdadm" commands as used for raid setup, we like it to setup ZFS, so add:

/sbin/zpool import zfs-root
/sbin/zfs set readonly=on zfs-root
mount zfs-root

We want to make sure the default is to mount readonly, then it mounts using defaults from /etc/fstab inside the initrd.

Inside init, close to the bottom, where it "Switch to real root partition", comment out 1 line that normally does the mount:

# mount -o ro -t $ROOTFS $ROOTDEV /mnt

We have already mounted zfs-root, and we don't want to use any mount options because we should be using a properly set /etc/fstab inside the initrd for the zfs-root mount options onto /mnt.

During a normal startup we want a readonly mount.

Note: we are going to setup with using "zpool import" and "zpool export" commands at startup and shutdown. But later, you can experiment having /etc/zfs/zpool.cache stored in the initrd and commenting out the zpool import/export lines. Using zpool import/export is safer than using a zpool.cache, so we try to setup that way preferably.

As mentioned, the zpool import command may sometimes fail, and give the message: "pool may be in use from another system". This is nothing serious in this test situation. When this happens, zfs-root will not get mounted to /mnt and the init script will detect this and drop you into a rescue command line where you may manually do:

zpool import -f zfs-root
mount zfs-root
exit

This will force import, overriding the conflict that zfs thinks may be happening. Next is mount. After this, exit to let the init script continue on to boot the system normally.

To avoid this problem we copy some files in (assuming you first edit /etc/HOSTNAME to hold the hostname you like to see on your command prompts, and edited /etc/hosts to have hostname on your loopback or other local ip address):

cp /etc/HOSTNAME /etc/hosts /boot/initrd-tree/etc

The value reported by the hostid command is what ZFS sees as your hostid (the SPL module actually tries to run the hostid command to get this value or it uses the spl.spl_hostid value that is given to it). In this setup so far, the value is based on the ip assigned to your hostname. If your hostname is on the line in /etc/hosts for ip address 192.168.13.37, then your hostid will be hex numbers C0.A8.0D.25 rearranged as A8C0250D (pairs of bytes are swapped). This is a ip-based hostid value, a 32bit hexadecimal number. The hostid has to be the same each boot or import will need a -f option to force it. Simply setting the hostid using the function sethostid() which writes a /etc/hostid file doesn't seem to agree with ZFS and you'd have to force the import, so this setup now assumes that you do not have or have deleted /etc/hostid.

You can sometimes see what has been happening to a zpool by using the command "zpool history -li zfs-root" to see a log of commands.

Once again, you can do away with the import and export commands by trying to have /etc/zfs/zpool.cache inside your initrd. To create zpool.cache, while on the installer or in the rescue shell of the initrd (at lilo "boot: vmlinuz rescue"):

(on installer or initrd rescue shell)
zpool import zfs-root
zfs set readonly=off zfs-root
mount zfs-root
cp /etc/zfs/zpool.cache /mnt/etc/zfs/zpool.cache-initrd
cp /mnt/etc/zfs/zpool.cache-initrd /mnt/boot/initrd-tree/etc/zfs/zpool.cache
umount /mnt
zfs set readonly=on zfs-root
#mount zfs-root
#exit (to continue boot process onto rootfs mounted at /mnt)

Words of caution, again: Doing this zpool.cache approach is not necessary, and some talk suggests this is deprecated or discouraged. Although /etc/hostid setup fails to import without -f option, ip-based (hosts|dns) hostid works on booting and import works without -f option and may be the more proper way to do shutdown and startup rather than using a cache file which could become invalid and cause a problem that is difficult to understand. The contents and exact function of the cache file are not well understood by many people. As will be spammed all over this wiki, using zpool.cache is not recommended if you can avoid it because it is potentially unsafe for your pool.

Save /boot/initrd-tree/init and quit vi.

Now, edit /etc/rc.d/rc.S and rc.6 to understand how to remount rw or ro as readonly property changes:

In /etc/rc.d/rc.S:

# comment out the line
#    /sbin/mount -w -v -n -o remount /
#add the line
     /sbin/zfs set readonly=off zfs-root

If you are using mountpoint=legacy for zfs-root, you do not need to make the above change.

In /etc/rc.d/rc.6:

# before: echo "Unmounting local file systems."
# add lines:
  rm -f /etc/zfs/zpool.cache
  echo "Removed /etc/zfs/zpool.cache"

Remove /etc/zfs/zpool.cache on shutdown. This is recommended because if it is present when you import a pool, it might use this old cache to determine the device filenames in the pool, which may have changed filenames since the last boot, especially the non-persistent /dev/sd* device filenames. It is safest to avoid any usage of the zpool.cache file at boot in initramfs, zfs module load, and at zpool import of other pools.

It is also possible to prevent /etc/zfs/zpool.cache from ever being created by using the zfs module option 'spa_config_path=' to tell the zfs module not to use any zpool.cache file (a blank filename). To set this zfs module option, do the following:

echo options zfs spa_config_path= > /etc/modprobe.d/zfs.conf
cp -a /etc/modprobe.d /boot/initrd-tree/etc

If zfs is built into the kernel (see ZFS root (builtin)), then this zfs module option is set as a kernel parameter: zfs.spa_config_path=

In /etc/rc.d/rc.6:

# comment out the line:
#    /bin/mount -v -n -o remount,ro /
# add the line:
     /sbin/zfs set readonly=on zfs-root

If you are using mountpoint=legacy for zfs-root, you do not need to make the above change.

In /etc/rc.d/rc.6, near the bottom just before poweroff|reboot, add the following block of lines:

########################
# Try to export zfs-root
/sbin/zpool export zfs-root
if [ $? -gt 0 ] ; then
 echo "ZFS export failed."
 echo "Notice: if root fs is busy, then this is normal fail."
else
 echo "ZFS export successful."
 echo "Warning: successful export is abnormal for root fs."
fi
/bin/sleep 5
########################

This attempts to export even though it cannot return success for the root pool. The attempt is still recorded in the zpool history, and even though it fails the pool may actually be marked as exported (not sure).

For ZFS, you cannot run fsck on it. It doesn't support that kind of filesystem check. The rc.S script will complain a lot in many places about not being able to run fsck on the zfs-root. To disable all attempts to run fsck, we can simply create a file that acts as a flag in /etc/rc.d/rc.S to tell it to skip fsck:

touch /etc/fastboot

This fastboot file is normally a run-once temporary file, and it is deleted near the bottom inside rc.S. Edit rc.S and stop it from deleting /etc/fastboot. We would like it to always skip all fsck. Edit /etc/rc.d/rc.S, and find the "rm" of fastboot, and do not have it removed!

An alternative to using /etc/fastboot is to make the file /sbin/fsck.zfs that always just "does nothing successfully":

#!/bin/sh
exec /bin/true

Or, it can be:

#!/bin/sh
exit 0

Or, just do:

ln -s /bin/true /sbin/fsck.zfs

This fsck helper has the advantage that you do not need to alter rc.S as much, or at all if you use zfs property mountpoint=legacy and the regular mount command on your zfs-root. With these ideas, no slackware /etc/rc.d/* files really need editing, but it is up to you how you want to experiment. I prefer to use mountpoint=legacy and edit just the initrd init script as above, and edit rc.6 to remove /etc/zfs/zpool.cache and try to export.

We are getting close, and have finished all editing, so make /boot/initrd.gz and install lilo:

mkinitrd -F
lilo

We finished! All that's left to do is clean up.

Exit the chroot.

exit

Try to umount everything:

umount /mnt/proc
umount /mnt/sys
umount /mnt/dev
umount /mnt/boot
umount /mnt/mnt/cdrom
umount /mnt/root/packages
umount /mnt
zpool export zfs-root

Note, that export stops a pool and frees devices, while import binds devices and starts a pool. It is somewhat like mdadm -A -s and mdadm -S -s, while /etc/zfs/zpool.cache might be somewhat like mdadm -E -s > /etc/mdadm.conf.

# Remove the DVD or change boot order
reboot

When you reboot, if all is good, you are booting properly onto a slackware installed on a ZFS root! Congrats!

If this seems to go badly, and this might happen if the import fails as explained some before, then you will be on the initrd's rescue command line just before the part in init where it wants to mount zfs-root and switch_root to boot up. In this case, you typically got commands like these to do:

If you get dumped to rescue because zpool import will not work and requires -f (force), then:

 zpool import -f zfs-root
 zpool export    zfs-root
 zpool import    zfs-root
 mount zfs-root
 exit

This can happen if hostname, hostid, or uname changes.

If you get into a jam and need to boot into the initrd and skip zpool import etc, use the lilo prompt (or, boot the install-dvd):

boot: vmlinuz rescue

In the rescue mode of the initrd, after you have mounted your zfs-root as readonly=on to /mnt, then "exit" to continue booting.

An alternative rescue shell on the initrd can be started as follows:

boot: vmlinuz rdinit=/bin/sh

This puts you into a sh shell inside the initrd, and the normal /init script is not run at all. In this shell, you can fix problems and then attempt to start the system normally as follows:

exec /init

This runs the normal rdinit program, "execing" it so that it takes process ID (PID) 1 of your sh shell, which is the required PID for the init program.

In other emergency command lines, you might need to run the lines at the end of init starting from: udevadm info --cleanup-db; udevadm control --exit If you don't run those udevadm commands, two runs of udevd could fight each other and it is slow; you have to wait 2 minutes twice for timeouts (4 minutes). The last line in init is run like: exec switch_root /mnt /sbin/init 1

Warning: the switch_root command is kind of dangerous, so read the manual page about it. From what I can tell, everything except the new root "/mnt" is deleted, so you are giving a command that says what NOT to delete! Make sure nothing got mounted outside of "/mnt" before you run it. However, the deletes should not cross filesystems, but only delete the contents of initramfs.

Also, in rescue command line or anytime you have finished making changes and are about to reboot in some abnormal situation, first run:

zfs set readonly=on zfs-root

If you are using mountpoint=legacy for zfs-root, then you may want to use readonly=off and use the options to the mount command to control ro|rw mounting. With mountpoint=legacy and readonly=on used together, there can be a contradiction between how the kernel actually mounts zfs-root (ro according to the readonly property) and how the userspace mount command and /etc/mtab attempt to mount the zfs-root (if you give -o rw). The kernel might actually mount ro, even when you have told the mount command to mount rw. The /etc/mtab can erroneously report rw, while the kernel's /proc/mounts file reports the actual mount as ro. Again, the two methods of mounting a zfs fs, either "legacy" mount command or by using the zfs set mountpoint can conflict with each other on the mode of mounting. If you create a zfs filesystem under zfs-root, like zfs create "zfs-root/movies" and then try to mount it, it may have inherited mountpoint=legacy and readonly=on from zfs-root, and it may mount ro even if you use the rw mount option: in this case, you have to do another command mount -o remount,rw "zfs-root/movies" to make it actually switch to rw (kind of weird!).

A bug in unpatched grep 2.13 build 1, found slackware 14.0rc1, causes problems when processing files stored on ZFS but rarely shows problems on other filesystems; the bug does not handle sparse files on zfs and btrfs correctly and results in chaos with slackpkg, compiling kernel modules, and possibly many other things. You must downgrade grep to 2.12 or use the patched grep 2.13 build 2 in slackware 14.0rc2 or later.


Upgrading:

From time to time you will want to upgrade the kernel to the new kernel in slackware64-current or in patches/. This will also necessitate a rebuild and reinstall or upgrade of spl and zfs. Below is an example of when both kernel and spl/zfs are being upgraded, to give an idea of the process:

Using slackpkg, that is configured to use your local slackware mirror is a good option but you can also configure it to use a remote mirror.

slackpkg upgrade-all

Assume slackpkg just upgraded your kernel, kernel-modules, kernel-headers, and kernel-source.

Note: the first time you try to run on kernel-generic, setup kernel-huge in lilo as 2nd choice. Huge has all the hard drive controllers builtin, but generic has them all as modules. For example, run "modinfo mptsas" to see module options (maybe you have a LSISAS1068) for your hard drive controller. Add file /etc/modprobe.d/mptsas.conf to set options you want with a line: options mptsas ... Do this for all the device driver modules for your hardware. Do this setup while you are on the huge kernel, then setup to reboot to generic. Include all these modules you need into mkinitrd.conf MODULES_LIST. Check that /boot/initrd-tree/etc/modprobe.d/ has a copy of your customized module conf files because it is on the initrd where the modules will load and use the conf files. mkinitrd probably copies all /etc/modprobe.d/* files into the initrd-tree, but check it. Pay attention to /etc/modprobe.d/README and /lib/modprobe.d/* /lib/modprobe.d/* maybe not copied to initrd-tree but you might want to copy some of it into your initrd-tree yourself to use them.


Check that upgraded kernel-source is symlinked to /usr/src/linux (zfs build looks here). SPL/ZFS modules will install to the kernel version in /lib/modules/</usr/src/linux kernel version>

cd /usr/local/src
tar xvzf ~/spl-*.tar.gz
tar xvzf ~/zfs-*.tar.gz

###### upgrade spl 1st
mkdir install
cd spl-*
./configure --prefix=/ --libexecdir=/usr/libexec --libdir=/lib64 --includedir=/usr/include --datarootdir=/usr/share
make
make install DESTDIR=/usr/local/src/install
cd ../install
makepkg ../spl-<version>rcX_<kernel version>-x86_64-Xroot.txz
cd ..
rm -r install
upgradepkg spl-<version>rcX_<kernel version>-x86_64-Xroot.txz
depmod <new kernel version>     (e.g. depmod 3.2.28, if .28 is new kernel upgrade; you currently run on old, maybe .27)
# this builds the module dependencies in /lib/modules/<new kernel version>
###### upgrade zfs 2nd
mkdir install
cd zfs-*
./configure --prefix=/ --libexecdir=/usr/libexec --libdir=/lib64 --includedir=/usr/include --datarootdir=/usr/share
make
make install DESTDIR=/usr/local/src/install
cd ../install
makepkg ../zfs-<version>rcX_<kernel version>-x86_64-Xroot.txz
cd ..
rm -r install
upgradepkg zfs-<version>rcX_<kernel version>-x86_64-Xroot.txz
depmod 3.2.28
####### upgrade spl/zfs inside altroot /boot/initrd-tree
ROOT=/boot/initrd-tree upgradepkg spl-<version>rcX_<kernel version>-x86_64-Xroot.txz
ROOT=/boot/initrd-tree upgradepkg zfs-<version>rcX_<kernel version>-x86_64-Xroot.txz
# note: ROOT is an environment variable, not a command line option to upgradepkg!
# note: do not worry about depmod inside initrd-tree, mkinitrd -F actually installs the modules
# note: this upgrade here is really to just upgrade the command line tools zpool, zfs etc
####### rebuild /boot/initrd.gz and reinstall lilo
vi /etc/mkinitrd.conf
# temporarily set KERNEL_VERSION to new version, but you are still running on old version
mkinitrd -F
lilo
vi /etc/mkinitrd.conf
# you can set KERNEL_VERSION back to $(uname -r) now
reboot

After reboot you can remove any old /lib/modules/<old kernel version> directory.


More about /etc/hostid. If you want to experiment changing your /etc/hostid file and seeing if you can use it and make zpool import work normally, here is a tool that might be helpful:

/* software product name: foobarz-chhostid.c
 * suggested binary name: chhostid
 * license              : BSD
 * license text:
Copyright (c) 2012, foobarz
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:
    * Redistributions of source code must retain the above copyright
      notice, this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above copyright
      notice, this list of conditions and the following disclaimer in the
      documentation and/or other materials provided with the distribution.
    * Neither the name of the <organization> nor the
      names of its contributors may be used to endorse or promote products
      derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
*/

#define FOOBARZ_CHHOSTID_VERSION "1.0.5"
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <stdio.h>
#include <endian.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <sysexits.h>
/* note: if include stdlib then sethostid fails even with _BSD_SOURCE defined */

long swapbytes(long abcd) {
  long badc;
  unsigned char addr_a;
  unsigned char addr_b;
  unsigned char addr_c;
  unsigned char addr_d;
  addr_a = (unsigned char) (abcd >> 24);
  addr_b = (unsigned char) (abcd >> 16);
  addr_c = (unsigned char) (abcd >>  8);
  addr_d = (unsigned char) (abcd);
  return badc = ( (addr_b << 24) | (addr_a << 16) | (addr_d << 8) | addr_c );
}

int main(int argc, char* argv[])
{
 long newhostid;
 struct in_addr addr;
 char* invalidchar;

 /*********** check command line */
 if(!( ((argc==3) && ( (strcmp(argv[1],"-h")==0) ||
                       (strcmp(argv[1],"-i")==0) ||
                       (strcmp(argv[1],"-l")==0) )) ||
       ((argc==2) && ( (strcmp(argv[1],"-s")==0) ))
     )
   ) {
                 /*01234567890123456789012345678901234567890123456789012345678901234567890123456789*/
  fprintf(stderr, "foobarz-chhostid, version %s. License: BSD.\n", FOOBARZ_CHHOSTID_VERSION);
  fprintf(stderr, "foobarz-chhostid, (c) 2012, foobarz; all rights reserved.\n");
  fprintf(stderr, "descr: %s change system hostid (in /etc/hostid).\n", argv[0]);
  fprintf(stderr, "usage: %s -h <new hostid>\n"
                  "        Change system hostid to <new hostid>;\n"
                  "        <new hostid> must be 32bit 8-digit hex number xxxxxxxx, each x in 0-f|F.\n"
                  "        The ip corresponding to <new hostid> is also listed for reference only.\n", argv[0]);
  fprintf(stderr, "usage: %s -i <dotted ip address>\n"
                  "        Change system hostid to hostid corresponding to <dotted ip address>.\n", argv[0]);
  fprintf(stderr, "usage: %s -l <dotted ip address>\n"
                  "        List hostid corresponding to <dotted ip address>;\n"
                  "        no change to system hostid.\n", argv[0]);
  fprintf(stderr, "usage: %s -s\n"
                  "        List current system hostid and corresponding ip address;\n"
                  "        no change to system hostid.\n", argv[0]);
  return EX_USAGE;
 }

 /*********** -h <hostid> */
 if( strcmp(argv[1],"-h") == 0 ) {
  if(strlen(argv[2]) != 8) { fprintf(stderr, "new hostid must be 8-digit hex number xxxxxxxx, each x in 0-f|F\n"); return EX_USAGE; }

  newhostid = strtol(argv[2],&invalidchar,16);
  if( newhostid==LONG_MIN ) { perror("strtol (underflow)"); return EX_USAGE; }
  if( newhostid==LONG_MAX ) { perror("strtol (overflow)"); return EX_USAGE; }
  if(*invalidchar != '\0') { fprintf(stderr, "invalid hex number %c; must be in 0-f|F\n", *invalidchar); return EX_USAGE; }

  if( sethostid(newhostid) != 0 ) {
   perror("sethostid");
   if( errno == EPERM ) return EX_NOPERM;
   return EX_UNAVAILABLE;
  }
  printf("System hostid changed.\n");

  /* hostid is a byte-swapped ip like b.a.d.c */
  /* get ip by swapping to correct order a.b.c.d and save into in_addr struct */
  addr.s_addr = htonl(swapbytes(newhostid));
 }

 /*********** -i or -l, then find hostid for <dotted ip address> */
 if( (strcmp(argv[1],"-i") == 0) || (strcmp(argv[1],"-l") == 0) ) {
  if( inet_aton(argv[2], &addr) == 0 ) { fprintf(stderr, "inet_aton: invalid dotted ip address: %s\n", argv[2]); return EX_USAGE; }

  /* hostid is a byte-swapped ip like b.a.d.c */
  /* get hostid by swapping bytes of ip a.b.c.d */
  newhostid = swapbytes(ntohl(addr.s_addr));
 }

 /*********** -i <dotted ip address> */
 if( strcmp(argv[1],"-i") == 0 ) {
  if( sethostid(newhostid) != 0 ) {
   perror("sethostid");
   if( errno == EPERM ) return EX_NOPERM;
   return EX_UNAVAILABLE;
  }
  printf("System hostid changed.\n");
 }

 /*********** -l <dotted ip address> */
 if(strcmp(argv[1],"-l") == 0 ) {
   printf("Listing hostid for ip. (no changes)\n");
 }

 /*********** -s */
 if(strcmp(argv[1],"-s") == 0 ) {
   /* see if /etc/hostid exists and print info */
   if( access("/etc/hostid", F_OK) != 0 ) {
     printf("Notice: /etc/hostid does not exist.\n");
     perror("Reason");
     printf("System hostid derived from ip of hostname found in hosts file or dns.\n");
   } else {
     printf("Notice: /etc/hostid exists.\n"
            "System hostid is obtained from hostid file.\n");
   }

  errno=0;
  newhostid = gethostid();
  if(errno!=0) { perror("gethostid"); return EX_UNAVAILABLE; }

  /* hostid is a byte-swapped ip like b.a.d.c */
  /* get ip by swapping to correct order a.b.c.d and save into in_addr struct */
  addr.s_addr = htonl(swapbytes(newhostid));

  printf("Listing current system hostid. (no changes)\n");
 }

 /*********** print info and return */
 printf("ip dot: %s\n", inet_ntoa(addr) );
 printf("ip hex: %.8x\n", ntohl(addr.s_addr) );
 printf("hostid: %.8x\n", newhostid );
 return EX_OK;
}

Now available! "ZFS root (builtin)" wiki (aka "ZFS root, part II"): http://slackwiki.com/ZFS_root_(builtin) It is now possible to build the modules and a small initramfs into your kernel to have a fully-contained and bootable ZFS root kernel from a simple lilo entry. Good luck!

Personal tools