Giter Site home page Giter Site logo

dasj / sd-zfs Goto Github PK

View Code? Open in Web Editor NEW
42.0 10.0 13.0 64 KB

Compatibility between systemd and ZFS roots

License: MIT License

C 90.91% Makefile 1.80% Shell 7.29%
zfs zfsonlinux systemd sd-zfs initrd boot mkinitcpio systemd-unit pool snapshot

sd-zfs's Introduction

sd-zfs

This project is an attempt to allow using systemd in the initrd and also having a ZFS dataset as root. It is intended for mkinitcpio (used by Arch Linux) but should also work with other systems. You are expected to already have a root filesystem on a ZFS dataset.

Please note that legacy root mounts are not supported because they are legacy technology (as the name implies).

Functionality

  • Boot from any ZFS dataset. All subdatasets are mounted as well
  • Use bootfs to decide what dataset to use
  • Use zpool.cache for pool caching (can be overridden)
  • Included udev rules for importing by vdev
  • All pools are exported on shutdown
  • root is mounted as read-only when rw is not set on command line
  • Actual hostid value is used when it's not found in /etc/hostid
  • /etc/modprobe.d/{spl,zfs}.conf is included in initrd

Snapshot booting

sd-zfs supports booting from ZFSsnapshots. As snapshots are read-only and not bootable, they are automatically cloned to new datasets which are booted. All subdatasets wil be checked for the same snapshot.

Example:

  • Boot with root=ZFS=tank/root@snap
  • The following datasets with the following snapshots exist:
tank/root
tank/root@snap
tank/root/etc
tank/root/etc@snap
  • When booting, the following datasets are created (they get deleted before creating if they exist):
tank/root_initrd_snap
tank/root_initrd_snap/etc
  • sd-zfs will boot from tank/root_initrd_snap

Installation

Get mkinitcpio-sd-zfs from the AUR. Users without Arch should read the manual installation instructions at the bottom of this document. sd-zfs is not ready for use yet. You need to configure it first.

Configuration

Kernel parameters

sd-zfs supports multiple kernel parameters to select what dataset to boot from and to tune the booting process.

Which dataset to boot from

  • root=ZFS=somepool/somedataset - Use this dataset to boot from
  • root=ZFS=AUTO - Check all pools for the bootfs value. See rpool to narrow the search
  • rpool=somepool - Check only this pool for the bootfs value. This may not contain slashes

The root option can be suffixed with @snap to boot from a snapshot named snap. See "Snapshot booting" for more information.

Other options

  • rootflags=flags - Use these flags when mounting the dataset
  • zfs_force=1 - Force import of the pools
  • zfs_ignorecache=1 - Ignore the pool cache file while booting

Bootfs

sd-zfs can use the bootfs value of your zpools. This is an property of ZFS pools which is intended to point to the root filesystem that is used for booting. You need to set it on any pool (the pool with the root fielsystem is recommended). If you set it to different values on multiple pools, the first one that is found will be used.

Check the bootfs value of all pools:

# zpool get -p bootfs
NAME   PROPERTY  VALUE   SOURCE
tank   bootfs    -       default

Set the bootfs value:

# zpool set bootfs=tank/root tank

This will make the system boot from the dataset "root" of the pool "tank".

The mountpoint value of the dataset needs to be /.

Custom module options

If you have any options for the SPL and ZFS modules, you can add them to /etc/modprobe.d/zfs.conf and /etc/modprobe.d/spl.conf. These files will be included into the initrd if they exist during initrd build.

mkinitcpio.conf

Add sd-zfs to the HOOK array of /etc/mkinitcpio.conf. As it depends on the systemd hook, it needs to come after it. There are no more dependencies than systemd.

Cache file

When booting the system, all devices are scanned to check for pools. Depending on the number of devices you have, it can be faster to cache the pools. This is accomplished by using the standard ZFS cachefile, which will be created at /etc/zfs/zpool.cache. If it exists during creation of the initrd, it will be included.

Hostid

If /etc/hostid exists during build, it will be included in the initrd. It is highly recommended to use this file. More information is found in the Arch wiki. If the file is not found, the actual value of the hostid command will be written to the initrd.

Rebuilding initrd

After changing any of these mkinitcpio related things (apart from the kernel command line and the bootfs value), you need to rebuild your initrd. Assuming you have the default linux package, you can just run:

# mkinitcpio -p linux

If you use another kernel (like linux-lts), you need to adapt the command.

How it works

Generating

When systemd is starting, all generators are run, this includes a generator for ZFS units. This generator parses the kernel parameters and creates systemd services for importing the pools as well as overriding sysroot.mount which is responsible for mounting the root filesystem.

Importing

When systemd is running, the pools are imported (without actually mounting them).

There are two ways to import the pools:

By scan

If no cachefile exists, all devices in /etc/disk/by-id are scanned and all pools that imported without actually mounting them. This can be forced via kernel parameter.

By cachefile

If the cachefile exists, all devices from the cachefile are imported without actually mounting them. This can be prevented via kernel parameter.

Mounting

The systemd unit sysroot.mount is overriden so it will run a custom command. This command figures out the bootfs (if ZFS=AUTO is used) and handles snapshot creation. It proceeds to mount the correct dataset, including all subdatasets.

Switching root

systemd will now take care of killing all processes and switching root to /sysroot.

Shutdown

When shutting down, systemd pivots back to another ramdisk. All executables from /usr/lib/systemd/system-shutdown are run. One of them is provided by this package and is responsible for forcefully exporting all pools. To accomplish that, zpool is added to the ramdisk via a systemd unit.

Manual installation

See mkinitcpio-install/sd-zfs, which instructs mkinitcpio what to do. $BUILDROOT is the root of the initrd that is currently being built. You need to make all and put the resulting files to the locations mentioned in the install file.

mkinitcpio-install/sd-zfs-shutdown is responsible for copying zpool to the ramdisk on shutdown. This is required for zfs-shutdown.

Warranty

Nope.

sd-zfs's People

Contributors

ajs124 avatar bkus-goog avatar dasj avatar genofire avatar lnicola avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sd-zfs's Issues

Zpool import first fails then succeeds after typing Ctrl + D

Hi, thank for sharing your work. I am trying to create a minimal initrd. I configured the hooks as follow:

HOOKS=(base udev block systemd sd-plymouth autodetect modconf keyboard keymap sd-zfs)

I am using refind

menuentry "Arch Linux (ck-surface4)" {
    icon     /EFI/refind/icons/os_arch.png
    loader   vmlinuz-linux-ck-surface4
    options  "initrd=intel-ucode.img initrd=initramfs-linux-ck-surface4-minimal.img rw root=zfs:zroot/root/default zfs_wait=30"
    submenuentry "Boot using default initramfs" {
        initrd initramfs-linux-ck-surface4.img
    }
    submenuentry "Boot using fallback initramfs" {
        initrd initramfs-linux-ck-surface4-fallback.img
        add_options "break=postmount"
    }
    submenuentry "Boot to terminal" {
        add_options "systemd.unit=multi-user.target"
    }
}
 

When booting the zpool import first fails. When I type Ctrl + D it seems it tries again and starts normally.
Any idea what I did wrong?

Filesystems not mounted

I tried sd-zfs, but my root fs ended up in /sysroot/pool/fs instead of /sysroot, so switching the root didn't work. I'm using non-legacy mounts.

Alternative support for native encryption via a systemd service

Native encryption is currently not supported.

PR#4 offers a C based solution, but it hasn't been mainlined yet

I'd like to suggest an alternative solution that could be immediately mainlined since it's a pure systemd based solution: it only requires starting the service : systemctl daemon-reload ; systemctl start --now zfs-askpassword@rpool-enc

The use of conditions means no password will be asked if the key is already available, so it should also be safe to add as a default

The lack of instance parameters (starting zfs-askpassword instead of the parametrized zfs-askpassword@rpool-enc) is managed via DefaultInstance=pool-enc which can be used to hardcode a default name (here pool/enc due to how systemd escaping works)

The ExecStart line can be modified to support other methods to provide the key, for example tpm2 based (unseal) or luks based (by specifying the location of the keyfile on the filesystem)

Currently, the solution depends on 2 services: [email protected] and zfs-remount.service due to my limited understanding of the systemd ordering sequence for sd-zfs.

However, the 2nd service is only present to have zfs mount mount the non-legacy dataset in their expected mointpoints on the filesystem as a separate step: if deemed desirable, the same result could be achieved via:

ExecStartPost=+/bin/sh -c 'zfs mount -a'

Ideally, this service would be generated by sd-zfs, to correctly fill the default DefaultInstance=pool-enc based on the bpool parameter, and automatically include the dependencies from systemd-ask-password in mkinitcpio.conf:

  • FILES should contain /etc/systemd/system/sysroot.mount.wants/[email protected] /usr/lib/systemd/system/systemd-ask-password-console.service if rpool/enc is the natively encrypted bpool
  • BINARIES should contain systemd-ask-password systemd-tty-ask-password-agent and anything else needed for other methods (ex: tpm2_unseal or luks binaries)

Development status?

Based on the stale pull request and lack of commits in this repo despite the owner being active on Github, it appears that sd-zfs is not being actively maintained. Is that correct?

This would be a shame because as I understand the current state of ZFS on root for linux, this is my only option for booting from a ZFS pool that is composed of a set of mirrored LUKS encrypted drives. If development has indeed stalled, please indicate that on the README so that users directed here (e.g. from the Arch Wiki) can easily reach that conclusion. If there is a different project that has superceded this one then you can point to that one. If there is no active development simply because everything is running swimmingly, then I suppose your indicate as such in response to this issue will document that.

Support mounting additional pools

To implement this, a new file called /etc/initcpio/sd-zfs-mount will be added. It will contain all datasets that should be mounted before switching roots.

The generator will read this file if it wants to mount just one pool (root=tank/root will usually cause it to import tank only instead of importing all pools. But if datasets from other pools need to be imported as well, it will add them to the import command.

The mounter reads the file again. It will mount the required datasets.

The structure of the file shoud look like this:

# Import tank/etc
tank/etc
# Import fish/usr
fish/usr

This would import the two datasets to their specified mountpoint value. Whitespaces at the beginning/ending of the line should be ignored.

TODO

  • Add the file to the initrd
  • Allow generator to detect if multiple pools are wanted
  • Tell the mounter to mount the additional datasets

==> WARNING: Possibly missing '/bin/sh' for script: /usr/lib/udev/vdev_id

With the latest OS update, on manjaro, running mkinitcpio gives me:

==> Starting build: 5.4.228-1-MANJARO
-> Running build hook: [systemd]
-> Running build hook: [modconf]
-> Running build hook: [sd-vconsole]
-> Running build hook: [keyboard]
-> Running build hook: [sd-plymouth]
-> Running build hook: [sd-encrypt]
-> Running build hook: [block]
-> Running build hook: [sd-zfs]
==> WARNING: Possibly missing '/bin/sh' for script: /usr/lib/udev/vdev_id
-> Running build hook: [filesystems]
==> Generating module dependencies
==> Creating gzip-compressed initcpio image: /boot/initramfs-5.4-x86_64-fallback.img

Same error message with more recent kernels like 6.0.
The OS can boot.

Failed to mount /sysroot with mix of legacy and non-legacy mount points

I want to create a multiboot pool. So I have the following

$ zfs list -r ssd/linux
NAME                 USED  AVAIL     REFER  MOUNTPOINT
ssd/linux           81,7G  79,1G       96K  none
ssd/linux/arch      21,1G  79,1G     19,5G  legacy
ssd/linux/arch/pkg  1,59G  79,1G     1,59G  /var/cache/pacman/pkg
ssd/linux/home      60,5G  79,1G     60,5G  /home

$ grep arch /etc/fstab                 
ssd/linux/arch  / zfs defaults 0 0

I think /home is mounted before arch root.

Cannot import root pool in 1.0.1

Probably the same issue reported on AUR.

I have two pools, one for the root fs and one for storage. When I drop to the shell on boot, if I run zpool import -a, the storage pool gets imported fine, while for the root pool I get an error saying that the host id has changed. I can import it with -f, but it fails again on the next boot.

Also, after the force import, I can boot with the standard initcpio with no issues (that is, it doesn't complain about the host id like I was afraid it might).

I think I've had issues in the past where I had one host id in the running system and another one in initcpio. Possibly related to archzfs/archzfs@0760a00.

Add the setting: Wait X seconds for all drives to be displayed before importing the pool.

https://github.com/zfsonlinux/zfs/wiki/Ubuntu-18.04-Root-on-ZFS

At the bottom of the page see section: MPT2SAS

Most LSI cards are perfectly compatible with ZoL. If your card has this glitch, try setting ZFS_INITRD_PRE_MOUNTROOT_SLEEP=X in /etc/default/zfs. The system will wait X seconds for all drives to appear before importing the pool.

Same from archzfs:
mkinitcpio -H zfs

To change the seconds of time to wait for ZFS devices to show up at boot:

    zfs_wait=30

Tnx.

systemd WorkingDirectory paths mounted too late

I have a systemd service with a custom WorkigDirectory pointing to a zfs filesystem from another pool (i.e. not the root one). It fails on boot with EXIT_CHDIR because the path isn't accessible yet. I would expect the service to start:

Units with WorkingDirectory=, RootDirectory=, RootImage=, RuntimeDirectory=, StateDirectory=, CacheDirectory=, LogsDirectory= or ConfigurationDirectory= set automatically gain dependencies of type Requires= and After= on all mount units required to access the specified paths. This is equivalent to having them listed explicitly in RequiresMountsFor=.

However, things are much better now than in 1.0. I have other services that either require or are located filesystems on the root pool. Previously, they didn't start, but systemd reported no error and showed them as "loaded, not running".

Failing to mount /sysroot on 1.0.3 release

The latest stable update of this (1.0.3) rendered my system unbootable. I can't give a lot of detail as the "emergency console" wasn't letting me activate it. It kept erroneously claiming my root was locked.

Downgrading to 1.0.2 allowed me to boot with ZFS again.

Use of mountpoint or org.zol:mountpoint

I am using ArchLinux on a ZFS root file system, booting with systemd and sd-zfs (mkinitcpio-sd-zfs). I am experimenting with multiple root filesysem datasets, either cloned from snapshots or created from scratch. I am unsure how to specify their mount points because they are all the same (/).

The README here says

The mountpoint value of the dataset needs to be /.

Setting the mountpoint property to the same value on multple datasets seems to cause more problems than it solves.

I noticed that snapshot clones created by sd-zfs do not have the mountpoint property but they work fine. My own datasets don't.

I've done tests and know that the initramfs won't mount the root filesystem without a defined mount point. So I dug further and found another property called org.zol:mountpoint which those snapshot clones do have. I tried setting that on my own datasets and it works.

I've read over the source for sd-zfs and it sets this property when cloning and it uses it when mounting.

So, is this some undocumented feature or is it the right way to do it - I don't know. Googling hasn't helped with this.

Booting a dataset either explicitly as tank/ROOT/arch123 or with AUTO and the bootfs property works fine with this org.zol.mountpoint. In fact, they work without setting mountpoint at all. It also works on other distros as well as Arch.

You can set it easily with set org.zol.mountpount=/ tank/ROOT/arch123 and, unlike setting mountpoint, it doesn't cause ZFS to try mounting (or unmounting) something you don't want or need (and avoids the associated error messages).

SO it works nicely but isn't documented anywhere that I could find. So I ask, is it ok to do it this way, or is there a better way?

If this is the correct approach then perhaps the README could be updated to reflect this.

systemd udev doesn't wait for HDDs, system can't boot (not even recovery)

With systemd udev, sd-zfs fails to find my boot devices and attempts to go into rescue mode and fails there with an inability to mount /sysroot and systemd whining about not being able to do something with the password file. With udev and zfs hooks udev spins for a while waiting on my storage and then everything comes up clean. With systemd and sd-zfs the udev startup appears to get paralleled with other items and this breaks the process. The boot devices are SSDs, however primary storage is all HDDs set to not spin up unless told to. Unfortunately I don't have the resources to determine if this will happen without sd-zfs.

Do you know of some way to force systemd to wait on udev before proceeding? The initramfs is a very truncated version and I don't know where to start looking.

[Issue] Snapshot-Booting seem not to work

Hi :)

first of all: Normal boot works perfectly - and with a really hard system config.

I'm successfully using

  • Manjaro
  • rEFInd Bootloader
  • Unified Kernel Image complete with kernel cmdline baked in
  • Luks Container on Partition (cant use the whole disk because of multiboot with windows)
  • ZFS in LUKS Container
  • Secure Boot enabled - enrolled own keys with sbctl along the Microsoft and Vendor (Framework 16) one's - signing my unified kernel images with sbctl - sbctl has a mkinitcpio hook.

Sadly the snapshot booting of sd-zfs seems to not work.

I'm creating an additional kernel uki image with the modified cmdline for the zfs snapshot name (wrote my own pacman hook script for this).

Booting this custom kernel in with the cmdline just modified for the snapshot gives me a whole lot of normal boot messages until a point where the system does not seem to proceed - but with no error or "systemd - Waiting on Service xxx" messages.

Booting this custom kernel into rescue target will get me a rescue shell where i can see that the snapshot dataset was successfully created, but not all needed zfs datasets are mounted.

I then mount them manually, which works, and then try to continue booting with systemctl isolate multi-user.target. Sadly i dont get anymore boot messages and the laptop seems to be "frozen" - the cursor is still blinking but the computer does not respond to anything. Have to hard poweroff it.

I really dont know how to debug this problem because i dont get any error or have any log which i could check (At least thats what I'm thinking - I'm not certain regarding logs). Also because of my really complicated setup I really dont know which infos to drop here in this issue because that would be so much. So please feel free to ask for information.

Best regards
hasechris

Support for separate non-legacy /var DSN?

It does not appear that sd-zfs currently supports a separate non-legacy /var DSN. It would be nice if it did, although I don't know what would be required. This is my mountable OS DSN list:

rpool/home                             9.06G  27.6G  7.67G  /home
rpool/pac_cache                        2.03G  27.6G  2.03G  /var/cache/pacman
rpool/root/default                     4.66G  27.6G  4.55G  /
rpool/root/opt                         1.81G  27.6G  1.81G  /opt
rpool/root/var                         2.83G  27.6G   322M  /var
rpool/root/var/cache                   1.43G  27.6G  1.10G  /var/cache

As shown it does not function with sd-zfs (nor without) in order for this to work properly I must make /var (and children) a legacy mount in /etc/fstab.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.