ZFS: third time through

Fascinating–I had just finished a scrub on June 6. Then a drive started dying. Now I’ve attempted to replace it, but an adjacent drive also had errors. I feel like I’m in a bit of a pickle. This snapshot had the error, so I deleted the snapshot and did a zfs clear tank, and the scrub process automatically restarted:

tank/VMs/l_4548-f30m64r@0000-installed:/lf541-f30-64-sda.img
  pool: tank
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Sat Jul  4 09:31:08 2020
	3.61T scanned out of 5.02T at 474M/s, 0h51m to go
	464G resilvered, 71.96% done
config:

	NAME                                                      STATE     READ WRITE CKSUM
	tank                                                      DEGRADED     0     0     0
	  mirror-0                                                DEGRADED     0     0     0
	    wwn-0x5000c500536ffa79                                ONLINE       0     0     0
	    replacing-1                                           DEGRADED     0     0     0
	      8130638507939855275                                 UNAVAIL      0     0     0  was /dev/disk/by-id/wwn-0x5000c5006e3d9d3f-part1
	      wwn-0x5000cca24cd7f690                              ONLINE       0     0     0
	  mirror-1                                                ONLINE       0     0     0
	    wwn-0x5000c5005226b37b                                ONLINE       0     0     0
	    wwn-0x5000c500522766b5                                ONLINE       0     0     0
	  mirror-3                                                ONLINE       0     0     0
	    wwn-0x5000cca223ca0714                                ONLINE       0     0     0
	    wwn-0x5000cca224cb6a3b                                ONLINE       0     0     0
	logs
	  mirror-2                                                ONLINE       0     0     0
	    nvme-SAMSUNG_MZVPW128HEGM-00000_S347NY0HB06043-part1  ONLINE       0     0     0
	    nvme-SAMSUNG_MZVPW128HEGM-00000_S347NY0HB06165-part1  ONLINE       0     0     0
	cache
	  nvme-eui.002538cb61020e02-part2                         ONLINE       0     0     0
	  nvme-eui.002538cb61020e7c-part2                         ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        <0xb081>:<0x2>

So at the bottom it has a unnamed file reference and it is prolly going to stick at 71.x% for the next two hours.

Adding Lots of Storage Pools to Libvirt

I do not enjoy having subdirectories involved for storing files in libvert. The virt-manager interface is just way too brutal about how you manually add storage pools. After much ranting, I wrote a bash script to add these directories for my VM disk images.

  1 #!/bin/bash
  2 
  3 set -e
  4 #set -x
  5 existing_pools=()
  6 export existing_pools
  7 
  8 function add_this_dir() {
  9     local -n xisting_pools="$1"
 10     if [ ! -d "$2" ]; then
 11       echo "add_this_dir: $2 is not a directory"
 12       return
 13     fi
 14     local shortname=`basename $2`
 15     if [[ " ${xisting_pools[@]} " =~ " $2 " ]]; then
 16       echo "$2 already exists"
 17       sudo virsh pool-start "$shortname"  &>/dev/null ||:
 18       sudo virsh pool-autostart "$shortname"  &>/dev/null ||:
 19       return
 20     fi
 21     if [[ " ${xisting_pools[@]} " =~ " $shortname " ]]; then
 22       echo "$shortname already exists"
 23       sudo virsh pool-start "$shortname" &>/dev/null ||:
 24       sudo virsh pool-autostart "$shortname" &>/dev/null  ||:
 25       return
 26     fi
 27     sudo virsh pool-define-as --name $shortname --type dir --target "$2" --source-path "$2"
 28     sleep 1
 29     sudo virsh pool-start "$shortname" &>/dev/null ||:
 30     sudo virsh pool-autostart "$shortname"  &>/dev/null ||:
 31     sleep 1
 32 }
 33 
 34 while read L; do
 35   if [[ x$L = x ]]; then continue; fi
 36   hunks=($L) 
 37   existing_pools+=("${hunks[0]}")
 38 done < <(sudo virsh pool-list --all | grep -v -e Autostart -e '----' )
 39 
 40 echo "You have these existing pools defined: "
 41 echo "%%${existing_pools[@]}%%"
 42 
 43 while read D; do
 44   add_this_dir existing_pools "/tank/VMs/$D"
 45 done < <(ls /tank/VMs) 
 46 while read D; do
 47   add_this_dir existing_pools "/tank/softlib/iso/$D"
 48 done < <(ls /tank/softlib/iso)

 

VirtualBox: boot from USB image

Projects like OPNsense.org provide you with an .img file that you would dd to a USB device to boot from. This is not obvious how to use from VirtualBox. You need to convert that into a VMDK file. Basically, the command I used was:

vboxmanage convertfromraw OPNsense-19.7-OpenSSL-serial-amd64.img /tank/VMs/4544-opnsense-19-freebsd/opensense-19.7-usb.vmdk –format vmdk

Then attach that VMDK file to your virtual SATA controller and when you boot really quick! Hit F12 and choose option 2. That’s your USB device.

 

Ubuntu 18.04 Terminal Boot

Here are a series of commands to get Ubuntu 18.04 to boot into terminal mode, with various extras on how to get an automatic menu on boot up.

Skipping Graphical Boot

If you want to skip the graphical login screen, hit [Shift] or [Esc] before you see the grub menu to get to the grub menu. Add these features to the linux command:
systemd.unit=multi-user.target
Then hit Ctrl-X.

Changing the Default Boot Target

Become root. In /lib/systemd/system, change the default.target symlink:

# rm default.target; ln -s multi-user.target default.target
# systemctl daemon-reload

Checking the Filesystem Every Boot

If you do the first command above with a semicolon, you can still use tab-completion. Next, we go to /etc/default and update the grub settings:

# cd /etc/default
# vim grub
Change GRUB_CMDLINE_LINUX_DEFAULT to this value:
"fsck.mode=force fsck.repair=yes"

Run update-grub2:
# update-grub2

Reinforce this behavior by using tune2fs to make each file system run a check each boot. What file systems are you running?

# lsblk -o NAME,MOUNTPOINT # will produce output kinda like:
sda   
  sda1  /boot
  sda2  /
  sda3 [SWAP]
  sda4 /home

Running these command will make sda1, sda2, sda4 all check every mount:

# tune2fs -c1 /dev/sda1
# tune2fs -c1 /dev/sda2
# tune2fs -c1 /dev/sda4

Reboot:
# reboot

That shouldn’t take too long. You have a tty login now.

Creating an Automatic Menu

I’m disabling a few things:

systemctl disable snapd.service wpa_supplicant.service unattended-upgrades.service cups-browserd.service cups.service
systemctl daemon-reload

There will be lots of snaps you don’t want:

snap list --all | awk '/gnome|gtk/{print $1, $2}' | while read snapname snaprevision; do snap remove "$snapname" --revision="$snaprevision"; done
This didn't work well, maybe snap remove "$snapname" is enough
You are logged in on tty1 by default. (I don't know why tty0 exists.) Following this guide, create this directory:
# cd /etc/systemd/system
# mkdir getty@tty1.service.d
# cd getty@tty1.service.d
# vim override.conf
[Service]
ExecStart=
ExecStart=-/root/onboot.bash
StandardInput=tty
StandardOutput=tty
# vim /root/onboot.bash

#!/bin/bash
echo "This is a sound recorder appliance. Hit a key to start recording."
RECORDING=0
while true; do
  read -sn1 KEY
  if [[ $RECORDING = 0 ]]; then
    RECORDING=1
    echo "Now recording"
    /root/start-recording.bash
  else
    RECORDING=0
    echo "Recording stopped"
    /root/stop-recording.bash
  fi
done

 

# chmod +x /root/onboot.bash
# systemd daemon-reload
# reboot

All you have to do then is record things with the start-recording.bash and stop-recording.bash scripts.

ZFS Snapshot alias

Add this to your .bash_aliases for fun and profit:

function Snapshot () {
  local dst=""
  local atnam=""
  if [ -z "$1" ]; then
    dst=`df -l . | tail -1 |awk '{print $1}'`
  else
    if [[ $1 = *@* ]]; then
      atnam="${1##*@}"
      dst="${1%%@*}"
    fi
    dst=`df -l "$dst" | tail -1 |awk '{print $1}'`
  fi
  [ -z "$dst" ] && echo "wants file system name to snapshot" && return 1
  local NOW=`date +%Y%m%d-%H%M%S`
  [[ $dst = /* ]] && dst="${dst#/}"
  [[ $dst = */ ]] && dst="${dst%/}"
  [[ x$atnam = x ]] && atnam=$NOW
  sudo zfs snapshot "${dst}@${atnam}"
}

 

Ubuntu 18.04 Netplan!

This was unexpected, but I think I’m coping well. These are my notes on configuring netplan networking on my Ubuntu 18.04 server.

  1. systemctl disable NetworkManager.service NetworkManager-wait-online.service
  2. systemctl mask NetworkManager-wait-online.service
  3. systemctl daemon-reload
  4. apt install bridge-utils -y
  5. edit /etc/udev/rules.d/70-net.rules
    SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{dev_id}=="0x0", ATTR{type}=="1", ATTR{address}=="c8:70:00:9f:d7:72", NAME="eth0"
    SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{dev_id}=="0x0", ATTR{type}=="1", ATTR{address}=="00:e2:ed:17:09:60", NAME="eth1"
    SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{dev_id}=="0x0", ATTR{type}=="1", ATTR{address}=="00:e2:ed:17:09:61", NAME="eth2"
    SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{dev_id}=="0x0", ATTR{type}=="1", ATTR{address}=="00:e2:ed:17:09:62", NAME="eth3"
    SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{dev_id}=="0x0", ATTR{type}=="1", ATTR{address}=="00:e2:ed:17:09:63", NAME="eth4"
  6. edit /etc/netplan/01-netcfg.yaml
      version: 2
      renderer: networkd
      ethernets:
        eth0:
          dhcp4: no
          dhcp6: no
        eth1:
          dhcp4: no
          dhcp6: no
        eth2:
          dhcp4: no
          dhcp6: no
        eth3:
          dhcp4: no
          dhcp6: no
        eth4:
          dhcp4: no
          dhcp6: no
      bridges:
        br0:
          dhcp4: yes
          dhcp6: no
          interfaces:
             - eth0
          routes:
             -  to: 192.168.100.0/24
                via: 192.168.45.3
                on-link: true
        br1:
          dhcp4: no
          dhcp6: no
          addresses: [10.45.0.1/24]
          interfaces:
             - eth1
        br2:
          dhcp4: no
          dhcp6: no
          addresses: [10.45.1.1/24]
          interfaces:
             - eth2
        br3:
          dhcp4: no
          dhcp6: no
          addresses: [10.45.2.1/24]
          interfaces:
             - eth3
        br4:
          dhcp4: no
          dhcp6: no
          addresses: [10.45.3.1/24]
          interfaces:
             - eth4
    
  7. sudo netplan generate
  8. sudo netplan apply
  9. reboot

Without my eth1-eth4 devices plugged into a switch, rebooting takes forever.

ZFS Rebuild Script

I’ve rebuilt my zfs modules often enough that I’ve written a script to do a clean build that should avoid old kernel modules and old libraries.

#!/bin/bash
sudo find /lib/modules -depth -type d -iname "spl" -exec rm -rf {} \;
sudo find /lib/modules -depth -type d -iname "zfs" -exec rm -rf {} \;
sudo find /usr/local/src/ -type d -a \( \
   -iname "spl-*" \
   -o -iname "zfs-*" \
   \) -exec rm -rf {} \;

sudo find /usr/local/lib/ -type f -a \( \
   -iname "libzfs*" \
   -o -iname "libzpool*" \
   -o -iname "libnvpair*" \
   \) -exec rm -f {} \;

cd spl
git reset --hard HEAD
git checkout master
git pull
git tag | tail -1 | xargs git checkout
./autogen.sh && ./configure && make -j13 && sudo make install
cd ../zfs
git reset --hard HEAD
git checkout master
git pull
git tag | tail -1 | xargs git checkout
./autogen.sh && ./configure && make -j13 && sudo make install

sudo update-initramfs -u
sudo update-grub2