Close

December 28, 2017

Mirrored ZPOOL with ZFS Boot on Root FAIL … and Fix

matto@spock: cat /projects_share/docs/zfs_zpool-attachdrive.txt

iX SpecOps: MIRRORED ZPOOL ROOT/BOOT FIX

Description: SINGLE VDEV BOOT ZPOOL FOR PRODUCTION SYSTEM

The problem with Prometheus is that ada4 had another pool on it, tank1. Hence the tank2 zpool name, I suspect. Since somebody did not destroy this root zpool with bootcode when bootcode & gpt partitioning
was done on the Intel ada0/1 pair, when the Intel RAID broke or was not discovered first, the system booted from the broken ada3s2 partition.

In fact, chances are that this occured many times & nearly every device in the system, which is full of unmatched drives of various sizes with no apparently plan.

        zpool import                                     
           pool: tank1
             id: 13323130829716330915
          state: ONLINE
         status: The pool was last accessed by another system.
         action: The pool can be imported using its name or numeric identifier and
             the '-f' flag.
           see: http://illumos.org/msg/ZFS-8000-EY
         config:

                tank1       ONLINE
                  ada4p2    ONLINE

The Fix:

It was a correct assumption to attempt adding the device to the existing 30GB zpool, called tank2. Various permutations of this approach were executed.

In addition, several methods were used to copy the existing partitions onto the second disk. This failed in the same fashion as directly attaching the drive to the existing zpool.

The errors seemed of little help.

        root@prometheus#> gpart backup /dev/ada3 | \
          gpart restore /dev/ada4
          gpart: geom 'ada4': Operation not permitted

        root@prometheus#> gpart backup /dev/ada3
          GPT 152
          1   freebsd-boot       40      512
          2   freebsd-zfs      552 58306560
          3   freebsd-swap 58307112  4194304

        root@prometheus#> part backup /dev/ada3 | \
          gpart restore /dev/ada4
        : Operation not permitted
        root@prometheus#> zpool attach $POOL_NAME ada3s2 ada4s2
        : Operation not permitted  # AND SO ON AND SO FORTH

## UNTIL:

   root@prometheus#> 
sysctl kern.geom.debugflags=0x10
   kern.geom.debugflags: 0 -> 16

    zpool attach -f tank2 /dev/ada3p2 /dev/ada4p2

Make sure to wait until resilver is done before rebooting.

If you boot from pool ‘tank2’, you may need to update boot code on newly attached disk ‘/dev/ada4p2’.

<br />     gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada4
     partcode written to ada4p1
     bootcode written to ada4
#################################### #### SUCCESS !
        zpool status tank2
        zsh: correct 'tank2' to 'tank' [nyae]? n
          pool: tank2
         state: ONLINE
        status: One or more devices is currently being resilvered.  The pool will
                continue to function, possibly in a degraded state.
        action: Wait for the resilver to complete.
          scan: resilver in progress since Thu Dec 14 19:25:29 2017
                31.2M scanned out of 23.1G at 201K/s, 33h26m to go
                30.2M resilvered, 0.13% done
        config:

                NAME        STATE     READ WRITE CKSUM
                tank2       ONLINE       0     0     0
                  mirror-0  ONLINE       0     0     0
                    ada3p2  ONLINE       0     0     0
                    ada4p2  ONLINE       0     0     0  (resilvering)