An Ode to Replacing a Disk in a ZFS Pool

A disk in my ZFS pool has just started to die - and, quite annoyingly, not in a consistent way.

This sombre news means that I need to replace a drive in my zpool. gulp, etc

First things first - what drive is dying? If things are bad enough zpool status will be complaining about which drive is experiencing errors. But hopefully it won’t have got to that stage, and your smartmontools will have picked up an increase in drive errors early on. (hahaha it turned out that my smartmontools setup wasn’t working and that’s why it took ZFS freaking out for me to be alerted - learn from my lesson kids!)

Thankfully I have my ZFS pool set to use /dev/disk/by-id identifiers, rather than the rather arbitary (and subject to change) /dev/sda approach. Present me is thankful to past me for doing this, as I don’t have to try and map those /dev/sda style identifiers to a drive model and serial number. Tough luck if you have to do this step.

I’m even more thankful to past me as I’ve written the drive size, manufacturer and serial number on a label and stuck it on the hot-swap drive bay of my server. So all I have to do is go and find the drive and yank it out. And yes, I’ve not had to do anything so far in ZFS. I like ZFS.

After a bit of fumbling with screws, and a moment of terror as I think I’ve dropped one into the vent of a power supply, I have the new drive sitting in a caddy ready to be installed in the server. Time to take a photo of the new drive so I have a record of the serial and model number.

Once physically installed zpool status tells me that something is wrong with the pool. Which is to be expected as I just removed one of the drives.

  pool: slowpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: resilvered 5.86M in 00:00:10 with 0 errors on Mon Feb 10 19:48:00 2025
config:

	NAME                                            STATE     READ WRITE CKSUM
	slowpool                                        DEGRADED     0     0     0
	  mirror-0                                      DEGRADED     0     0     0
	    9764235106334271955                         UNAVAIL      0     0     0  was /dev/disk/by-id/scsi-SATA_ST4000VN008-2DR1_ZDHAMGT1-part1
	    scsi-SATA_WDC_WD40EFZX-68A_WD-WX12DA08KR45  ONLINE       0     0     0
	  mirror-1                                      ONLINE       0     0     0
	    scsi-SATA_WDC_WD40EFZX-68A_WD-WX12DA0R1V78  ONLINE       0     0     0
	    scsi-SATA_WDC_WD40EFZX-68A_WD-WX82DA1RHH08  ONLINE       0     0     0
	  mirror-2                                      ONLINE       0     0     0
	    scsi-SATA_ST4000VN008-2DR1_ZDHB6E35         ONLINE       0     0     0
	    scsi-SATA_ST4000VN008-2DR1_ZDHB880S         ONLINE       0     0     0
	  mirror-3                                      ONLINE       0     0     0
	    scsi-SATA_<redacted>                        ONLINE       0     0     0
	    scsi-SATA_<redacted>                        ONLINE       0     0     0

(<redacted> as those drives are still under warranty).

Before replacing the drive I need to know the ID of the new drive. A quick ls -alh /dev/disk/by-id/ and looking for that serial number tells me that it’s called /dev/disk/by-id/scsi-SATA_ST16000NT001-3LV_REDACTED.

Now I can take the disk that has been removed (see the zpool status output above) and replace it with the new disk.

sudo zpool replace slowpool /dev/disk/by-id/scsi-SATA_ST4000VN008-2DR1_ZDHAMGT1-part1 /dev/disk/by-id/scsi-SATA_ST16000NT001-3LV_REDACTED

After a few seconds the disk is replaced and has begun resilvering as it heals the vdev.

As I write this the resilvering is projected to take over a day to complete. This is slightly unnerving, as there’s no redundancy on that mirror while I wait for it. But even so, the pool is still usable, even if degraded.

EDIT: yes, yes, yes, I know that those drive sizes don’t match. The original mirror was made of two 4TB drives, and I replaced one with a 16TB drive. Yes, I also know that this means I have many terabytes wasted. It was a choice between buying a drive of the same size, or, knowing that I was going to upgrade one of the mirrors soon anyway, buying one of larger size and spening money now, but saving some money in the future. Such is being a grown-up.

Tagged with