sysadmin | Chris Read

ZFS on Linux ‘insufficient replicas’ panic

17 December 2014 Chris Read Leave a comment

I run a lovely little HP N54L MicroServer at home to keep all my important bits. It’s been a faithful companion for many years across two continents. I’m running Ubuntu LTS on it, booting off a small SSD but keeping years worth of backups across two ZFS mirrors.

I discovered this evening that the little PCIe card I was using for my boot drive had failed. There’s a spare SATA port on the motherboard I never bothered using (it’s only SATA II, the SSD is SATA III), so I just pulled out the old card and booted off the onboard controller. Imagine the horror when I got the following response to my zpool status after the first boot:

root@dumpy:~# zpool status
  pool: first
 state: UNAVAIL
status: One or more devices could not be used because the label is missing
	or invalid.  There are insufficient replicas for the pool to continue
	functioning.
action: Destroy and re-create the pool from
	a backup source.
   see: http://zfsonlinux.org/msg/ZFS-8000-5E
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	first       UNAVAIL      0     0     0  insufficient replicas
	  mirror-0  UNAVAIL      0     0     0  insufficient replicas
	    sda     UNAVAIL      0     0     0
	    sdb     FAULTED      0     0     0  corrupted data

  pool: second
 state: UNAVAIL
status: One or more devices could not be used because the label is missing
	or invalid.  There are insufficient replicas for the pool to continue
	functioning.
action: Destroy and re-create the pool from
	a backup source.
   see: http://zfsonlinux.org/msg/ZFS-8000-5E
  scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	second      UNAVAIL      0     0     0  insufficient replicas
	  mirror-0  UNAVAIL      0     0     0  insufficient replicas
	    sdc     FAULTED      0     0     0  corrupted data
	    sdd     FAULTED      0     0     0  corrupted data

The whole point of having two separate mirrors was so that bad things like this would need something more serious than an unconnected controller failure corrupting them!

After taking a deep breath I had a look at the data again, and at the rest of my system. /dev/sda was now my boot SSD, but ZFS thought it was part of an array. Looks like using the on board port had shuffled drive names around. This data is stored in /etc/zfs/zpool.cache to speed up mounting on boot. Moving drives around had invalidated this information.

So, I did the following:

rm /etc/zfs/zpool.cache
Rebooted the machine (unloading the ZFS modules should also theoretically work)
zpool import <my pools>

And all my bits were back in the correct order!

root@dumpy:~# zpool status
  pool: first
 state: ONLINE
  scan: scrub repaired 0 in 3h27m with 0 errors on Sun Dec 14 03:27:14 2014
config:

	NAME                                          STATE     READ WRITE CKSUM
	first                                         ONLINE       0     0     0
	  mirror-0                                    ONLINE       0     0     0
	    ata-WDC_WD20EARX-00PASB0_WD-WMAZA6447754  ONLINE       0     0     0
	    ata-WDC_WD20EARX-00PASB0_WD-WMAZA6448154  ONLINE       0     0     0

errors: No known data errors

  pool: second
 state: ONLINE
  scan: scrub repaired 0 in 9h42m with 0 errors on Sun Dec 14 09:42:32 2014
config:

	NAME                                          STATE     READ WRITE CKSUM
	second                                        ONLINE       0     0     0
	  mirror-0                                    ONLINE       0     0     0
	    ata-WDC_WD20EARS-00J2GB0_WD-WCAYY0231617  ONLINE       0     0     0
	    ata-WDC_WD20EARS-00J2GB0_WD-WCAYY0221030  ONLINE       0     0     0

errors: No known data errors

I initially created the system with an early 0.6.0 release candidate of ZFS on Linux, which is why it was doing something as silly as identifying drives by /dev/sd? in the first place. Now I’m running on the 0.6.3 release I’m happy to see it using drive serial numbers instead.

Hopefully this information will save someone from blowing away a valid mirror and having to restore from backups…

Categories: Linux Tags: sysadmin, zfs

Chris Read

Archive

ZFS on Linux ‘insufficient replicas’ panic

@cread

Recent Posts

Top Posts

Archives

Meta