Just as I thot I was all cool for having a sixteen drive NAS, today’s opening of it and trying a new network card (did not fit) left me with bad news on the next powerup.
> dmesg | grep ata | grep error: [ 23.223221] ata13.00: error: { ABRT } [ 23.234448] ata13.00: error: { ABRT } [ 31.262674] ata13.00: error: { ABRT } [ 31.275241] ata13.00: error: { ABRT } [ 31.288012] ata13.00: error: { ABRT } [ 39.073802] ata13.00: error: { ABRT } [ 50.815339] ata13.00: error: { ABRT } [ 50.827082] ata13.00: error: { ABRT } [ 57.606645] ata13.00: error: { ABRT } [ 69.616356] ata7.00: error: { ABRT } [ 69.616451] ata13.00: error: { ABRT }
That’s failure of two drives. TWO at the same time! ….and look at this:
> zpool status -v pool: tank state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://zfsonlinux.org/msg/ZFS-8000-9P scan: scrub repaired 0 in 4h43m with 0 errors on Sat Sep 6 00:13:22 2014 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 ata-Hitachi_HTS547575A9E384_J2190059G9PDPC ONLINE 0 0 0 ata-Hitachi_HTS547575A9E384_J2190059G9SBBC ONLINE 0 0 0 ata-Hitachi_HTS547575A9E384_J2190059G6GMGC ONLINE 0 0 0 ata-Hitachi_HTS547575A9E384_J2190059G95REC ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 ata-Hitachi_HTS547575A9E384_J2190059G9LH9C ONLINE 0 0 0 ata-Hitachi_HTS547575A9E384_J2190059G95JPC ONLINE 0 0 0 ata-Hitachi_HTS547575A9E384_J2190059G6LUDC ONLINE 0 0 0 ata-Hitachi_HTS547575A9E384_J2190059G5PXYC ONLINE 0 0 0 raidz1-2 ONLINE 0 0 0 ata-TOSHIBA_MQ01ABD050_X3EJSVUOS ONLINE 0 0 0 ata-TOSHIBA_MQ01ABD050_X3EJSVUNS ONLINE 0 0 0 ata-TOSHIBA_MQ01ABD050_933PTT11T ONLINE 0 0 0 ata-TOSHIBA_MQ01ABD050_933PTT17T ONLINE 0 0 0 raidz1-3 ONLINE 0 0 0 ata-TOSHIBA_MQ01ABD050_933PTT12T ONLINE 0 0 0 ata-TOSHIBA_MQ01ABD050_933PTT13T ONLINE 0 0 2 ata-TOSHIBA_MQ01ABD050_933PTT14T ONLINE 0 0 2 ata-TOSHIBA_MQ01ABD050_933PTT0ZT ONLINE 0 0 0 logs ata-OCZ-AGILITY4_OCZ-77Z13FI634825PNW-part5 ONLINE 0 0 0 cache ata-OCZ-AGILITY4_OCZ-77Z13FI634825PNW-part6 ONLINE 0 0 0 errors: No known data errors
Two checksum errors in the same Raid 5 volume. That’s going to be a very tricky replacement. I think I’m going to either replace one disk at a time and hope for the best resilver possibilities, or maybe…add a PCI controller back in there and add another zvol and migrate data from one zvol to another? That’ll be a wild trick.
It will frack up my backups for a while, that’s for sure. Oh, and those Toshiba drives? That’s three Toshiba failures, zero Hitatchi failures.