ZFS on Linux machine

Beavertaill Cactus [Wikipedia]

Beavertaill Cactus [Wikipedia]

Here is my ZFS on Linux story, and some of you might have seen these pictures when I started this project last year: I recycled an old Athlon system from work and placed an 35 Watt AMD A2 processor with 8GB 1600 ram on an Asus mobo in it. I name my home systems after after cacti, and after I installed Ubuntu 12.04 on it, I named this one Beavertail.bitratchet.net.

My previous experiences with storage systems involved them dying from heat. So I decided I would avoid full sized drives and stick with laptop drives and boot off an SSD. I have the SSD partitioned with a /boot, root, and two more partitions for ZIL and L2ARC. The bulk of the storage is a mix of 750GB Hitachi and 500GB Toshiba laptop hard drives, 16 total.  I have lost two drives in this system, which I would label “normal drive attrition.” Boot drive is a 128GB OCZ Vertex 2.


Half the drives are on the bottom, and half are on top. At work I have access to gobs of full-height card brackets and this is what I built drive cages out of.

To get all the drives wired up, I started with a bunch of 1x and 2x PCIe sata expanders and used up all my mobo sata ports, but by the time I got to about 12 drives, I only had a PCI slot left, so had to use that. When looking at my disk utilization in iostat -Nx and dstat --disk-util it was plainly clear that I had a swath of drives underperforming and they were all connected to the slowest PCI controller.

Supermicro HBA 8-port SAS controllers

Supermicro HBA 8-port SAS controllers

I saved up and remedied that by purchasing two SuperMicro SAS HBA’s with Marvel chipsets. They are only 3G SATA (equivalent) but they each control eight drives, and they do so consistently. They take 8x PCIe lanes, and that’s great use for the two 16x PCIe slots on the mobo.

02:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller (rev c3)
04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection

It took me a while to find out my network bandwidth issues. The problem was my motherboard: it has an onboard Realtek chipset. It would max out at 500Mbps download and 250Mbps upload…and very often wedge the system. I got a PCIe 1x Instell card and I got a good clean 955Mbps both ways out of that with one iperf stream, and 985+Mpbs with two iperf streams. To actually achieve this, I needed to put an Intel nic in my workstation as well. (My switch is a 16-port unmanaged Zyxel).

picture of drives

Eight drives on top

I am able to push close to full network capacity to Beavertail. As you can see, the results speak for themselves: the screenie below shows iftop displaying better than 880Mps and I saw it grab 910Mbps during this backup. Clearly part of the success is having a Samsung 840EVO in my laptop, but having a stripe of four zvols clearly allows plenty of IO headroom.

screen capture of iftop

910Mbps transfer from laptop to NAS.


Here are some other nerdy stats, mostly on how my drives are arranged:

 > zpool status -v
  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub repaired 0 in 4h43m with 0 errors on Sat Sep  6 00:13:22 2014

        NAME                                            STATE     READ WRITE CKSUM
        tank                                            ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G9PDPC  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G9SBBC  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G6GMGC  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G95REC  ONLINE       0     0     0
          raidz1-1                                      ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G9LH9C  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G95JPC  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G6LUDC  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G5PXYC  ONLINE       0     0     0
          raidz1-2                                      ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_X3EJSVUOS            ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_X3EJSVUNS            ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_933PTT11T            ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_933PTT17T            ONLINE       0     0     0
          raidz1-3                                      ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_933PTT12T            ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_933PTT13T            ONLINE       0     0     2
            ata-TOSHIBA_MQ01ABD050_933PTT14T            ONLINE       0     0     2
            ata-TOSHIBA_MQ01ABD050_933PTT0ZT            ONLINE       0     0     0
          ata-OCZ-AGILITY4_OCZ-77Z13FI634825PNW-part5   ONLINE       0     0     0
          ata-OCZ-AGILITY4_OCZ-77Z13FI634825PNW-part6   ONLINE       0     0     0

errors: No known data errors

And to finish up, this system has withstood a series of in-place Ubuntu upgrades. It is now running 14.04. My advice on this, and Linux kernels is this:

  • Do not rush to install new mainline kernels, you have to wait for dkms and spl libraries to sync up with mainline and to send out PPA updates through the ubuntu-zfs ppa.
  • If you do a dist-upgrade and reboot, and your zpool does not return on reboot, this is easily fixed by doing a ubuntu-zfs reinstall: apt-get install --reinstall ubuntu-zfs. This will re-link your kernel modules and you should be good to go.

Like I said, this has been working for four releases of Ubuntu for me, along with replacing controllers anbd drives. My only complaint is that doing sequences of small file operations in it tends to bring the speed down a lot (or has, have not recreated on 14.04 yet). But for streaming large files, I get massive throughput…which is great for my large photo collection!