Ubuntu 15.10 and ZFS

Screenshot-root@cholla:~

Some quick thots on doing this for my workstation:

  1. I have six 2TB drives in raid 10 zfs pool, and they would not import to 15.10 because 15.10 ships with (or tries to) zfs 0.6.4.2
  2. I decided on /boot, swap, /, mdadm partitions for OS install
  3. needed to do 15.10 server cmdline install for mdadm raid setup
  4. glad to not have attempted zfs-on-root for this distro
  5. setup three extra partitions on my two 120GB SSDS, using them for
    1. tank zil
    2. tank l2arc
    3. home pool (second pool named homer :-)
  6. Do not attempt to use PPA ubuntu/zfs-stable anymore, 15.10 will not accept it and it WILL mess with your zfsutils-linux recommended install.
  7. Somehow it did end up installing zfs-fuse. Somehow trying to install spl-dkms and zfs-dkms and uninstalling zfsutils-linux apt-get chose it. Why?
  8. I purged zfsutils, zfs/spl-dkms and did git clones on github/zfsonlinux/{spl,zfs}
  9. All of this required starting off with build-essential, autotools, automake, auto… and libuuid and … stuff. Not difficult to chase down.
  10. ./autoconfig.sh, ./configure && make -j10 && make install for spl and zfs
  11. updated /etc/rc.local to modprobe spl zfs, zpool import tank; zpool import homer; zfs mount tank ; zfs mount homer

I am able to reboot and import without pool version warnings.

Why did I move off 14.04.x? I really want to do video editing for kid videos and all the video packages for 14.04 are way ancient.

Also:

  1. get first server install working
  2. install lubuntu-desktop
  3. replace /etc/default/grub hidden -> false
  4. default/grub -> replace “splash quiet” with “nofb”
  5. once LXDE displays, then I do a “apt-get install mate-desktop-*” which seems to work just fine.
  6. Why? lubuntu-desktop flawlessly sets up Xorg dependencies and gives me a desktop the first time without messing around wondering why mate-desktop didn’t.

Merry Xmas!

ZFS on Linux machine

Beavertaill Cactus [Wikipedia]

Beavertaill Cactus [Wikipedia]


Here is my ZFS on Linux story, and some of you might have seen these pictures when I started this project last year: I recycled an old Athlon system from work and placed an 35 Watt AMD A2 processor with 8GB 1600 ram on an Asus mobo in it. I name my home systems after after cacti, and after I installed Ubuntu 12.04 on it, I named this one Beavertail.bitratchet.net.

My previous experiences with storage systems involved them dying from heat. So I decided I would avoid full sized drives and stick with laptop drives and boot off an SSD. I have the SSD partitioned with a /boot, root, and two more partitions for ZIL and L2ARC. The bulk of the storage is a mix of 750GB Hitachi and 500GB Toshiba laptop hard drives, 16 total.  I have lost two drives in this system, which I would label “normal drive attrition.” Boot drive is a 128GB OCZ Vertex 2.

image

Half the drives are on the bottom, and half are on top. At work I have access to gobs of full-height card brackets and this is what I built drive cages out of.

To get all the drives wired up, I started with a bunch of 1x and 2x PCIe sata expanders and used up all my mobo sata ports, but by the time I got to about 12 drives, I only had a PCI slot left, so had to use that. When looking at my disk utilization in iostat -Nx and dstat --disk-util it was plainly clear that I had a swath of drives underperforming and they were all connected to the slowest PCI controller.

Supermicro HBA 8-port SAS controllers

Supermicro HBA 8-port SAS controllers

I saved up and remedied that by purchasing two SuperMicro SAS HBA’s with Marvel chipsets. They are only 3G SATA (equivalent) but they each control eight drives, and they do so consistently. They take 8x PCIe lanes, and that’s great use for the two 16x PCIe slots on the mobo.

02:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller (rev c3)
04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection

It took me a while to find out my network bandwidth issues. The problem was my motherboard: it has an onboard Realtek chipset. It would max out at 500Mbps download and 250Mbps upload…and very often wedge the system. I got a PCIe 1x Instell card and I got a good clean 955Mbps both ways out of that with one iperf stream, and 985+Mpbs with two iperf streams. To actually achieve this, I needed to put an Intel nic in my workstation as well. (My switch is a 16-port unmanaged Zyxel).

picture of drives

Eight drives on top

I am able to push close to full network capacity to Beavertail. As you can see, the results speak for themselves: the screenie below shows iftop displaying better than 880Mps and I saw it grab 910Mbps during this backup. Clearly part of the success is having a Samsung 840EVO in my laptop, but having a stripe of four zvols clearly allows plenty of IO headroom.

screen capture of iftop

910Mbps transfer from laptop to NAS.

 

Here are some other nerdy stats, mostly on how my drives are arranged:

 > zpool status -v
  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-9P
  scan: scrub repaired 0 in 4h43m with 0 errors on Sat Sep  6 00:13:22 2014
config:

        NAME                                            STATE     READ WRITE CKSUM
        tank                                            ONLINE       0     0     0
          raidz1-0                                      ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G9PDPC  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G9SBBC  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G6GMGC  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G95REC  ONLINE       0     0     0
          raidz1-1                                      ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G9LH9C  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G95JPC  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G6LUDC  ONLINE       0     0     0
            ata-Hitachi_HTS547575A9E384_J2190059G5PXYC  ONLINE       0     0     0
          raidz1-2                                      ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_X3EJSVUOS            ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_X3EJSVUNS            ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_933PTT11T            ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_933PTT17T            ONLINE       0     0     0
          raidz1-3                                      ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_933PTT12T            ONLINE       0     0     0
            ata-TOSHIBA_MQ01ABD050_933PTT13T            ONLINE       0     0     2
            ata-TOSHIBA_MQ01ABD050_933PTT14T            ONLINE       0     0     2
            ata-TOSHIBA_MQ01ABD050_933PTT0ZT            ONLINE       0     0     0
        logs
          ata-OCZ-AGILITY4_OCZ-77Z13FI634825PNW-part5   ONLINE       0     0     0
        cache
          ata-OCZ-AGILITY4_OCZ-77Z13FI634825PNW-part6   ONLINE       0     0     0

errors: No known data errors

And to finish up, this system has withstood a series of in-place Ubuntu upgrades. It is now running 14.04. My advice on this, and Linux kernels is this:

  • Do not rush to install new mainline kernels, you have to wait for dkms and spl libraries to sync up with mainline and to send out PPA updates through the ubuntu-zfs ppa.
  • If you do a dist-upgrade and reboot, and your zpool does not return on reboot, this is easily fixed by doing a ubuntu-zfs reinstall: apt-get install --reinstall ubuntu-zfs. This will re-link your kernel modules and you should be good to go.

Like I said, this has been working for four releases of Ubuntu for me, along with replacing controllers anbd drives. My only complaint is that doing sequences of small file operations in it tends to bring the speed down a lot (or has, have not recreated on 14.04 yet). But for streaming large files, I get massive throughput…which is great for my large photo collection!

Happy Penguins at LinuxFest Northwest 2014 Party

I felt like I had a lot of camera issues at the start (which first was needing to go back home in order to retrieve all my SD cards, pppt.) Indoor pictures are always difficult, and as such, to make these look less like snapshots, I squirted a lot of Sriracha on these. Here are your spicy penguin pals:

PH7 Engine – is it really the fix?

This sounds very useful, especially if you consider that PHP is used in embedded environments like MonoWall and pfSense.

My big question for larger installation is this: xcache already does a great job at bytecode caching. The largest slowdown in the majority of PHP applications is the relational engine sitting underneath it. After much profiling of previous applications, I have always found the biggest benefits to application performance in taking a casual schema to a rigorous level. And if that is not fast enough, then you throw in some memcached to the mix.

PH7 Engine.

Disk Performance Analysis on Linux

Really Into the Guts of Linux Filesystem Performance

Really Into the Guts of LinuxIt is possible to find out how hard your disks are running. Some you don’t want to run as hard as others. For instance, if you wanted to save on write cycles for your SSD you could move small frequent writes into a ram filesystem, and that seems to be done by default in many systems these days (with /tmp getting mounted as tmpsf).

In the process of rebuilding my backup server, I left it going with a few large writes overnight, to check for some stability issues. Checking dmesg is always a good way to do that.

For instance, I see that drive 14 seems to be wonky:

Nov 16 22:44:13 beavertail kernel: [  692.590756] ata14.00: cmd 61/e0:18:20:0a:00/00:00:00:00:00/40 tag 3 ncq 114688 out
Nov 16 22:44:13 beavertail kernel: [  692.590756]          res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Nov 16 22:44:13 beavertail kernel: [  692.598694] ata14.00: status: { DRDY }
Nov 16 22:44:14 beavertail kernel: [  693.093364] ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Nov 16 22:44:14 beavertail kernel: [  693.118236] ata14.00: configured for UDMA/100
Nov 16 22:44:14 beavertail kernel: [  693.118255] ata14.00: device reported invalid CHS sector 0
Nov 16 22:44:14 beavertail kernel: [  693.118273] ata14: EH complete

Which drive is this? Oh, I see where it came up using sudo dmesg | grep 'ata14' |less:

[    1.061214] ata14: SATA max UDMA/133 abar m2048@0xfea10000 port 0xfea10280 irq 49
[    1.556891] ata14: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    1.589329] ata14.00: ATA-8: TOSHIBA MQ01ABD050, AX001A, max UDMA/100

I cannot pass ata14 as an argument to programs, so I will see if I can match a serial number:

root@beavertail /dev/disk/by-id
 > find -type l  -a \( -name 'wwn-*' -o -name '*-part*' \) -prune -o -printf "%l\t%f\n" | sort
	.
../../sda	ata-Hitachi_HTS547575A9E384_J2190059G9PDPC
../../sdb	ata-Hitachi_HTS547575A9E384_J2190059G9SBBC
../../sdc	ata-Hitachi_HTS547575A9E384_J2190059G6GMGC
../../sdd	ata-Hitachi_HTS547575A9E384_J2190059G95REC
../../sde	ata-OCZ-AGILITY4_OCZ-77Z13FI634825PNW
../../sdf	ata-Hitachi_HTS547575A9E384_J2190059G9LH9C
../../sdg	ata-Hitachi_HTS547575A9E384_J2190059G95JPC
../../sdh	ata-Hitachi_HTS547575A9E384_J2190059G6LUDC
../../sdi	ata-Hitachi_HTS547575A9E384_J2190059G5PXYC
../../sdj	ata-TOSHIBA_MQ01ABD050_933PTT17T
../../sdk	ata-TOSHIBA_MQ01ABD050_X3EJSVUOS
../../sdl	ata-TOSHIBA_MQ01ABD050_X3EJSVUNS
../../sdm	ata-TOSHIBA_MQ01ABD050_933PTT11T
../../sdn	ata-TOSHIBA_MQ01ABD050_933PTT15T
../../sdo	ata-TOSHIBA_MQ01ABD050_933PTT12T
../../sdp	ata-TOSHIBA_MQ01ABD050_933PTT13T
../../sdq	ata-TOSHIBA_MQ01ABD050_933PTT14T
../../sdr	ata-TOSHIBA_MQ01ABD050_933PTT0ZT

Nope, so we search dmesg again, grepping for the ATA-8 port:

[    1.538182] ata4.00: ATA-8: Hitachi HTS547575A9E384, JE4OA60A, max UDMA/133
[    1.538213] ata2.00: ATA-8: Hitachi HTS547575A9E384, JE4OA60A, max UDMA/133
[    1.538244] ata3.00: ATA-8: Hitachi HTS547575A9E384, JE4OA60A, max UDMA/133
[    1.538565] ata5.00: ATA-8: Hitachi HTS547575A9E384, JE4OA60A, max UDMA/133
[    1.563034] ata6.00: ATA-8: TOSHIBA MQ01ABD050, AX001A, max UDMA/100
[    1.584336] ata12.00: ATA-8: TOSHIBA MQ01ABD050, AX001A, max UDMA/100
[    1.589329] ata14.00: ATA-8: TOSHIBA MQ01ABD050, AX001A, max UDMA/100
[    1.590477] ata11.00: ATA-8: TOSHIBA MQ01ABD050, AX001A, max UDMA/100
[    1.590713] ata13.00: ATA-8: TOSHIBA MQ01ABD050, AX001A, max UDMA/100
[    1.618162] ata15.00: ATA-8: TOSHIBA MQ01ABD050, AX001A, max UDMA/100
[    1.621146] ata18.00: ATA-8: TOSHIBA MQ01ABD050, AX001A, max UDMA/100
[    1.623070] ata16.00: ATA-8: TOSHIBA MQ01ABD050, AX001A, max UDMA/100
[    1.625311] ata17.00: ATA-8: TOSHIBA MQ01ABD050, AX001A, max UDMA/100
[    3.267833] ata7.00: ATA-8: Hitachi HTS547575A9E384, JE4OA60A, max UDMA/133
[    5.469619] ata8.00: ATA-8: Hitachi HTS547575A9E384, JE4OA60A, max UDMA/133
[    7.671669] ata9.00: ATA-8: Hitachi HTS547575A9E384, JE4OA60A, max UDMA/133
[    9.890091] ata10.00: ATA-8: Hitachi HTS547575A9E384, JE4OA60A, max UDMA/133

We begin to notice I should have kept with the Hitachi drives, they are going to be faster. Oh well, I’m doing raid 50, striping across five groups, that right there is pretty good.

But where is this drive? Time to paw thru /sys. Paw is just about right, I don’t have much memorized about the /sys directory, but with ‘less’ and ‘ls’ I bet…there we go: ls -l /sys/class/block/ | grep ata14:

root@beavertail /var/log
 > ls -l /sys/class/block/ | grep ata14
lrwxrwxrwx 1 root root 0 Nov 16 22:32 sdn -> ../../devices/pci0000:00/0000:00:02.0/0000:01:00.0/ata14/host13/target13:0:0/13:0:0:0/block/sdn
lrwxrwxrwx 1 root root 0 Nov 17 06:43 sdn1 -> ../../devices/pci0000:00/0000:00:02.0/0000:01:00.0/ata14/host13/target13:0:0/13:0:0:0/block/sdn/sdn1
lrwxrwxrwx 1 root root 0 Nov 17 06:43 sdn9 -> ../../devices/pci0000:00/0000:00:02.0/0000:01:00.0/ata14/host13/target13:0:0/13:0:0:0/block/sdn/sdn9

Now I can use hdparm on this guy:

root@beavertail /var/log
 > hdparm -I /dev/sdn | head

/dev/sdn:

ATA device, with non-removable media
        Model Number:       TOSHIBA MQ01ABD050                      
        Serial Number:      933PTT15T
        Firmware Revision:  AX001A  
        Transport:          Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6

Now that I have the serial number for that guy, I will be able to find him.

How is my new storage array doing in general? Well, iostat gives me some indication that the drives are being written to somewhat equally:

jreynolds@beavertail ~
 > iostat -N -x 2 2
Linux 3.11.0-12-generic (beavertail) 	11/17/2013 	_x86_64_	(2 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.52    0.00   12.75   46.82    0.00   39.91

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.01     0.00    0.06   59.85     0.28  5350.97   178.65     6.73  112.27   23.92  112.36  13.23  79.25
sdb               0.01     0.00    0.05   59.86     0.27  5350.94   178.63     6.69  111.66   25.89  111.74  13.22  79.19
sdc               0.01     0.00    0.07   67.59     0.33  6029.96   178.27     7.05  104.27   21.14  104.35  12.08  81.72
sdd               0.01     0.00    0.13   65.75     0.59  5953.08   180.73     7.11  107.97   35.70  108.12  12.46  82.08
sde               0.02     1.13    3.29  319.92    20.22 39454.49   244.27     1.49    4.60    2.80    4.62   0.52  16.79
sdf               0.01     0.00    0.12   70.63     0.54  5953.20   168.30     0.70    9.91   16.08    9.90   2.17  15.38
sdg               0.01     0.00    0.06   70.40     0.30  5953.25   169.00     0.76   10.84   18.69   10.83   2.31  16.28
sdi               0.01     0.00    0.14   71.55     0.63  6030.08   168.23     0.82   11.43   23.98   11.41   2.34  16.76
sdj               0.02     0.00    0.14   75.14     0.66  5497.69   146.09     0.21    2.75    9.36    2.74   0.94   7.07
sdk               0.02     0.00    0.03    0.00     0.21     0.04    18.43     0.00    3.04    2.78    8.32   2.40   0.01
sdm               0.04     0.00    0.08   68.44     0.50  5477.06   159.90     0.27    3.98    8.62    3.97   0.98   6.72
sdo               0.04     0.00    0.07   66.84     0.49  5477.08   163.72     0.29    4.35    8.03    4.34   1.04   6.94
sdn               0.02     0.00    0.08   64.80     0.43  5497.76   169.51     0.41    6.35    8.18    6.35   1.31   8.51
sdl               0.02     0.00    0.03    0.00     0.21     0.04    18.43     0.00    3.37    3.10    8.74   2.52   0.01
sdp               0.04     0.00    0.07   66.85     0.49  5477.05   163.71     0.29    4.34    8.13    4.34   1.04   6.93
sdq               0.04     0.00    0.08   67.91     0.53  5497.71   161.72     0.29    4.27    8.44    4.27   1.02   6.92
sdh               0.01     0.00    0.13   72.34     0.58  6030.12   166.44     0.72    9.90   16.88    9.89   2.12  15.38
sdr               0.04     0.00    0.07   66.92     0.48  5351.01   159.77     0.29    4.37    7.13    4.36   1.12   7.53

From that I can tell that /dev/sde is getting the shit kicked out of it with 6x the writes other drives are getting. What the hell? Let’s check zpool:

root@beavertail ~
 > zpool iostat -v 30
                                                         capacity     operations    bandwidth
pool                                                  alloc   free   read  write   read  write
----------------------------------------------------  -----  -----  -----  -----  -----  -----
tank                                                  2.15T  5.99T      0    608    204  58.8M
  raidz1                                               470G  1.57T      0    130     34  12.7M
    ata-Hitachi_HTS547575A9E384_J2190059G5PXYC-part1      -      -      0     73      0  6.32M
    ata-Hitachi_HTS547575A9E384_J2190059G6GMGC-part1      -      -      0     70      0  6.43M
    ata-Hitachi_HTS547575A9E384_J2190059G6LUDC-part1      -      -      0     73    273  6.32M
  raidz1                                               464G  1.58T      0    128     85  12.7M
    ata-Hitachi_HTS547575A9E384_J2190059G95JPC-part1      -      -      0     72      0  6.32M
    ata-Hitachi_HTS547575A9E384_J2190059G95REC-part1      -      -      0     68    136  6.42M
    ata-Hitachi_HTS547575A9E384_J2190059G9LH9C-part1      -      -      0     72    273  6.32M
  raidz1                                               414G   978G      0    116     68  11.0M
    ata-Hitachi_HTS547575A9E384_J2190059G9PDPC-part1      -      -      0     61    136  5.56M
    ata-Hitachi_HTS547575A9E384_J2190059G9SBBC-part1      -      -      0     60    136  5.54M
    ata-TOSHIBA_MQ01ABD050_933PTT0ZT                      -      -      0     66      0  5.45M
  raidz1                                               425G   967G      0    115      0  11.1M
    ata-TOSHIBA_MQ01ABD050_933PTT11T                      -      -      0     70      0  5.61M
    ata-TOSHIBA_MQ01ABD050_933PTT12T                      -      -      0     66      0  5.61M
    ata-TOSHIBA_MQ01ABD050_933PTT13T                      -      -      0     67      0  5.61M
  raidz1                                               426G   966G      0    117     17  11.2M
    ata-TOSHIBA_MQ01ABD050_933PTT14T                      -      -      0     69      0  5.66M
    ata-TOSHIBA_MQ01ABD050_933PTT15T                      -      -      0     66      0  5.66M
    ata-TOSHIBA_MQ01ABD050_933PTT17T                      -      -      0     78    136  5.66M
logs                                                      -      -      -      -      -      -
  ata-OCZ-AGILITY4_OCZ-77Z13FI634825PNW-part5          128K  1.98G      0      0      0      0
cache                                                     -      -      -      -      -      -
  ata-OCZ-AGILITY4_OCZ-77Z13FI634825PNW-part6         64.1G  14.0M      0    342  4.80K  41.9M
----------------------------------------------------  -----  -----  -----  -----  -----  -----

That guy is my Layer 2 cache SSD partition on my OCZ drive. (The astute will see that I am only running one SSD on here.) Since this is a backup drive, I doubt actually need cache on it.
It is transferring like %50 of the activity of the whole pool for no good reason. I’ll disable him later.

But, (and this is quiet easily seen using dstat) I have four drives just “Chugging Away” doing more than their share. These guys are busier and I don’t know why, unless it has to do with extra parity writes.

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00   64.50     0.00  5354.00   166.02     5.62   88.93    0.00   88.93  11.63  75.00
sdb               0.00     0.00    0.00   63.50     0.00  5416.00   170.58     6.55  108.13    0.00  108.13  12.94  82.20
sdc               0.00     0.00    0.50   73.00     2.00  7144.00   194.45     7.33  102.07   20.00  102.63  11.40  83.80
sdd               0.00     0.00    0.00   63.00     0.00  7044.00   223.62     7.67  126.29    0.00  126.29  13.43  84.60

What kind of jibberish am I looking at? Trim it down to the write-queue and the “awaiting” columns:

Device:          avgrq-sz avgqu-sz   await  w_await   %util
sda                166.02     5.62   88.93    88.93   75.00
sdb                170.58     6.55  108.13   108.13   82.20
sdc                194.45     7.33  102.07   102.63   83.80
sdd                223.62     7.67  126.29   126.29   84.60

We see that the write-wait queue, the outstanding device wait interval, and the queue size on those devices is way higher
than the rest of the drives. However, I have five raidz pools, so why would I have four drives so poorly conditioned? Either
ZFS has decided that I need to write more bytes there, or those four drives are all on a common controller that is slower, I suspect.

Common controller? Let’s check that /sys/ directory again. Yeah, with variations of ls -l /sys/class/block/ | egrep 'sd[a-d] ' I definitely see those
drives share something on the PCI bus in common: address 14.

jreynolds@beavertail ~
 > ls -l /sys/class/block/ | egrep 'sd[a-d] '
lrwxrwxrwx 1 root root 0 Nov 16 22:32 sda -> ../../devices/pci0000:00/0000:00:14.4/0000:03:05.0/ata7/host1/target1:0:0/1:0:0:0/block/sda
lrwxrwxrwx 1 root root 0 Nov 16 22:32 sdb -> ../../devices/pci0000:00/0000:00:14.4/0000:03:05.0/ata8/host2/target2:0:0/2:0:0:0/block/sdb
lrwxrwxrwx 1 root root 0 Nov 16 22:32 sdc -> ../../devices/pci0000:00/0000:00:14.4/0000:03:05.0/ata9/host3/target3:0:0/3:0:0:0/block/sdc
lrwxrwxrwx 1 root root 0 Nov 16 22:32 sdd -> ../../devices/pci0000:00/0000:00:14.4/0000:03:05.0/ata10/host5/target5:0:0/5:0:0:0/block/sdd

However, using dmidecode, all I can tell at the moment is that I have four slots and their addresses do not match. This is unlikely a motherboard connected item. Is it a PCI connected item?

 > lspci
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 12h Processor Root Complex
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Sumo [Radeon HD 6410D]
00:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 12h Processor Root Port
00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Family 12h Processor Root Port
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 40)
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11)
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 11)
00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11)
00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 11)
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 14)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 11)
00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD] FCH PCI Bridge (rev 40)
00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11)
00:15.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Hudson PCI to PCI bridge (PCIE port 0)
00:15.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Hudson PCI to PCI bridge (PCIE port 1)
00:16.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11)
00:16.2 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 11)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 12h/14h Processor Function 0 (rev 43)
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 12h/14h Processor Function 1
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 12h/14h Processor Function 2
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 12h/14h Processor Function 3
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 12h/14h Processor Function 4
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 12h/14h Processor Function 6
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 12h/14h Processor Function 5
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Family 12h/14h Processor Function 7
01:00.0 SATA controller: Marvell Technology Group Ltd. Device 9235 (rev 10)
02:00.0 SATA controller: Marvell Technology Group Ltd. Device 9235 (rev 10)
03:05.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02)
04:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)

DING Found it: that Silicon Image Sata controller. It is likely the slow link in all this. I can see the match: 03:05.0

03:05.0 RAID bus controller: Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller (rev 02)

Sorting through thousands of photos

Backups are great. Having terabytes of of space for them is now completely necessary. Filling up those terabytes is …frustrating.

My photo collection is probably much like many photo enthusiast’s — well into the hundreds of thousands of pictures. But why is it so? I make thumbnails (that’s 2x pictures), I keep a low-res and a full-res picture besides thumbnails (3x). I’ve started keeping both my RAW (DNG) and my full-res JPG files (4x pictures).

This does not presume madness with the backups. What happens when you have copies of the same SD card on two computers? And you discover the months later, and you don’t have time to un-dup those few hundred? Having multiple photo editing software tools doesn’t necessarily make it easier. I use both Gimp and Darktable. I batch create my thumbnails using a shell script that drives ImageMagick, of course.  And then what piece of backup software would ever warn you if you started backing up your photos in overlapping locations? I might have (6x) the pictures I actually took!

Look at this listing of photos, for example:

val_memorial168.JPG;2484792;/tank/pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts/big
val_memorial168.JPG;2484792;/tank/pictures/Pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts/big
val_memorial168.JPG;2484792;/tank/pictures/tank/pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts/big
val_memorial168.JPG;2484792;/tank/pictures/tank/pictures/Pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts/big
val_memorial168-small.jpg;29530;/tank/pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts/thumb
val_memorial168-small.jpg;29530;/tank/pictures/Pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts/thumb
val_memorial168-small.jpg;29530;/tank/pictures/tank/pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts/thumb
val_memorial168-small.jpg;29530;/tank/pictures/tank/pictures/Pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts/thumb
val_memorial168-small.jpg;355505;/tank/pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts
val_memorial168-small.jpg;355505;/tank/pictures/Pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts
val_memorial168-small.jpg;355505;/tank/pictures/tank/pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts
val_memorial168-small.jpg;355505;/tank/pictures/tank/pictures/Pictures/9999-Source/2006/2006-05-13-memorial-tey-roberts

Shell scripting to the rescue, right? My tactics are roughly this:

  • sort the photos by name (not by path)
  • and by size (similarly sized photos are often dups)
  • generate md5 sums of the start of the photo (the start is likely going to be as indicative and the whole photo)
  • sort by all of the above
  • try not to use perl :-)

Does it take a fancy data structure to sort this information? No. The trick is to re-arrange it in lines of text so that the things you most want to sort by are on the left side of the line.

And to wit, I doth bust out said magic thusly:

      1 #!/bin/bash
      2 ##  ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
      3 ##
      4 ##    Script intended to identify duplicately named 
      5 ##    files and also files with likely identical contents.
      6 ##
      7 ##  ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
      8 
      9 PIC_DIR=/tank/pictures
     10 VAR_DIR=/home/jreynolds/var
     11 rm -f $VAR_DIR/*.txt
     12 rm -f $VAR_DIR/*.a*
     13  F01_UNSORTED_NAMES="$VAR_DIR/unsorted.txt"
     14    F02_SORTED_NAMES="$VAR_DIR/sorted.txt"
     15     F03_WITH_HASHES="$VAR_DIR/hashes.txt"
     16   F04_SORTED_HASHES="$VAR_DIR/sorted-hashes.txt"
     17 # | head -100  \
     18 find $PIC_DIR -type f -printf "%f;%s;%h\n" \
     19 > $F01_UNSORTED_NAMES
     20 wc -l $F01_UNSORTED_NAMES
     21 
     22 cat $F01_UNSORTED_NAMES | sort > $F02_SORTED_NAMES
     23 line_ct=$(cat $F02_SORTED_NAMES|wc -l)
     24 lines_per=$[ (line_ct / 8 ) + 1 ];
     25 echo "This is the line count per file: $lines_per"
     26 
     27 i=0
     28 rm $F03_WITH_HASHES.*
     29 split -l $lines_per $F02_SORTED_NAMES ${F02_SORTED_NAMES/.txt/}.
     30 ls -l $VAR_DIR/*
     31 
     32 for J in $VAR_DIR/sorted.a* ; do
     33    echo "== $J == started"
     34    bash -c "$HOME/bin/hash_list.sh $J" &
     35 done
     36 n=8
     37 while [ $n -gt 0 ]; do 
     38    sleep 1
     39    n=$(pgrep -lf hash_list.sh | wc -l )
     40    echo -n "$n processing "
     41 done

And the careful reader will wonder, wth is hash_list.sh? You’d better wonder. It reveals the only way one can encorporate exactly one efficient disk read per file into a program. Behold:

      1 #!/bin/bash
      2 [ -z "$1" ] && echo "please specify file list, bye." && exit 1
      3 
      4 cat $1 | while read F ; do
      5    IFS=\; read -ra hunks <<> $1.hmac
     10    i=$[ i + 1 ]
     11    [ $(( $i % 50 )) == 0 ] && echo -n "."
     12 done
     13 wc -l $1.hmac
     14 
     15 cat $1.hmac | sort > $1.sorted
     16 wc -l $1.sorted

Can you even do this on windows? I’d be surprised if you could do it without at least Cygwin. However, this is all that is required: the bash programming environment. Could you do this on a Mac? Yeah. Could you do it on an iPad? Pppppt…you’re joking. Could you do it on a Samsung android tablet? Yeah, if you wanted to watch it…melt. You’d have to install the ssh client and terminal apps of course. Enough!

Now see what I discover? Broad swaths of duplication await to be uncovered:

000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/0000-incoming/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/0000-incoming/in-2012-12-09/big
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/9999-Source/2011/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/9999-Source/2011/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/9999-Source/2011/2011-08-31b
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG~;2301225;/tank/pictures/9999-Source/2012/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/9999-Source/2012/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/9999-Source/2012/in-2012-12-09b/big
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/9999-Source/2013/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/9999-Source/2013/2011/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/home/Pictures/9999-Source/2011/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/home/Pictures/9999-Source/2013/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/home/Pictures/9999-Source/2013/2011/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/home/Pictures/9999-Source/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/Pictures/9999-Source/2011/2011-08-31b
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/Pictures/9999-Source/2011/ff/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/Pictures/9999-Source/2012/ff/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/Pictures/9999-Source/2012/ff/in-2012-12-09/big
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/Pictures/9999-Source/2012/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/Pictures/9999-Source/2012/in-2012-12-09b/big
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/Pictures/9999-Source/2012/zz/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/tank/pictures/9999-Source/2011/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/tank/pictures/9999-Source/2011/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/tank/pictures/9999-Source/2011/2011-08-31b
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/tank/pictures/9999-Source/2011/2011-08-31-c
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/tank/pictures/9999-Source/2011/2011-08-31-d
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/tank/pictures/9999-Source/2011/2011-08-31-f
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG~;2301225;/tank/pictures/tank/pictures/9999-Source/2012/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/tank/pictures/9999-Source/2012/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/tank/pictures/9999-Source/2012/in-2012-12-09b/big
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/tank/pictures/9999-Source/2013/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/tank/pictures/9999-Source/2013/2011/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/tank/pictures/9999-Source/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/tank/pictures/home/Pictures/9999-Source/2013/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/tank/pictures/home/Pictures/9999-Source/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/tank/pictures/Pictures/9999-Source/2011/2011-08-31b
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/tank/pictures/Pictures/9999-Source/2011/ff/2011-08-31
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/tank/pictures/Pictures/9999-Source/2012/ff/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/tank/pictures/Pictures/9999-Source/2012/ff/in-2012-12-09/big
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/tank/pictures/Pictures/9999-Source/2012/in-2012-12-09
000173549decf06ae5c858ea1eccfcca  -,imgp2355.jpg;2301225;/tank/pictures/tank/pictures/Pictures/9999-Source/2012/in-2012-12-09b/big
000173549decf06ae5c858ea1eccfcca  -,IMGP2355.JPG;2301225;/tank/pictures/tank/pictures/Pictures/9999-Source/2012/zz/in-2012-12-09

Make, classpaths and environment variables

Make, Ant and probably any other build language (and/or toolkit, since Ant is not much in the way of a language) are tricky bastards. I’ve been maintaining a parallel set of build scripts for a Java and C++ project for a few years now, and this has been great practice in getting my Make chops snappier. Just today, I found I needed to build the -Xbootclasspath argument for my makefile. I ended up re-learning an important lesson that applies to bash. Not expectedly so, since Make and bash are not of common ancestry…but that lesson was “spaces.” One needs to put at least one space after Makes conditionals if, ifeq or ifneq. Just like one has to put space after bash’s if, while, and for.

  • Bash:
    if [ -z "$classpath" ] ; then do
       classpath="./*jar"
    done
  • Make:
    ifeq (,$(classpath))
       classpath="./*jar"
    endif
      Both those statements are checking for zero length strings.

However, I’m going to stop comparing the two. You just look at where I’ve put those spaces, yung’un.
My real goal was to evaluate a series of conditions (is there an JAVA6_HOME defined? Is there a $JAVA_HOME/../jdk6 directory? Is there /usr/local/jdk6 directory? If so create a BOOTCLASSPATH variable:

This is the trick:

  4 comma:= ,
  5 colon:= :
  6 empty:=
  7 space:= $(empty) $(empty)
  ...
 56 ifneq (,$(JAVA6_HOME))
 57    ifneq (,$(wildcard  $(JAVA6_HOME)/.))
 58       JAVA6 = ${JAVA6_HOME}
 59    endif
 60 endif
 61 ifeq (,$(JAVA6))
 62    ifneq (,$(wildcard $(JAVA_HOME)/../jdk6/.))
 63       JAVA6 = $(JAVA_HOME)/../jdk6
 64    endif
 65 endif
 66 ifeq (,$(JAVA6))
 67    ifneq (,$(wildcard /usr/local/jdk6/.))
 68       JAVA6 = "/usr/local/jdk6"
 69    endif
 70 endif
 71 ifneq (,$(JAVA6))
 72    BOOTCLASSPATH := $(wildcard $(JAVA6)/lib/*.jar)
 73    JAVATARGET = -target 1.6 \
 74                -source 1.6 \
 75                -Xbootclasspath/p:$(subst $(space),$(colon),$(BOOTCLASSPATH))
 76    $(info BOOTCLASSPATH is $(BOOTCLASSPATH))                                                                     
 77    $(info JAVATARGET    is $(JAVATARGET))
 78 endif

And the quiz for the reader is, why is line 75 important?

Backups: Using `find` Across a Panalopy of Directories

Linux Backups logo

Linux Backups

I love using the find command. In DOS, find is like grep. In Linux, find is the most powerful recursive DOS dir /s or Linux ls -r command you could ever put your saddle on.

One of the things you can do with find is to avoid directories, using the -prune switch. Like so:

find /usr/local -type d -a \( -name jre1.6.0_38 -prune -o -type d -print \)

Yeah, put your bike helmet on if you keep reading. That spat out a ton of gook. But was I lying? Well, grep out everything but what we should have pruned:

find /usr/local -type d -a \( -name jre1.6.0_38 -prune -o -type d -print \) | grep jre1.6

What if you have a series of subdirectories you want to include, but you cannot write enough -prune switches for them? This is a problem I frequently have. For instance, how do you exclude all your Firefox Cache directories, especially if you have multiple profiles? Great question.

I’d first use find to find all the directories I

do want to backup:

find /home/jed -maxdepth 4 -type d > /tmp/dirlist

Then you grep out things you really don’t want:

egrep -i "/cache|/Trash" /tmp/dirlist > /tmp/avoid

Then parse it into things you do want to find to avoid:

cat /tmp/avoid | while read F ; do echo " -path $F -o " ; done > /tmp/avoid2 ;
echo "-path ./asdf" >> /tmp/avoid2

Now we can refresh our list of directories to descend:

find . -xdev -depth -type d \( `cat /tmp/avoid2` \) -prune -o -print

If we want to turn that right into files, modify the last print statement to find files:

find . -xdev -depth -type d \( `cat /tmp/avoid2` \) -prune -o -type f -print

Now if you want to find the files more recently created than your last backup in /home/backup/monday.tgz, try this:

find . -xdev -depth -type d \( `cat /tmp/avoid2` \) -prune -o -type f -newer /home/backup/monday.tgz -print

Is that enough to make you cry? Chin up, think of all the disk space you’re saving, and how much faster a specific backup can occur. This means you can run backups every 15 minutes.