Backups: using tar and find

If you are familiar with zip files, they are the DOS version of tar files (tar = Tape Archive). The tar utility is totally intended for storing backups. A quick way to backup your home directory is:

cd /home ; tar -cvf home-jed.tar ./jed

You might see that command grab a whole lot of stuff you don’t want to keep, including all your Firefox cache files and your Trash files. Also that archive is uncompressed. Lets get it compressed as much as we can, first, that’s easy:

tar -cvjf home-jed.tbz2 ./jed

Next, we can build a list of files we want to backup using find. Please don’t try and avoid the find command, once you begin to understand it, life in Linux really can improve. On our first try, we will pair it down with fgrep (simple grep) to exclude our Firefox .Cache directory.

cd home
find jed/.mozilla/firefox \
| fgrep -v '.default/Cache' \
> /tmp/jed.txt

And following that, avoiding our trash can:

find jed/.mozilla/firefox \
| fgrep -v '.default/Cache' \
| fgrep -v '.local/share/Trash' \
> /tmp/jed.txt

Now think about why we want to use pipe operators in that second find command. Would it be easier as two commands both appending to /tmp/jed.txt? (Think about the overlap and duplication that results.)

If we wanted to use that file to guide tar, we change our tar command like so:

tar cvjf ./jed.tbz2 -T /tmp/jed.txt

In order to make regular backups a regularity, we need to make them pertinent and economical (of time and of space). We often do not want to back up ephemeral files that are byproducts of our work. If you program, you will have ready examples on your own drive: .a, .o, .out, .class code files often do not need to be kept if you make them several times a day.

Consider the example below. With it we can backup the substantive slice of our code tree to another drive on our system. We avoid the ephemeral files. We also chose to backup our code separately from the rest of our home directory. By doing this we can schedule code tree backups every hour, and schedule our home tree backups just once a day.

#!/bin/bash
function CodeSnap() {
    local now=`date +%Y-%m-%d.%H%M`
    local arcnom="/mnt/backup/code.$now.tbz2"
    local flist="/tmp/code.$now.txt"
    find ~/code -type f -a\
      \( -name '*.xml' \
      -o -name '*.java' \
      -o -name '*.properties' \
      -o -name '*.php' \
      -o -name '*.pl' \
      -o -name '*.conf' \
      -o -name '*.pm' \
      -o -name '*.c' \
      -o -name '*.h' \
      -o -name '*sh' \
      -o -name '[Mm]ake*' \
      \) > $flist
    find ~/Documents -type f -a\
      \( -name '*.php' \
      -o -name '*.pl' \
      -o -name '*.conf' \
      -o -name '*.pm' \
      \) >> $flist

    tar vcjf $arcnom -T $flist
}
##
## Copyright (C) 2013, Jed Reynolds
## Free for non commercial use.
##
CodeSnap

Questions? I hope! You just saw a full strength, professional level bash script. If you don’t have questions, show me your script.

%d bloggers like this: