Backups: one quick file backup alias

When you have a file you need to edit and you have the foresight to think, “whoa, make a copy before I destroy…” you often copy hulk.txt to hulk.txt.old (that’s using the minimum of keystrokes:

 cp hul[tab][tab] hul[tab][tab].old[enter].
Linux Backups logo

Linux Backups

Well, a week later, what do you rename your next .old file? .old2? No time to put this folder into revision control? Thought so. You can inspect that last modified time on your file with stat. Experiment with this first:

echo `stat hulk.txt | awk '/Modify:/ {print $2}'`

(*snrk* did I just get you use use Awk? OMG!)

So how does that help…more precisely, you’re asking how do I add that to a backup file name? One of many ways, and I will show you the method with least typing: use an in-place shell exansion.

cp hulk.txt .hulk.txt.`stat hulk.txt | awk '/Modify:/ {print $2}'`

STOP. What wee character did I just sneek into that filename? Hold on, first write it up in an alias so you can reuse it:

alias bu="cp hulk.txt .hulk.txt.\`stat hulk.txt | awk '/Modify:/ {print $2}'\`"

Right, the backslashes (or ‘hacks’ as I nic them) keep your statement from actually evaluating the command as soon as it’s defined. The backtick is the same as saying “bash -e …stuff...”. Anyhow, now type bu and you can backup hulk.txt again. Now type ‘ls’ and see where your backup is.

No file? And no error? Oh, right the period before name hides it (sneeky). This means the next time we accidentally do a “rm *” (which often appears when you say “rm * .old” — Computer, stop, replay with magnification: rm__*__.old ). You need a good-old-fasioned:

ls -a

It’s hiding. Let’s finish up here with your alias, properly written:

alias bu="\`cp $1 .$1.\`stat $1 | awk '/Modify:/ {print $2}'\`"

Can we do it without that crazy awk? Sure:

alias bu="\`cp $1 .$1.\`stat $1 --printf %Y '\`"

Now go make a backup…right now!

Backups: using tar and find

If you are familiar with zip files, they are the DOS version of tar files (tar = Tape Archive). The tar utility is totally intended for storing backups. A quick way to backup your home directory is:

cd /home ; tar -cvf home-jed.tar ./jed

You might see that command grab a whole lot of stuff you don’t want to keep, including all your Firefox cache files and your Trash files. Also that archive is uncompressed. Lets get it compressed as much as we can, first, that’s easy:

tar -cvjf home-jed.tbz2 ./jed

Next, we can build a list of files we want to backup using find. Please don’t try and avoid the find command, once you begin to understand it, life in Linux really can improve. On our first try, we will pair it down with fgrep (simple grep) to exclude our Firefox .Cache directory.

cd home
find jed/.mozilla/firefox \
| fgrep -v '.default/Cache' \
> /tmp/jed.txt

And following that, avoiding our trash can:

find jed/.mozilla/firefox \
| fgrep -v '.default/Cache' \
| fgrep -v '.local/share/Trash' \
> /tmp/jed.txt

Now think about why we want to use pipe operators in that second find command. Would it be easier as two commands both appending to /tmp/jed.txt? (Think about the overlap and duplication that results.)

If we wanted to use that file to guide tar, we change our tar command like so:

tar cvjf ./jed.tbz2 -T /tmp/jed.txt

In order to make regular backups a regularity, we need to make them pertinent and economical (of time and of space). We often do not want to back up ephemeral files that are byproducts of our work. If you program, you will have ready examples on your own drive: .a, .o, .out, .class code files often do not need to be kept if you make them several times a day.

Consider the example below. With it we can backup the substantive slice of our code tree to another drive on our system. We avoid the ephemeral files. We also chose to backup our code separately from the rest of our home directory. By doing this we can schedule code tree backups every hour, and schedule our home tree backups just once a day.

#!/bin/bash
function CodeSnap() {
    local now=`date +%Y-%m-%d.%H%M`
    local arcnom="/mnt/backup/code.$now.tbz2"
    local flist="/tmp/code.$now.txt"
    find ~/code -type f -a\
      \( -name '*.xml' \
      -o -name '*.java' \
      -o -name '*.properties' \
      -o -name '*.php' \
      -o -name '*.pl' \
      -o -name '*.conf' \
      -o -name '*.pm' \
      -o -name '*.c' \
      -o -name '*.h' \
      -o -name '*sh' \
      -o -name '[Mm]ake*' \
      \) > $flist
    find ~/Documents -type f -a\
      \( -name '*.php' \
      -o -name '*.pl' \
      -o -name '*.conf' \
      -o -name '*.pm' \
      \) >> $flist

    tar vcjf $arcnom -T $flist
}
##
## Copyright (C) 2013, Jed Reynolds
## Free for non commercial use.
##
CodeSnap

Questions? I hope! You just saw a full strength, professional level bash script. If you don’t have questions, show me your script.

Backups: outline

Here’s some basic programs and techniques I’ll be covering about backups.

  • tar
  • rsync
  • find
  • date
  • how to write “now” using date
  • how to find files newer than your last backup
  • all this will be done in bash

Backups: Using rsync and find.

Linux Backups logo

Linux Backups

I’m doing a talk at Linuxfest Northwest on making Very Sexy Backup Scripts. This is because you are more empowered when you know your filesystem and a little bit of bash scripting on your sweet linux system.

I’ll start with an out-of-order post showing rsync and find. If you don’t know [[WTF]] I’m talking about, do comment, but stick with me. This could save you hundreds or thousands of dollars by avoiding purchasing a separate piece of backup software or proprietary solution.

1 #!/bin/bash
2 ##
3 ## backup script, (C) 2013, Jed Reynolds
4 ##
5 source ./bu-rsync.sh
6 export DEST="latitude"
7 EXCLUDES="XX.gvfs PP.cache PPCache PPTrash"
8 bu_steps home/jreynolds/ / $DEST/home-jreynolds JUST_FILES $EXCLUDES
9 bu_steps home/jreynolds/ / $DEST/home-jreynolds $EXCLUDES
10 bu_steps home/liam/ / $DEST/home-liam JUST_FILES $EXCLUDES
11 bu_steps home/liam/ / $DEST/home-liam $EXCLUDES
12 bu_steps home/Music / $MDEST SIZEONLY
13 bu_steps home/Pictures / $PDEST SIZEONLY
14
15 EXCLUDES=""
16 bu_steps home/candela/btbits/x64_btbits/client / $DEST/home-candela/client \
17 'XX*.class'
18 bu_steps home/candela/btbits/x64_btbits/server / $DEST/home-candela/server \
19 'XX*.o'
20 bu_steps home/candela/btbits/x64_btbits/tools / $DEST/home-candela/btbits/x64_btbits/tools \
21 'XX*.o'
22 bu_steps home/candela/btbits/x64_btbits/3plibs / $DEST/home-candela/3plibs \
23 'XX*.o'
24 bu_steps home/candela/btbits/x64_btbits/html / $DEST/home-candela/html
25 bu_steps etc / $DEST/etc
26 bu_steps usr/local / $DEST/usr-local \
27 SIZE_ONLY
28 ##
29 ## Free for non-commercial use. No warrany or support offered.
30 ##

That was the config file. It wont do anything without the library of functions, below:

  1 #!/bin/bash                                                                                                     
  2 ##
  3 ## backup script library, (C) 2013, Jed Reynolds
  4 ## Free for non-commercial use. No warrany or support offered.
  5 ##
  6 BU_HOST=beavertail
  7 SSH_OP="-i /home/jreynolds/.ssh/beavertail_dsa"
  8 LSYNC="rsync "
  9 export REALM=tank
 10 RHOST="backup@$BU_HOST"
 11 RSYNC="rsync --progress -rlpt --copy-unsafe-links "
 12 Y=`date +%Y`
 13 DEST="latitude"
 14 PDEST="pictures"
 15 MDEST="music"
 16 SDEST="softlib"
 17 VDEST="VMs"
 18 RSYNC_PASSWORD="m........"
 19 RSYNC_PASSWD="m........"
 20 FLIST="/tmp/bu-list"
 21 MK_SNAP="000-mksnap-000"
 22 export RSYNC_PASSWORD RSYNC_PASSWD
 23  
 24 function fail() {
 25     local msg=${1:-"unknown cause"}
 26     echo " -- $msg --"
 27     exit 1
 28 }
 29 function ping_gw() {
 30     local default_gw=$(ip r | grep default | cut -d ' ' -f 3)
 31     ping_host $default_gw
 32     return $?
 33 }
 34 function ping_host() {
 35     [ -z "$1" ] && echo "ping_host cannot ping no host, bye" && exit 1
 36     ping -q -w 1 -c 1 $1    > /dev/null && return 0 || return 1
 37 }
 38 function bu_steps() {
 39     ping_gw                 || fail "no gateway"
 40     ping_host $BU_HOST      || fail "no ping to $BU_HOST"
 41     local f_others=""
 42     local r_others=""
 43     local dir_a=$1;         shift
 44     local sit_on=$1;        shift
 45     local dir_b=$1;         shift
 46     [ "${dir_b:0:1}" != "/" ]       && dir_b="$dir_b"
 47     local xcld=""
 48     local prun=""
 49     local minusv=""
 50     local flist="/tmp/flist.txt"
 51     local RECT=".recent"
 52     while(( "$#" )); do
 53         [ "$1" == "SIZEONLY" ]      && r_others="$r_others --size-only "
 54         [ "$1" == "JUST_FILES" ]    && f_others="-maxdepth 1 -type f $f_others "
 55         [ "$1" == "JUST_FILES" ]    && RECT=".recent_files"
 56         if [[ $1 == XX* ]] ; then
 57             [ ! -z "$xlcd" ]        && xcld="$xcld -o" 
 58             [ ! -z "$minusv" ]      && minusv="${minusv}|" 
 59             xcld="$xcld ${1/XX/-name }"
 60             minuxv="${minusv}${1/XX/}"
 61         fi
 62         if [[ $1 == PP* ]] ; then
 63             [ ! -z "$prun" ]        && prun="$prun -o" 
 64             [ ! -z "$minusv" ]      && minusv="${minusv}|" 
 65             prun="$prun ${1/PP/-name } -prune"
 66             minuxv="${minusv}${1/XX/}"
 67         fi
 68         shift;
 69     done
 70     [ ! -z "$prun" ]                && prun="-a ( ! $prun )"
 71     [ ! -z "$xcld" ]                && xcld="-a ( ! $xcld )"
 72     [ ! -z "$minusv" ]              && minusv="| grep -v ($minusv)"
 73    
 74     local mk_remot=0
 75     $LSYNC ${RHOST}::${REALM}/${dir_b} || mk_remot=1
 76     if [ $mk_remot -eq 1 ] ; then
 77        ssh jreynolds@$BU_HOST "sudo mkdir -p /$REALM/$dir_b && sudo chmod 777 /$REALM/$dir_b " || fail "Could not make dir in /$REALM/$dir_b"
 78     fi
 79     cd $sit_on
 80     
 81     # Find the .recent file, or create a recent file of Jan 1, 1990
 82     if [ ! -f $dir_a/$RECT ] ; then
 83         echo "creating ${sit_on}${dir_a}/$RECT that will pull Everything!"
 84         touch -d "01 Jan 1990" ${sit_on}${dir_a}/$RECT || fail "permission denied in ${sit_on}${dir_a}, bye"
 85     fi
 86     cat /dev/null > $flist
 87     if [ ! -d ${sit_on}${dir_a} ] ; then 
 88         echo "** ${sit_on}${dir_a} not found, skipping **"
 89         return 0
 90     else
 91         FND_CMD="find $dir_a $f_others -type f -a -newer ${sit_on}${dir_a}/$RECT $xcld $prun"
 92         echo "$FND_CMD $minusv      > $flist"
 93         $FND_CMD $minusv            > $flist || fail "unable to complete find command"
 94     fi
 95     local fil_ct=`wc -l $flist | cut -d' ' -f1`
 96     if [ $fil_ct -lt 1 ] ; then
 97         echo "Skipping $dir_b, no files found"
 98     else
 99         echo "== $RSYNC $r_others --files-from=$flist $sit_on $RHOST::$REALM/$dir_b/ =="
100         $RSYNC $r_others -v --files-from=$flist $sit_on $RHOST::$REALM/$dir_b/ || fail "Rsync failed, bad paths?"
101         touch_mksnap $dir_b
102         touch $dir_a/$RECT
103     fi
104 }
105 function check_nas_ready() {
106     ssh $SSH_OP jreynolds@$BU_HOST "~/bin/mount_up"
107 }
108 function touch_mksnap() {
109     ssh $SSH_OP jreynolds@$BU_HOST "sudo touch /$REALM/$1/$MK_SNAP"
110 }
111  
112 [ `id -u` == "0" ]  || fail "Do not run but as root, bye."
113 ping_gw             || fail "no gateway"
114 ping_host $BU_HOST  || fail "no ping to $BU_HOST"
115 check_nas_ready     || fail "NAS not prepared, cannot continue."
116  
117 [ -f $DEST/NOT-MOUNTED ]        && mount $DEST 
118 [ -f $PDEST/NOT-MOUNTED ]       && mount $PDEST 
119 [ -f $MDEST/NOT-MOUNTED ]       && mount $MDEST 
120  
121 [ -f $DEST/NOT-MOUNTED ]        && fail "Remote system not mounted to $DEST, bye."
122 [ -f $MDEST/NOT-MOUNTED ]       && fail "Remote system not mounted to $MDEST, bye."
123 [ -f $PDEST/NOT-MOUNTED ]       && fail "Remote system not mounted to $PDEST, bye."
124 #

Recommendations for #Linux storage server?

I’ve got an embedded 800mhz Via system that I’m going to retrofit with a 2TB drive for catching backups. I used to run CentOS 3.5 on there but those days are long gone. I’d rather run a rolling distro…Debian…Arch. I will likely also retrofit it with a CF adapter for the boot device. Likely I will use it for a caching DNS server, and if I have resources left over, maybe some Squid caching for the household as well. What would you recommend?