Given enough time it happens to us all. We clobber some important file we’ve been pouring ourselves over for the last 4 hours, and *poof* it’s all gone. Many companies setup regular snapshot copies of your home directory at regular intervals so when this happens you can quickly go to your snapshots and pull your lost file out of the ether. While you may only use this occasionally it’s like an insurance policy that’s really worth the regular cost.

I do a lot of work on my laptop and until now I didn’t have the luxury of regular snapshots of my work. Using a program like dirvish it’s easy to setup an automated snapshot of whatever files you want with minimal disk usage. Because dirvish uses hard links to store the incremental backups each backup copy will only consume the difference of the changed files. This way you could store multiple snapshots at a given interval without multiplying the disk usage, here’s how I did it on my Macbook, other Unix systems should be nearly identical.

I’ll walk you through the steps I took to intsall and setup a regular snapshot of my working files using dirvish. This was done on a Macbook Pro running OS X 10.4.8 and dirvish 1.2.1, but it there’s nothing really Mac specific here so porting it to other Unix type platforms should be trivial.

Installing

Go download dirvish.

dirvish requires some additional perl modules that may not be installed on your system. To install these modules run:

» perl -MCPAN -e ‘install Time::ParseDate’
» perl -MCPAN -e ‘install Time::Period’

After this you should be able to extract the dirvish download like so:

» tar -zxf dirvish-1.2.1.tgz

Then perform an installation of dirvish, I’ve highlighted what you should type in bold:

» cd dirvish-1.2.1
» sudo ./install.sh
» perl to use (/opt/local/bin/perl)
» What installation prefix should be used? () /usr/local
» Directory to install executables? (/usr/local/sbin)
» Directory to install MANPAGES? (/usr/local/share/man)
» Configuration directory (/etc/dirvish)
»
» Perl executable to use is /opt/local/bin/perl
» Dirvish executables to be installed in /usr/local/sbin
» Dirvish manpages to be installed in /usr/local/share/man
» Dirvish will expect its configuration files in /etc/dirvish
»
» Is this correct? (no/yes/quit) yes
»
» Executables created.
»
» Install executables and manpages? (no/yes) yes
»
» installing /usr/local/sbin/dirvish
» installing /usr/local/sbin/dirvish-runall
» installing /usr/local/sbin/dirvish-expire
» installing /usr/local/sbin/dirvish-locate
» installing /usr/local/share/man/man8/dirvish.8
» installing /usr/local/share/man/man8/dirvish-runall.8
» installing /usr/local/share/man/man8/dirvish-expire.8
» installing /usr/local/share/man/man8/dirvish-locate.8
» installing /usr/local/share/man/man5/dirvish.conf.5
»
» Clean installation directory? (no/yes) yes
» Install directory cleaned.

Configuration

You’re installation of dirvish should be complete, now we need to configure your backups. The default backup location we set above was /etc/dirvish so we’ll need to place the following in /etc/dirvish/master.conf:

bank:
      /Users/shire/snaps
image-default: %Y%m%d.%H%M
log: none
Dirvish: /usr/local/sbin/dirvish
index: none
expire-default: +2 hours
Runall:
  data
 
 
[Download this code: dirvish/master.conf]

This configuration sets our backups to go into the “bank” /Users/shire/snaps. You’ll want to make this your own settings such as /Users//snaps or /Users/backup or whatever you like. The image-default sets the format of the backups, in this case .. I’ve turned off the log and the index, you can consult the dirvish documentation on how these work. Be sure you set the path to the dirvish executable with the “Dirvish” option. This prevented the automated cron jobs from running and cost me some extra debug time, which I’ll try to save you. I keep my snapshots for a 2 hour period as denoted by expire-default, you’ll want to adjust this to your liking once you have this working. Finally the run-all tells dirvish to update the “data vault” (vaults are just a fancy name for a specific grouping of backups), which we’ll setup now…

Go ahead and create your ~/snaps directory (or whatever you used). Within this create a subdirectory called data, and within that a directory called dirvish. (Basically just use the command “mkdir -p ~/snaps/data/dirvish” ). We need to setup some specific configs for this data vault so you’ll want to place the following config in the ~/snaps/data/dirvish/default.conf file:

client: shirebook.local
tree: /Users/shire/data
exclude:
  + /apc/
  + /apc/**
  + /juice/
  + /juice/**
  + /monome/
  + /monome/**
  + /vld/
  + /vld/**
  + /xdebug/
  + /xdebug/**
  - *
 
 
[Download this code: dirvish/default.conf]

We need to set the “client” setting to your machines local name. You can determine this by running the “hostname” command. The tree option specifies the root directory we are backing up, the exclude options will be relative to this as will the snapshots. Dirvish doesn’t have an include option, so we have to do some hackery here. The “exclude” option really just gets passed to rsync, which accepts a +/- notation for which files to include. We’ll start by using regular expression like syntax for the directories we want to backup. Basically this is “+ /dir/” followed by another “+ /dir/**”. Do this for each directory and end it with a “- *” to exclude all other directories. Obviously if you can and want to backup every file under a given directory you can leave the exclude setting blank and just add items like “*.o” for example to exclude any file ending in “.o”. I’d encourage you to consult the dirvish documentation or the rsync man page to tweak this to your liking.

Execution

We’re almost there! The vault has to be initialized so that an initial backup can be created. This might take a little while depending on how much data you are copying. Go into the ~/snaps/data directory and run the following command: (data must match the vault name specified in master.conf, in this case “data”. Also make sure that the user running this has enough permissions to access all the files or you might get some strange errors. In my case I have to run this as root)

» /usr/local/bin/dirvish –vault data –init

After this completes you should have a fresh snapshot in the same directory. Your source files are kept in a tree subdirectory of each snapshot. Check it out and make sure it’s got the files you want.

The regular incremental snapshots are created with the command:

» /usr/local/bin/dirvish-runall

You can try running this by hand to see it create another snapshot. Old snapshots are removed according to the rules you specified in /etc/dirvish/master.conf (+2 hours in my example). This is done by running:

» /usr/local/bin/dirvish-expire

So to make this automated we need to setup some cron jobs. This is pretty straightforward, and more information can be found by doing a simple “man cron” command. Let’s open your crontab for editing (make sure you do this as the user you want to run the dirvish command, in my case this needs to be root due to some permission issues):

» crontab -e

Then add the following, adjusting to meet your needs:

# ------------- minute (0 - 59)
# | ----------- hour (0 - 23)
# | | --------- day of month (1 - 31)
# | | | ------- month (1 - 12)
# | | | | ----- day of week (0 - 6) (Sunday=0)
# | | | | |
# * * * * * command to be executed
 
0 */2 * * * /usr/local/sbin/dirvish-expire --quiet
*/10  * * * * /usr/local/sbin/dirvish-runall --quiet
 
 
[Download this code: dirvish/crontab]

This runs the dirvish-expire command every 2 hours, and the dirvish-runall command every 10 minutes. I run the dirvish-expire every 2 hours as this is the expiration time I specied in the /etc/dirvish/master.conf, you should probably adjust this to match whatever you have there. Once you save this file, cron will automatically read the file and start running it as specified. If you don’t start seeing regular snapshots, try checking the users mailbox as cron will output errors there. (just run “mail” as the user).

Fin!

If both of us did everything right, you should now have fresh snapshots of your data that you’ll hopefully never have to use. If you run into problems, the dirvish config files can be pretty syntax sensitive. Try playing with spacing and tabs, post your problems here or email me with suggestions on how this can be made better!