Johan Ekenberg

Computer Programmer and Double Bass Player

Linux Time Machine With Rsync

Edit: The script in this article has been superseeded by the Linux Time Machine project on GitHub

Time Machine is Apple’s backup solution for Mac computers. It makes incremental backups and provides a gui (the star field) for browsing and restoring data from arbitrary points in time. I use it for my Macs and it is nice, although the star field is limited since it provides no terminal access.

Ok, but my important computers run Linux and I want something similar for them. No need for the gui, but incremental backups that are easy to access. The magic bullet is to use rsync with the --link-dest= option. This makes rsync use hard links for every file that is unchanged between backup versions. A hard link takes virtually no space, so the backup will not use more space than necessary for all files + whatever changes occur between versions.

Here’s how I did it, together with a small script to get you started.

You need

  • rsync. It is probably already installed on your Linux machine. Else, install it through your package manager.
  • Somewhere to put the backups. I use a NAS from QNAP, but it could be any kind of external storage. Please note that the target filesystem must support hard links. Microsoft FAT-32 does not.

Configuring the script

My backup storage is mounted over NFS:

1
2
$ grep qnap-backup /etc/fstab
qnap:/backup  /mnt/qnap-backup nfs auto,user 0 0

so $BACKUP_MOUNTPOINT will be /mnt/qnap-backup. You could also use ssh to backup over the network. After setting up ssh-keys for password-less connection, $BACKUP_MOUNTPOINT will be something like backup-hostname:/path/to/backup

$BACKUP_EXCLUDE is the path to a file specifying what to exclude from the backup. Basically it contains exclude-patterns, one per line. See man rsync for details. Mine contains entries like:

/home/johan/Videos/*
/tmp/*

Adjust $BACKUP_MOUNTPOINT and $BACKUP_EXCLUDE. Schedule it to run (as root) periodically. I run my every night at 3 am. This is what it looked like after a few nights (piano is my hostname):

piano:/mnt/qnap-backup$ ls -1
piano-backup-2013-02-20
piano-backup-2013-02-21
piano-backup-2013-02-22
piano-backup-2013-02-23
piano-backup-2013-02-24
piano-backup-2013-02-25
piano-backup-2013-02-26
piano-backup-current

The last entry is a symlink to the most current backup. This is important, since every new backup is compared to the current one, and only new or changed files are actually stored on the filesystem. Everything else is hard-linked, meaning that it takes virtually no extra space.

do_incremental_backup.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#!/bin/sh

BACKUP_MOUNTPOINT="/mnt/qnap-backup"
BACKUP_NAME="`hostname`-backup"
BACKUP_EXCLUDE="~/local/etc/backup_exclude.conf"

current_user=`id -u`

if [ "$current_user" != "0" ]; then
    echo "You need to be root (sudo) to run system backups"
    exit 1
fi

VERBOSE_ARGS=""
if [ "x$1" = "x-v" ]; then
    VERBOSE_ARGS="--progress -v"
fi

TODAY=`date +"%Y-%m-%d"`

CURRENT_BACKUP=$BACKUP_MOUNTPOINT/${BACKUP_NAME}-current
NEW_BACKUP=$BACKUP_MOUNTPOINT/${BACKUP_NAME}-$TODAY

mkdir -p $NEW_BACKUP

if [ ! -d $NEW_BACKUP ]; then
    echo "No such directory: $NEW_BACKUP"
    exit 1
fi

rsync $VERBOSE_ARGS -a --delete --relative --one-file-system \
	--numeric-ids --exclude-from=$BACKUP_EXCLUDE \
	--link-dest=$CURRENT_BACKUP/ / $NEW_BACKUP/

# new symlink to current
[ -h $CURRENT_BACKUP ] && rm -f $CURRENT_BACKUP
ln -s $NEW_BACKUP $CURRENT_BACKUP

Going further

This was just to get you started. The script should probably be extended to do things like:

  • Erase old backups over a certain age.
  • Check that the backup storage is not full.
  • Log success or failure somewhere.
  • Alert you if there is a problem.

Here is an excellent article, explaining the rsync approach in more depth

Comments