(Yet Another rsync Rotator)


Warning: Yarr! Backup is alpha software. Don't expect anything to work as advertised (or to be implemented at all). And while this sounds ironic, it could be a good idea to backup your data before using it (as it's just not really tested yet).


Yarr! Backup is a backup tool for POSIX-like systems (e.g. Linux). It takes some directories of data and copies them into a backup which it puts into the store.

The store is just some directory either on the local machine or on a machine accessible via ssh or directly via rsync protocol. A backup is simply a subdirectory of the store. Its name is the date and time the backup was taken. It contains a full copy of the data directories plus the log of the backup process (marking the backup as complete) or a tag file marking it as incomplete. Other files or directories in the store are allowed and will be ignored by Yarr! Backup.

Either the data or the store can live on a remote machine. If the name you specify is prefixed by hostname: or user@hostname:, the rsync underlying Yarr! Backup will log into the host via ssh to access the data or store there. If the name is prefixed by hostname::, a native connection to an rsync daemon will be used.

The backups are kept according to a set of backup plans, i.e. daily backups, weekly backups etc. Yarr! Backup's prime functionality is to maintain these plans.

Features

Yarr! Backup has been written to backup computers that are switched on and off at odd times onto a continuously running file server. It is designed specifically to complement the more canonical rsnapshot tool, which is unsuited for this usecase.

So Yarr! Backup does:

How to use

The commandline I use to backup my desktop PC is:

yarr.py backup --store=colin:/home/backup/thrashbarg /home /var/space /usr/local /etc

As you can see, I backup the homes (/home), some other directory (/var/space), manually installed software (/usr/local) and my settings (/etc), pushing it onto my backup server colin, where it is stored under /home/backup/thrashbarg (thrashbarg is the name of my desktop, the store for my notebook is colin:/home/backup/fenchurch, you get the idea).

In order to be able to transparently backup onto colin, I had to setup passwordless ssh access for root (the user running Yarr! Backup) from thrashbarg to colin. Search the ssh manpage for authorized_keys for details.

The backup subcommands does two things. First it creates a new backup containing a snapshot of the data. As all files that don't have changed since any other recent backup are not stored anew but get hardlinked to the existing data, this is quite storage efficient, even though each backup represents a full snapshot. After the snapshot is taken, obsolete backups are expired, i.e. daily, weekly, monthly, … and recent backups are retained, extra backups (e.g. a daily one more than a week old) are deleted.

For a list of other subcommands and options, run yarr.py without parameters.

Backup plans

Backups are retained or expired according to a set of backup plans. This is similar in spirit to the old custom of reusing the tapes of daily backups after a week while keeping the weekly tape of the, say, friday backup for a month. But as we don't use any tapes but stuff everything on the same harddisk, we don't need to rotate and reuse but only throw away backups that are no longer needed. So I have to admit that the program's name yet another rsync rotator is, strictly spoken, just plain wrong.

A Yarr! Backup plan has two parameters, its period (max age) and its frequency (min age, sort of). A backup is considered in plan when the time between it and the recent backup is no longer than period and it is not closer than frequency to the next older in-plan backup. This means, to be in plan, a backup must not be too old and must not be too frequently.

Backups that are not in any plan (or incomplete) and not recent are considered expired and get deleted.

The current alpha version uses a fixed set of backup plans:

PlanFrequencyPeriod
daily1 day7 days
weekly7 days31 days
monthly30 days94 days
quarterly90 days366 days
annual365 daysindefinite

This means you can run Yarr! Backup as often as you want, not more than one backup per day will be retained. After a week not more than one per week is retained, after a month one per month, after a quarter one per quarter, after a year one per year. This is kept unless you delete it manually.