Saturday, August 1, 2009

Backup server file changes with Rsync

WEBMASTER UPDATE

By David Gewirtz

Traditional file synchronization programs work by matching the contents of two directories. When used as a backup, the contents of the source directory are mirrored to the destination directory. When used in a synchronization mode, the two directories are compared and files not in one are moved to the other, and vice versa.

The problem with traditional file synchronization programs is they typically work by moving or copying entire files. This is fine on a local network, especially one working at gigabit speeds. But if you're moving files from a remote server to a local machine, you might not want to move entire files, especially if the files are particularly large.

Here at ZATZ, for example, we have a number of constantly updated server files that are in excess of 20GB. Some of these are database files we can synchronize using MySQL on a record-level, but others are wildly large data files that simply can't be synchronized based on their internal structure.

And yet, because our server farm is located in Illinois and our development lab is located here in Central Florida, we want to make sure we have up-to-date complete local copies of everything running on the server. There's just no way we can download multiple 20GB+ files to our local machines on a daily basis. We'd clog the pipes and it'd still take days to download a single file.

The solution, instead, is to only download what's changed in the files and reconstruct the file at the local side by merging in the changes. We're not alone in this requirement; in fact a particularly valuable Linux utility exists to do just this.

Rsync

Rsync was developed back in 1996 by Australians Andrew Tridgell and Paul Mackerras. Tidgell used Rsync as the subject of his Ph.D. thesis (thanks, Wikipedia!). As Tridgell described it, it can take a long time to get data transmitted from the rest of the world to Australia and he set out to find a way to make it go faster.

Rsync is the result of his quest, and it's long been part of most Linux distros. But what if you want to set up Rsync under Windows, you're not a Linux command-line wiz, and you don't want to rebuild everything on your server? That's what this article's all about.

Backing up servers

We're going to go through a scenario you can use to backup your server. As it turns out, there's a really slick little utility that'll get you most of the way.

Called DeltaCopy, this free utility is really a front-end to the Cygwin version of Rsync. But what makes DeltaCopy nice is that it comes with just rsync.exe and the necessary Cygwin DLLs, and that means you don't need to download the massive Cygwin install.

The first thing to do is install DeltaCopy on your server. Just double-click the installer and tell DeltaCopy to install Rsync as a service, as shown in Figure A.