How to Use rsync to Backup Files

Copy-pasting could be a straightforward approach to backing up important files and directories. But rsync takes this process further. This page describes the basic usage of the rsync utility.

Created
July 14, 2020

Why Simple Copy-Pasting May Not Be Enough

Imagine a folder with important data and an empty external hard drive for backups. Creating a backup of the folder is easy, right? Right-click on the folder, select Copy, right-click on the empty hard drive window and select Paste. Wait until the transfer completes, and you are done. This is what many people in my family seem to do.

While the previous approach works, it is not perfect for incremental backups. Imagine you change the contents of the important folder after it was copied to the backup hard drive. Now you may want to back up the important folder again. Copy-paste, and you will be greeted with a message telling you that a folder with the same name already exists on the destination hard drive. Usually, you can decide whether to skip files with the same names or overwrite them. It does not tell you whether the content of the files differs. You could either end up re-writing all files over again or skipping files whose content may have changed since the previous backup.

If the cp command was used, existing files would be overwritten without asking you anything.

Backup Files With rsync

I will cover only the basic functionality of rsync here. For more options, try man rsync.

Read the man pages (man rsync) before using any commands you don't understand. Incorrect usage of any of the parameters could result in a data loss!

First-time Backup

Let's say I want to back up a data directory to an external hard drive mounted at /mnt/BHD. I want to preserve permissions and other attributes, so I will use the -a switch. To avoid permission errors, I use sudo.

sudo rsync -a ./data /mnt/BHD/

Note that there is no forward slash after the source data directory path! This will copy the whole directory. A forward slash after the destination path is arbitrary.

Future Backups

I prefer to perform the backup in three steps.

  1. First, copy only the new files that do not exist in the destination and skip those that exist. Note that now there is a forward slash after the data directory path. Also, the data directory is specified in the destination path (here the slash is arbitrary). The first command only outputs the files that will be copied without copying anything. The second command will copy the files.

    sudo rsync -anv --ignore-existing ./data/ /mnt/BHD/data/
    sudo rsync -a --ignore-existing ./data/ /mnt/BHD/data/
  2. Second, copy all remaining files that exist in the destination but have been changed since. This ignores files that have not changed, which is the biggest advantage over plain cp or copy-paste.

    sudo rsync -anv ./data/ /mnt/BHD/data/
    sudo rsync -a ./data/ /mnt/BHD/data/
  3. Lastly, remove files from the destination that do not exist in the data directory anymore.

    sudo rsync -anv --delete ./data/ /mnt/BHD/data/
    sudo rsync -a --delete ./data/ /mnt/BHD/data/

Compare Contents of Two Directories

To quickly compare the contents of two directories, data and /mnt/BHD/data, use the following command. It uses file modification times to decide whether a file has been changed. It will print all files that differ.

sudo rsync -anvi --delete ./data/ /mnt/BHD/data/

To compare the contents of the files (instead of their names and modification times), use the following command. It calculates and compares checksums of the data. This is a reliable method to discover if files differ in at least a single bit. It will take a very long time though because the whole source and destination files need to be read in order to determine whether they differ.

sudo rsync -anvci --delete ./data/ /mnt/BHD/data/

Bonus: Preserve More Attributes, Copy Sparse Files, Show Progress

The heading says it all.

sudo rsync -aAXHS --delete --info=progress2 ./data/ /mnt/BHD/data/