Ubuntu-Server
Rsync

Overview of rsync

rsync (Remote Sync) is a command-line utility for copying and synchronizing files locally and remotely across systems. It’s efficient because it only copies the differences between source and destination, minimizing data transfer.

Basic rsync Command Structure

rsync [options] source destination

Common rsync Options

  1. -a – Archive mode (preserves permissions, timestamps, symbolic links, and more).
  2. -v – Verbose mode (shows more details during execution).
  3. -z – Compresses files during transfer (useful for network transfers).
  4. --delete – Deletes files at the destination that don’t exist in the source.
  5. --progress – Displays progress of files being transferred.
  6. --exclude and --include – Used to exclude/include specific files or directories.

1. Local Backup with rsync

To back up a local directory to another local directory:

rsync -av --progress /source_directory/ /backup_directory/
  • Explanation:
    • -a: Archive mode for preserving metadata.
    • --progress: Shows progress.
    • The trailing / on /source_directory/ means “sync contents of directory” (recommended for consistent backups).

Tip: Always use a trailing slash on the source directory to avoid creating nested directories at the destination.

2. Remote Backup with rsync Over SSH

To back up files to a remote server, you can use SSH with rsync to ensure a secure transfer:

rsync -avz --progress -e "ssh -p 22" /source_directory/ user@remote_host:/backup_directory/
  • Explanation:
    • -e "ssh -p 22": Uses SSH with port 22. Modify the port if SSH runs on a different port.
    • user@remote_host:/backup_directory/: Replace with your actual SSH username and server address.

Tip: Setting up SSH keys between machines avoids entering passwords for every transfer. Run ssh-keygen to generate keys and copy them with ssh-copy-id.

3. Scheduling Backups with Cron

To automate backups, you can set up a cron job. Use crontab -e to edit cron jobs, then add:

0 2 * * * rsync -avz --delete --exclude='*.log' /source_directory/ user@remote_host:/backup_directory/
  • Explanation:
    • 0 2 * * *: Runs daily at 2:00 AM.
    • --delete: Removes files from destination that don’t exist on the source (good for mirroring).

Tip: Use --log-file=/path/to/logfile to log the backup operation. It’s useful for tracking errors or confirming backup completion.

4. Excluding Files and Directories

To exclude specific files or directories from a backup, use --exclude:

rsync -av --exclude='*.tmp' --exclude='/path/to/exclude' /source_directory/ /backup_directory/

Tip: For complex exclusions, create an “exclusion list” file and use --exclude-from:

rsync -av --exclude-from='/path/to/exclude_list.txt' /source_directory/ /backup_directory/

5. Using rsync with Bandwidth Limiting

To avoid consuming all bandwidth during a transfer, use the --bwlimit option (in KB/s):

rsync -avz --bwlimit=500 /source_directory/ user@remote_host:/backup_directory/

6. Verifying File Integrity with --checksum

By default, rsync checks file modification times and sizes to determine changes. For added integrity, use --checksum, though it’s slower:

rsync -avz --checksum /source_directory/ /backup_directory/

7. Backup Over Slow Connections with rsync and SSH Compression

If transferring large files over slow networks, enable SSH compression for faster transfers:

rsync -avz -e "ssh -C" /source_directory/ user@remote_host:/backup_directory/
  • Explanation:
    • -e "ssh -C": Enables SSH compression.

Tip: If you’re copying to the same destination frequently, consider setting rsync to update only changed files. For large directory trees, the initial transfer may take time, but subsequent runs will be faster.

8. Dry Run Mode

To see what rsync would do without making changes, use --dry-run:

rsync -avz --dry-run /source_directory/ user@remote_host:/backup_directory/

Tip: This option is great for testing your backup scripts before running them live.

9. Advanced Remote Backups with SSH Configuration Files

If you regularly back up to the same remote server, you can set up SSH configuration shortcuts:

  1. Edit the SSH config file ~/.ssh/config:

    Host backup_server
      HostName remote_host
      User username
      Port 22
      IdentityFile ~/.ssh/id_rsa
  2. Then run rsync with the shortcut:

    rsync -avz /source_directory/ backup_server:/backup_directory/

Practical Example: Full Command with All Options

Here’s a full example command for a remote backup with various options:

rsync -avz --progress --delete --exclude-from='/path/to/exclude_list.txt' --bwlimit=500 --log-file='/var/log/rsync_backup.log' -e "ssh -p 2222" /source_directory/ user@remote_host:/backup_directory/
  • Explanation:
    • Combines archive mode, compression, deletion of extra files, exclusion list, bandwidth limit, and logging, all over SSH with a custom port.

Troubleshooting Tips

  1. Permissions Issues: Use sudo with rsync if permissions errors occur, especially for system directories.
  2. Network Issues: If using SSH, ensure the SSH service is active on both machines and the firewall is open.
  3. Directory Structure: Double-check directory paths and trailing slashes (/) to prevent nesting issues.
  4. Logging and Monitoring: Use logging to track transfer issues and cron for regular monitoring of backups.

Best Practices

  1. Plan the Backup Schedule: Run heavy backups during off-hours.
  2. Automate with SSH Keys: Set up passwordless SSH for automated scripts.
  3. Use Checksums for Critical Backups: Enable --checksum for sensitive data to ensure accuracy.
  4. Log All Backups: Use --log-file and monitor logs regularly.
  5. Test Restores Regularly: Periodically test backup restores to ensure data integrity and completeness.

Using rsync with these techniques will give you efficient and reliable backups, with flexibility for local or remote scenarios.

Cheat Sheet

Command/OptionDescriptionExample
rsync -a source/ destination/Archive mode (preserves permissions, timestamps, symlinks, etc.) for local sync.rsync -a /data/ /backup/
rsync -av source/ destination/Archive and verbose mode to display detailed information.rsync -av /data/ /backup/
rsync -avz source/ destination/Compresses data during transfer (recommended for network backups).rsync -avz /data/ remote:/backup/
rsync -e sshSpecifies ssh as the remote shell to securely sync files over the network.rsync -avz -e ssh /data/ user@host:/backup/
rsync --dry-runRuns a “dry run” to simulate changes without making them (useful for testing).rsync -avz --dry-run /data/ /backup/
rsync --progressShows progress of each file transfer.rsync -avz --progress /data/ /backup/
rsync --deleteDeletes files in the destination that don’t exist in the source (sync/mirror).rsync -avz --delete /data/ /backup/
rsync --exclude='pattern'Excludes files matching a pattern (e.g., specific file types or folders).rsync -av --exclude='*.tmp' /data/ /backup/
rsync --exclude-from=file.txtExcludes patterns listed in a text file (one pattern per line).rsync -av --exclude-from='exclude_list.txt' /data/ /backup/
rsync --include='pattern'Includes only files matching a specific pattern.rsync -av --include='*.jpg' /data/ /backup/
rsync --checksumChecksums files to determine changes, not just size/date (slower, for critical data).rsync -avz --checksum /data/ /backup/
rsync --bwlimit=KBLimits bandwidth (in KB/s) for network transfers.rsync -avz --bwlimit=500 /data/ remote:/backup/
rsync --log-file=logfile.logLogs output to a specified file.rsync -avz --log-file='/var/log/rsync.log' /data/ /backup/
rsync --archive or -aCombines multiple options: -rlptgoD (recursive, links, perms, times, group, owner, devices).rsync -a /data/ /backup/
rsync -rRecursive transfer for syncing directories.rsync -r /data/ /backup/
rsync -PCombines --partial (resumes partial transfers) and --progress.rsync -avzP /data/ /backup/
rsync -uUpdates only newer files on destination (does not overwrite newer files).rsync -avzu /data/ /backup/
rsync -cUses checksum for file comparison, useful for verifying file integrity.rsync -avzc /data/ /backup/
rsync -nAnother way to specify a “dry run.”rsync -avn /data/ /backup/
rsync -pPreserves permissions of files during transfer.rsync -avp /data/ /backup/
rsync -tPreserves timestamps for accurate file synchronization.rsync -avt /data/ /backup/
rsync -gPreserves group ownership of files (for multi-user systems).rsync -avg /data/ /backup/
rsync -oPreserves file ownership (only run as root to avoid errors).sudo rsync -avo /data/ /backup/
rsync -lCopies symbolic links as symbolic links.rsync -avl /data/ /backup/
rsync -DTransfers special devices and block files (only as root).sudo rsync -aD /data/ /backup/
rsync --times or -tPreserves modification times for files (same as -t in -a).rsync -av --times /data/ /backup/
rsync --size-onlyCompares files by size only, ignoring timestamps (useful for same-size files with diff dates).rsync -av --size-only /data/ /backup/
rsync --no-permsIgnores permissions, useful when permissions mismatch across systems.rsync -av --no-perms /data/ /backup/
rsync --link-dest=DIRHard-link to unchanged files in another directory to save space.rsync -av --link-dest=/prev_backup/ /data/ /backup/