Overview of rsync
rsync
(Remote Sync) is a command-line utility for copying and synchronizing files locally and remotely across systems. It’s efficient because it only copies the differences between source and destination, minimizing data transfer.
Basic rsync
Command Structure
rsync [options] source destination
Common rsync
Options
-a
– Archive mode (preserves permissions, timestamps, symbolic links, and more).-v
– Verbose mode (shows more details during execution).-z
– Compresses files during transfer (useful for network transfers).--delete
– Deletes files at the destination that don’t exist in the source.--progress
– Displays progress of files being transferred.--exclude
and--include
– Used to exclude/include specific files or directories.
1. Local Backup with rsync
To back up a local directory to another local directory:
rsync -av --progress /source_directory/ /backup_directory/
- Explanation:
-a
: Archive mode for preserving metadata.--progress
: Shows progress.- The trailing
/
on/source_directory/
means “sync contents of directory” (recommended for consistent backups).
Tip: Always use a trailing slash on the source directory to avoid creating nested directories at the destination.
2. Remote Backup with rsync
Over SSH
To back up files to a remote server, you can use SSH with rsync
to ensure a secure transfer:
rsync -avz --progress -e "ssh -p 22" /source_directory/ user@remote_host:/backup_directory/
- Explanation:
-e "ssh -p 22"
: Uses SSH with port 22. Modify the port if SSH runs on a different port.user@remote_host:/backup_directory/
: Replace with your actual SSH username and server address.
Tip: Setting up SSH keys between machines avoids entering passwords for every transfer. Run ssh-keygen
to generate keys and copy them with ssh-copy-id
.
3. Scheduling Backups with Cron
To automate backups, you can set up a cron job. Use crontab -e
to edit cron jobs, then add:
0 2 * * * rsync -avz --delete --exclude='*.log' /source_directory/ user@remote_host:/backup_directory/
- Explanation:
0 2 * * *
: Runs daily at 2:00 AM.--delete
: Removes files from destination that don’t exist on the source (good for mirroring).
Tip: Use --log-file=/path/to/logfile
to log the backup operation. It’s useful for tracking errors or confirming backup completion.
4. Excluding Files and Directories
To exclude specific files or directories from a backup, use --exclude
:
rsync -av --exclude='*.tmp' --exclude='/path/to/exclude' /source_directory/ /backup_directory/
Tip: For complex exclusions, create an “exclusion list” file and use --exclude-from
:
rsync -av --exclude-from='/path/to/exclude_list.txt' /source_directory/ /backup_directory/
5. Using rsync
with Bandwidth Limiting
To avoid consuming all bandwidth during a transfer, use the --bwlimit
option (in KB/s):
rsync -avz --bwlimit=500 /source_directory/ user@remote_host:/backup_directory/
6. Verifying File Integrity with --checksum
By default, rsync
checks file modification times and sizes to determine changes. For added integrity, use --checksum
, though it’s slower:
rsync -avz --checksum /source_directory/ /backup_directory/
7. Backup Over Slow Connections with rsync
and SSH Compression
If transferring large files over slow networks, enable SSH compression for faster transfers:
rsync -avz -e "ssh -C" /source_directory/ user@remote_host:/backup_directory/
- Explanation:
-e "ssh -C"
: Enables SSH compression.
Tip: If you’re copying to the same destination frequently, consider setting rsync
to update only changed files. For large directory trees, the initial transfer may take time, but subsequent runs will be faster.
8. Dry Run Mode
To see what rsync
would do without making changes, use --dry-run
:
rsync -avz --dry-run /source_directory/ user@remote_host:/backup_directory/
Tip: This option is great for testing your backup scripts before running them live.
9. Advanced Remote Backups with SSH Configuration Files
If you regularly back up to the same remote server, you can set up SSH configuration shortcuts:
-
Edit the SSH config file
~/.ssh/config
:Host backup_server HostName remote_host User username Port 22 IdentityFile ~/.ssh/id_rsa
-
Then run
rsync
with the shortcut:rsync -avz /source_directory/ backup_server:/backup_directory/
Practical Example: Full Command with All Options
Here’s a full example command for a remote backup with various options:
rsync -avz --progress --delete --exclude-from='/path/to/exclude_list.txt' --bwlimit=500 --log-file='/var/log/rsync_backup.log' -e "ssh -p 2222" /source_directory/ user@remote_host:/backup_directory/
- Explanation:
- Combines archive mode, compression, deletion of extra files, exclusion list, bandwidth limit, and logging, all over SSH with a custom port.
Troubleshooting Tips
- Permissions Issues: Use
sudo
withrsync
if permissions errors occur, especially for system directories. - Network Issues: If using SSH, ensure the SSH service is active on both machines and the firewall is open.
- Directory Structure: Double-check directory paths and trailing slashes (
/
) to prevent nesting issues. - Logging and Monitoring: Use logging to track transfer issues and
cron
for regular monitoring of backups.
Best Practices
- Plan the Backup Schedule: Run heavy backups during off-hours.
- Automate with SSH Keys: Set up passwordless SSH for automated scripts.
- Use Checksums for Critical Backups: Enable
--checksum
for sensitive data to ensure accuracy. - Log All Backups: Use
--log-file
and monitor logs regularly. - Test Restores Regularly: Periodically test backup restores to ensure data integrity and completeness.
Using rsync
with these techniques will give you efficient and reliable backups, with flexibility for local or remote scenarios.
Cheat Sheet
Command/Option | Description | Example |
---|---|---|
rsync -a source/ destination/ | Archive mode (preserves permissions, timestamps, symlinks, etc.) for local sync. | rsync -a /data/ /backup/ |
rsync -av source/ destination/ | Archive and verbose mode to display detailed information. | rsync -av /data/ /backup/ |
rsync -avz source/ destination/ | Compresses data during transfer (recommended for network backups). | rsync -avz /data/ remote:/backup/ |
rsync -e ssh | Specifies ssh as the remote shell to securely sync files over the network. | rsync -avz -e ssh /data/ user@host:/backup/ |
rsync --dry-run | Runs a “dry run” to simulate changes without making them (useful for testing). | rsync -avz --dry-run /data/ /backup/ |
rsync --progress | Shows progress of each file transfer. | rsync -avz --progress /data/ /backup/ |
rsync --delete | Deletes files in the destination that don’t exist in the source (sync/mirror). | rsync -avz --delete /data/ /backup/ |
rsync --exclude='pattern' | Excludes files matching a pattern (e.g., specific file types or folders). | rsync -av --exclude='*.tmp' /data/ /backup/ |
rsync --exclude-from=file.txt | Excludes patterns listed in a text file (one pattern per line). | rsync -av --exclude-from='exclude_list.txt' /data/ /backup/ |
rsync --include='pattern' | Includes only files matching a specific pattern. | rsync -av --include='*.jpg' /data/ /backup/ |
rsync --checksum | Checksums files to determine changes, not just size/date (slower, for critical data). | rsync -avz --checksum /data/ /backup/ |
rsync --bwlimit=KB | Limits bandwidth (in KB/s) for network transfers. | rsync -avz --bwlimit=500 /data/ remote:/backup/ |
rsync --log-file=logfile.log | Logs output to a specified file. | rsync -avz --log-file='/var/log/rsync.log' /data/ /backup/ |
rsync --archive or -a | Combines multiple options: -rlptgoD (recursive, links, perms, times, group, owner, devices). | rsync -a /data/ /backup/ |
rsync -r | Recursive transfer for syncing directories. | rsync -r /data/ /backup/ |
rsync -P | Combines --partial (resumes partial transfers) and --progress . | rsync -avzP /data/ /backup/ |
rsync -u | Updates only newer files on destination (does not overwrite newer files). | rsync -avzu /data/ /backup/ |
rsync -c | Uses checksum for file comparison, useful for verifying file integrity. | rsync -avzc /data/ /backup/ |
rsync -n | Another way to specify a “dry run.” | rsync -avn /data/ /backup/ |
rsync -p | Preserves permissions of files during transfer. | rsync -avp /data/ /backup/ |
rsync -t | Preserves timestamps for accurate file synchronization. | rsync -avt /data/ /backup/ |
rsync -g | Preserves group ownership of files (for multi-user systems). | rsync -avg /data/ /backup/ |
rsync -o | Preserves file ownership (only run as root to avoid errors). | sudo rsync -avo /data/ /backup/ |
rsync -l | Copies symbolic links as symbolic links. | rsync -avl /data/ /backup/ |
rsync -D | Transfers special devices and block files (only as root). | sudo rsync -aD /data/ /backup/ |
rsync --times or -t | Preserves modification times for files (same as -t in -a ). | rsync -av --times /data/ /backup/ |
rsync --size-only | Compares files by size only, ignoring timestamps (useful for same-size files with diff dates). | rsync -av --size-only /data/ /backup/ |
rsync --no-perms | Ignores permissions, useful when permissions mismatch across systems. | rsync -av --no-perms /data/ /backup/ |
rsync --link-dest=DIR | Hard-link to unchanged files in another directory to save space. | rsync -av --link-dest=/prev_backup/ /data/ /backup/ |