WordPress-Server-Ubuntu-24.04
6 Automatic Backups to S3 Bucket

Automated Backups to Amazon S3

In this chapter, we'll set up and automate backups for your WordPress site hosted at example.com. Regular backups are essential to ensure you can recover from issues like user error, corruption, or security breaches.

There are two types of backups that are recommended:

  1. Full System Backups: These are typically provided by your VPS provider (e.g., DigitalOcean, AWS, Linode) for recovering the entire server. However, full system backups aren't necessary unless you're restoring the entire server after catastrophic failure.
  2. Site-Specific Backups: These backups save the database and files for individual sites. This is crucial for WordPress sites, as restoring a single site is often all that's needed.

For WordPress backups, include all data and files in the backup, including the database, uploads directory, core files, plugins, and themes.

Create a Bash Script to Backup WordPress Files and Database

For regular backups, a weekly backup is sufficient for sites that aren't frequently updated. For e-commerce sites, consider running backups more frequently (e.g., every few hours) to prevent data loss.

Step 1: Create a Backup Directory

First, create a directory to store your backups:

cd /var/www/example.com
mkdir backups

This will create a backups directory in your site's root folder, where the public files are stored.

Step 2: Write the Backup Script

Next, create a bash script to back up both the database and the site's files. Create a file called backup.sh:

nano backup.sh

Now, paste the following code into the backup.sh script and save it:

#!/bin/bash
 
# Get the current date and time to append to the backup files
NOW=$(date +%Y%m%d%H%M%S)
 
# Set filenames for database and file backups using the current timestamp
SQL_BACKUP=${NOW}_database.sql
FILES_BACKUP=${NOW}_files.tar.gz
 
# Extract database credentials from the wp-config.php file
DB_NAME=$(sed -n "s/define( *'DB_NAME', *'\([^']*\)'.*/\1/p" wp-config.php)
DB_USER=$(sed -n "s/define( *'DB_USER', *'\([^']*\)'.*/\1/p" wp-config.php)
DB_PASSWORD=$(sed -n "s/define( *'DB_PASSWORD', *'\([^']*\)'.*/\1/p" wp-config.php)
DB_HOST=$(sed -n "s/define( *'DB_HOST', *'\([^']*\)'.*/\1/p" wp-config.php)
 
 
# Backup the database using mysqldump
mysqldump --add-drop-table -u$DB_USER -p$DB_PASSWORD -h$DB_HOST $DB_NAME > ../backups/$SQL_BACKUP 2>&1
 
# Compress the database backup file
gzip ../backups/$SQL_BACKUP
 
# Archive and compress the public directory (site files)
tar -zcf ../backups/$FILES_BACKUP .
 
# Remove backup files older than 30 days
rm -f ../backups/$(date +%Y%m%d* --date='1 month ago').gz

Explanation of the Bash Script

Let's break down the regex and the sed command for each of these lines step by step:

General Structure of the Command

  • $(...): This is command substitution in bash. It executes the command inside and substitutes its output into the variable.
  • sed -n 's/.../.../p': The sed command is used to manipulate text. The -n option suppresses the automatic printing of input, and the s/.../.../ part is for search and replace. The p at the end prints the result if the substitution is successful.

Now let's look at the specific regex.

1. Extracting the Database Name (DB_NAME)

DB_NAME=$(sed -n "s/define( *'DB_NAME', *'\([^']*\)'.*/\1/p" wp-config.php)
  • define( *'DB_NAME', *':

    • define(: Matches the word define( literally.
    • *: Matches zero or more spaces.
    • 'DB_NAME': Matches the string 'DB_NAME' literally.
    • ,: Matches the comma literally, which separates the constant name and value.
    • *: Matches zero or more spaces again.
    • ': Matches the opening single quote before the value of DB_NAME.
  • \([^']*\):

    • \( and \): Parentheses used for grouping. The contents inside the parentheses will be captured (and can be referenced later).
    • [^']*: Matches any characters except a single quote ('). This captures the actual value of the database name.
  • '.*:

    • ': Matches the closing single quote after the value.
    • .*: Matches the rest of the line (0 or more characters), ensuring the rest of the define() line is captured but not used.
  • \1: Refers to the captured group (the database name).

  • p: Prints the substituted result (the database name) if a match is found.

2. Extracting the Database User (DB_USER)

DB_USER=$(sed -n "s/define( *'DB_USER', *'\([^']*\)'.*/\1/p" wp-config.php)

This is almost identical to the previous one, except it's looking for the DB_USER instead of DB_NAME:

  • 'DB_USER': Matches the string 'DB_USER'.
  • The rest follows the same pattern, extracting the value of the DB_USER constant.

3. Extracting the Database Password (DB_PASSWORD)

DB_PASSWORD=$(sed -n "s/define( *'DB_PASSWORD', *'\([^']*\)'.*/\1/p" wp-config.php)

This line follows the same pattern:

  • 'DB_PASSWORD': Matches the string 'DB_PASSWORD', and the regex captures the value assigned to this constant.

4. Extracting the Database Host (DB_HOST)

DB_HOST=$(sed -n "s/define( *'DB_HOST', *'\([^']*\)'.*/\1/p" wp-config.php)

Again, this line is similar to the previous ones:

  • 'DB_HOST': Matches the string 'DB_HOST', and the regex extracts the value of this constant.

Summary of Key Regex Elements

  • define( *'CONSTANT_NAME': Matches the definition of the constant CONSTANT_NAME, allowing any amount of spaces between elements.
  • '\([^']*\)': Captures the value between single quotes for the constant.
  • \1: Refers to the captured value (the content inside the parentheses).
  • p: Prints the result of the substitution.

Each line extracts the value for DB_NAME, DB_USER, DB_PASSWORD, and DB_HOST from the wp-config.php file, using regular expressions to capture the information from the corresponding define() statements in the file.

Step 3: Make the Script Executable

To ensure the script can be executed, run the following command:

chmod u+x backup.sh

Step 4: Schedule the Backup with Cron

Now that the script is ready, you can schedule it to run automatically. To do this, open your crontab file:

crontab -e

Add the following line to schedule the script to run every Sunday at 5:00 AM:

0 5 * * 0 cd /var/www/example.com/public/; /var/www/example.com/backup.sh >/dev/null 2>&1

This cron job changes to the public directory and then runs the backup.sh script.

To run the backup daily at 5:00 AM, you can update the cron schedule:

0 5 * * * cd /var/www/example.com/public/; /var/www/example.com/backup.sh >/dev/null 2>&1

Step 5: Upload Backups to Amazon S3

Backups stored on the same server as your site can be lost if the server fails. To mitigate this, we’ll upload the backups to an Amazon S3 bucket.

1. Create an S3 Bucket

Log into AWS and create a new S3 bucket.

2. Set Up an IAM User with S3 Access

Follow these steps to create an IAM user with access to S3:

  1. Go to IAM and create a new user.
  2. Assign the AmazonS3FullAccess policy to the user.
  3. Generate an access key for the user.

3. Install AWS CLI

Here’s a streamlined version of the installation and update steps for the AWS CLI on Ubuntu, focusing on key commands and troubleshooting tips:

Quick Installation Steps

  1. Download the AWS CLI Installer

    curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
  2. Unzip the Installer

    unzip awscliv2.zip
  3. Install the AWS CLI

    sudo ./aws/install

Verifying Installation

  1. Check AWS CLI Version

    aws --version

Optional: Verifying Integrity of the Download

  1. Install GnuPG

    sudo apt-get install gnupg
  2. Import AWS CLI Public Key

    Create a file aws-public-key.asc with the following content:

    -----BEGIN PGP PUBLIC KEY BLOCK-----
    [Your Public Key Here]
    -----END PGP PUBLIC KEY BLOCK-----

    Import it:

    gpg --import aws-public-key.asc
  3. Download and Verify Signature

    curl -o awscliv2.sig https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip.sig
    gpg --verify awscliv2.sig awscliv2.zip

Here’s a concise guide with a title for setting up AWS CLI autocomplete on Ubuntu:


Setting Up AWS CLI Autocomplete on Ubuntu

1. Install Required Packages

  1. Install Bash Completion

    sudo apt-get install bash-completion

2. Enable AWS CLI Autocomplete

  1. Add Autocomplete Script to .bashrc

    echo 'complete -C /usr/local/aws-cli/v2/current/bin/aws_completer aws' >> ~/.bashrc

    (Adjust the path if your AWS CLI is installed elsewhere.)

  2. Reload Bash Configuration

    source ~/.bashrc

3. Verify Autocomplete

  • Type aws <TAB> in your terminal to check if autocomplete is working.

This setup allows you to use autocomplete for AWS CLI commands, making command entry more efficient.

Troubleshooting

  • Ensure the Script Exists: Check if the install script is present in the extracted aws directory:

    ls -l ./aws/install
  • Check Permissions: Make sure the install script has execute permissions:

    chmod +x ./aws/install
  • Verify AWS CLI Path: If aws command is not found, confirm installation paths and update:

    which aws
    ls -l /usr/local/bin/aws

4. Configure AWS CLI

Configure AWS CLI with your access keys:

aws configure

You’ll be prompted to enter your AWS Access Key ID, AWS Secret Access Key, and default region.

5. Modify the Backup Script to Upload Files to S3

Update the backup script to upload the backup files to your S3 bucket:

#!/bin/bash
 
# Get the current date and time
NOW=$(date +%Y%m%d%H%M%S)
 
# Define filenames for backups
SQL_BACKUP=${NOW}_database.sql
FILES_BACKUP=${NOW}_files.tar.gz
 
# Extract database credentials from wp-config.php
DB_NAME=$(sed -n "s/define( *'DB_NAME', *'\([^']*\)'.*/\1/p" wp-config.php)
DB_USER=$(sed -n "s/define( *'DB_USER', *'\([^']*\)'.*/\1/p" wp-config.php)
DB_PASSWORD=$(sed -n "s/define( *'DB_PASSWORD', *'\([^']*\)'.*/\1/p" wp-config.php)
DB_HOST=$(sed -n "s/define( *'DB_HOST', *'\([^']*\)'.*/\1/p" wp-config.php)
 
# Backup the database
mysqldump --add-drop-table -u$DB_USER -p$DB_PASSWORD -h$DB_HOST $DB_NAME > ../backups/$SQL_BACKUP 2>&1
 
# Compress the database backup
gzip ../backups/$SQL_BACKUP
 
# Backup the site files (public directory)
tar -zcf ../backups/$FILES_BACKUP .
 
# Remove old backups
rm -f ../backups/$(date +%Y%m%d* --date='1 month ago').gz
 
# Upload backups to Amazon S3
aws s3 cp ../backups/$SQL_BACKUP.gz s3://your-s3-bucket-name/ --quiet --storage-class STANDARD_IA
aws s3 cp ../backups/$FILES_BACKUP s3://your-s3-bucket-name/ --quiet --storage-class STANDARD_IA

Explanation of Changes:

  1. AWS CLI Configuration:
    • aws s3 cp ../backups/$SQL_BACKUP.gz s3://your-s3-bucket-name/ --quiet --storage-class STANDARD_IA: This uploads the compressed database backup to your S3 bucket.
    • aws s3 cp ../backups/$FILES_BACKUP s3://your-s3-bucket-name/ --quiet --storage-class STANDARD_IA: This uploads the compressed file backup to your S3 bucket.
    • --storage-class STANDARD_IA: Tells S3 to use the Standard-Infrequent Access class, which is cheaper for backups that are accessed infrequently.

Automate S3 Uploads

Update your cron job to include the S3 bucket:

0 5 * * * cd /var/www/example.com/public/; /var/www/example.com/backup.sh your-s3-bucket-name

Final refactored Script

Now that we have the basics of the backup script in place, let’s improve it by making it more generic for use across multiple sites.

Here are a couple of enhancements:

  1. The script should accept the S3 bucket name as an argument.
  2. Ensure that the backups folder exists before proceeding.

Here's the updated version of the backup script:

#!/bin/bash
 
# Get the bucket name from an argument passed to the script
BUCKET_NAME=${1-''}
 
# Ensure the backups folder exists
if [ ! -d ../backups/ ]; then
    echo "This script requires a 'backups' folder 1 level up from your site files folder."
    exit
fi
 
NOW=$(date +%Y%m%d%H%M%S)
SQL_BACKUP=${NOW}_database.sql
FILES_BACKUP=${NOW}_files.tar.gz
 
DB_NAME=$(sed -n "s/define( *'DB_NAME', *'\([^']*\)'.*/\1/p" wp-config.php)
DB_USER=$(sed -n "s/define( *'DB_USER', *'\([^']*\)'.*/\1/p" wp-config.php)
DB_PASSWORD=$(sed -n "s/define( *'DB_PASSWORD', *'\([^']*\)'.*/\1/p" wp-config.php)
DB_HOST=$(sed -n "s/define( *'DB_HOST', *'\([^']*\)'.*/\1/p" wp-config.php)
 
# Backup the database
mysqldump --add-drop-table -u$DB_USER -p$DB_PASSWORD -h$DB_HOST $DB_NAME > ../backups/$SQL_BACKUP 2>&1
 
# Compress the database dump file
gzip ../backups/$SQL_BACKUP
 
# Backup the entire public directory
tar -zcf ../backups/$FILES_BACKUP .
 
# Remove backup files that are over a month old
rm -f ../backups/$(date +%Y%m%d* --date='1 month ago').gz
 
# Copy files to S3 if a bucket is provided
if [ ! -z "$BUCKET_NAME" ]; then
    aws s3 cp ../backups/$SQL_BACKUP.gz s3://$BUCKET_NAME/ --quiet --storage-class STANDARD_IA
    aws s3 cp ../backups/$FILES_BACKUP s3://$BUCKET_NAME/ --quiet --storage-class STANDARD_IA
fi

Next, you could move the backup.sh script to a more central location on your server, like /usr/local/bin, or in this example, to a custom scripts directory:

mkdir /home/user/scripts
mv /home/user/example.com/backup.sh /home/user/scripts/

To automate backups using cron, specify the script path and optionally include the bucket name for S3 backups:

  • To copy backups to S3:

    0 5 * * * cd /home/user/example.com/public/; /home/user/scripts/backup.sh backups-example
  • Without copying to S3:

    0 5 * * * cd /home/user/example.com/public/; /home/user/scripts/backup.sh

Amazon S3 Storage Classes

In the above S3 commands, the --storage-class STANDARD_IA option is used, which is suitable for data that is infrequently accessed but still needs to be quickly available. It's cheaper than the STANDARD class.

Consider other storage classes based on how long you'll keep your backups:

  • Glacier Instant Retrieval (GLACIER_IR): For backups stored for 90+ days.
  • Glacier Deep Archive (DEEP_ARCHIVE): For backups stored for 180+ days, but note that retrieval could take up to 12 hours.

You can adjust the --storage-class option to change the storage class accordingly.

Configuring Amazon S3 Lifecycle Rules

You can set up S3 Lifecycle rules to automatically move backup files to a cheaper storage class or delete them after a specified period. For example, you could move backups to Glacier after 30 days and delete them after one year.

Steps:

  1. Open your S3 bucket in the AWS Management Console.
  2. Go to the Management tab and click Create lifecycle rule.
  3. Name your rule, and apply it to all objects in the bucket.
  4. Under Lifecycle rule actions, select Move current versions of objects between storage classes and Expire current versions of objects.
  5. Configure the transitions and expiration dates, such as:
    • Transition to Glacier Instant Retrieval after 30 days.
    • Expire (delete) objects after 365 days.

Finally, click Save to apply your lifecycle rules.

By implementing these changes, you can easily back up your WordPress site to S3, manage storage costs with lifecycle rules, and keep your server's disk space free by offloading media files directly to S3 using plugins like WP Offload Media.

In the next chapter, we'll focus on improving your server's security with Nginx configuration tweaks.