Automated Backups with CloudNativePG for PostgreSQL on Kubernetes

blog preview

CloudNativePG makes it rather easy to set up and operate a highly available PostgreSQL cluster on Kubernetes. See this post for more details.

Now while setting up is already good - operating PostgreSQL obviously requires one additional component: Backups.

The TLDR is found in the Summary

There are a ton of backup solutions for PostgreSQL.

to just mention 4 of the more popular options.

Now - if we look at how we deployed our cluster - using CloudNativePG - it's clear that we could use any of the methods. Nevertheless, the CloudNativePG operator comes with first class pgBarman support. Which is no surprise as both pgBarman and CloudNativePG are from the awesome people at Enterprise DB.

pgBarman - Overview

pgBarman or short Barman was developed to implement backup and disaster recovery for PostgreSQL databases. One of the main design aspects was to optimize for maximum business continuity in case of a disaster.

Features: Some of the more exciting features are:

  • Point in time recovery
  • Remote backup
  • WAL archiving and streaming
  • Synchronous WAL streaming (means exactly zero data loss in case of failures)
  • Incremental and parallel backups
  • Backup catalog

This guide will show us some of these features in action.

Prerequisites

To follow this guide, you need:

  1. A running CloudNativePG cluster
  2. An Azure Blob Storage Container

Setting up Barman backups with CloudNativePG

First, we will add Barman to our clust objects - and configure it to store it's backup in an Azure Blob Storage Container.

Setting up Barman with CloudNativePG is as simple as adding an additional backup configuration block to the PostgreSQL cluster object and creating a k8s secret for accessign the backup backend.

CloudNativePG currently supports, Azure Blob Storage, Google Cloud Storage and AWS S3 for backup backends. In this guide we'll use Azure Blob Storage - but the steps are very similar for the other backends. See the following documentation section for more details: Link to CloudNativePG docs.

First, let's create our backup backend secret - the secret to access the cloud backend (Azure Blob Storage in our case). Please replace <base64-encoded-connection-string> with your Azure Blob Storage container connection string. You can find the connection string in the Azure Console at your storage account page -> Access keys --> Connection string

Azure Blob Storage Access KeysAzure Blob Storage Access Keys

  1. Create the following secret and apply it to your cluster.

    1apiVersion: v1
    2kind: Secret
    3metadata:
    4name: backup-creds
    5namespace: postgres-namespace
    6data:
    7AZURE_CONNECTION_STRING: <base64-encoded-connection-string>
    8
  2. Next, update the Cluster k8s object. If we look at the Summary of how we set up our cluster, there we had our cluster manifest. To now add backups to the cluster, simply add the following lines to the manifest (see highlighted lines 3 to 16)

    1superuserSecret:
    2 name: example-superuser
    3backup:
    4 barmanObjectStore:
    5 destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path
    6 azureCredentials:
    7 connectionString:
    8 name: backup-creds
    9 key: AZURE_CONNECTION_STRING
    10 wal:
    11 compression: gzip
    12 maxParallel: 8
    13 encryption: AES256
    14 data:
    15 compression: gzip
    16 encryption: AES256
    17 immediateCheckpoint: false
    18 jobs: 2
    19 retentionPolicy: "30d"
    20storage:
    21 pvcTemplate:

    Afterward, apply the manifest to your cluster.

The settings explained

  • destinationPath: Destination path of the Microsoft Azure Blob Storage Container. Format: <http|https>://<account-name>.<service-name>.core.windows.net/<container>/<blob>. Note, that <container> refers to the name of your Blob Storage container and <blob> to the name of the Blob inside the container. The "blob" will automatically be created - with the name you set here.
  • connectionString: Reference to the secret which stores the connection string
  • wal: Define the WAL archiving/recovery behavior:
    • maxParallel: Number of WAL files to be either archived in parallel.
    • compression: Whether to compress the backups. Options are: gzip, bzip2, snappy. Off by default
    • encryption: Whether to encrypt the WAL files. Options are: AES256 or aws:kms. Leave empty to use the backup backend storage policy.
  • data: Defines the data backup behavior:
    • immediateCheckpoint: If set to true, an immediate checkpoint will be used, meaning PostgreSQL will complete the checkpoint as soon as possible.
    • compression:: Whether to compress the backups. Options are: gzip, bzip2, snappy. Off by default.
    • encryption: Whether to encrypt the WAL files. Options are: AES256 or aws:kms. Leave empty to use the backup backend storage policy.
    • jobs: The number of parallel jobs to be used to upload the backup.
  • retentionPolicy: Defines when old backups should be deleted.

NOTE: That's actually all we need. Apply the cluster manifest and your CloudNativePG operated PostgreSQL cluster is ready to make it's first backup. WAL archiving by the way is already started directly after applying these changes.

Executing backups

There are two ways to execute a backup.

  1. On-demand and
  2. Scheduled

While On-demand backups are fine if you need a backup immediately - eg. because you attempt complex maintenance - scheduled backups are used during everydays operation.

On-demand backups

For on-demand backups simply create a Backup object by creating the following yaml manifest and applying it:

1apiVersion: postgresql.cnpg.io/v1
2kind: Backup
3metadata:
4 name: general-purpose-backup
5 namespace: postgres-namespace
6spec:
7 cluster:
8 name: example-cluster

Directly after applying the manifest, the operator will attempt to initiate the backup. You can check progress (and potential errors) by running kubectl describe backups -n postgres-namespace general-purpose-backup.

Scheduled backups

Scheduled backups are not more complex to set up. Create a yaml manifest and apply it:

1apiVersion: postgresql.cnpg.io/v1
2kind: ScheduledBackup
3metadata:
4 name: general-purpose-scheduled-backup
5 namespace: postgres-namespace
6spec:
7 # Note that this cron dialect has 6 places - an additional one for seconds
8 schedule: "1 0 0 * * 0"
9 # Set this to true, if you want to suspend the backup for now
10 suspend: false
11 # Determines if the first backup should be done immediately
12 immediate: true
13 #Indicates which ownerReference should be put inside the created backup resources.
14 # - none: no owner reference for created backup objects (same behavior as before the field was introduced)
15 # - self: sets the Scheduled backup object as owner of the backup
16 # - cluster: set the cluster as owner of the backup
17 backupOwnerReference: self
18 cluster:
19 name: example-cluster

After applying the manifest, the backup will be scheduled as defined. If immediate is set to true, the backup will execute immediately.

Check the backup state by running kubectl describe backups -n postgres-namespace general-purpose-scheduled-backup.

A note on how often to schedule backups

The interval of how often to schedule backups is determined mainly by how fast one needs to recover after a desaster. Between two backups, Barman needs to recover from the WAL-Archive which takes longer than recovery from backup. That being said, backup intervals of faster than once per week are rarely needed and simply lead to unnecessary load and costs.

Recovery

While taking backups is nice - we need to be able to recover a cluster from backup in case the unthinkable happens.

To recover a cluster from backup, we can bootstrap a new cluster by referencing the backup data. This means we can't recover our backup into an existing Cluster.

Recover from an existing Backup

If there is a Backup object inside the same namespace as you want your cluster to recover to, simply add the following snippets to your cluster yaml manifest (in the spec - section):

1 bootstrap:
2 recovery:
3 backup:
4 name: general-purpose-scheduled-backup

Applying the manifest will create a cluster and recover from the data referenced in the backup.

Recover from backup object storage

If there is NO Backup object inside the same namespace as you want your cluster to recover to, add the following externalCluster configuration to the spec section of your cluster manifest. Replace <your-previous-cluster-name> with the name of what your cluster was previously called. If you do not recall the name of your cluster, you can the Azure blob storage path (which is https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose) with the Azure Blob Storage console. The first subdirectory found in this folder is the name of the server.

1 bootstrap:
2 recovery:
3 source: clusterBackup
4
5 externalClusters:
6 - name: clusterBackup
7 barmanObjectStore:
8 serverName: "<your-previous-cluster-name>"
9 destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path
10 azureCredentials:
11 connectionString:
12 name: backup-creds
13 key: AZURE_CONNECTION_STRING
14 wal:
15 maxParallel: 8

Point in time recovery

Point in time recovery is the process of not recovering all the WALs up to the latest one, but until a certain point in time. This comes handy, if you messed up your database and want to restore the database state of eg. yesterday.

Compare with Chapter Setting up Barman - as the backup configuration section needs to match the externalCluster configuration during recovery.

While this process is rather complex in the background, CloudNativePG as well as Barman help us tremendously. We again simply need to define, how we want to bootstrap a new cluster. We - again as for normal recovery - can choose to either recover from Backup or from an backup object store. As the process is similar to the chapters above, we are only going to demonstrate point in time recovery for object store.

Add the following snippets to your cluster spec yaml section:

1 bootstrap:
2 recovery:
3 source: clusterBackup
4 recoveryTarget:
5 targetTime: "2020-11-26 15:22:00.00000+00"
6
7 externalClusters:
8 - name: clusterBackup
9 barmanObjectStore:
10 destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path
11 azureCredentials:
12 connectionString:
13 name: backup-creds
14 key: AZURE_CONNECTION_STRING
15 wal:
16 maxParallel: 8

As you can see it's literally the same configuration as for normal backup recovery, but with the additional targetTime setting.

Important to note for recovery

  1. You need to bootstrap a fresh cluster
  2. Use a different blob configuration for your recovery object and your backup of the new cluster. Eg if your "old" cluster you want to recover from had a <blob> name of postgres-backup - use a different blob name in the backup section of your new cluster. You can reuse the same container though - just use a different blob name.
  3. The operator does NOT attempt to back up (and recover) the underlying secrets. Make sure to back them up with your regular k8s backups.

Summary

Adding backups to a CloudNativePG operated, highly available PostgreSQL cluster is rather easy.

  1. Add the following sections to your Cluster yaml manifest and apply

    1superuserSecret:
    2 name: example-superuser
    3backup:
    4 barmanObjectStore:
    5 destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path
    6 azureCredentials:
    7 connectionString:
    8 name: backup-creds
    9 key: AZURE_CONNECTION_STRING
    10 wal:
    11 compression: gzip
    12 maxParallel: 8
    13 encryption: AES256
    14 data:
    15 compression: gzip
    16 encryption: AES256
    17 immediateCheckpoint: false
    18 jobs: 2
    19 retentionPolicy: "30d"
    20storage:
    21 pvcTemplate:
  2. Add a scheduled backup by creating a yaml manifest and applying it:

    1apiVersion: postgresql.cnpg.io/v1
    2kind: ScheduledBackup
    3metadata:
    4 name: general-purpose-scheduled-backup
    5 namespace: postgres-namespace
    6spec:
    7 # Note that this cron dialect has 6 places - an additional one for seconds
    8 schedule: "1 0 0 * * 0"
    9 # Set this to true, if you want to suspend the backup for now
    10 suspend: false
    11 # Determines if the first backup should be done immediately
    12 immediate: true
    13 #Indicates which ownerReference should be put inside the created backup resources.
    14 # - none: no owner reference for created backup objects (same behavior as before the field was introduced)
    15 # - self: sets the Scheduled backup object as owner of the backup
    16 # - cluster: set the cluster as owner of the backup
    17backupOwnerReference: self
    18cluster:
    19 name: example-cluster
    20
  3. (Optional) Run an on-demand backup by creating a yaml manifest and applying it:

    1apiVersion: postgresql.cnpg.io/v1
    2kind: Backup
    3metadata:
    4 name: general-purpose-backup
    5 namespace: postgres-namespace
    6spec:
    7 cluster:
    8 name: example-cluster

To check the status of your backups, simply run: kubectl describe backups -n postgres-namespace <name-of-your-backup>.

------------------

Interested in how to train your very own Large Language Model?

We prepared a well-researched guide for how to use the latest advancements in Open Source technology to fine-tune your own LLM. This has many advantages like:

  • Cost control
  • Data privacy
  • Excellent performance - adjusted specifically for your intended use

Need assistance?

Do you have any questions about the topic presented here? Or do you need someone to assist in implementing these areas? Do not hesitate to contact me.