Automated Backups with CloudNativePG for PostgreSQL on Kubernetes

CloudNativePG makes it rather easy to set up and operate a highly available PostgreSQL cluster on Kubernetes. See this post for more details.

Now while setting up is already good - operating PostgreSQL obviously requires one additional component: Backups.

pgBarman - Overview
Prerequisites
Setting up Barman backups with CloudNativePG
- The settings explained
Executing backups
- On-demand backups
- Scheduled backups
A note on how often to schedule backups
Recovery
Summary

The TLDR is found in the Summary

There are a ton of backup solutions for PostgreSQL.

to just mention 4 of the more popular options.

Now - if we look at how we deployed our cluster - using CloudNativePG - it's clear that we could use any of the methods. Nevertheless, the CloudNativePG operator comes with first class pgBarman support. Which is no surprise as both pgBarman and CloudNativePG are from the awesome people at Enterprise DB.

pgBarman - Overview

pgBarman or short Barman was developed to implement backup and disaster recovery for PostgreSQL databases. One of the main design aspects was to optimize for maximum business continuity in case of a disaster.

Features: Some of the more exciting features are:

Point in time recovery
Remote backup
WAL archiving and streaming
Synchronous WAL streaming (means exactly zero data loss in case of failures)
Incremental and parallel backups
Backup catalog

This guide will show us some of these features in action.

Prerequisites

To follow this guide, you need:

Setting up Barman backups with CloudNativePG

First, we will add Barman to our clust objects - and configure it to store it's backup in an Azure Blob Storage Container.

Setting up Barman with CloudNativePG is as simple as adding an additional backup configuration block to the PostgreSQL cluster object and creating a k8s secret for accessign the backup backend.

CloudNativePG currently supports, Azure Blob Storage, Google Cloud Storage and AWS S3 for backup backends. In this guide we'll use Azure Blob Storage - but the steps are very similar for the other backends. See the following documentation section for more details: Link to CloudNativePG docs.

First, let's create our backup backend secret - the secret to access the cloud backend (Azure Blob Storage in our case). Please replace <base64-encoded-connection-string> with your Azure Blob Storage container connection string. You can find the connection string in the Azure Console at your storage account page -> Access keys --> Connection string

Azure Blob Storage Access Keys

Create the following secret and apply it to your cluster.

1apiVersion: v1
2kind: Secret
3metadata:
4name: backup-creds
5namespace: postgres-namespace
6data:
7AZURE_CONNECTION_STRING: <base64-encoded-connection-string>
8

Next, update the Cluster k8s object. If we look at the Summary of how we set up our cluster, there we had our cluster manifest. To now add backups to the cluster, simply add the following lines to the manifest (see highlighted lines 3 to 16)

1superuserSecret:
2    name: example-superuser
3backup:
4    barmanObjectStore:
5        destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path
6        azureCredentials:
7            connectionString:
8                name: backup-creds
9                key: AZURE_CONNECTION_STRING
10        wal:
11            compression: gzip
12            maxParallel: 8
13            encryption: AES256
14        data:
15            compression: gzip
16            encryption: AES256
17            immediateCheckpoint: false
18            jobs: 2
19    retentionPolicy: "30d"
20storage:
21    pvcTemplate:

Afterward, apply the manifest to your cluster.

The settings explained

destinationPath: Destination path of the Microsoft Azure Blob Storage Container. Format: <http|https>://<account-name>.<service-name>.core.windows.net/<container>/<blob>. Note, that <container> refers to the name of your Blob Storage container and <blob> to the name of the Blob inside the container. The "blob" will automatically be created - with the name you set here.
connectionString: Reference to the secret which stores the connection string
wal: Define the WAL archiving/recovery behavior:
- maxParallel: Number of WAL files to be either archived in parallel.
- compression: Whether to compress the backups. Options are: gzip, bzip2, snappy. Off by default
- encryption: Whether to encrypt the WAL files. Options are: AES256 or aws:kms. Leave empty to use the backup backend storage policy.
data: Defines the data backup behavior:
- immediateCheckpoint: If set to true, an immediate checkpoint will be used, meaning PostgreSQL will complete the checkpoint as soon as possible.
- compression:: Whether to compress the backups. Options are: gzip, bzip2, snappy. Off by default.
- encryption: Whether to encrypt the WAL files. Options are: AES256 or aws:kms. Leave empty to use the backup backend storage policy.
- jobs: The number of parallel jobs to be used to upload the backup.
retentionPolicy: Defines when old backups should be deleted.

NOTE: That's actually all we need. Apply the cluster manifest and your CloudNativePG operated PostgreSQL cluster is ready to make it's first backup. WAL archiving by the way is already started directly after applying these changes.

Executing backups

There are two ways to execute a backup.

On-demand and
Scheduled

While On-demand backups are fine if you need a backup immediately - eg. because you attempt complex maintenance - scheduled backups are used during everydays operation.

On-demand backups

For on-demand backups simply create a Backup object by creating the following yaml manifest and applying it:

1apiVersion: postgresql.cnpg.io/v1
2kind: Backup
3metadata:
4  name: general-purpose-backup
5  namespace: postgres-namespace
6spec:
7  cluster:
8    name: example-cluster

Directly after applying the manifest, the operator will attempt to initiate the backup. You can check progress (and potential errors) by running kubectl describe backups -n postgres-namespace general-purpose-backup.

Scheduled backups

Scheduled backups are not more complex to set up. Create a yaml manifest and apply it:

1apiVersion: postgresql.cnpg.io/v1
2kind: ScheduledBackup
3metadata:
4  name: general-purpose-scheduled-backup
5  namespace: postgres-namespace
6spec:
7  # Note that this cron dialect has 6 places - an additional one for seconds
8  schedule: "1 0 0 * * 0"
9  # Set this to true, if you want to suspend the backup for now
10  suspend: false
11  # Determines if the first backup should be done immediately
12  immediate: true
13  #Indicates which ownerReference should be put inside the created backup resources.
14  # - none: no owner reference for created backup objects (same behavior as before the field was introduced)
15  # - self: sets the Scheduled backup object as owner of the backup
16  # - cluster: set the cluster as owner of the backup
17  backupOwnerReference: self
18  cluster:
19    name: example-cluster

After applying the manifest, the backup will be scheduled as defined. If immediate is set to true, the backup will execute immediately.

Check the backup state by running kubectl describe backups -n postgres-namespace general-purpose-scheduled-backup.

A note on how often to schedule backups

The interval of how often to schedule backups is determined mainly by how fast one needs to recover after a desaster. Between two backups, Barman needs to recover from the WAL-Archive which takes longer than recovery from backup. That being said, backup intervals of faster than once per week are rarely needed and simply lead to unnecessary load and costs.

Recovery

While taking backups is nice - we need to be able to recover a cluster from backup in case the unthinkable happens.

To recover a cluster from backup, we can bootstrap a new cluster by referencing the backup data. This means we can't recover our backup into an existing Cluster.

Recover from an existing Backup

If there is a Backup object inside the same namespace as you want your cluster to recover to, simply add the following snippets to your cluster yaml manifest (in the spec - section):

1  bootstrap:
2    recovery:
3      backup:
4        name: general-purpose-scheduled-backup

Applying the manifest will create a cluster and recover from the data referenced in the backup.

Recover from backup object storage

If there is NO Backup object inside the same namespace as you want your cluster to recover to, add the following externalCluster configuration to the spec section of your cluster manifest. Replace <your-previous-cluster-name> with the name of what your cluster was previously called. If you do not recall the name of your cluster, you can the Azure blob storage path (which is https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose) with the Azure Blob Storage console. The first subdirectory found in this folder is the name of the server.

1  bootstrap:
2    recovery:
3      source: clusterBackup
4
5  externalClusters:
6    - name: clusterBackup
7      barmanObjectStore:
8        serverName: "<your-previous-cluster-name>"
9        destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path
10        azureCredentials:
11          connectionString:
12            name: backup-creds
13            key: AZURE_CONNECTION_STRING
14        wal:
15          maxParallel: 8

Point in time recovery

Point in time recovery is the process of not recovering all the WALs up to the latest one, but until a certain point in time. This comes handy, if you messed up your database and want to restore the database state of eg. yesterday.

Compare with Chapter Setting up Barman - as the backup configuration section needs to match the externalCluster configuration during recovery.

While this process is rather complex in the background, CloudNativePG as well as Barman help us tremendously. We again simply need to define, how we want to bootstrap a new cluster. We - again as for normal recovery - can choose to either recover from Backup or from an backup object store. As the process is similar to the chapters above, we are only going to demonstrate point in time recovery for object store.

Add the following snippets to your cluster spec yaml section:

1  bootstrap:
2    recovery:
3      source: clusterBackup
4      recoveryTarget:
5        targetTime: "2020-11-26 15:22:00.00000+00"
6
7  externalClusters:
8    - name: clusterBackup
9      barmanObjectStore:
10        destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path
11        azureCredentials:
12          connectionString:
13            name: backup-creds
14            key: AZURE_CONNECTION_STRING
15        wal:
16          maxParallel: 8

As you can see it's literally the same configuration as for normal backup recovery, but with the additional targetTime setting.

Important to note for recovery

You need to bootstrap a fresh cluster
Use a different blob configuration for your recovery object and your backup of the new cluster. Eg if your "old" cluster you want to recover from had a <blob> name of postgres-backup - use a different blob name in the backup section of your new cluster. You can reuse the same container though - just use a different blob name.
The operator does NOT attempt to back up (and recover) the underlying secrets. Make sure to back them up with your regular k8s backups.

Summary

Adding backups to a CloudNativePG operated, highly available PostgreSQL cluster is rather easy.

Add the following sections to your Cluster yaml manifest and apply

1superuserSecret:
2    name: example-superuser
3backup:
4    barmanObjectStore:
5        destinationPath: https://devopsandmorebackups.blob.core.windows.net/postgres-backups/generalpurpose # This is an azure blob storage path
6        azureCredentials:
7            connectionString:
8                name: backup-creds
9                key: AZURE_CONNECTION_STRING
10        wal:
11            compression: gzip
12            maxParallel: 8
13            encryption: AES256
14        data:
15            compression: gzip
16            encryption: AES256
17            immediateCheckpoint: false
18            jobs: 2
19    retentionPolicy: "30d"
20storage:
21    pvcTemplate:

Add a scheduled backup by creating a yaml manifest and applying it:

1apiVersion: postgresql.cnpg.io/v1
2kind: ScheduledBackup
3metadata:
4    name: general-purpose-scheduled-backup
5    namespace: postgres-namespace
6spec:
7    # Note that this cron dialect has 6 places - an additional one for seconds
8    schedule: "1 0 0 * * 0"
9    # Set this to true, if you want to suspend the backup for now
10    suspend: false
11    # Determines if the first backup should be done immediately
12    immediate: true
13    #Indicates which ownerReference should be put inside the created backup resources.
14    # - none: no owner reference for created backup objects (same behavior as before the field was introduced)
15    # - self: sets the Scheduled backup object as owner of the backup
16    # - cluster: set the cluster as owner of the backup
17backupOwnerReference: self
18cluster:
19    name: example-cluster
20

(Optional) Run an on-demand backup by creating a yaml manifest and applying it:

1apiVersion: postgresql.cnpg.io/v1
2kind: Backup
3metadata:
4    name: general-purpose-backup
5    namespace: postgres-namespace
6spec:
7    cluster:
8    name: example-cluster

To check the status of your backups, simply run: kubectl describe backups -n postgres-namespace <name-of-your-backup>.

------------------

Interested in how to train your very own Large Language Model?

We prepared a well-researched guide for how to use the latest advancements in Open Source technology to fine-tune your own LLM. This has many advantages like:

Cost control
Data privacy
Excellent performance - adjusted specifically for your intended use

Get your free LLM training guide