Backup and Restore¶
Prerequisites¶
For Kind or Minikube¶
- Run the CSI driver deployment script before installing the documentdb-operator:
./operator/src/scripts/test-scripts/deploy-csi-driver.sh
- Validate storage and snapshot components:
kubectl get storageclass
kubectl get volumesnapshotclasses
You should see something like:
StorageClasses:
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
csi-hostpath-sc hostpath.csi.k8s.io Delete Immediate true 5d20h
standard (default) rancher.io/local-path Delete WaitForFirstConsumer false 5d20h
VolumeSnapshotClasses:
NAME DRIVER DELETIONPOLICY AGE
csi-hostpath-snapclass hostpath.csi.k8s.io Delete 5d19h
If csi-hostpath-snapclass isn't present, the deploy script didn’t finish correctly. Re-run it.
- When creating a cluster, ensure you set the appropriate storage class:
apiVersion: documentdb.io/preview
kind: DocumentDB
metadata:
name: my-cluster
namespace: default
spec:
resource:
storage:
storageClass: csi-hostpath-sc # Specify your CSI storage class
# ... other configuration
AKS¶
AKS already provides a CSI driver.
To allow the documentdb-operator to auto-create a default VolumeSnapshotClass, set spec.environment: aks in your DocumentDB spec:
apiVersion: documentdb.io/preview
kind: DocumentDB
metadata:
name: my-cluster
namespace: default
spec:
environment: aks
# ... other configuration
Other Providers (EKS / GKE / Custom)¶
Support is emerging; you must manually ensure:
- A CSI driver that supports snapshots
- VolumeSnapshot CRDs installed
- A default VolumeSnapshotClass
Example manual snapshot class (adjust DRIVER accordingly):
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
name: generic-snapclass
annotations:
snapshot.storage.kubernetes.io/is-default-class: "true"
driver: ebs.csi.aws.com # or pd.csi.storage.gke.io / other
deletionPolicy: Delete
Apply:
kubectl apply -f volumesnapshotclass.yaml
On-Demand Backup¶
An on-demand backup creates a single backup of a DocumentDB cluster.
Creating an On-Demand Backup¶
Create a Backup resource:
apiVersion: documentdb.io/preview
kind: Backup
metadata:
name: my-backup
namespace: default # Same namespace as DocumentDB cluster
spec:
cluster:
name: my-documentdb-cluster # Must match the DocumentDB cluster name
retentionDays: 30 # Optional: backup retention period in days
Apply the resource:
kubectl apply -f backup.yaml
Monitoring Backup Status¶
Check the backup status:
kubectl get backups -n default
View detailed backup information:
kubectl describe backup my-backup -n default
Scheduled Backups¶
Scheduled backups automatically create backups at regular intervals using a cron schedule.
Creating a Scheduled Backup¶
Create a ScheduledBackup resource on yaml file scheduledbackup.yaml
apiVersion: documentdb.io/preview
kind: ScheduledBackup
metadata:
name: my-scheduled-backup
namespace: default # Same namespace as DocumentDB
spec:
cluster:
name: my-documentdb-cluster # Must match the DocumentDB cluster name
schedule: "0 2 * * *" # Cron expression: daily at 2:00 AM
retentionDays: 30 # Optional: backup retention period in days
Apply the resource:
kubectl apply -f scheduledbackup.yaml
Cron Schedule Format¶
The schedule uses standard cron expression format. Common examples:
| Schedule | Meaning |
|---|---|
0 2 * * * |
Every day at 2:00 AM |
0 */6 * * * |
Every 6 hours |
0 0 * * 0 |
Every Sunday at midnight |
*/15 * * * * |
Every 15 minutes |
0 2 1 * * |
First day of every month at 2:00 AM |
For more details, see cron expression format.
Monitoring Scheduled Backups¶
List all ScheduledBackups:
kubectl get scheduledbackups -n default
Check the generated backups:
kubectl get backups -n default
Important Notes¶
- If a backup is currently running, the next backup will be queued and start after the current one completes
- The operator will automatically create
Backupresources according to the schedule - Failed backups do not prevent subsequent backups from being scheduled
- ScheduledBackups are automatically garbage collected when the source cluster is deleted
- Deleting a ScheduledBackup does NOT delete its created Backup objects; they remain until expiration
Restore from Backup¶
You can restore a backup to a different DocumentDB cluster.
List Available Backups¶
First, identify the backup you want to restore:
kubectl get backups -n default
Create a New Cluster with Backup Recovery¶
Create a new DocumentDB resource with recovery configuration:
apiVersion: documentdb.io/preview
kind: DocumentDB
metadata:
name: my-restored-cluster
namespace: default
spec:
bootstrap:
recovery:
backup:
name: my-backup # Reference the backup to restore from
#...... other configurations
Apply the resource:
kubectl apply -f restore.yaml
Backup Retention Policy¶
Backups don't live forever. Each one gets an expiration time. After that time passes, the operator deletes it automatically.
Where the Retention Value Comes From (priority order)¶
Backup.spec.retentionDays(per backup override)ScheduledBackup.spec.retentionDays(copied into each created Backup)DocumentDB.spec.backup.retentionDays(cluster default)- Default (if none set): 30 days
How it's calculated¶
- Success: retention starts at
status.stoppedAt - Failure: retention starts at
metadata.creationTimestamp - Expiration = start + retentionDays * 24h
Examples¶
Per-backup override:
apiVersion: documentdb.io/preview
kind: Backup
metadata:
name: monthly-audit
spec:
cluster:
name: prod-cluster
retentionDays: 90
Scheduled backups (14‑day retention):
apiVersion: documentdb.io/preview
kind: ScheduledBackup
metadata:
name: nightly
spec:
cluster:
name: prod-cluster
schedule: "0 2 * * *"
retentionDays: 14
Cluster default (used when Backup doesn't set retention):
apiVersion: documentdb.io/preview
kind: DocumentDB
metadata:
name: prod-cluster
spec:
backup:
retentionDays: 30
Important Notes¶
- Changing retention on a
ScheduledBackuponly affects new backups, not old ones. - Changing
DocumentDB.spec.backup.retentionDaysdoesn’t retroactively update existing backups. - Failed backups still expire (timer starts at creation).
- Deleting the cluster does NOT delete its Backup objects immediately—they still wait for expiration.
- No "keep forever" mode—export externally if you need permanent archival.
PersistentVolume Retention and Recovery¶
The DocumentDB operator supports retaining PersistentVolumes (PVs) after cluster deletion, allowing you to recover data by creating a new cluster from the retained PV.
Configuring PV Retention¶
By default, PVs are retained when the DocumentDB cluster is deleted (policy is Retain). To delete the PV automatically, set the persistentVolumeReclaimPolicy to Delete:
apiVersion: documentdb.io/preview
kind: DocumentDB
metadata:
name: my-cluster
namespace: default
spec:
resource:
storage:
pvcSize: 10Gi
persistentVolumeReclaimPolicy: Retain # Default - keeps PV after cluster deletion
# persistentVolumeReclaimPolicy: Delete # Deletes PV when cluster is deleted
# ... other configuration
How PV Retention Works¶
- When
persistentVolumeReclaimPolicyisRetain(the default), the operator updates the underlying PersistentVolume's reclaim policy - When the DocumentDB cluster is deleted, the PVC is deleted but the PV is retained in a "Released" state
- The operator emits a warning event listing the retained PVs for visibility
- The retained PV contains all the database data and can be used to recover the cluster
Recovering from a Retained PV¶
To restore a DocumentDB cluster from a retained PV:
Step 1: Identify the retained PV
# List PVs in Released state
kubectl get pv | grep Released
# Find PVCs that belonged to your deleted cluster (PVCs carry the cnpg.io/cluster label)
kubectl get pvc -n <namespace> -l cnpg.io/cluster=<old-cluster-name>
# For a given PVC, find the bound PV name
kubectl get pvc <pvc-name> -n <namespace> -o jsonpath='{.spec.volumeName}'
Step 2: Create a new DocumentDB cluster with PV recovery
The operator automatically handles all the complexity of PV recovery:
apiVersion: documentdb.io/preview
kind: DocumentDB
metadata:
name: my-recovered-cluster
namespace: default
spec:
nodeCount: 1
instancesPerNode: 1
resource:
storage:
pvcSize: 10Gi
storageClass: managed-csi
exposeViaService:
serviceType: ClusterIP
bootstrap:
recovery:
persistentVolume:
name: <pv-name> # Name of the retained PV to recover from
What the operator does automatically:
- Validates that the PV exists and is in
AvailableorReleasedstate - If the PV is
Released, clears theclaimRefto make it available for binding - Creates a temporary PVC bound to the retained PV
- Uses the temporary PVC as a data source for CNPG to clone the data
- After the cluster becomes healthy, deletes the temporary PVC
- The source PV is released back for manual cleanup or reuse
Cleaning Up Retained PVs¶
After successful recovery (or when you no longer need the data), delete the retained PV:
# View retention warning events
kubectl get events -n <namespace> --field-selector reason=PVsRetained
# Delete the retained PV when no longer needed
kubectl delete pv <pv-name>
Important Notes for PV Recovery¶
- The PV must exist and be in
AvailableorReleasedstate - The PV must have been created with
persistentVolumeReclaimPolicy: Retain - You cannot specify both
backupandpersistentVolumerecovery at the same time - PV recovery preserves all data including users, roles, and collections
- After successful recovery, consider setting up regular backups for the new cluster
- The source PV remains after recovery and must be manually deleted when no longer needed