Backup and Restore¶
Overview¶
Backups protect your DocumentDB cluster against data loss from accidental deletion, corruption, or failed upgrades. A reliable backup strategy is the foundation of any production deployment — without it, recovery may be impossible.
The DocumentDB operator provides a snapshot-based backup system built on Kubernetes VolumeSnapshots. Each backup captures a snapshot of the primary instance's persistent volume, which can later be used to bootstrap a new DocumentDB cluster. Any writes that occurred after the snapshot and before a failure are not captured — these backups do not provide point-in-time recovery (PITR).
Key characteristics:
- VolumeSnapshot-based — backups use the CSI (Container Storage Interface) driver's snapshot capability, so they are fast and storage-efficient.
- Primary-only — the operator always targets the primary instance for backups.
- Namespace-scoped —
BackupandScheduledBackupresources must reside in the same namespace as theDocumentDBcluster. - Retention-managed — expired backups are automatically deleted by the operator.
Prerequisites¶
Before creating backups, ensure your Kubernetes cluster has the required snapshot support.
Run the CSI driver deployment script before creating a backup:
Validate storage and snapshot components:
You should see a VolumeSnapshotClass such as csi-hostpath-snapclass. If it's missing, re-run the deploy script.
When creating a DocumentDB cluster, specify the CSI storage class:
AKS provides a CSI driver out of the box. Set spec.environment: aks so the operator can auto-create a default VolumeSnapshotClass:
Ensure the following are in place:
- A CSI driver that supports snapshots
- VolumeSnapshot CRDs installed
- A default
VolumeSnapshotClass
Example for EKS:
Backup¶
An on-demand backup creates a single point-in-time backup of a DocumentDB cluster.
apiVersion: documentdb.io/preview
kind: Backup
metadata:
name: my-backup
namespace: default
spec:
cluster:
name: my-documentdb-cluster
retentionDays: 30 # Optional: defaults to cluster setting or 30 days
For the full list of fields, see the Backup API Reference.
Scheduled backups automatically create Backup resources at regular intervals using a cron schedule.
apiVersion: documentdb.io/preview
kind: ScheduledBackup
metadata:
name: nightly-backup
namespace: default
spec:
cluster:
name: my-documentdb-cluster
schedule: "0 2 * * *" # Daily at 2:00 AM
retentionDays: 14 # Optional
For the full list of fields, see the ScheduledBackup API Reference.
Cron schedule examples:
| Schedule | Meaning |
|---|---|
0 2 * * * |
Every day at 2:00 AM |
0 */6 * * * |
Every 6 hours |
0 0 * * 0 |
Every Sunday at midnight |
0 2 1 * * |
First day of every month at 2:00 AM |
For more details, see cron expression format.
Behavior notes:
- If a backup is still running when the next schedule triggers, the new backup is queued until the current one completes.
- Failed backups do not block future scheduled backups.
- Deleting a
ScheduledBackupdoes not delete its previously createdBackupobjects.
Restore from Backup¶
You can restore a backup by creating a new DocumentDB cluster that references the backup.
Warning
In-place restore is not supported. You must create a new DocumentDB cluster to restore from a backup.
Step 1: Identify the Backup¶
List backups for your DocumentDB cluster and choose one in completed status:
Step 2: Create a New DocumentDB Cluster¶
apiVersion: documentdb.io/preview
kind: DocumentDB
metadata:
name: my-restored-cluster
namespace: default
spec:
nodeCount: 1
instancesPerNode: 1
resource:
storage:
pvcSize: 10Gi
exposeViaService:
serviceType: ClusterIP
bootstrap:
recovery:
backup:
name: my-backup # Name of the backup to restore from
Step 3: Verify the Restore¶
# Wait for the DocumentDB cluster to become healthy
kubectl get documentdb my-restored-cluster -n default -w
Once the status shows Cluster in healthy state, connect and verify your data. See Connect with mongosh for connection instructions.
Restore Constraints¶
- You cannot restore to the original DocumentDB cluster name while the old resources exist. Delete any leftover resources first, or use a new name.
- The backup must be in
completedstatus. - The VolumeSnapshot referenced by the backup must still exist — if it was manually deleted, the backup cannot be used for recovery.
- You cannot specify both
backupandpersistentVolumein the same recovery spec.
For additional recovery options (including PV-based recovery), see Restore a Deleted DocumentDB Cluster.
Backup Retention Policy¶
Each backup receives an expiration time. After expiration, the operator deletes it automatically. You can define the retention period at multiple levels:
| Level | Field | Scope |
|---|---|---|
| Per-backup | Backup.spec.retentionDays |
Overrides all other settings for a single backup |
| Per-schedule | ScheduledBackup.spec.retentionDays |
Applied to all backups created by this schedule |
| Per-cluster | DocumentDB.spec.backup.retentionDays |
Cluster-wide default for all backups |
| Default | — | 30 days (if nothing is set) |
The operator resolves retention in priority order: per-backup > per-schedule > per-cluster > default.
How Expiration Is Calculated¶
- Successful backups: retention starts at
status.stoppedAt - Failed backups: retention starts at
metadata.creationTimestamp - Expiration = start time + (
retentionDays× 24 hours)
Important Retention Notes¶
- Changing
retentionDayson aScheduledBackuponly affects new backups. - Changing
DocumentDB.spec.backup.retentionDaysdoes not retroactively update existing backups. - Failed backups still expire (timer starts at creation).
- Deleting the DocumentDB cluster does not immediately delete its
Backupobjects — they wait for expiration. - There is no "keep forever" option. Export backups externally for permanent archival.