Backup and Restore¶

Overview¶

Backups protect your DocumentDB cluster against data loss from accidental deletion, corruption, or failed upgrades. A reliable backup strategy is the foundation of any production deployment — without it, recovery may be impossible.

The DocumentDB operator provides a snapshot-based backup system built on Kubernetes VolumeSnapshots. Each backup captures a snapshot of the primary instance's persistent volume, which can later be used to bootstrap a new DocumentDB cluster. Any writes that occurred after the snapshot and before a failure are not captured — these backups do not provide point-in-time recovery (PITR).

Key characteristics:

VolumeSnapshot-based — backups use the CSI (Container Storage Interface) driver's snapshot capability, so they are fast and storage-efficient.
Primary-only — the operator always targets the primary instance for backups.
Namespace-scoped — Backup and ScheduledBackup resources must reside in the same namespace as the DocumentDB cluster.
Retention-managed — expired backups are automatically deleted by the operator.

Prerequisites¶

Before creating backups, ensure your Kubernetes cluster has the required snapshot support.

Kind / MinikubeAKSEKS / GKE / Other

Run the CSI driver deployment script before creating a backup:

./operator/src/scripts/test-scripts/deploy-csi-driver.sh

Validate storage and snapshot components:

kubectl get storageclass
kubectl get volumesnapshotclasses

You should see a VolumeSnapshotClass such as csi-hostpath-snapclass. If it's missing, re-run the deploy script.

When creating a DocumentDB cluster, specify the CSI storage class:

apiVersion: documentdb.io/preview
kind: DocumentDB
metadata:
  name: my-cluster
  namespace: default
spec:
  resource:
    storage:
      storageClass: csi-hostpath-sc

AKS provides a CSI driver out of the box. Set spec.environment: aks so the operator can auto-create a default VolumeSnapshotClass:

spec:
  environment: aks

Ensure the following are in place:

A CSI driver that supports snapshots
VolumeSnapshot CRDs installed
A default VolumeSnapshotClass

Example for EKS:

volume-snapshot-class.yaml

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: ebs-snapclass
  annotations:
    snapshot.storage.kubernetes.io/is-default-class: "true"
driver: ebs.csi.aws.com
deletionPolicy: Delete

Backup¶

On-Demand BackupScheduled Backup

An on-demand backup creates a single point-in-time backup of a DocumentDB cluster.

backup.yaml

apiVersion: documentdb.io/preview
kind: Backup
metadata:
  name: my-backup
  namespace: default
spec:
  cluster:
    name: my-documentdb-cluster
  retentionDays: 30  # Optional: defaults to cluster setting or 30 days

kubectl apply -f backup.yaml

For the full list of fields, see the Backup API Reference.

Scheduled backups automatically create Backup resources at regular intervals using a cron schedule.

scheduledbackup.yaml

apiVersion: documentdb.io/preview
kind: ScheduledBackup
metadata:
  name: nightly-backup
  namespace: default
spec:
  cluster:
    name: my-documentdb-cluster
  schedule: "0 2 * * *"    # Daily at 2:00 AM
  retentionDays: 14         # Optional

kubectl apply -f scheduledbackup.yaml

For the full list of fields, see the ScheduledBackup API Reference.

Cron schedule examples:

Schedule	Meaning
`0 2 * * *`	Every day at 2:00 AM
`0 /6 * *`	Every 6 hours
`0 0 * * 0`	Every Sunday at midnight
`0 2 1 * *`	First day of every month at 2:00 AM

For more details, see cron expression format.

Behavior notes:

If a backup is still running when the next schedule triggers, the new backup is queued until the current one completes.
Failed backups do not block future scheduled backups.
Deleting a ScheduledBackup does not delete its previously created Backup objects.

Restore from Backup¶

You can restore a backup by creating a new DocumentDB cluster that references the backup.

Warning

In-place restore is not supported. You must create a new DocumentDB cluster to restore from a backup.

Step 1: Identify the Backup¶

List backups for your DocumentDB cluster and choose one in completed status:

kubectl get backups -n <namespace>

Step 2: Create a New DocumentDB Cluster¶

restore.yaml

apiVersion: documentdb.io/preview
kind: DocumentDB
metadata:
  name: my-restored-cluster
  namespace: default
spec:
  nodeCount: 1
  instancesPerNode: 1
  resource:
    storage:
      pvcSize: 10Gi
  exposeViaService:
    serviceType: ClusterIP
  bootstrap:
    recovery:
      backup:
        name: my-backup  # Name of the backup to restore from

kubectl apply -f restore.yaml

Step 3: Verify the Restore¶

# Wait for the DocumentDB cluster to become healthy
kubectl get documentdb my-restored-cluster -n default -w

Once the status shows Cluster in healthy state, connect and verify your data. See Connect with mongosh for connection instructions.

Restore Constraints¶

You cannot restore to the original DocumentDB cluster name while the old resources exist. Delete any leftover resources first, or use a new name.
The backup must be in completed status.
The VolumeSnapshot referenced by the backup must still exist — if it was manually deleted, the backup cannot be used for recovery.
You cannot specify both backup and persistentVolume in the same recovery spec.

For additional recovery options (including PV-based recovery), see Restore a Deleted DocumentDB Cluster.

Backup Retention Policy¶

Each backup receives an expiration time. After expiration, the operator deletes it automatically. You can define the retention period at multiple levels:

Level	Field	Scope
Per-backup	`Backup.spec.retentionDays`	Overrides all other settings for a single backup
Per-schedule	`ScheduledBackup.spec.retentionDays`	Applied to all backups created by this schedule
Per-cluster	`DocumentDB.spec.backup.retentionDays`	Cluster-wide default for all backups
Default	—	30 days (if nothing is set)

The operator resolves retention in priority order: per-backup > per-schedule > per-cluster > default.

How Expiration Is Calculated¶

Successful backups: retention starts at status.stoppedAt
Failed backups: retention starts at metadata.creationTimestamp
Expiration = start time + (retentionDays × 24 hours)

Important Retention Notes¶

Changing retentionDays on a ScheduledBackup only affects new backups.
Changing DocumentDB.spec.backup.retentionDays does not retroactively update existing backups.
Failed backups still expire (timer starts at creation).
Deleting the DocumentDB cluster does not immediately delete its Backup objects — they wait for expiration.
There is no "keep forever" option. Export backups externally for permanent archival.