DocumentDB Kubernetes Operator¶
The DocumentDB Kubernetes Operator is an open-source project to run and manage DocumentDB on Kubernetes. DocumentDB is the engine powering vCore-based Azure Cosmos DB for MongoDB. It is built on top of PostgreSQL and offers a native implementation of document-oriented NoSQL database, enabling CRUD operations on BSON data types.
As part of a DocumentDB cluster installation, the operator deploys and manages a set of PostgreSQL instance(s), the DocumentDB Gateway, as well as other Kubernetes resources. While PostgreSQL is used as the underlying storage engine, the gateway ensures that you can connect to the DocumentDB cluster using MongoDB-compatible drivers, APIs, and tools.
Note: This project is under active development but not yet recommended for production use. We welcome your feedback and contributions!
Quickstart¶
This quickstart guide will walk you through the steps to install the operator, deploy a DocumentDB cluster, access it using mongosh, and perform basic operations.
Prerequisites¶
- Helm installed.
- kubectl installed.
- A local Kubernetes cluster such as minikube, or kind installed. You are free to use any other Kubernetes cluster, but that's not a requirement for this quickstart.
- Install mongosh to connect to the DocumentDB cluster.
Start a local Kubernetes cluster¶
If you are using minikube, use the following command:
minikube start
If you are using kind, use the following command:
kind create cluster
Install cert-manager¶
cert-manager is used to manage TLS certificates for the DocumentDB cluster.
If you already have
cert-managerinstalled, you can skip this step.
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --set installCRDs=true
Verify that cert-manager is installed correctly:
kubectl get pods -n cert-manager
Output:
NAMESPACE NAME READY STATUS RESTARTS
cert-manager cert-manager-6794b8d569-d7lwd 1/1 Running 0
cert-manager cert-manager-cainjector-7f69cd69f7-pd9bc 1/1 Running 0
cert-manager cert-manager-webhook-6cc5dccc4b-7jmrh 1/1 Running 0
Install documentdb-operator using the Helm chart¶
The DocumentDB operator utilizes the CloudNativePG operator behind the scenes, and installs it in the
cnpg-systemnamespace. At this point, it is assumed that the CloudNativePG operator is not pre-installed in your cluster.
Use the following command to install the DocumentDB operator:
# Add the Helm repository
helm repo add documentdb https://documentdb.github.io/documentdb-kubernetes-operator
helm repo update
# Install the operator
helm install documentdb-operator documentdb/documentdb-operator \
--namespace documentdb-operator \
--create-namespace \
--wait
This will install the operator in the documentdb-operator namespace. Verify that it is running:
kubectl get deployment -n documentdb-operator
Output:
NAME READY UP-TO-DATE AVAILABLE AGE
documentdb-operator 1/1 1 1 113s
You should also see the DocumentDB operator CRDs installed in the cluster:
kubectl get crd | grep documentdb
Output:
documentdbs.db.microsoft.com
Store DocumentDB credentials in K8s Secret¶
Before deploying the DocumentDB cluster, create a Kubernetes secret to store the DocumentDB credentials. The sidecar injector plugin will automatically inject these credentials as environment variables into the DocumentDB gateway container.
Create the secret with your desired username and password:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: documentdb-preview-ns
---
# DocumentDB Credentials Secret
#
# Login credentials:
# Username: k8s_secret_user
# Password: K8sSecret100
#
# Connect using mongosh (port-forward):
# mongosh 127.0.0.1:10260 -u k8s_secret_user -p K8sSecret100 --authenticationMechanism SCRAM-SHA-256 --tls --tlsAllowInvalidCertificates
#
# Connect using connection string (port-forward):
# mongosh "mongodb://k8s_secret_user:K8sSecret100@127.0.0.1:10260/?directConnection=true&authMechanism=SCRAM-SHA-256&tls=true&tlsAllowInvalidCertificates=true&replicaSet=rs0"
#
apiVersion: v1
kind: Secret
metadata:
name: documentdb-credentials
namespace: documentdb-preview-ns
type: Opaque
stringData:
username: k8s_secret_user
password: K8sSecret100
EOF
Verify the secret is created:
kubectl get secret documentdb-credentials -n documentdb-preview-ns
Output:
NAME TYPE DATA AGE
documentdb-credentials Opaque 2 10s
Note: By default the operator expects a credentials secret named
documentdb-credentialscontainingusernameandpasswordkeys. You can override the secret name by settingspec.documentDbCredentialSecretin yourDocumentDBresource. Whatever name you configure (or the default) will be used by the sidecar injector to project the values asUSERNAMEandPASSWORDenvironment variables into the gateway sidecar container.
Deploy a DocumentDB cluster¶
Create a single-node DocumentDB cluster:
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Namespace
metadata:
name: documentdb-preview-ns
---
apiVersion: db.microsoft.com/preview
kind: DocumentDB
metadata:
name: documentdb-preview
namespace: documentdb-preview-ns
spec:
nodeCount: 1
instancesPerNode: 1
documentDbCredentialSecret: documentdb-credentials
resource:
storage:
pvcSize: 10Gi
exposeViaService:
serviceType: ClusterIP
EOF
Wait for the DocumentDB cluster to be fully initialized. Verify that it is running:
kubectl get pods -n documentdb-preview-ns
Output:
NAME READY STATUS RESTARTS AGE
documentdb-preview-1 2/2 Running 0 26m
You can also check the DocumentDB CRD instance:
kubectl get DocumentDB -n documentdb-preview-ns
Output:
NAME STATUS CONNECTION STRING
documentdb-preview Cluster in healthy state mongodb://$(kubectl get secret documentdb-credentials -n documentdb-preview-ns -o jsonpath='{.data.username}' | base64 -d):$(kubectl get secret documentdb-credentials -n documentdb-preview-ns -o jsonpath='{.data.password}' | base64 -d)@10.0.29.01:10260/?directConnection=true&authMechanism=SCRAM-SHA-256&tls=true&tlsAllowInvalidCertificates=true&replicaSet=rs0
Connect to the DocumentDB cluster¶
Once you have deployed your DocumentDB cluster, you can connect using different methods depending on your service type. Choose the approach that best fits your deployment strategy:
Option 1: ClusterIP Service (Default - for local development)¶
The default deployment uses ClusterIP service type. To connect from your local machine, use port forwarding:
Step 1: Set up port forwarding (keep this terminal open):
kubectl port-forward pod/documentdb-preview-1 10260:10260 -n documentdb-preview-ns
Step 2: In a new terminal, connect using mongosh:
# Traditional format (via port-forward)
mongosh 127.0.0.1:10260 -u k8s_secret_user -p K8sSecret100 --authenticationMechanism SCRAM-SHA-256 --tls --tlsAllowInvalidCertificates
# Or connection string format (via port-forward)
mongosh "mongodb://k8s_secret_user:K8sSecret100@127.0.0.1:10260/?directConnection=true&authMechanism=SCRAM-SHA-256&tls=true&tlsAllowInvalidCertificates=true&replicaSet=rs0"
Option 2: LoadBalancer Service (For cloud deployments)¶
If you prefer direct external access (recommended for cloud environments like Azure AKS), deploy with LoadBalancer service type:
Step 1: Deploy DocumentDB with LoadBalancer service:
cat <<EOF | kubectl apply -f -
apiVersion: db.microsoft.com/preview
kind: DocumentDB
metadata:
name: documentdb-preview
namespace: documentdb-preview-ns
spec:
nodeCount: 1
instancesPerNode: 1
documentDbCredentialSecret: documentdb-credentials
resource:
storage:
pvcSize: 10Gi
exposeViaService:
serviceType: LoadBalancer
EOF
Step 2: Wait for the external IP to be assigned:
kubectl get services -n documentdb-preview-ns -w
You should see something like:
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
documentdb-service-documentdb-preview LoadBalancer 10.0.228.243 52.149.56.216 10260:30312/TCP 2m
Step 3: Connect directly using the external IP:
# Get the connection string with external IP automatically populated
kubectl get documentdb documentdb-preview -n documentdb-preview-ns -o jsonpath='{.status.connectionString}'
# Copy the output and use it directly with mongosh
mongosh "PASTE_CONNECTION_STRING_HERE"
Note:
LoadBalancerservice is supported in cloud environments (Azure AKS, AWS EKS, GCP GKE), as well as local development with minikube and kind.
Work with Data¶
Once connected, execute the following commands to create a database and a collection, and insert some documents:
use testdb
db.createCollection("test_collection")
db.test_collection.insertMany([
{ name: "Alice", age: 30 },
{ name: "Bob", age: 25 },
{ name: "Charlie", age: 35 }
])
db.test_collection.find()
Output:
[direct: mongos] test> use testdb
switched to db testdb
[direct: mongos] testdb> db.createCollection("test_collection")
{ ok: 1 }
[direct: mongos] testdb> db.test_collection.insertMany([
... { name: "Alice", age: 30 },
... { name: "Bob", age: 25 },
... { name: "Charlie", age: 35 }
... ])
{
acknowledged: true,
insertedIds: {
'0': ObjectId('682c3b06491dc99ae02b3fed'),
'1': ObjectId('682c3b06491dc99ae02b3fee'),
'2': ObjectId('682c3b06491dc99ae02b3fef')
}
}
[direct: mongos] testdb> db.test_collection.find()
[
{ _id: ObjectId('682c3b06491dc99ae02b3fed'), name: 'Alice', age: 30 },
{ _id: ObjectId('682c3b06491dc99ae02b3fee'), name: 'Bob', age: 25 },
{
_id: ObjectId('682c3b06491dc99ae02b3fef'),
name: 'Charlie',
age: 35
}
]
Other options: Try the sample Python app¶
Connect to DocumentDB using a Python app¶
In addition to mongosh, you can also use the sample Python program (that uses the PyMongo client) in the GitHub repository to execute operations on the DocumentDB instance. It inserts a sample document to a movies collection inside the sample_mflix database.
git clone https://github.com/documentdb/documentdb-kubernetes-operator
cd documentdb-kubernetes-operator/operator/src/scripts/test-scripts
pip3 install pymongo
python3 mongo-python-data-pusher.py
Output:
Inserted document ID: 682c54f9505b85fba77ed154
{'_id': ObjectId('682c54f9505b85fba77ed154'),
'cast': ['Olivia Colman', 'Emma Stone', 'Rachel Weisz'],
'directors': ['Yorgos Lanthimos'],
'genres': ['Drama', 'History'],
'rated': 'R',
'runtime': 121,
'title': 'The Favourite MongoDB Movie',
'type': 'movie',
'year': 2018}
You can verify this using the mongosh shell:
use sample_mflix
db.movies.find()
Output:
[direct: mongos] testdb> use sample_mflix
switched to db sample_mflix
[direct: mongos] sample_mflix>
[direct: mongos] sample_mflix> db.movies.find()
[
{
_id: ObjectId('682c54f9505b85fba77ed154'),
title: 'The Favourite MongoDB Movie',
genres: [ 'Drama', 'History' ],
runtime: 121,
rated: 'R',
year: 2018,
directors: [ 'Yorgos Lanthimos' ],
cast: [ 'Olivia Colman', 'Emma Stone', 'Rachel Weisz' ],
type: 'movie'
}
]
If you are using the Python program to connect to DocumentDB, make sure to update the script's
hostvariable with the appropriate IP address based on your service type (127.0.0.1 for ClusterIP with port-forward, or the external IP for LoadBalancer service). Additionally, ensure that you update the defaultpasswordin the script or, preferably, use environment variables to securely manage sensitive information like passwords.
Configuration and Advanced Topics¶
Now that you have a basic DocumentDB cluster running, you may want to explore advanced configuration options and operational guides:
Sidecar Injector Plugin Configuration¶
The DocumentDB operator uses a sidecar injector plugin to automatically inject the DocumentDB Gateway container into PostgreSQL pods. This plugin supports multiple configuration parameters including:
- Gateway Image Configuration: Customize which DocumentDB Gateway container image is used
- Pod Labels and Annotations: Add custom metadata to injected pods
For detailed information on configuring the sidecar injector plugin, see: Sidecar Injector Plugin Configuration
Multi-Cloud Deployment¶
The DocumentDB operator supports deployment across multiple cloud environments and Kubernetes distributions. For guidance on multi-cloud deployments, see: Multi-Cloud Deployment Guide
TLS Setup¶
For advanced TLS configuration and testing:
- TLS Setup Guide - Complete TLS configuration guide
- E2E Testing - Comprehensive testing procedures
Clean Up¶
Delete the DocumentDB cluster and other resources¶
kubectl delete DocumentDB documentdb-preview -n documentdb-preview-ns
The Pod should now be terminated:
kubectl get pods -n documentdb-preview-ns
Uninstall the DocumentDB operator:
helm uninstall documentdb-operator --namespace documentdb-operator
Output:
These resources were kept due to the resource policy:
[CustomResourceDefinition] poolers.postgresql.cnpg.io
[CustomResourceDefinition] publications.postgresql.cnpg.io
[CustomResourceDefinition] scheduledbackups.postgresql.cnpg.io
[CustomResourceDefinition] subscriptions.postgresql.cnpg.io
[CustomResourceDefinition] backups.postgresql.cnpg.io
[CustomResourceDefinition] clusterimagecatalogs.postgresql.cnpg.io
[CustomResourceDefinition] clusters.postgresql.cnpg.io
[CustomResourceDefinition] databases.postgresql.cnpg.io
[CustomResourceDefinition] imagecatalogs.postgresql.cnpg.io
release "documentdb-operator" uninstalled
Verify that the Pod is removed:
kubectl get pods -n documentdb-preview-ns
Delete namespace, and CRDs:
kubectl delete namespace documentdb-operator
kubectl delete crd backups.postgresql.cnpg.io \
clusterimagecatalogs.postgresql.cnpg.io \
clusters.postgresql.cnpg.io \
databases.postgresql.cnpg.io \
imagecatalogs.postgresql.cnpg.io \
poolers.postgresql.cnpg.io \
publications.postgresql.cnpg.io \
scheduledbackups.postgresql.cnpg.io \
subscriptions.postgresql.cnpg.io \
documentdbs.db.microsoft.com