Easy start for AI teams

Stop juggling databases for AI

Documents, vectors, and search in one Mongo API-compatible system.

Run locally with Docker, use familiar Mongo drivers and tools, and keep documents plus embeddings together — no extra sync layer needed.

Start locally in minutesMongo API-compatibleNative vector searchMIT licensed
Quick startVector search in minutes

Start locally with one Docker command

Spin up DocumentDB with Docker, then follow the four-step quick start below to connect with mongosh, create a vector index, and run your first vector search.

Docker
docker run -dt --name documentdb \
  -p 10260:10260 \
  ghcr.io/documentdb/documentdb/documentdb-local:latest \
  --username admin --password password

API

Mongo API-compatible

Use familiar Mongo drivers and tools.

Indexing

HNSW + IVFFlat

Choose the right vector algorithm for recall, scale, and cost.

Retrieval

Vector search

Filter by metadata and return similarity scores.

Why teams choose it

A simpler stack for AI

Compare a typical AI setup against what DocumentDB consolidates into one system.

Before

Typical AI stack

  • App database for documents and metadata
  • Separate vector database for embeddings
  • Extra data sync and failure modes between systems
  • More moving parts, more cost, more things to break
After

DocumentDB stack

  • BSON documents and embeddings stored together
  • Native vector search with filtering in one database
  • Geospatial, graph, and 40+ aggregation stages in one system
  • Simpler architecture with PostgreSQL-backed operations

Easy start

Go from zero to vector search in four steps

Copy, paste, and run. Each step works with mongosh, pymongo, or any standard Mongo client.

01

Docker

Run locally with Docker

One command to start DocumentDB on port 10260 with local credentials.

Docker
docker run -dt --name documentdb \
  -p 10260:10260 \
  ghcr.io/documentdb/documentdb/documentdb-local:latest \
  --username admin --password password
02

Shell

Connect with mongosh

Connect with mongosh, pymongo, the Node.js driver, or another standard Mongo client.

Shell
mongosh "mongodb://admin:password@localhost:10260/?tls=true&tlsAllowInvalidCertificates=true"
03

Index

Create a vector index

Create a native HNSW vector index with cosine similarity for embedding retrieval.

Index
db.runCommand({
  createIndexes: "documents",
  indexes: [{
    key: { embedding: "cosmosSearch" },
    name: "vector_idx",
    cosmosSearchOptions: {
      kind: "vector-hnsw",
      similarity: "COS",
      dimensions: 1536
    }
  }]
});
04

Vector search

Query by meaning

Retrieve semantically similar results with vector search and optional metadata filters.

Vector search
db.documents.aggregate([
  {
    $vectorSearch: {
      queryVector: [0.12, -0.04, 0.31, /* ... 1536 dims */],
      path: "embedding",
      limit: 10,
      numCandidates: 100,
      filter: { category: "technical" }
    }
  }
]);

Use cases

Built for the AI workloads that matter

Vector retrieval, document storage, and query power in one system — ready for the patterns teams are actually shipping.

RAG

RAG and grounded chat

Store chunks, metadata, and embeddings together so retrieval stays close to the source document.

  • Pre-filter by category, source, or recency before retrieval
  • Return source text alongside similarity scores
Search

Semantic search

Search products, knowledge bases, or support histories by meaning instead of exact keyword match.

  • Choose cosine, L2, or inner product for your domain
  • Mix vector results with metadata filters in one query
Agents

Agent memory

Store conversation state as BSON documents and retrieve past context with vector similarity.

  • Persist tool outputs, plans, and dialogue turns naturally
  • Recall relevant history without scanning entire threads
Graph + AI

Knowledge navigation

Find the best vector matches, then traverse related entities with `$graphLookup` in one query pipeline.

  • Recursive traversal with depth control
  • Useful when retrieval also needs relationship traversal

Open community

Built in the open, backed by a real ecosystem

MIT-licensed, publicly governed, with TSC members from five organizations.

3.2k+

GitHub stars

200+

Forks

11

TSC members

across 5 organizations

Microsoft
Amazon
Rippling
YugabyteDB
AB InBev

Ready to try it?

Go from zero to vector search in four commands. Open source, self-hosted, MIT-licensed.