# How to Deploy a Qdrant Vector Database

Qdrant is a fast, Rust-based vector database built for similarity search at scale. If your RAG or recommendation system has outgrown ad-hoc vectors, a dedicated store like Qdrant pays off in performance and features (filtering, payloads, quantization). This guide self-hosts Qdrant in production and covers the tuning that actually matters.

Persistent storage is non-negotiable

Qdrant keeps your vectors and the search index on disk at /qdrant/storage. In a container, that path is ephemeral by default — lose it and you lose every embedding and have to re-index from scratch. The first rule of deploying Qdrant: attach persistent storage to that directory. Without it, a redeploy is a data-loss event.

Secure it with an API key

An open Qdrant instance lets anyone read, write, or delete your collections. Always set an API key:

# config.yaml
service:
  api_key: ${QDRANT_API_KEY}
storage:
  storage_path: /qdrant/storage

Clients then send api-key: on every request. Combined with HTTPS from your platform, that's a reasonable baseline; for stricter setups Qdrant also supports read-only keys and TLS.

Containerize

Use the official image and point it at your config:

FROM qdrant/qdrant:latest
COPY config.yaml /qdrant/config/production.yaml
EXPOSE 6333 6334
ENV QDRANT__SERVICE__API_KEY=${QDRANT_API_KEY}

Qdrant exposes 6333 for the HTTP/REST API and 6334 for gRPC (faster for bulk operations).

Deploy on PandaStack

1Push a repo with the Dockerfile and config to GitHub.
2Create a container app in the [dashboard](https://dashboard.pandastack.io) connected to the repo. It builds via rootless BuildKit and serves an HTTPS URL with automatic SSL.
3Set QDRANT_API_KEY as an encrypted env var.
4Attach persistent storage to /qdrant/storage so your index survives redeploys.
5Pick a memory-aware tier — vector search performance is bound by how much of the index fits in RAM. An m1/m2 memory-optimized tier is the right family as your collection grows.

Sizing intuition

Memory usage scales with vector count and dimensionality. A rough mental model:

Collection size	Tier guidance
Tens of thousands of vectors	Small/compute tier is fine
Hundreds of thousands+	Memory-optimized (m1/m2)
Millions	Enable quantization, more RAM, consider sharding

Create a collection and insert vectors

from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct

client = QdrantClient(url="https://<app>", api_key="<QDRANT_API_KEY>")

client.recreate_collection(
    collection_name="docs",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE),
)

client.upsert("docs", points=[
    PointStruct(id=1, vector=embedding, payload={"title": "Intro", "url": "/intro"}),
])

hits = client.search("docs", query_vector=query_embedding, limit=5,
                     query_filter={"must": [{"key": "lang", "match": {"value": "en"}}]})

The distance metric must match how your embeddings were trained — cosine for most modern embedding models. Getting this wrong silently degrades results.

Tuning that matters

Quantization: scalar or binary quantization shrinks the in-memory footprint dramatically with a small accuracy cost — essential for large collections on bounded RAM.
HNSW parameters: m and ef_construct trade index build time and memory for recall. Defaults are sane; raise them only if recall is too low.
Payload indexing: if you filter on metadata fields, create payload indexes so filters are fast.
gRPC for bulk loads: use the gRPC port for large upserts; it's noticeably faster than REST.

Qdrant vs. pgvector — when to use which

	pgvector on managed Postgres	Self-hosted Qdrant
Ops overhead	None (it's your existing DB)	A service to run
Best for	Small/medium corpora, transactional needs	Large-scale, filter-heavy search
Features	Solid ANN + SQL	Advanced quantization, payloads, sharding
Backups	Managed DB backups	Your responsibility (snapshot storage)

If you're just adding RAG to an app that already has a managed Postgres, start with pgvector. Reach for Qdrant when scale, advanced filtering, or quantization become real requirements.

Operational notes

Backups: Qdrant supports snapshots — schedule them and copy snapshots off the instance to durable storage.
Cold starts: don't run a primary vector DB on free-tier scale-to-zero; an index reload on cold start is slow and you want it always available. Use a paid tier.
Health: Qdrant exposes a health endpoint; tail PandaStack's live logs to catch OOM kills, the classic sign you need a bigger memory tier or quantization.

References

[Qdrant documentation](https://qdrant.tech/documentation/)
[Qdrant: Security](https://qdrant.tech/documentation/guides/security/)
[Qdrant: Quantization](https://qdrant.tech/documentation/guides/quantization/)
[Qdrant: Snapshots](https://qdrant.tech/documentation/concepts/snapshots/)

Self-hosted Qdrant gives you serious vector search once you nail persistence, security, and memory sizing. PandaStack provides HTTPS, encrypted keys, persistent storage, and memory-optimized tiers — and a managed pgvector Postgres if you decide you don't need a dedicated DB yet. Start at [dashboard.pandastack.io](https://dashboard.pandastack.io).

How to Deploy a Qdrant Vector Database

Persistent storage is non-negotiable

Secure it with an API key

Containerize

Deploy on PandaStack

Sizing intuition

Create a collection and insert vectors

Tuning that matters

Qdrant vs. pgvector — when to use which

Operational notes

References

Ready to deploy?

More in Tutorial

How to Deploy a Phoenix (Elixir) App to the Cloud

How to Deploy a Monorepo with Multiple Services

How to Deploy a Python RQ Background Worker

See also