Back to Blog
Tutorial12 min read2026-06-30

How to Use pgvector on a Managed Postgres for Embeddings

Add vector search to your app without a new service: enable pgvector on managed Postgres, choose HNSW vs IVFFlat indexes, and run fast similarity queries alongside your relational data.

Ajay Kumar
Ajay Kumar
Founder & DevOps, PandaStack

# How to Use pgvector on a Managed Postgres for Embeddings

The most underrated way to ship semantic search is to *not* add a vector database. If your app already has a Postgres, the pgvector extension turns it into a capable vector store — with transactional consistency, SQL filtering, and managed backups for free. This guide shows how to use pgvector on a managed Postgres for embeddings, end to end.

Why pgvector instead of a dedicated vector DB

For a huge share of real applications, a separate vector database is over-engineering:

  • One fewer service to deploy, secure, monitor, and back up.
  • Transactional consistency — your embeddings and the rows they describe live in the same database and update atomically.
  • SQL filtering — combine vector similarity with WHERE clauses on your existing columns in one query.
  • Backups included — your vectors are covered by the same managed backups as the rest of your data.

You outgrow pgvector at very large scale (many millions of vectors with demanding latency), but most projects never get there.

Enable the extension

On a managed Postgres that supports it:

CREATE EXTENSION IF NOT EXISTS vector;

Then define a table with a vector column whose dimension matches your embedding model (1536 for many OpenAI models, 768 for many open models):

CREATE TABLE documents (
  id          BIGSERIAL PRIMARY KEY,
  content     TEXT NOT NULL,
  metadata    JSONB DEFAULT '{}',
  embedding   VECTOR(1536)
);

The dimension is fixed per column — changing embedding models means a new column or table.

Insert embeddings

Compute embeddings in your app and insert them like any other value:

import os, psycopg
from openai import OpenAI

client = OpenAI()
conn = psycopg.connect(os.environ["DATABASE_URL"])

def embed(text: str):
    return client.embeddings.create(model="text-embedding-3-small", input=text).data[0].embedding

with conn.cursor() as cur:
    vec = embed("PandaStack auto-wires DATABASE_URL into your app")
    cur.execute(
        "INSERT INTO documents (content, embedding) VALUES (%s, %s)",
        ("...", vec),
    )
conn.commit()

Query by similarity

pgvector adds distance operators: <-> (L2), <=> (cosine), <#> (inner product). Use the one matching your model — cosine for most modern embeddings:

SELECT id, content, 1 - (embedding <=> %s) AS similarity
FROM documents
ORDER BY embedding <=> %s
LIMIT 5;

The killer feature is combining this with normal SQL:

SELECT id, content
FROM documents
WHERE metadata->>'lang' = 'en'
  AND created_at > now() - interval '30 days'
ORDER BY embedding <=> %s
LIMIT 5;

No dedicated vector DB gives you ad-hoc relational filtering this naturally.

Indexing: HNSW vs. IVFFlat

Without an index, queries do an exact (and slow) scan. pgvector offers two approximate index types:

IndexBuild speedQuery speedRecallNotes
HNSWSlower to buildFastHighBest default for most workloads
IVFFlatFast to buildFastGood (needs tuning)Requires choosing lists; train after data loaded

For most apps, HNSW is the right choice:

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

Tune query-time recall vs. speed with hnsw.ef_search:

SET hnsw.ef_search = 100;  -- higher = better recall, slower

If you choose IVFFlat, pick lists roughly proportional to row count and remember the index must be built *after* you've loaded representative data:

CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

Deploy on PandaStack

  1. 1Provision a managed PostgreSQL (14.x or 16.x) in the [dashboard](https://dashboard.pandastack.io) and enable the vector extension.
  2. 2Deploy your app as a container app in the same project. PandaStack auto-wires DATABASE_URL into the app's environment, so the psycopg.connect(os.environ["DATABASE_URL"]) above just works — no connection string copy-paste.
  3. 3Set your embeddings provider key (OPENAI_API_KEY, etc.) as an env var.

That's it — semantic search with zero new infrastructure beyond the database you already wanted.

Free tier vs. paid

Free tierPro / Premium
DB storageSmall (dev/hobby)Larger
Connections50300 / 1000
Good forA few thousand chunksLarger corpora, more concurrency

Free-tier databases are dev/hobby sized — perfect for prototyping a RAG feature. As your corpus and traffic grow, move to Pro or Premium for more storage and connections.

Operational tips

  • Match dimension to model exactly, or inserts fail.
  • Index after bulk load for IVFFlat; HNSW can be built anytime but building on a large table takes a while.
  • Connection pooling: embeddings-heavy apps open many short connections — use a pooler and mind the connection limit on your plan.
  • Re-embedding: changing models means re-embedding into a new column; do it in a background job (a cronjob works well) and swap atomically.

References

  • [pgvector](https://github.com/pgvector/pgvector)
  • [pgvector: HNSW indexing](https://github.com/pgvector/pgvector#hnsw)
  • [PostgreSQL documentation](https://www.postgresql.org/docs/)
  • [OpenAI: Embeddings guide](https://platform.openai.com/docs/guides/embeddings)

pgvector is the pragmatic default for embeddings: one database, transactional consistency, SQL filters, and backups included. PandaStack's managed Postgres supports it and auto-wires DATABASE_URL into your app, so you can ship semantic search on the free tier. Start at [dashboard.pandastack.io](https://dashboard.pandastack.io).

Ready to deploy?

Start free on PandaStack.

Start free on PandaStack

More in Tutorial

Browse all Tutorial articles →

See also