# How to Use pgvector on a Managed Postgres for Embeddings
The most underrated way to ship semantic search is to *not* add a vector database. If your app already has a Postgres, the pgvector extension turns it into a capable vector store — with transactional consistency, SQL filtering, and managed backups for free. This guide shows how to use pgvector on a managed Postgres for embeddings, end to end.
Why pgvector instead of a dedicated vector DB
For a huge share of real applications, a separate vector database is over-engineering:
- One fewer service to deploy, secure, monitor, and back up.
- Transactional consistency — your embeddings and the rows they describe live in the same database and update atomically.
- SQL filtering — combine vector similarity with
WHEREclauses on your existing columns in one query. - Backups included — your vectors are covered by the same managed backups as the rest of your data.
You outgrow pgvector at very large scale (many millions of vectors with demanding latency), but most projects never get there.
Enable the extension
On a managed Postgres that supports it:
CREATE EXTENSION IF NOT EXISTS vector;Then define a table with a vector column whose dimension matches your embedding model (1536 for many OpenAI models, 768 for many open models):
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
metadata JSONB DEFAULT '{}',
embedding VECTOR(1536)
);The dimension is fixed per column — changing embedding models means a new column or table.
Insert embeddings
Compute embeddings in your app and insert them like any other value:
import os, psycopg
from openai import OpenAI
client = OpenAI()
conn = psycopg.connect(os.environ["DATABASE_URL"])
def embed(text: str):
return client.embeddings.create(model="text-embedding-3-small", input=text).data[0].embedding
with conn.cursor() as cur:
vec = embed("PandaStack auto-wires DATABASE_URL into your app")
cur.execute(
"INSERT INTO documents (content, embedding) VALUES (%s, %s)",
("...", vec),
)
conn.commit()Query by similarity
pgvector adds distance operators: <-> (L2), <=> (cosine), <#> (inner product). Use the one matching your model — cosine for most modern embeddings:
SELECT id, content, 1 - (embedding <=> %s) AS similarity
FROM documents
ORDER BY embedding <=> %s
LIMIT 5;The killer feature is combining this with normal SQL:
SELECT id, content
FROM documents
WHERE metadata->>'lang' = 'en'
AND created_at > now() - interval '30 days'
ORDER BY embedding <=> %s
LIMIT 5;No dedicated vector DB gives you ad-hoc relational filtering this naturally.
Indexing: HNSW vs. IVFFlat
Without an index, queries do an exact (and slow) scan. pgvector offers two approximate index types:
| Index | Build speed | Query speed | Recall | Notes |
|---|---|---|---|---|
| HNSW | Slower to build | Fast | High | Best default for most workloads |
| IVFFlat | Fast to build | Fast | Good (needs tuning) | Requires choosing lists; train after data loaded |
For most apps, HNSW is the right choice:
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);Tune query-time recall vs. speed with hnsw.ef_search:
SET hnsw.ef_search = 100; -- higher = better recall, slowerIf you choose IVFFlat, pick lists roughly proportional to row count and remember the index must be built *after* you've loaded representative data:
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);Deploy on PandaStack
- 1Provision a managed PostgreSQL (14.x or 16.x) in the [dashboard](https://dashboard.pandastack.io) and enable the
vectorextension. - 2Deploy your app as a container app in the same project. PandaStack auto-wires
DATABASE_URLinto the app's environment, so thepsycopg.connect(os.environ["DATABASE_URL"])above just works — no connection string copy-paste. - 3Set your embeddings provider key (
OPENAI_API_KEY, etc.) as an env var.
That's it — semantic search with zero new infrastructure beyond the database you already wanted.
Free tier vs. paid
| Free tier | Pro / Premium | |
|---|---|---|
| DB storage | Small (dev/hobby) | Larger |
| Connections | 50 | 300 / 1000 |
| Good for | A few thousand chunks | Larger corpora, more concurrency |
Free-tier databases are dev/hobby sized — perfect for prototyping a RAG feature. As your corpus and traffic grow, move to Pro or Premium for more storage and connections.
Operational tips
- Match dimension to model exactly, or inserts fail.
- Index after bulk load for IVFFlat; HNSW can be built anytime but building on a large table takes a while.
- Connection pooling: embeddings-heavy apps open many short connections — use a pooler and mind the connection limit on your plan.
- Re-embedding: changing models means re-embedding into a new column; do it in a background job (a cronjob works well) and swap atomically.
References
- [pgvector](https://github.com/pgvector/pgvector)
- [pgvector: HNSW indexing](https://github.com/pgvector/pgvector#hnsw)
- [PostgreSQL documentation](https://www.postgresql.org/docs/)
- [OpenAI: Embeddings guide](https://platform.openai.com/docs/guides/embeddings)
pgvector is the pragmatic default for embeddings: one database, transactional consistency, SQL filters, and backups included. PandaStack's managed Postgres supports it and auto-wires DATABASE_URL into your app, so you can ship semantic search on the free tier. Start at [dashboard.pandastack.io](https://dashboard.pandastack.io).