# How to Use pgvector on a Managed Postgres for Embeddings

The most underrated way to ship semantic search is to *not* add a vector database. If your app already has a Postgres, the pgvector extension turns it into a capable vector store — with transactional consistency, SQL filtering, and managed backups for free. This guide shows how to use pgvector on a managed Postgres for embeddings, end to end.

Why pgvector instead of a dedicated vector DB

For a huge share of real applications, a separate vector database is over-engineering:

One fewer service to deploy, secure, monitor, and back up.
Transactional consistency — your embeddings and the rows they describe live in the same database and update atomically.
SQL filtering — combine vector similarity with WHERE clauses on your existing columns in one query.
Backups included — your vectors are covered by the same managed backups as the rest of your data.

You outgrow pgvector at very large scale (many millions of vectors with demanding latency), but most projects never get there.

Enable the extension

On a managed Postgres that supports it:

CREATE EXTENSION IF NOT EXISTS vector;

Then define a table with a vector column whose dimension matches your embedding model (1536 for many OpenAI models, 768 for many open models):

CREATE TABLE documents (
  id          BIGSERIAL PRIMARY KEY,
  content     TEXT NOT NULL,
  metadata    JSONB DEFAULT '{}',
  embedding   VECTOR(1536)
);

The dimension is fixed per column — changing embedding models means a new column or table.

Insert embeddings

Compute embeddings in your app and insert them like any other value:

import os, psycopg
from openai import OpenAI

client = OpenAI()
conn = psycopg.connect(os.environ["DATABASE_URL"])

def embed(text: str):
    return client.embeddings.create(model="text-embedding-3-small", input=text).data[0].embedding

with conn.cursor() as cur:
    vec = embed("PandaStack auto-wires DATABASE_URL into your app")
    cur.execute(
        "INSERT INTO documents (content, embedding) VALUES (%s, %s)",
        ("...", vec),
    )
conn.commit()

Query by similarity

pgvector adds distance operators: <-> (L2), <=> (cosine), <#> (inner product). Use the one matching your model — cosine for most modern embeddings:

SELECT id, content, 1 - (embedding <=> %s) AS similarity
FROM documents
ORDER BY embedding <=> %s
LIMIT 5;

The killer feature is combining this with normal SQL:

SELECT id, content
FROM documents
WHERE metadata->>'lang' = 'en'
  AND created_at > now() - interval '30 days'
ORDER BY embedding <=> %s
LIMIT 5;

No dedicated vector DB gives you ad-hoc relational filtering this naturally.

Indexing: HNSW vs. IVFFlat

Without an index, queries do an exact (and slow) scan. pgvector offers two approximate index types:

Index	Build speed	Query speed	Recall	Notes
HNSW	Slower to build	Fast	High	Best default for most workloads
IVFFlat	Fast to build	Fast	Good (needs tuning)	Requires choosing `lists`; train after data loaded

For most apps, HNSW is the right choice:

CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops);

Tune query-time recall vs. speed with hnsw.ef_search:

SET hnsw.ef_search = 100;  -- higher = better recall, slower

If you choose IVFFlat, pick lists roughly proportional to row count and remember the index must be built *after* you've loaded representative data:

CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

Deploy on PandaStack

1Provision a managed PostgreSQL (14.x or 16.x) in the [dashboard](https://dashboard.pandastack.io) and enable the vector extension.
2Deploy your app as a container app in the same project. PandaStack auto-wires DATABASE_URL into the app's environment, so the psycopg.connect(os.environ["DATABASE_URL"]) above just works — no connection string copy-paste.
3Set your embeddings provider key (OPENAI_API_KEY, etc.) as an env var.

That's it — semantic search with zero new infrastructure beyond the database you already wanted.

Free tier vs. paid

	Free tier	Pro / Premium
DB storage	Small (dev/hobby)	Larger
Connections	50	300 / 1000
Good for	A few thousand chunks	Larger corpora, more concurrency

Free-tier databases are dev/hobby sized — perfect for prototyping a RAG feature. As your corpus and traffic grow, move to Pro or Premium for more storage and connections.

Operational tips

Match dimension to model exactly, or inserts fail.
Index after bulk load for IVFFlat; HNSW can be built anytime but building on a large table takes a while.
Connection pooling: embeddings-heavy apps open many short connections — use a pooler and mind the connection limit on your plan.
Re-embedding: changing models means re-embedding into a new column; do it in a background job (a cronjob works well) and swap atomically.

References

[pgvector](https://github.com/pgvector/pgvector)
[pgvector: HNSW indexing](https://github.com/pgvector/pgvector#hnsw)
[PostgreSQL documentation](https://www.postgresql.org/docs/)
[OpenAI: Embeddings guide](https://platform.openai.com/docs/guides/embeddings)

pgvector is the pragmatic default for embeddings: one database, transactional consistency, SQL filters, and backups included. PandaStack's managed Postgres supports it and auto-wires DATABASE_URL into your app, so you can ship semantic search on the free tier. Start at [dashboard.pandastack.io](https://dashboard.pandastack.io).

How to Use pgvector on a Managed Postgres for Embeddings

Why pgvector instead of a dedicated vector DB

Enable the extension

Insert embeddings

Query by similarity

Indexing: HNSW vs. IVFFlat

Deploy on PandaStack

Free tier vs. paid

Operational tips

References

Ready to deploy?

More in Tutorial

How to Deploy a Phoenix (Elixir) App to the Cloud

How to Deploy a Monorepo with Multiple Services

How to Deploy a Python RQ Background Worker

See also