What Is a Database Index?
An index is a separate data structure — typically a B-tree — that the database maintains alongside your table. Think of it like a book's index: instead of scanning every page (full table scan), the database jumps directly to the relevant rows.
Without an index on a million-row table, a simple lookup can take seconds. With one, the same query completes in milliseconds.
How a Full Table Scan Works
-- No index on email — PostgreSQL reads every row
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'alice@example.com';
-- Seq Scan on users (cost=0.00..18340.00 rows=1 width=120)
-- Actual time=45.231..45.232 rows=1 loops=1Now add an index:
CREATE INDEX idx_users_email ON users(email);
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'alice@example.com';
-- Index Scan using idx_users_email on users
-- Actual time=0.041..0.043 rows=1 loops=1From 45 ms to 0.04 ms — a 1000× improvement.
Types of Indexes
Single-Column Index
The most common type. Use it when you filter or sort by a single column frequently:
CREATE INDEX idx_orders_user_id ON orders(user_id);
CREATE INDEX idx_posts_created_at ON posts(created_at DESC);Composite (Multi-Column) Index
When queries filter on multiple columns together, a composite index is more efficient than two separate ones:
-- Query pattern: WHERE status = 'active' AND created_at > '2025-01-01'
CREATE INDEX idx_orders_status_created ON orders(status, created_at);Column order matters: put the highest-cardinality column (most distinct values) or the most frequently filtered column first.
Partial Index
Index only the rows you actually query. Saves space and improves write performance:
-- Only index published posts — drafts are never searched by users
CREATE INDEX idx_posts_published ON posts(created_at)
WHERE published = true;Unique Index
Enforces uniqueness and speeds up lookups simultaneously:
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);Full-Text Index
For keyword search across text columns:
CREATE INDEX idx_posts_fts ON posts
USING GIN (to_tsvector('english', title || ' ' || body));
-- Query
SELECT title FROM posts
WHERE to_tsvector('english', title || ' ' || body)
@@ to_tsquery('english', 'postgresql & performance');Finding Missing Indexes
Use EXPLAIN ANALYZE to spot sequential scans on large tables:
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE customer_id = 101 AND status = 'pending'
ORDER BY created_at DESC;Look for Seq Scan with high rows estimates — that's your signal to add an index.
PostgreSQL also tracks unused indexes:
SELECT relname AS table, indexrelname AS index, idx_scan
FROM pg_stat_user_indexes
WHERE idx_scan = 0
ORDER BY relname;Zero scans means the index is never used — consider dropping it.
The Cost of Over-Indexing
Every index you add has a write-time cost. Inserts, updates, and deletes must update all indexes on the affected table. A table with 15 indexes can be significantly slower to write than one with 3.
Rules of thumb:
- Index columns used in
WHERE,JOIN ON, andORDER BYclauses - Don't index low-cardinality columns (e.g., a boolean
activeflag on its own) - Drop indexes that are never used (
idx_scan = 0) - Monitor index bloat and run
REINDEXif needed
Index Maintenance
-- Rebuild a bloated index without locking the table
REINDEX INDEX CONCURRENTLY idx_orders_user_id;
-- View index sizes
SELECT indexname, pg_size_pretty(pg_relation_size(indexname::regclass)) AS size
FROM pg_indexes
WHERE tablename = 'orders'
ORDER BY pg_relation_size(indexname::regclass) DESC;Indexing in MongoDB
If you're using MongoDB (also available on PandaStack), the syntax is different but the concepts are the same:
# Single-field index
db.users.createIndex({ email: 1 })
# Compound index
db.orders.createIndex({ status: 1, createdAt: -1 })
# Partial index
db.posts.createIndex(
{ createdAt: -1 },
{ partialFilterExpression: { published: true } }
)
# Explain a query
db.users.find({ email: "alice@example.com" }).explain("executionStats")Managed Databases on PandaStack
[PandaStack](https://dashboard.pandastack.io) provides managed PostgreSQL, MySQL, Redis, and MongoDB — all with automated backups and connection pooling built in. You focus on schema design and indexes; PandaStack handles the rest.
Docs: [docs.pandastack.io](https://docs.pandastack.io).
Quick Reference
| Index Type | Use Case |
|---|---|
| Single-column | One-column WHERE / ORDER BY |
| Composite | Multi-column filter patterns |
| Partial | Subset of rows (e.g., active records) |
| Unique | Enforce uniqueness + fast lookup |
| Full-text (GIN) | Keyword / document search |
Index strategically, measure with EXPLAIN ANALYZE, and prune what you don't use.