What Is a Database Index?

An index is a separate data structure — typically a B-tree — that the database maintains alongside your table. Think of it like a book's index: instead of scanning every page (full table scan), the database jumps directly to the relevant rows.

Without an index on a million-row table, a simple lookup can take seconds. With one, the same query completes in milliseconds.

How a Full Table Scan Works

-- No index on email — PostgreSQL reads every row
EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'alice@example.com';
-- Seq Scan on users  (cost=0.00..18340.00 rows=1 width=120)
--                    Actual time=45.231..45.232 rows=1 loops=1

Now add an index:

CREATE INDEX idx_users_email ON users(email);

EXPLAIN ANALYZE
SELECT * FROM users WHERE email = 'alice@example.com';
-- Index Scan using idx_users_email on users
--   Actual time=0.041..0.043 rows=1 loops=1

From 45 ms to 0.04 ms — a 1000× improvement.

Types of Indexes

Single-Column Index

The most common type. Use it when you filter or sort by a single column frequently:

CREATE INDEX idx_orders_user_id   ON orders(user_id);
CREATE INDEX idx_posts_created_at ON posts(created_at DESC);

Composite (Multi-Column) Index

When queries filter on multiple columns together, a composite index is more efficient than two separate ones:

-- Query pattern: WHERE status = 'active' AND created_at > '2025-01-01'
CREATE INDEX idx_orders_status_created ON orders(status, created_at);

Column order matters: put the highest-cardinality column (most distinct values) or the most frequently filtered column first.

Partial Index

Index only the rows you actually query. Saves space and improves write performance:

-- Only index published posts — drafts are never searched by users
CREATE INDEX idx_posts_published ON posts(created_at)
WHERE published = true;

Unique Index

Enforces uniqueness and speeds up lookups simultaneously:

CREATE UNIQUE INDEX idx_users_email_unique ON users(email);

Full-Text Index

For keyword search across text columns:

CREATE INDEX idx_posts_fts ON posts
USING GIN (to_tsvector('english', title || ' ' || body));

-- Query
SELECT title FROM posts
WHERE to_tsvector('english', title || ' ' || body)
      @@ to_tsquery('english', 'postgresql & performance');

Finding Missing Indexes

Use EXPLAIN ANALYZE to spot sequential scans on large tables:

EXPLAIN ANALYZE
SELECT * FROM orders
WHERE customer_id = 101 AND status = 'pending'
ORDER BY created_at DESC;

Look for Seq Scan with high rows estimates — that's your signal to add an index.

PostgreSQL also tracks unused indexes:

SELECT relname AS table, indexrelname AS index, idx_scan
FROM   pg_stat_user_indexes
WHERE  idx_scan = 0
ORDER  BY relname;

Zero scans means the index is never used — consider dropping it.

The Cost of Over-Indexing

Every index you add has a write-time cost. Inserts, updates, and deletes must update all indexes on the affected table. A table with 15 indexes can be significantly slower to write than one with 3.

Rules of thumb:

Index columns used in WHERE, JOIN ON, and ORDER BY clauses
Don't index low-cardinality columns (e.g., a boolean active flag on its own)
Drop indexes that are never used (idx_scan = 0)
Monitor index bloat and run REINDEX if needed

Index Maintenance

-- Rebuild a bloated index without locking the table
REINDEX INDEX CONCURRENTLY idx_orders_user_id;

-- View index sizes
SELECT indexname, pg_size_pretty(pg_relation_size(indexname::regclass)) AS size
FROM   pg_indexes
WHERE  tablename = 'orders'
ORDER  BY pg_relation_size(indexname::regclass) DESC;

Indexing in MongoDB

If you're using MongoDB (also available on PandaStack), the syntax is different but the concepts are the same:

# Single-field index
db.users.createIndex({ email: 1 })

# Compound index
db.orders.createIndex({ status: 1, createdAt: -1 })

# Partial index
db.posts.createIndex(
  { createdAt: -1 },
  { partialFilterExpression: { published: true } }
)

# Explain a query
db.users.find({ email: "alice@example.com" }).explain("executionStats")

Managed Databases on PandaStack

[PandaStack](https://dashboard.pandastack.io) provides managed PostgreSQL, MySQL, Redis, and MongoDB — all with automated backups and connection pooling built in. You focus on schema design and indexes; PandaStack handles the rest.

Docs: [docs.pandastack.io](https://docs.pandastack.io).

Quick Reference

Index Type	Use Case
Single-column	One-column `WHERE` / `ORDER BY`
Composite	Multi-column filter patterns
Partial	Subset of rows (e.g., active records)
Unique	Enforce uniqueness + fast lookup
Full-text (GIN)	Keyword / document search

Index strategically, measure with EXPLAIN ANALYZE, and prune what you don't use.

Database Indexing Explained: Speed Up Your Queries