FastAPI is fast to write and fast at runtime, but "works on my machine" and "runs in production" are two very different things. This guide walks through a production-grade deployment: the right ASGI server setup, a managed PostgreSQL database wired in safely, database migrations, and health checks.
Why FastAPI needs more than uvicorn main:app
The development command you see in every tutorial — uvicorn main:app --reload — is single-process, single-threaded, and reloads on file change. In production you want:
- Multiple workers to use all CPU cores.
- A process manager that restarts crashed workers.
- Graceful shutdown so in-flight requests finish during a deploy.
- No
--reload(it watches the filesystem and leaks memory over time).
The canonical production setup is Gunicorn managing Uvicorn workers:
gunicorn app.main:app \
--workers 4 \
--worker-class uvicorn.workers.UvicornWorker \
--bind 0.0.0.0:8000 \
--timeout 60 \
--graceful-timeout 30A good rule of thumb for workers is (2 * CPU cores) + 1, but for async I/O-bound FastAPI apps you often do well with CPU cores workers since each one handles many concurrent requests via the event loop. Measure under load before over-provisioning.
A production Dockerfile
Use a slim base image, install dependencies in a layer that caches well, and run as a non-root user.
FROM python:3.12-slim
ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
RUN useradd -m appuser
USER appuser
EXPOSE 8000
CMD ["gunicorn", "app.main:app", \
"--workers", "4", \
"--worker-class", "uvicorn.workers.UvicornWorker", \
"--bind", "0.0.0.0:8000"]Keep requirements.txt pinned (pip freeze or a lockfile via uv or pip-tools). Reproducible builds matter more than you think when something breaks at 2am.
Connecting a managed PostgreSQL database
Never ship a database inside your app container. Use a managed instance so backups, failover, and connection limits are handled for you. Read the connection string from an environment variable — never hardcode it.
import os
from sqlalchemy.ext.asyncio import create_async_engine, async_sessionmaker
DATABASE_URL = os.environ["DATABASE_URL"].replace(
"postgresql://", "postgresql+asyncpg://", 1
)
engine = create_async_engine(
DATABASE_URL,
pool_size=5,
max_overflow=10,
pool_pre_ping=True,
)
SessionLocal = async_sessionmaker(engine, expire_on_commit=False)Connection pooling matters
Each Gunicorn worker has its own connection pool. With 4 workers and pool_size=5, that's up to 20 base connections plus overflow. Multiply that across replicas and you can exhaust your database's connection limit fast. Free-tier databases on PandaStack allow 50 connections; size your pool accordingly, or put PgBouncer in front for high-concurrency workloads.
Migrations with Alembic
Run migrations as a discrete step, not on app boot — concurrent workers racing to migrate is a classic foot-gun.
alembic upgrade headA safe deploy order:
- 1Deploy backward-compatible migrations (add columns, don't drop).
- 2Deploy the new app code.
- 3In a later release, clean up old columns once nothing references them.
This "expand/contract" pattern keeps deploys zero-downtime.
Health checks
Give your orchestrator a cheap endpoint to probe:
from fastapi import FastAPI
from sqlalchemy import text
app = FastAPI()
@app.get("/healthz")
async def health():
return {"status": "ok"}
@app.get("/readyz")
async def ready():
async with SessionLocal() as session:
await session.execute(text("SELECT 1"))
return {"status": "ready"}Keep /healthz dependency-free (liveness) and let /readyz check the database (readiness). A failing DB shouldn't make your orchestrator kill the pod — it should just stop routing traffic.
Deploying on PandaStack
With a managed platform the workflow collapses to connecting your repo and pushing. PandaStack auto-detects Python, builds your Dockerfile (or uses buildpacks if you have none), and deploys.
- 1Create a PostgreSQL database from the dashboard. PandaStack injects
DATABASE_URLinto your service automatically — no copy-pasting secrets. - 2Connect your Git repository as a container app.
- 3Set your start command to the Gunicorn line above (or let the buildpack default handle it).
- 4Add a cronjob for nightly tasks if you have them, and an edge function for lightweight callbacks.
Builds run in rootless BuildKit inside ephemeral Kubernetes Job pods, push to Artifact Registry, then deploy via Helm — so there's no Docker socket exposed and no shared build host. You get live build logs, automatic SSL on custom domains, and rollbacks from deploy history.
| Concern | Dev | Production |
|---|---|---|
| Server | uvicorn --reload | Gunicorn + Uvicorn workers |
| Database | SQLite / local PG | Managed PostgreSQL |
| Secrets | .env file | Injected env vars |
| Migrations | manual | release step (alembic upgrade head) |
| TLS | none | automatic SSL |
Common production mistakes
- Leaving
--reloadon — it's a memory leak and a security smell. - Synchronous DB drivers in async routes — blocks the event loop; use
asyncpg. - Running migrations on every worker boot — race conditions and partial schema states.
- Ignoring connection limits — the #1 cause of "works locally, dies under load."
References
- FastAPI deployment docs: https://fastapi.tiangolo.com/deployment/
- Uvicorn deployment guide: https://www.uvicorn.org/deployment/
- Gunicorn settings reference: https://docs.gunicorn.org/en/stable/settings.html
- Alembic documentation: https://alembic.sqlalchemy.org/en/latest/
- SQLAlchemy async ORM: https://docs.sqlalchemy.org/en/20/orm/extensions/asyncio.html
---
Want to skip the YAML and the Docker socket headaches? PandaStack's free tier gives you container apps, a managed PostgreSQL database, and automatic SSL with DATABASE_URL wired in for you. Connect a repo and push — it runs. Start at https://dashboard.pandastack.io