Stop using the Flask dev server
Flask's built-in server prints a warning for a reason — it's single-threaded, not hardened, and not meant for production traffic. The standard production setup is Gunicorn, a battle-tested WSGI server, running your Flask app with multiple workers. Let's deploy that properly.
Step 1: Structure the app for a WSGI server
Gunicorn imports a WSGI callable. Expose your app via an application factory or a module-level app:
# app.py
from flask import Flask, jsonify
def create_app():
app = Flask(__name__)
@app.get("/health")
def health():
return jsonify(status="ok")
@app.get("/api/items")
def items():
return jsonify(items=fetch_items())
return app
app = create_app()Step 2: Run with Gunicorn, tuned
Gunicorn worker count matters. A common starting point is (2 x CPU) + 1 workers, but in memory-constrained containers, fewer workers with threads often works better:
gunicorn "app:app" \
--bind 0.0.0.0:$PORT \
--workers 3 \
--threads 2 \
--timeout 60 \
--access-logfile - \
--error-logfile -Key flags:
--bind 0.0.0.0:$PORT— listen on the injected port, all interfaces.--workers/--threads— concurrency. For I/O-bound apps, threads help; for CPU-bound, more workers.--access-logfile -/--error-logfile -— log to stdout/stderr for the platform's live log view.
For async workloads, consider --worker-class gevent, but start with sync workers unless you have a reason.
Step 3: Slim, reproducible container
FROM python:3.12-slim
WORKDIR /app
ENV PYTHONUNBUFFERED=1 PYTHONDONTWRITEBYTECODE=1
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD gunicorn "app:app" --bind 0.0.0.0:$PORT --workers 3 --threads 2 --access-logfile - --error-logfile -PYTHONUNBUFFERED=1 ensures logs flush immediately to stdout. Pin your dependencies in requirements.txt (or use a lockfile via pip-tools/poetry) for reproducible builds. On PandaStack you can alternatively let buildpacks detect Python.
Step 4: Wire a managed Postgres
Read DATABASE_URL from the environment — PandaStack injects it when you attach a managed PostgreSQL. With SQLAlchemy:
import os
from sqlalchemy import create_engine
engine = create_engine(
os.environ["DATABASE_URL"],
pool_size=5, # keep total below the tier connection limit
pool_pre_ping=True, # recover dead connections gracefully
)Mind connection math: workers x pool_size must stay under your tier limit (50 on free, 300 Pro, 1000 Premium). Three workers x pool_size 5 = 15 connections — comfortably within free tier.
Step 5: Run migrations as a release step
With Flask-Migrate / Alembic, run migrations once per deploy, not on worker boot:
flask db upgradeStep 6: Deploy
Connect your repo and push:
git push origin mainThe build runs in a rootless BuildKit K8s Job pod, the image goes to Artifact Registry, and Helm deploys it. Live logs stream from self-hosted Elasticsearch, automatic SSL covers your custom domain, and server-side metrics come without any client SDK.
Step 7: Config and secrets
Keep all config in env vars — SECRET_KEY, DATABASE_URL, third-party keys. Never commit them. Set FLASK_ENV/FLASK_DEBUG appropriately (debug off in production).
Production checklist
- [ ] Gunicorn, not the Flask dev server
- [ ] Bind
0.0.0.0:$PORT - [ ] Workers/threads tuned to tier; logs to stdout
- [ ]
DATABASE_URLinjected; pool sized under connection limit - [ ] Migrations as a release step
- [ ]
SECRET_KEYand secrets in env vars, debug off
References
- [Flask — Deploying to Production](https://flask.palletsprojects.com/en/latest/deploying/)
- [Gunicorn — Settings and worker types](https://docs.gunicorn.org/en/stable/settings.html)
- [SQLAlchemy — Engine and connection pooling](https://docs.sqlalchemy.org/en/20/core/pooling.html)
- [Flask-Migrate](https://flask-migrate.readthedocs.io/)
---
Gunicorn plus tuned workers turns Flask into a production-ready service. Push it to PandaStack's [free tier](https://dashboard.pandastack.io), attach a managed Postgres, and let DATABASE_URL wire itself in — no connection strings to copy around.