# How to Deploy a ComfyUI Image Pipeline

ComfyUI is the most flexible node-based interface for Stable Diffusion and related image models. Most people run it locally with a GPU, but you can deploy it as a backend service that your app calls via its API. This guide covers running ComfyUI headless in production and the realities of doing so.

Set expectations: this is a heavy workload

Before anything else, be honest about the requirements. Diffusion models are large (multiple gigabytes per checkpoint) and generation is compute-intensive. ComfyUI runs *much* faster on a GPU; CPU-only generation works for tiny test images but is impractical for real output. Plan your deployment around:

Model storage — checkpoints, LoRAs, VAEs, and ControlNet models add up to many gigabytes. They need persistent storage, not an ephemeral container filesystem.
Compute — substantial CPU/RAM at minimum; GPU for usable speed.
Cold loading — first generation after start loads models into memory, which is slow.

Run ComfyUI headless with its API

ComfyUI exposes an HTTP/WebSocket API. The key flag is to bind to all interfaces so the container is reachable:

python main.py --listen 0.0.0.0 --port 8188

You drive it by POSTing a workflow (the JSON graph you'd build in the UI) to /prompt, then polling /history/{prompt_id} or listening on the WebSocket for completion. Export a workflow as API JSON from the UI (enable dev mode options) and treat that JSON as your pipeline definition checked into git.

import requests, json

def generate(prompt_workflow: dict):
    r = requests.post("http://comfyui:8188/prompt", json={"prompt": prompt_workflow})
    prompt_id = r.json()["prompt_id"]
    # poll /history/{prompt_id} until images appear, then fetch via /view
    return prompt_id

Containerize

A Dockerfile that installs ComfyUI and its dependencies. Keep models *out* of the image — mount or download them at runtime so the image stays small and models stay shareable:

FROM python:3.11-slim
RUN apt-get update && apt-get install -y --no-install-recommends git \
    && rm -rf /var/lib/apt/lists/*
WORKDIR /opt
RUN git clone https://github.com/comfyanonymous/ComfyUI.git
WORKDIR /opt/ComfyUI
RUN pip install --no-cache-dir -r requirements.txt
EXPOSE 8188
CMD ["python", "main.py", "--listen", "0.0.0.0", "--port", "8188"]

Architecture: put a thin API in front

Don't expose raw ComfyUI to the internet. Front it with a small API service that:

Validates and templates workflows (users pass parameters, not raw graphs).
Authenticates requests.
Queues jobs so concurrent requests don't overwhelm a single GPU.
Returns job IDs and serves finished images.

Client → API service (auth, queue) → ComfyUI backend → image output

This two-service split is the production-grade shape and keeps the powerful, unauthenticated ComfyUI internals private.

Deploy on PandaStack

1Push your repo (the ComfyUI Dockerfile plus your API service) to GitHub.
2Create a container app for the API service and one for the ComfyUI backend in the [dashboard](https://dashboard.pandastack.io). They build via rootless BuildKit and deploy via Helm.
3Choose a compute tier sized to the workload. For CPU inference and orchestration, c1/c2 compute-optimized or m1/m2 memory-optimized tiers (up to 8 CPU / 16GB on C2-2XCompute) handle the API, queueing, and lighter models. Diffusion at scale ultimately wants GPU — size honestly for your model.
4Use a managed datastore (Postgres/Redis) to track jobs and a real object store for generated images.

Right-sizing guide

Component	Tier family	Why
API/queue service	Small shared compute	Light, I/O-bound
ComfyUI (CPU testing)	m1/m2 memory-optimized	Models are memory-hungry
Image output storage	Managed object storage	Don't keep images on the pod

Persisting models and outputs

The container filesystem is ephemeral — anything written there vanishes on redeploy. So:

Models: download to a persistent volume on first boot, or bake a download step that caches them. Re-downloading multi-gigabyte checkpoints on every restart is painful.
Outputs: write generated images to object storage and return URLs, not to the pod's local disk.

Honest limitations

ComfyUI is GPU-first; CPU-only deployments are for testing or very low volume.
Scale-to-zero on the free tier means model reload on every cold start — fine for experiments, not for a responsive service.
Concurrency is bounded by your compute; a queue in front is essential, not optional.

Use the free tier to wire up the API contract and validate workflows, then move the generation backend to an appropriately sized paid tier.

References

[ComfyUI](https://github.com/comfyanonymous/ComfyUI)
[ComfyUI API examples](https://github.com/comfyanonymous/ComfyUI/tree/master/script_examples)
[Stability AI: Stable Diffusion models](https://stability.ai/stable-image)
[Hugging Face: Diffusers](https://huggingface.co/docs/diffusers/index)

ComfyUI as a backend is powerful once you split orchestration from generation and persist your models. PandaStack lets you deploy both services, pick a compute-optimized tier, and queue jobs in an auto-wired datastore. Prototype the API on the free tier at [dashboard.pandastack.io](https://dashboard.pandastack.io).

How to Deploy a ComfyUI Image Pipeline

Set expectations: this is a heavy workload

Run ComfyUI headless with its API

Containerize

Architecture: put a thin API in front

Deploy on PandaStack

Right-sizing guide

Persisting models and outputs

Honest limitations

References

Ready to deploy?

More in Tutorial

How to Deploy a Phoenix (Elixir) App to the Cloud

How to Deploy a Monorepo with Multiple Services

How to Deploy a Python RQ Background Worker

See also