Auto Scaling

Scale automatically with demand

Scale your app automatically based on CPU and traffic. Kubernetes HPA, scale to zero, pay only for what you use.

replicas: 1 -> 4

cpu target: 70%

memory target: 75%

$scale-to-zero on idle: enabled

✓ traffic spike absorbed with no manual action

# what you get

Use the Horizontal Pod Autoscaler to scale replicas up and down automatically as demand changes.

Set autoscaling rules around the resource signals that actually reflect pressure on your workload.

Define the floor and ceiling so scaling stays predictable and aligned with your budget.

Eligible workloads can scale down when no traffic is flowing so you only pay for active usage.

Scaling and deploys work together without taking healthy traffic offline.

Once policies are set, PandaStack handles replica changes automatically instead of relying on on-call reactions.

# how it works

Service: api-prod
Autoscaling: enabled
Min replicas: 1
Max replicas: 6

targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 75

current CPU: 88%
scale event: 2 -> 4 replicas
status: completed

traffic: low
replicas: 4 -> 1
scale-to-zero: available on supported workloads

# included with every setup

Kubernetes HPA

CPU-based scaling

Memory-based scaling

Min/max controls

Scale-to-zero support

Rolling updates

Works with web apps and APIs

No manual scaling chores

Let traffic decide your replica count

Set sensible limits once and let PandaStack scale services up for spikes and back down for quieter periods.