Auto Scaling

Scale automatically with demand

Scale your app automatically based on CPU and traffic. Kubernetes HPA, scale to zero, pay only for what you use.

Start free

replicas: 1 -> 4

cpu target: 70%

memory target: 75%

scale-to-zero on idle: enabled

✓ traffic spike absorbed with no manual action

What you get

Kubernetes HPA

Use the Horizontal Pod Autoscaler to scale replicas up and down automatically as demand changes.

CPU + memory-based scaling

Set autoscaling rules around the resource signals that actually reflect pressure on your workload.

Min/max replica control

Define the floor and ceiling so scaling stays predictable and aligned with your budget.

Scale-to-zero on idle

Eligible workloads can scale down when no traffic is flowing so you only pay for active usage.

Rolling updates during scale

Scaling and deploys work together without taking healthy traffic offline.

No manual intervention needed

Once policies are set, PandaStack handles replica changes automatically instead of relying on on-call reactions.

How it works

01

Choose a service and enable autoscaling

Service: api-prod
Autoscaling: enabled
Min replicas: 1
Max replicas: 6
02

Define scaling thresholds

targetCPUUtilizationPercentage: 70
targetMemoryUtilizationPercentage: 75
03

Handle spikes automatically

current CPU: 88%
scale event: 2 -> 4 replicas
status: completed
04

Scale back down when idle

traffic: low
replicas: 4 -> 1
scale-to-zero: available on supported workloads

Included with every setup

Kubernetes HPA
CPU-based scaling
Memory-based scaling
Min/max controls
Scale-to-zero support
Rolling updates
Works with web apps and APIs
No manual scaling chores

Let traffic decide your replica count

Set sensible limits once and let PandaStack scale services up for spikes and back down for quieter periods.

Start free