gRPC is a high-performance RPC framework built on HTTP/2 and Protocol Buffers. It's excellent for service-to-service communication, low latency, strong typing, streaming, but it has deployment requirements that trip people who are used to REST. The big ones: it requires HTTP/2 end to end, its connection model defeats naive load balancers, and clients need a health-check protocol that isn't a plain HTTP endpoint. This guide covers shipping a gRPC service correctly.

The HTTP/2 requirement

gRPC runs exclusively over HTTP/2. This is the most important deployment fact. Your ingress and load balancer must support HTTP/2 and forward it to your service, if an intermediary downgrades to HTTP/1.1, gRPC breaks. When choosing where to deploy, confirm the ingress handles HTTP/2 (gRPC) traffic. Kong ingress supports gRPC over HTTP/2, so connections reach your service intact.

A minimal service

Starting from a .proto:

syntax = "proto3";
package greet;
service Greeter {
  rpc SayHello (HelloRequest) returns (HelloReply);
}
message HelloRequest { string name = 1; }
message HelloReply { string message = 1; }

A Go server (the language ecosystem doesn't change the deployment principles):

func (s *server) SayHello(ctx context.Context, in *pb.HelloRequest) (*pb.HelloReply, error) {
    return &pb.HelloReply{Message: "Hello " + in.Name}, nil
}

func main() {
    port := os.Getenv("PORT")
    if port == "" { port = "50051" }
    lis, _ := net.Listen("tcp", "0.0.0.0:"+port)
    s := grpc.NewServer()
    pb.RegisterGreeterServer(s, &server{})
    // health + reflection, explained below
    grpc_health_v1.RegisterHealthServer(s, health.NewServer())
    reflection.Register(s)
    s.Serve(lis)
}

Bind to 0.0.0.0 and read the port from the environment, the same container fundamentals as any service.

Health checks and reflection

gRPC has its own health-checking protocol (grpc.health.v1.Health) rather than a plain HTTP /health endpoint. Register the standard health service so orchestrators and load balancers can probe liveness over gRPC. Use grpc-health-probe in your container health check:

HEALTHCHECK CMD ["/bin/grpc_health_probe", "-addr=localhost:50051"]

Server reflection lets tools like grpcurl introspect your service without the .proto file, invaluable for debugging in production. Register it as shown; you can disable it in locked-down environments if you prefer.

TLS

gRPC strongly prefers TLS. In a typical deployment, TLS terminates at the ingress (the client uses a secure channel to the public endpoint), and traffic inside the cluster may be plaintext HTTP/2 or re-encrypted depending on your security posture. Clients connect with transport credentials:

creds := credentials.NewTLS(&tls.Config{})
conn, _ := grpc.Dial("your-service.pandastack.app:443", grpc.WithTransportCredentials(creds))

The load-balancing gotcha

Here's what surprises REST veterans: gRPC uses long-lived HTTP/2 connections that multiplex many requests over a single connection. A connection-level (L4) load balancer pins a client to one backend for the life of the connection, so it can't spread individual RPCs across replicas, your traffic piles onto whichever instances clients happened to connect to. Proper gRPC load balancing happens at L7 (request level) or via client-side load balancing. When deploying behind an L7-aware ingress that understands HTTP/2, individual RPC streams can be balanced correctly. Keep this in mind when you scale out: confirm balancing is happening at the request level, not just the connection level.

Container build from protobufs

Compile your protos as part of the build. For Go, a multi-stage Dockerfile keeps the final image tiny:

FROM golang:1.22 AS build
WORKDIR /src
COPY . .
RUN go build -o /server ./cmd/server

FROM gcr.io/distroless/base-debian12
COPY --from=build /server /server
COPY --from=build /bin/grpc_health_probe /bin/grpc_health_probe
EXPOSE 50051
ENTRYPOINT ["/server"]

Distroless keeps the attack surface minimal, which fits gRPC's typical use in internal service meshes.

Deploying on PandaStack

1Connect your repo as a container app. Build runs in an ephemeral Job pod with rootless BuildKit and deploys via Helm behind Kong ingress.
2Because Kong supports gRPC over HTTP/2, your service is reachable over a secure endpoint with automatic SSL.
3Set the port via the environment; register the gRPC health service so probes work.
4Attach a managed database if your service is stateful, DATABASE_URL is injected automatically.
5Tail live logs and use grpcurl against your endpoint (reflection enabled) to verify methods are callable.

Requirement	Why it matters
HTTP/2 end to end	gRPC won't work over HTTP/1.1
gRPC health service	Probes aren't plain HTTP
L7 load balancing	L4 pins long-lived connections
TLS	Strongly preferred; terminate at ingress
Reflection	Debugging with grpcurl

Free tier note

A Go gRPC service compiled to a static binary is tiny and fast-starting, an excellent fit for the free tier, including scale-to-zero for low-traffic internal services (accept a small cold start). For high-throughput inter-service traffic, a compute-optimized tier and warm instances give consistent latency.

Verifying

grpcurl your-service.pandastack.app:443 list
grpcurl -d '{"name":"world"}' your-service.pandastack.app:443 greet.Greeter/SayHello
grpc_health_probe -addr=your-service.pandastack.app:443 -tls

References

gRPC documentation: https://grpc.io/docs/
gRPC health checking protocol: https://github.com/grpc/grpc/blob/master/doc/health-checking.md
grpc-health-probe: https://github.com/grpc-ecosystem/grpc-health-probe
gRPC load balancing: https://grpc.io/blog/grpc-load-balancing/
gRPC server reflection: https://github.com/grpc/grpc/blob/master/doc/server-reflection.md

gRPC deployment is mostly about respecting HTTP/2: an ingress that speaks it, request-level load balancing, and the gRPC health protocol. Deploy your service behind HTTP/2-capable ingress with automatic SSL on PandaStack's free tier: https://dashboard.pandastack.io

How to Deploy a gRPC Service to Production

The HTTP/2 requirement

A minimal service

Health checks and reflection

TLS

The load-balancing gotcha

Container build from protobufs

Deploying on PandaStack

Free tier note

Verifying

References

Ready to deploy?

More in Tutorial

How to Deploy a Phoenix (Elixir) App to the Cloud

How to Deploy a Monorepo with Multiple Services

How to Deploy a Python RQ Background Worker

See also