Technology

Zero‑Downtime HAProxy: Layer 4/7 Load Balancing, Health Checks that Matter, Sticky Sessions that Behave, and Clean TLS Passthrough

So there I was, staring at a slow‑moving progress bar on a Friday release, knowing full well that one wrong reload could drop hundreds of users off a checkout page. You know that feeling, right? When traffic is hot, stakes are high, and the only thing between your app and a storm of angry messages is your load balancer behaving like a calm, well‑trained traffic cop. That night, HAProxy saved me. Not magically—just because I finally learned how to treat it with the respect it deserves: good health checks, predictable stickiness, and a deployment flow that never cuts connections mid‑sentence.

If you’ve ever wondered how to make HAProxy do zero‑downtime balance at both Layer 4 and Layer 7, how to keep sessions sticky without making future you cry, or how to pass TLS straight through when you don’t want to terminate at the edge, you’re in the right place. Think of this as the friendly field guide I wish I’d had years ago: conversational, practical, and shaped by the kind of lessons you only learn when real users are on the line.

A Simple Mental Model for L4 vs L7 (And Why It Matters)

Let’s start with a friendly picture. Picture a busy toll booth. Layer 4 is the guard who just checks the license plate and lets you through—it’s about connections, IPs, and ports. It’s fast, it’s lean, it doesn’t care what’s inside the car. Layer 7 is the guard who also asks where you’re going, peeks at the map, and directs you to the right lane—HTTP headers, paths, cookies, and more. It’s smarter, which sometimes means more work, but it lets you route decisions on things the app actually understands.

In my experience, I like to think in two passes. First: do we need to inspect content? If not—say we’re passing TLS straight to the app or handling TCP protocols like MySQL or Redis—Layer 4 shines. Second: do we need to shape behavior using HTTP—sticky sessions with cookies, path‑based routing, header checks, or smart health checks? That’s Layer 7 territory.

The trick is to combine them without stepping on your own toes. I’ve seen teams try to do everything at L7 and then wonder why their TLS passthrough got weird, or push everything to L4 and lose out on the really useful HTTP tools. We’ll mix and match in a way that keeps your setup understandable six months from now.

Where L4 fits perfectly

Any time you don’t want to terminate TLS at HAProxy—maybe compliance reasons, maybe you prefer end‑to‑end TLS into Nginx or your app server—Layer 4 is your friend. You can still sniff SNI for routing decisions without decrypting. You can still keep deployments graceful. But you won’t be doing header rewrites or cookie stickiness here, and that’s okay.

Where L7 earns its keep

When you need control. Things like cookie‑based stickiness, intelligent health checks that call /healthz, path routing, rate limiting, and even some upgrade niceties for WebSockets. This is where HAProxy becomes more than a switch—it becomes the smart middle layer that keeps your app honest during deploys and spikes.

Health Checks That Actually Tell the Truth

I’ve been burned by “green” nodes that weren’t actually healthy—like a backend that accepted TCP but returned 500s on a key endpoint. That’s the kind of thing that makes a good release go sour. The fix? Don’t just check if a port is open. Check what your users actually rely on.

For HTTP services, I almost always use an explicit health endpoint. App teams can keep it fast and deterministic, and we can test more than just “is the app running?” We can include basic dependencies—database connectivity, cache reachability, background queue status—without making it brittle. When it fails, I want to know quickly; when it recovers, I want it to rejoin gracefully.

# Layer 7 HTTP health checks
backend app_http
  mode http
  balance roundrobin
  option httpchk GET /healthz
  http-check expect status 200
  default-server inter 2s fastinter 1s downinter 500ms rise 3 fall 2
  server app1 10.0.0.11:8080 check
  server app2 10.0.0.12:8080 check

Notice the rise and fall behavior. It doesn’t flap on one bad response; it also doesn’t wait forever to recover. I like to keep “fastinter” snappy when a server looks unstable—get signal faster and reduce the chance that bad nodes take traffic again too quickly.

For TLS passthrough or pure TCP protocols, I’ll switch to L4 checks. You can still get surprisingly nuanced with L4 by watching connection behavior, but it’s not a replacement for an actual HTTP probe.

# Layer 4 TCP health checks (simple and quick)
backend app_tls_passthrough
  mode tcp
  balance source
  option tcp-check
  default-server check inter 2s rise 2 fall 2
  server app1 10.0.0.21:443 check
  server app2 10.0.0.22:443 check

When teams ask me why their users still felt hiccups during deploys, the answer is often that health checks didn’t line up with reality. Your checks should mirror the path a real user takes—if the checkout dependency is down, your health check should say so.

Bonus: Weights, slowstart, and graceful rejoin

One trick I love is slowly increasing the weight of a server that just came back. It’s like easing back into traffic instead of flooring it onto the highway.

backend app_http
  mode http
  balance roundrobin
  option httpchk GET /healthz
  http-check expect status 200
  default-server inter 2s rise 3 fall 2 slowstart 30s
  server app1 10.0.0.11:8080 check weight 50
  server app2 10.0.0.12:8080 check weight 50

Slowstart helps smooth spikes right after a deploy when caches are cold and JITs are warming. Your users feel a whole lot less “thumpy.”

Sticky Sessions Without Regrets

Sticky sessions are one of those topics where what works on Tuesday blows up on Black Friday. I remember a retail client where we “solved” login consistency with IP‑based stickiness, only to discover later that mobile carrier NATs were pinning entire cities to a single backend. Oops. Lesson learned: choose your stickiness signal carefully and in context.

At Layer 7, you have the most control. Cookie‑based stickiness is the classic approach. HAProxy can insert a cookie and route requests to the same server automatically. Simple, predictable, and friendly to most apps.

# Cookie-based stickiness at L7
backend app_http
  mode http
  balance roundrobin
  cookie SRV insert indirect nocache httponly secure
  option httpchk GET /healthz
  http-check expect status 200
  server app1 10.0.0.11:8080 check cookie s1
  server app2 10.0.0.12:8080 check cookie s2

If you already manage your own session cookie, HAProxy can match it without issuing its own. You can also hash on headers, URLs, or user IDs, but keep an eye on hot keys. Consistent hashing helps when workloads can skew.

# URI hashing for cacheable endpoints
backend media_http
  mode http
  balance uri
  hash-type consistent
  server media1 10.0.1.11:80 check
  server media2 10.0.1.12:80 check

At Layer 4, you don’t get cookies. You’re mostly left with source IP hashing. It’s not bad, but you should be aware of where it can backfire—NATed egress, office proxies, and load tests from a single source all change the game. If you must stick at L4 in a complex environment, consider routing by SNI (per domain) or upstream PROXY protocol so your upstream Nginx can do smarter stickiness post‑TLS termination.

# L4 stickiness by source (use with care)
backend app_tls_passthrough
  mode tcp
  balance source
  server app1 10.0.0.21:443 check
  server app2 10.0.0.22:443 check

One more truth: sticky sessions are a trade‑off. They can stabilize login flows and long‑lived shopping carts, but they can also create uneven load when one server gets a popular cohort. Use them where you need them, and consider ways to reduce the need—centralized sessions, idempotent APIs, and moving state out of instance memory.

TLS Termination vs TLS Passthrough (And Why I Use Both)

This topic can spark long whiteboard sessions. Here’s the way I usually guide teams: terminate TLS at HAProxy when you want to do smart HTTP stuff—rate limits, cookie stickiness, header rewrites—and you’re comfortable managing certificates at the edge. Pass TLS through when you want end‑to‑end encryption to the app layer, or when your app demands specific TLS behaviors (ALPN quirks, client cert validation) handled upstream.

Terminating TLS at HAProxy

Terminating at the edge lets you use the full power of Layer 7. You can also offload CPU cost from app nodes and keep cert management centralized. With modern ciphers and ALPN settings, you can serve both HTTP/2 and HTTP/1.1 comfortably.

frontend https_in
  mode http
  bind :443 ssl crt /etc/haproxy/certs/ bundle crt-list /etc/haproxy/crt-list.txt alpn h2,http/1.1
  http-response set-header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload"
  http-response del-header Server
  acl host_api hdr(host) -i api.example.com
  use_backend api_http if host_api
  default_backend app_http

When you manage certificates, think about automation and governance. I’m a big fan of using a proper pipeline for cert updates and reloads. If you’re weighing certificate types for e‑commerce or SaaS, and when Wildcard or EV makes sense, I wrote a warm guide that can help: choosing the right SSL certificate without the drama.

Passing TLS through without decryption

With TLS passthrough, you keep the handshake intact to the backend. HAProxy can still peek at SNI (without decrypting) and send traffic to the right cluster. This is lovely when your upstream Nginx or app gateway does mTLS or custom auth at the edge.

frontend tls_router
  mode tcp
  bind :443
  tcp-request inspect-delay 5s
  tcp-request content accept if { req_ssl_hello_type 1 }
  acl sni_app req.ssl_sni -i app.example.com
  acl sni_api req.ssl_sni -i api.example.com
  use_backend be_app_tls if sni_app
  use_backend be_api_tls if sni_api
  default_backend be_default_tls

backend be_app_tls
  mode tcp
  balance source
  server app1 10.0.0.21:443 check
  server app2 10.0.0.22:443 check

backend be_api_tls
  mode tcp
  balance source
  server api1 10.0.1.21:443 check
  server api2 10.0.1.22:443 check

If you’re exploring mutual TLS for internal services or admin paths, I’ve got a step‑by‑step that keeps it calm: why mTLS and how to set it up cleanly. You can terminate client certs at your upstream Nginx while HAProxy simply routes by SNI—no decryption needed at the balancer.

And if you’re juggling WebSockets or gRPC behind a CDN, stick around; I’ll share a practical note on timeouts in a minute. I also wrote a deep dive on keeping these protocols happy: WebSockets and gRPC behind Cloudflare, without the tears.

Zero Downtime: Reloads, Drains, and Rollouts That Don’t Drop Users

Here’s the thing about “zero downtime”: it’s not just one feature. It’s a little orchestra of careful behaviors—hitless reloads, connection draining, sticky logic that behaves during deploys, and health checks that lead traffic away at the right time.

Hitless (seamless) reloads

Modern HAProxy supports master‑worker mode and socket inheritance so you can reload configs without dropping ongoing connections. It feels like magic the first time it works in production. The secret sauce is letting the new process take over the listeners while the old process finishes existing connections.

# /etc/haproxy/haproxy.cfg
global
  master-worker
  stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
  tune.ssl.default-dh-param 2048
  nbthread 4

defaults
  log global
  option dontlognull
  timeout connect 5s
  timeout client  60s
  timeout server  60s
  timeout tunnel  2h  # helps websockets/grpc

Then use your init system’s reload, not a hard restart. With systemd, a simple reload command hands off sockets cleanly. If you want to go deeper, the official docs on configuration and runtime API are worth a quiet Sunday read: HAProxy documentation and downloads and seamless reloads explained.

Draining connections during deploys

Before you restart a backend node, mark it as draining so new traffic stops while existing sessions finish. This is the little habit that prevents 499s and abandoned carts. You can do it with the admin socket:

echo "set server app_http/app1 state drain" | socat stdio /run/haproxy/admin.sock
# After deploy and quick checks
echo "set server app_http/app1 state ready" | socat stdio /run/haproxy/admin.sock

Pair draining with on-marked-down shutdown-sessions to be decisive when a check fails, and slowstart when a server returns. It’s like landing a plane with flaps out and a nice long runway.

backend app_http
  mode http
  option httpchk GET /healthz
  http-check expect status 200
  on-marked-down shutdown-sessions
  default-server inter 2s rise 3 fall 2 slowstart 30s

Blue/green in real life

I’ve had great luck with “two backends, one front door.” Prepare your new version on a separate backend, warm it up with shadow traffic or synthetic requests, then flip which backend the frontend uses. If the logs get weird, flip back. No panic. If your infra is orchestrated with Terraform and friends, this fits nicely into a zero‑downtime workflow—I shared how I wire DNS and deploys together here: automating DNS and zero‑downtime deploys.

Real‑World Recipes: Timeouts, WebSockets, gRPC, and PROXY Protocol

If there’s one category that causes silent pain, it’s timeouts. Healthy values make everything feel “buttery.” Bad ones lead to mystery disconnects.

Timeouts that keep long connections alive

For WebSockets and gRPC, raise the tunnel timeout generously. I often set it to an hour or more depending on your use case. Keep‑alive is important too if you’re doing L7 termination. If you’re behind a CDN like Cloudflare, align upstream timeouts with their expectations or you’ll see odd drops.

defaults
  timeout client 60s
  timeout server 60s
  timeout tunnel 2h   # websockets/grpc
  option http-keep-alive

If your app uses HTTP/2 (gRPC does), make sure ALPN includes h2 when you terminate TLS. For passthrough, your upstream must advertise ALPN—HAProxy won’t alter it.

WebSockets headers and upgrades

When HAProxy terminates TLS and speaks HTTP, make sure upgrade requests pass through untouched. Here’s the gentle nudge I add so upgrade flows are happy and not re‑encoded strangely:

backend ws_app
  mode http
  option http-server-close
  timeout server 2h
  timeout connect 5s
  server ws1 10.0.2.11:8080 check

If this world is your daily driver, I’ve written a separate guide to keep WebSockets and gRPC happy behind Cloudflare, with a focus on no‑drama timeouts and upgrades.

PROXY protocol: carrying the real client IP

When you pass TLS through, your upstream Nginx or app gateway might need the client’s real IP. Enter PROXY protocol. HAProxy can add it, and your upstream can read it without losing TLS passthrough. Just ensure both ends agree or things get messy.

frontend tls_router
  mode tcp
  bind :443
  tcp-request inspect-delay 5s
  tcp-request content accept if { req_ssl_hello_type 1 }
  use_backend be_app_tls

backend be_app_tls
  mode tcp
  balance source
  server app1 10.0.0.21:443 check send-proxy-v2
  server app2 10.0.0.22:443 check send-proxy-v2

On Nginx, enable proxy_protocol on the listener and make sure your real IP extraction trusts only your HAProxy IPs. It’s one of those features that feels invisible when done right—and ruins your day when misaligned.

Observability and Runtime Tweaks Without the Panic

One of my guilty pleasures is peeking at HAProxy’s runtime stats and making small adjustments mid‑incident. Not in a cowboy way—more like a short screwdriver twist on a valve that was too tight.

Expose the admin socket safely (local only, strict permissions) and get comfortable querying it during spikes. You can see which backends are hot, which servers are draining, and adjust weights temporarily to give a tired node a break.

# Watch server state and sessions
echo "show servers state" | socat stdio /run/haproxy/admin.sock

# Nudge a hot server down a bit
echo "set server app_http/app2 weight 30" | socat stdio /run/haproxy/admin.sock

Stick tables are another unsung hero. They’re great for gentle per‑IP rate control on login endpoints or to prevent abuse. I don’t love aggressive blocking as a default, but a small amount of shaping keeps everyone else fast and happy.

frontend https_in
  mode http
  bind :443 ssl crt /etc/haproxy/certs/
  stick-table type ip size 200k expire 10m store http_req_rate(10s)
  http-request track-sc0 src
  acl abuse sc_http_req_rate(0) gt 50
  http-request deny if abuse
  default_backend app_http

With runtime control, you can turn the dial up or down without a redeploy. If your team likes infrastructure as code (mine does), document your “incident playbook” for these tweaks so you’re not inventing it under pressure. I’ve shared how I write calm, repeatable runbooks here: no‑drama DR plans and runbooks.

A Few Patterns I Lean On (So You Don’t Need to Learn Them the Hard Way)

Over the years, certain patterns just keep paying rent. Think of these less like rigid rules and more like habits that make production feel boring—in the best way.

First, give your health checks a path to fail early in deploy scripts. I like to have a “ready” flag that flips only after database migrations have run, caches have warmed, and the app’s background workers are attached. When /healthz says 200, mean it.

Second, drain before deploy, and rejoin slowly. If you’re running autoscaling, make sure nodes that scale down drain too. You’d be surprised how many times instance termination scripts forget to tell HAProxy, and half of your slow session commits just vanish.

Third, decide upfront where state lives. If your sticky sessions exist to mask stateful behavior, ask whether that state should live elsewhere—Redis, a session store, or a token strategy. Sticky sessions are a tool, not a crutch. They work great; they just shouldn’t be the only reason users stay happy.

Fourth, practice reloads. I treat HAProxy reloads like fire drills—roll keys, renew certs, shuffle weights, flip blue/green—so when the real day comes, it’s not your first rodeo. If you use Let’s Encrypt, script cert updates and seamless reloads; the official docs and blogs like the HAProxy seamless reload guide are genuinely helpful: a detailed walkthrough of seamless reloads.

Finally, zoom out sometimes. HAProxy is the traffic cop, but it’s part of a little ecosystem: app servers, databases, and caches all carry their weight. If you enjoy database side tuning, I have a gentle guide on splitting reads and writes at the MySQL layer with a friendly tone: ProxySQL for read/write split and pooling without the drama. Different layer, same idea—move traffic smartly where it needs to go.

Putting It All Together: A Clean, Practical Layout

Let’s combine the pieces into a simple, production‑flavored layout. You’ll see edge TLS termination for the main app, a passthrough SNI router for a special subdomain that needs end‑to‑end TLS, cookie stickiness for session stability, and a playbook for graceful deploys.

global
  master-worker
  stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners
  nbthread 4

defaults
  log global
  option httplog
  option dontlognull
  timeout connect 5s
  timeout client 60s
  timeout server 60s
  timeout tunnel 2h

# L7 HTTPS termination for main app
frontend https_in
  mode http
  bind :443 ssl crt /etc/haproxy/certs/ alpn h2,http/1.1
  http-response set-header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload"
  http-response del-header Server
  acl host_app hdr(host) -i www.example.com example.com
  acl host_ws hdr(host) -i ws.example.com
  use_backend ws_app if host_ws
  use_backend app_http if host_app
  default_backend app_http

backend app_http
  mode http
  balance roundrobin
  cookie SRV insert indirect nocache httponly secure
  option httpchk GET /healthz
  http-check expect status 200
  on-marked-down shutdown-sessions
  default-server inter 2s rise 3 fall 2 slowstart 30s
  server app1 10.0.0.11:8080 check cookie s1
  server app2 10.0.0.12:8080 check cookie s2

backend ws_app
  mode http
  option http-server-close
  timeout server 2h
  server ws1 10.0.2.11:8080 check

# L4 SNI-based TLS passthrough for a special domain
frontend tls_passthrough
  mode tcp
  bind :8443
  tcp-request inspect-delay 5s
  tcp-request content accept if { req_ssl_hello_type 1 }
  acl sni_secure req.ssl_sni -i secure.example.com
  use_backend be_secure_tls if sni_secure
  default_backend be_secure_tls

backend be_secure_tls
  mode tcp
  balance source
  default-server check inter 2s rise 2 fall 2
  server sec1 10.0.3.21:443 check send-proxy-v2
  server sec2 10.0.3.22:443 check send-proxy-v2

Deploy flow? Mark app1 drain, deploy, mark ready, then repeat for app2. If you rotate certs, reload HAProxy hitlessly. If you feature‑toggle a new path, use L7 routing rules to canary traffic by header or hostname without touching upstreams. Simple, readable, and future‑you will thank you.

Little Pitfalls That Sneak In

Three gotchas I see a lot. First: mixed trust chains. If you terminate TLS at the edge and re‑encrypt to the backend, make sure your upstream trusts the right CAs or uses pinned certs. A mismatch here leads to head‑scratching 502s. Second: health checks that hit the homepage. That route changes during marketing campaigns; your checks should live on a boring, purpose‑built path. Third: sticky sessions during autoscaling. When a node disappears, some “sticky” users feel a blip. Keep the session store centralized or tolerate a fast re‑login flow for safety.

And a bonus one: gRPC and HTTP/2 behind older proxies. If you terminate TLS, be sure your alpn includes h2, and your upstream supports it. Otherwise, clients silently downgrade and everything feels slower.

A Quick Nod to Backups and Runbooks

You can do everything right and still have a bad day if you don’t have a playbook for when the unexpected shows up. I keep a small checklist nearby: how to drain, how to reload, how to rotate a cert, how to back out a routing rule. It’s the type of thing you don’t want to invent live. If infrastructure hygiene is your jam, I shared my favorite “first boot to ready” blueprint here: from blank VPS to reproducible services with cloud‑init + Ansible.

And for the tracing nerds (I say that with love, because I am one), HAProxy logs are only half the story. Correlate them with app traces for true insight. If you’re curious about a friendly, real‑world approach to telemetry, this guide is for you: tracing that feels like a conversation with your app.

Wrap‑Up: The Calm Path to “It Just Works”

When people ask me why I still reach for HAProxy, I tell them it’s because it lets me be boring in production. And boring is beautiful. With the right blend of Layer 4 and Layer 7, health checks that reflect reality, sticky sessions that are deliberate—not accidental—and a TLS strategy that suits each service, you can create a traffic layer that never steals the show.

If you take anything from this, let it be this handful of habits. Decide up front what you need from L4 versus L7. Make your health checks honest. Treat sticky sessions as a tool, not a crutch. And practice graceful rollouts—drain first, reload hitlessly, and let servers rejoin gently. Your users won’t notice, and that’s the point.

And if you’re the kind of person who enjoys connecting the dots, you might like pairing this with a cert strategy that won’t bite you later and some calm infra automation to glue it all together. Between a little planning and HAProxy’s quietly powerful features, zero downtime stops being a slogan and starts being your default.

Hope this was helpful! If you’ve got questions or want me to dig into your specific setup, send them my way—I love tuning these flows until they feel silky. See you in the next post.

Frequently Asked Questions

Great question! Use termination when you want L7 features like cookie stickiness, header routing, rate limiting, and smart HTTP health checks. Use passthrough when you need end-to-end TLS to upstream (like for mTLS or special ALPN needs) or when another gateway handles SSL logic. You can mix both: terminate on :443 for most traffic, and route passthrough on another port or frontend by SNI for the services that need it.

Drain before you deploy. Mark a server as drain via the admin socket, finish active sessions, release the new version, then mark it ready with slowstart so it warms up gracefully. Pair that with honest health checks (e.g., /healthz returning 200 only when dependencies are ready) and HAProxy’s hitless reload. Together, these steps make updates feel invisible to users.

At L7, cookie-based stickiness is the most predictable and avoids NAT pitfalls. If you already have an app session cookie, HAProxy can key off that. For cacheable paths, consider consistent hashing on URI. At L4, you’re mostly limited to source IP hashing, which can be skewed by NAT or proxies. Use stickiness intentionally, and try to keep application state out of instance memory so you’re not locked in.