Load Balancers (L4 vs L7) Explained

A load balancer distributes incoming requests across multiple backend servers. It’s the difference between “one server handling 1000 users” and “ten servers each handling 100.” It’s also how you survive a server crashing — the load balancer just stops sending traffic to it.

The two main types

Layer 4 (Transport-level)

Looks only at TCP/UDP headers. Distributes connections based on (source IP, source port, destination IP, destination port). Doesn’t understand HTTP or any other application protocol.

  • Pros: very fast, low overhead, protocol-agnostic, works for anything (databases, raw TCP, SMTP)
  • Cons: can’t make routing decisions based on URL, headers, cookies
  • Examples: AWS NLB, HAProxy in TCP mode, Linux IPVS, F5 BIG-IP

Layer 7 (Application-level)

Understands HTTP. Routes based on URL path, headers, cookies, methods. Can rewrite headers, terminate TLS, return cached responses.

  • Pros: smart routing (e.g., “/api/* to backend pool A, /static/* to pool B”), TLS termination, header manipulation
  • Cons: slower (parses HTTP), per-request overhead, only works for HTTP
  • Examples: AWS ALB, nginx, HAProxy in HTTP mode, Envoy, Traefik, Caddy

Common load balancing algorithms

Algorithm How it picks When to use
Round robin Each server in turn All servers identical, all requests similar cost
Weighted round robin More requests to bigger servers Mixed server sizes
Least connections Server with fewest active connections Variable request duration (websockets)
Least response time Fastest-responding server Heterogeneous servers, performance matters
Hash (source IP) Same client → same server When session persistence matters
Random with two choices Pick 2 random, send to less loaded Often best balance simplicity/performance

Health checks

Load balancers periodically probe each backend to see if it’s alive. Failed probe → remove from rotation until healthy again.

  • TCP check — can I open a connection? (L4)
  • HTTP check — does GET /health return 200? (L7)
  • Custom — does the response body match expected text?

Tune intervals carefully — too aggressive and you mark healthy servers down; too lax and clients hit dead servers for minutes.

Sticky sessions (session affinity)

“Once a user lands on server A, keep them on server A.” Implemented by:

  • Source IP hashing (works for short sessions, breaks with NAT)
  • Cookie injection (LB sets a cookie naming the chosen backend)
  • Application cookie awareness (LB reads YOUR session cookie)

Best practice: design stateless backends so you don’t need stickiness. Stickiness is a workaround for shared-nothing failure.

TLS termination

Common pattern: client connects via HTTPS to load balancer, load balancer decrypts and forwards plain HTTP to backends. Pros: backends don’t manage certs, central place to update them. Cons: traffic between LB and backend is unencrypted (use mTLS or private network).

Active-active vs active-passive

  • Active-active — multiple LBs all serve traffic simultaneously, distribute via DNS round-robin or anycast
  • Active-passive — one LB serves all traffic; standby takes over if primary fails (via VRRP, keepalived)

Cloud load balancers

Provider L4 L7
AWS NLB (Network Load Balancer) ALB (Application Load Balancer)
GCP TCP/UDP Load Balancer HTTP(S) Load Balancer
Azure Azure Load Balancer Application Gateway
Cloudflare Spectrum Load Balancer (HTTP)

What to learn next

Forward and reverse proxies — two opposite jobs sharing one name. Up next.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *