Note

Istio Ambient Mesh: sidecarless and the end of the double hop

Ambient took Envoy out of every pod. Where that removes the double hop and the overhead — and where the sidecar is still justified.

Service mesh is sold as a way to get mTLS, resilience and observability without touching application code. For a decade the price was the sidecar: an Envoy proxy next to every pod, carrying all of its traffic. Convenient — while pods are counted in dozens. At scale the sidecar becomes a tax everyone pays, even services that need nothing from the mesh but mTLS. Ambient mode, which went stable in Istio 1.24 in November 2024, takes Envoy out of the pod and rewrites that economics.

Where the sidecar model hits a ceiling

The sidecar's central problem is the double hop. A request from pod A to pod B passes through two full Envoys: A's outbound proxy and B's inbound proxy. Even when all you need is mTLS at L4, traffic is still dragged through two L7 proxyings with HTTP parsing. That is latency and CPU on every call.

Then comes the cost of ownership. An Envoy in every pod is hundreds of megabytes of memory multiplied by the replica count. The sidecar is tied to the application lifecycle: it has to start before the container (hence the holdApplicationUntilProxyStarts workaround, or the first requests are lost), it slows shutdown, and it cannot be sensibly fitted into batch jobs. Enabling injection requires restarting pods, and upgrading the data plane means rolling the entire fleet. And some workloads simply do not tolerate a sidecar.

Ambient: two layers instead of one proxy

Ambient splits what the sidecar held in a single Envoy into two independent layers. ztunnel is a per-node component (a DaemonSet, written in Rust) that takes on L4 and mTLS. It builds the mesh over HBONE tunnels (mTLS inside HTTP/2 CONNECT) and encrypts node-to-node traffic without a single proxy inside the pod. waypoint is an Envoy attached to a namespace or service account that switches on only when L7 is needed: header-based routing, traffic splitting, retries, L7 authorization.

The key idea is to pay for a layer by actual use. Want cheap zero-trust mTLS across the whole cluster? One ztunnel is enough, and Envoy appears nowhere. Want smart L7 on a specific service? You add a waypoint surgically, for that one service.

Where the double hop disappears

L4 traffic in ambient travels the path "pod → node's ztunnel → remote node's ztunnel → pod". A full Envoy takes no part — ztunnel is an order of magnitude lighter (across four releases its performance grew by roughly 75%). The sidecar's double proxying is simply absent here.

When L7 is genuinely needed, one waypoint is inserted on the path, not two sidecars at both ends. Instead of "always two heavy Envoys per call" you get "ztunnel for everyone, plus one waypoint where there is something to pay for". That is the end of the double hop as a mandatory charge.

An honest counterpoint

There is no free mesh; only the shape of the bill changes. ztunnel is shared per node, so the blast radius shifts from the pod to the node: its failure or overload hits every pod on the node, whereas a crashed sidecar took down a single pod. The ambient ecosystem is younger than the sidecar one: fewer battle-tested production stories, some advanced scenarios still catching up, debugging moving from a per-pod proxy to ztunnel and waypoint logs. Ambient requires CNI compatibility and a fresher kernel. This is a deliberate architectural choice, not "turn it on and forget".

Where it pays off and where it does not

The decision rule is simple. Ambient is justified when you want zero-trust mTLS cheaply and cluster-wide; when pods are large and dense — GPU nodes and LLM inference, where a sidecar per replica eats a noticeable share of memory (which is why ambient became a hot topic for AI workloads in 2026); when the cluster holds workloads that do not accept a sidecar. The classic sidecar still fits where you need strict per-pod isolation or a feature ambient does not yet have. And if you need no L7 policy, mTLS or mesh at all — a plain CNI is enough; do not pay for capabilities you never use.

Where to start

Ambient's strength is incrementality. It turns on per namespace, without re-injection or pod restarts. Start with L4: bring up ztunnel and get mTLS across the cluster almost for free. Add waypoints surgically — only to the services that genuinely need L7 routing or authorization. Do not migrate the whole mesh at once: ambient and sidecar coexist, and the right strategy is to move traffic in layers, not in a single leap.

© 2026 axyi.ru · CC BY 4.0