André Wlodkovski - Senior DevOps/Platform Engineer

Service meshes are often presented as the “final step” in Kubernetes maturity.

Once you have:

deployments
observability
networking
security

…you “graduate” into a service mesh.

At least, that’s the narrative.

In reality, many teams introduce a service mesh and quickly find themselves dealing with:

unexplained latency
broken networking paths
certificate rotation issues
complex debugging workflows

All for features they barely use.

This is because a service mesh is not just a tool.

It is a fundamental change to how networking works in your cluster.

Before adopting one, you need to understand:

What problems it actually solves—and whether you truly have those problems.

In Kubernetes today, there are three dominant approaches:

1️⃣ Sidecar-based meshes
2️⃣ eBPF-powered meshes
3️⃣ CNI-level encryption (mesh-lite)

Each comes with different tradeoffs in:

performance
complexity
operational burden

1️⃣ Sidecar-Based Meshes — The Classic Approach

Sidecar-based meshes are the most widely adopted model.

Tools like:

Istio
Linkerd

work by injecting a proxy container (sidecar) into every pod.

Example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
spec:
  template:
    spec:
      containers:
        - name: app
          image: my-api
        - name: envoy
          image: envoyproxy/envoy

All traffic entering and leaving the pod flows through the proxy.

What problem this solves

Sidecars give you Layer 7 (application-level) control:

retries
timeouts
circuit breaking
traffic splitting (canary releases)
mTLS between services

Example: traffic split (canary)

# conceptual example (Istio VirtualService)
http:
  - route:
      - destination:
          host: api-v1
        weight: 90
      - destination:
          host: api-v2
        weight: 10

This enables fine-grained traffic control without modifying application code.

The Tradeoffs — The “Sidecar Tax”

Every pod now has:

an extra container
additional CPU and memory usage
extra network hops

This introduces:

latency overhead (even if small, it compounds)
increased resource consumption across the cluster
more moving parts to debug

Operational complexity increases significantly:

certificate management (mTLS)
control plane upgrades
sidecar injection issues
version compatibility

When it makes sense

Sidecar meshes are justified when you need:

advanced traffic routing (canary, A/B testing)
strict Zero Trust (mTLS everywhere)
deep service-to-service observability
platform-level control over networking behavior

When it’s overkill

Avoid sidecars if:

your services are simple CRUD APIs
you don’t use L7 routing features
you don’t need per-request observability
your team struggles with operational complexity already

In these cases, you are paying the sidecar tax without real benefits.

2️⃣ eBPF-Powered Meshes — The Kernel Approach

eBPF-based solutions (like Cilium) take a different approach.

Instead of injecting proxies into pods, they move networking logic into the Linux kernel.

What problem this solves

eBPF allows you to:

intercept traffic at the kernel level
apply policies without sidecars
observe traffic with minimal overhead

This results in:

lower latency
reduced resource consumption
simpler pod definitions (no sidecars)

How it works (simplified)

Instead of:

Pod → Sidecar → Network

You get:

Pod → Kernel (eBPF) → Network

No extra container is required.

Capabilities

Depending on the implementation, eBPF meshes can provide:

network policies (L3/L4)
some L7 visibility
encryption (e.g., WireGuard)
observability (flow-level metrics)

Tradeoffs

The complexity shifts from application layer → kernel layer.

This introduces:

steeper learning curve
harder debugging (kernel-level visibility)
dependency on specific kernel features

Also, L7 features may not be as rich or flexible as sidecar-based meshes.

When it makes sense

eBPF meshes are ideal when:

performance is critical
you want to avoid sidecar overhead
you need strong networking + observability integration
your team has platform expertise

When it’s not the right fit

Avoid if:

your team lacks low-level networking expertise
you rely heavily on L7 routing features
you prefer simpler, more explicit architectures

3️⃣ CNI-Level Encryption — The Lightweight Alternative

Not every system needs a “full” service mesh.

Sometimes, the primary requirement is simply to encrypt traffic between nodes.

This can be achieved directly at the CNI level.

Example: enabling WireGuard in a CNI.

# conceptual example (Cilium)
encryption:
  enabled: true
  type: wireguard

What problem this solves

encryption in transit
minimal overhead
no sidecars
no control plane complexity

What you don’t get

no L7 routing
no traffic shaping
no retries/circuit breaking
limited observability

Tradeoffs

This approach is:

simple
efficient
limited

But that’s often exactly what many systems need.

When it makes sense

Use this approach when:

you only need encryption
your services are simple
you want minimal operational overhead
you already handle retries/timeouts in code

When it’s not enough

Avoid if:

you need traffic shaping or canary deployments
you require deep observability at request level
you need centralized networking control

Conclusion

Service meshes are powerful—but they are not free.

They introduce:

operational overhead
architectural complexity
performance tradeoffs

The key question is not:

“Should we use a service mesh?”

But rather:

“What problem are we trying to solve?”

In many systems:

sidecar meshes are overkill
eBPF solutions are a better balance
or no mesh at all is the right answer

A production-ready Kubernetes platform is not defined by the number of tools it uses.

It is defined by intentional architectural decisions.

Actionable Steps

Step 1 — Identify your real requirements

Do you actually need:

mTLS everywhere?
traffic splitting?
per-request observability?

Or are these just “nice to have”?

Step 2 — Measure your current system

Before adding a mesh, evaluate:

latency
resource usage
failure patterns

Don’t optimize problems you don’t have.

Step 3 — Start with the simplest solution

Prefer:

CNI-level encryption
application-level retries

before introducing a full mesh.

Step 4 — Evaluate operational cost

Consider:

team expertise
debugging complexity
upgrade burden

A tool that your team cannot operate safely is a liability.

Step 5 — Introduce complexity incrementally

If you adopt a mesh:

start small
enable only required features
avoid “turning everything on”

Production-ready Kubernetes Series:

Production-ready Kubernetes Part 10 - Service Meshes: Power Tool or Operational Burden?

1️⃣ Sidecar-Based Meshes — The Classic Approach

What problem this solves

The Tradeoffs — The “Sidecar Tax”

When it makes sense

When it’s overkill

2️⃣ eBPF-Powered Meshes — The Kernel Approach

What problem this solves

How it works (simplified)

Capabilities

Tradeoffs

When it makes sense

When it’s not the right fit

3️⃣ CNI-Level Encryption — The Lightweight Alternative

What problem this solves

What you don’t get

Tradeoffs

When it makes sense

When it’s not enough

Conclusion

Actionable Steps

Step 1 — Identify your real requirements

Step 2 — Measure your current system

Step 3 — Start with the simplest solution

Step 4 — Evaluate operational cost

Step 5 — Introduce complexity incrementally

Related Posts