For a decade, the dominant Silicon Valley mantra has been: monoliths don't scale; microservices do. Netflix itself helped popularize this philosophy. Its migration from a monolithic Java app to a sprawling microservices ecosystem became the canonical case study taught at conferences and in blog posts worldwide.
But here's the twist: buried in the operational reality of running one of the most complex streaming platforms on Earth, Netflix accidentally revealed a counterintuitive truth — under certain conditions, monoliths actually scale better.
The Myth of Infinite Microservice Scalability
Microservices promise parallel scaling: each service can scale independently, teams can deploy without waiting on others, and failures can be contained. In theory, this makes systems more resilient and elastic.
In practice, the cracks show:
- Network Tax: Every service boundary introduces serialization, deserialization, network hops, retries, and timeouts.
- Operational Complexity: Each new service requires monitoring, CI/CD pipelines, alerting rules, and incident playbooks.
- Debugging Nightmares: Distributed traces may map hundreds of services for what used to be a single function call.
Netflix, with its thousands of services, lives this reality every day.
The Surprise: A Monolithic Core That Scaled Better
Behind the curtain, not everything at Netflix is microservices. Certain workloads — recommendation engines, billing, playback authorization — have been consolidated into larger, monolithic systems. Why?
Because when traffic spikes (say, a hit show drops at midnight), these monolithic cores scale horizontally more efficiently than dozens of chatty microservices.
A single deployment artifact, a single runtime, a single scaling policy. Fewer moving parts means fewer chances for cascading failure.
Microservices vs. Monolith at Scale
Microservices: Monolith:
[Client] -> [API Gateway] [Client] -> [Monolith Cluster]
| |
[Service A] -> [Service B] (Internal function calls)
| |
[Service C] -> [DB] [DB]Every hop in microservices adds latency and operational overhead. The monolith routes calls internally in memory — faster, cheaper, simpler.
Benchmark Snapshot (Illustrative)
Netflix engineers once compared microservice-heavy and monolithic-heavy designs under peak load. The results (simplified):

The monolith not only responded faster, but required fewer teams in the war room when outages occurred.
Why the Industry Missed This
- Narrative Momentum — Microservices became the default "modern" answer; engineers stopped questioning it.
- Org Fit Over Tech Fit — Microservices solved scaling of teams, not always scaling of systems.
- Conference Bias — Case studies showcased microservice success stories, while the quiet wins of monoliths went unpublished.
Netflix's reality shows both: microservices and monoliths, coexisting where each makes sense.
Monolith or Microservices?
Do you have one team or many teams?
|
+-- One team -> Prefer monolith: faster iteration, simpler ops.
+-- Many teams -> Microservices may reduce team friction.
Is your main bottleneck scaling traffic or scaling people?
|
+-- Traffic -> Monolith often wins (fewer network hops).
+-- People -> Microservices may help (team autonomy).The Deeper Lesson
Netflix never abandoned microservices — but their quiet reliance on monolithic cores proves that scale is context-dependent. The lesson isn't that one architecture is universally better. It's that blindly following hype can backfire.
Monoliths, when built carefully and deployed with modern tooling, can scale astonishingly well. Netflix's accidental proof is a reminder: sometimes the old ways still work — and sometimes, they work even better.