Why Multi-Cloud Failed
Multi-cloud had many theoretical benefits. Cost-efficient. Flexible. No vendor lock-in. Best-of-breed services. Increased bargaining power. Risk mitigation.
Of course, none of these came true. In fact, many of these predictions actually ended up working in the opposite direction. Some reasons why multi-cloud failed.
Cost inefficient. The cost of data transfers roughly increases (1) inter-region, (2) inter-cloud, and (3) over the internet. This can make cross-cloud network fees extremely expensive compared to regional deployments.
Slow data transfers. Not only is it more costly, but data that has to be transferred over the public internet is much slower than data transferred over private Google Cloud backbones or AWS’s global network.
Security and compliance gaps. There’s no virtual private cloud (VPC) abstraction that spans clouds. Going over the internet means publicly accessible endpoints, which are sometimes easier to exploit than services on private or isolated cloud networks. How do you enforce consistent governance policies across clouds with different identity solutions?
Wide API surface means lack of interoperability. The S3 API isn’t just a CRUD wrapper around file storage. It’s a deep and hard to emulate API. And there are customers who depend on those esoteric features of the API (Hyrum’s law).
Any attempt at interoperability ended up as a least-common-denominator design. You have two choices. You can union all the features and end up with a complex API that still might be implementation specific. Or, you could intersect the APIs and end up with a small but potentially useless set of features that are common across clouds.
Vendor lock-in. Over time, cloud providers moved up the stack. Serverless runtimes (AWS Lambda, Google Cloud Run) didn’t have true analogs across clouds. Specialized tooling requires different skills (an AWS certification won’t help you on Google Cloud).
Discounts. Large spend on a single cloud can be heavily discounted. Nobody pays the sticker price.
Theoretical but not practical failover support. AWS has never had a full global outage. Even individual regions have extremely high reliability across most services. The benefit of using multi-cloud for failover is rarely worth the effort.
Worse developer experience. Having to aggregate logs, deployments, and other data across clouds requires extra work. Even if these tools existed, they add another ETL pipeline for developers to deal with.
But some cracks are beginning to emerge in the mono-cloud culture. Some ideas:
Infrastructure-as-code. Adds a more programmatic layer to cloud infrastructure. APIs are much easier to migrate than UIs. While it isn’t trivial to convert an AWS Terraform template to a Google Cloud Terraform template, it’s at least a little easier to reason about than it was before.
Framework-defined software. While cloud services aren’t fungible in every way, there are some common feature sets. By defining a smaller set of functionality (possibly across services) for a specific workflow, you might be able to replicate that set on multiple clouds.
Standardized infrastructure like Kubernetes. Part of the value of Kubernetes is not the implementation but the standardization of infrastructure. It can act as a common deployment substrate for third-party SaaS applications. Standard infrastructure APIs open up new opportunities (see Kubernetes as a dev tool).