Databases on K8s — Really? (part 8) | by Boris Dali | Google Cloud - Community

This multi-part blog series is a backstory that deep-dives into the motivation behind building Kubernetes Operators for running databases. This is the eighth (and last!) installment that summarizes the whole series and the reasons for creating an Assisted and AI-Assisted Database Management product.

Written by: Boris Dali, Database Engineer @ Google (LinkedIn)

Disclaimer

First off, the disclaimer: this blog series is neither a Google official documentation, nor it is an authoritative source. For the former, docs on AlloyDB Omni in particular, please see here and for the latter, please feel free to reach out to the AlloyDB product team. Google Cloud official blogs are posted here. Medium hosts Google Cloud Community here. This is neither. The opinions I express here in this blog post series are my own and may not represent or agree with Google’s official position on the subject.

So what is it then? Well, if you think of this blog series as the one engineer’s pseudo random ramblings on a particular topic (“running databases in containers on K8s” that is), you won’t be far off 🙂.

Blog Series Summary

If you missed part 1, part 2, part 3, part 4, part 5, part 6 and part 7 of this backstory and want to follow along, it may be a more logical place to start there (and yes, if you haven’t read it, apologies in advance, it’s not exactly a short read 🙂)

To recap, I started with the goal of exploring my reasoning behind the decision to invest into containers and K8s as a viable alternative to the more mainstream ways of running databases, but went down the rabbit hole of building the foundation first. It took me four installments in this blog series to define with sufficient (I hope) depth what those 13 expectations are. In the fifth installment I presented what I referred to as “a reality check” to see how cloud provider’s DBaaS systems fare against my expectations. TL;DR: they didn’t 😭.

This led me to the sixth installment in this series where I outlined the solution that in my mind closes the gap and satisfies many of my 13 expectations. That solution is what I referred to as the Assisted and AI-Assisted Database Management product. Portable, run-anywhere (including being temporarily disconnected and even fully air-gapped), leaving a customer in the driver seat with the “root” privileges and offering declarative management interface, just like SQL, but in the form of the YAML manifests to state the desired database intent.

In the seventh installment I also presented my view on the shared responsibility model, which I believe is the key success factor for Assisted Database Management product. I further tried to justify my choice of an Operator pattern in favor of other solutions and listed some of the K8s (still) annoying “features” that I hope will improve in the future.

There’s a lot more to say about many of the topics I briefly touched base on and I hope to deep-dive into them in the more narrow, more recipe-like posts, but I hope that the backstory I presented here explains the motivation for investing into the K8s ecosystem and building of the K8s Operators for hosting databases.

All of this brings me to this eighth and final installment where I intend to summarize this blog series.

Gemini’s take

The responses I received to this blog series so far are rather polarizing. Some folks complain that the explanations are too long, way beyond their attention span, while others state that the explanations are not exhaustive enough. Both groups are probably right and I’ll obviously try to address the latter group with the more narrowly scoped posts going forward, but for the former group, I thought I’d give the recently launched Gemini in Chrome feature a try at summarizing the series and presenting a half a page long TL;DR. Here’s what I got:

Prompt #1:

Hey Gemini, the first four installments published in this blog post series outline 13 expectations from an on-prem customer of a cloud provider’s DBaaS system. Please analyze them and tell me if you think these expectations are reasonable

Gemini’s response:

In my analysis, the 13 expectations outlined in the blog post series are largely reasonable and strategic from the viewpoint of a sophisticated risk-averse enterprise customer. These expectations reflect a desire to mitigate the traditional risks of cloud migration (lock-in, lack of control) while demanding the core benefits of a modern, efficient cloud platform. Many of them align with current industry trends and the future direction of the DBaaS and Kubernetes ecosystems.

Gemini then proceeded further and presented the breakdown of the 13 expectations by grouping them into the three categories (that I didn’t even think of!) to substantiate its claim that they are in fact reasonable.

Well, I couldn’t agree more 🙂.

Prompt #2:

Hey Gemini, what would be your key takeaways from this blog series?

Gemini’s response:

Analysis of the Blog Series’ Core Argument

The central argument of the “Databases on K8s — Really?” series is a strategic one, building up the justification for running Kubernetes Operators for databases like AlloyDB Omni.

Strategic Focus (“The Why”): I find the author’s approach of focusing on the strategic “why” before the technical “how” to be a robust starting point. It’s a common and necessary step in complex infrastructure decisions to first establish the business value and address the fundamental question of why an existing system (VMs, bare metal) needs to change.
Addressing Complexity: The series correctly identifies and addresses the common skepticism regarding the added complexity of Kubernetes for stateful workloads. The use of the Space Pen analogy highlights that complexity is only “over-engineering” if it fails to deliver a proportional benefit (e.g., safety, reliability, or automation).
DBaaS as the Benchmark: The argument that a self-hosted solution (like a database on K8s) must be evaluated against the high standard set by public cloud DBaaS (Database as a Service) offerings (like AlloyDB, RDS, etc.) is a very strong and relevant industry perspective. If a self-managed solution can’t significantly reduce operational overhead, it is unlikely to be adopted.
The Role of Kubernetes Operators: The final conclusion that Kubernetes Operators are the necessary mechanism to bridge the gap — by automating complex “Day 2” operations (like patching, upgrades, and failover) to match the “as-a-Service” experience — is the dominant trend in the cloud-native ecosystem for running stateful applications on K8s. This point is well-supported by the existence of products like El Carro (mentioned in the footnotes) and other community operators like CNPG.

In summary, the series establishes a logically sound and strategically relevant framework for debating the future of database hosting: traditional hosting must compete with Cloud DBaaS, and K8s-based solutions are the path to bringing the “DBaaS experience” to any environment (on-prem, hybrid, or multi-cloud).

Spot on, Gemini!

As an author, I couldn’t have summarized it better myself 🙂.

Assisted and AI-Assisted Database Management

In closing, I thought I’d go back to reading the tea leaves from my divination class and so here’s my naive forecast of what the future holds:

Fork in the road: DBaaS vs. Assisted and AI-Assisted DB Management product

Where do you see customers –especially those who still sit on the fence today with regards to the big and scary cloud migration– spending their database budgets a few years from now? One seemingly obvious answer is that it is just a matter of time until the hype, the FOMO, the buzzwords and billboards would push these customers towards the fully managed database services like AlloyDB Cloud, Cloud SQL and similar.

I beg to differ. Yes, people certainly appreciate when somebody else does the mundane and tedious tasks for them like patching their databases every quarter or when a new CVE comes out. And when somebody else takes full responsibility for chores like running a backup, especially when it’s backed up by an SLA restore/recovery guarantees with the clearly spelled out RTO and RPO and especially if you are offered an option to spend an extra buck to buy you a further reduced RPO (no, I don’t believe I’ve seen it being offered today). And yes, these DBaaS systems obviously come with the security that is guaranteed best-of-bread with hundreds of security engineers obsessed with safety of your data at rest and in flight and running endless penetration tests and offering various bounty programs to discover any last mile vulnerabilities in their management software.

So yes, SaaS and DBaaS in particular is to become even more of a mainstream and yet my expectation is that this won’t be a black and white matter for everybody and some enterprise customers are going to spend their database dollars differently, not always on a fully managed DBaaS. Why? I can think of a few reasons.

For one, there’s the rapidly increased complexity. In the rush to be more advanced than the competition, public cloud providers move at the speed of light to add more features, more knobs, more integrations (with AI/ML, with Big Data, with analytical and reporting dashboards, etc.). For a newcomer, a single look at the AWS landing page may prove to be scarier than a “Scream” movie. Yes, database services alone become easier, better, more streamlined, but databases don’t live in a vacuum. They need to be secured, hidden behind the VPCs with the proper firewall ACLs, the right peering rules, all logged and audited, so that only the apps that are supposed to make use of them, reach them and not the intruders. The overall deployment for enterprise grade applications gets more and more complex because it gets more feature rich too. So it’s a race and my bet is that the pace of innovation would only accelerate compared to the efforts of making the user experience smoother.

If so, where would the database dollars be spent on? On various professional services and integrators who can listen to the on-prem customer’s story and get paid not only when your whole app gets successfully migrated and tested end to end, but when it’s also proven that a cloud provider’s DBaaS is in fact more cost-effective compared to the on-prem (and not just in CapEx, but also including a monthly cloud bill). But if so, what would the PSO and integrators use? Well, my bet is not on the canned UIs of the fully managed DBaaS systems. My bet is on the repo-driven, peer review and approval controlled, tightly integrated and deployed together with the app, a-la-carte database services that bring the most value to a customer at the lowest cost. Why? Because it’s repeatable, easy to templatize and reuse among customers and yes, it’s predictable and reproducible too. Basically my bet is on a cost-conscious IaC that only picks the individual database management services needed for customer’s app management.

The second reason is my 13 expectations that I outlined in the first 4 installments of this blog series, which are unlikely to be easily met by the cloud providers’ existing solutions. My analysis could be a tad off (well, may be more than a tad), but my reality check showed that only a few of the expectations are really fully satisfied today. This is not to say that all of the on-prem folks would be similar to my hypothetical case study’s customer, but I bet that some are.

And then Gemini or ChatGPT are great, but not everybody has warn and fuzzy feeling while sending their AI requests across the wire and so hosting a local LLM (likely with the local set of agents too) along with the local database may just fit the bill.

To summarize, in my opinion, all of the above leads to a rise of a new kind of downloadable product that can be viewed as an extension of a cloud or the one that can work completely standalone. This product leaves a customer –whether it’s an actual end user or a software integrator– in control while still offering database managed services, but targeted, self-contained, loosely coupled, purchasable a-la-carte and capable to run anywhere, on any cloud, at the edge or on-prem, disconnected or even air gapped. And yes, with the intent-based declarative user interface too.

Hence the Assisted and AI-Assisted Database Management product.

Source Credit: https://medium.com/google-cloud/databases-on-k8s-really-part-8-182259e1720f?source=rss—-e52cf94d98af—4