The recent KubeCon + CloudNativeCon Europe event didn’t read like a celebration a lot as an admission. AI is in all places, however the systems underneath it are strained — not by models, but by layout. The information and machines that make AI are spread across clouds, edge sites and on-prem environments that never agreed on learn how to behave as one system — the core challenge driving the Kubernetes control plane for AI.
Recent research shows that the vast majority of AI initiatives fail to succeed in production, with most breakdowns brought on by integration and operational execution challenges moderately than model performance.
Paul Nashawaty, principal analyst at theCUBE Research, describes the structural realities at the guts of the problem: “AI is exposing a fundamental flaw in enterprise infrastructure; it was never designed to operate as a unified system. What KubeCon EU makes clear is that fragmentation across cloud, edge and on-prem is now the first barrier to production AI.”
That fragmentation now gets a reputation: sovereignty. Systems inherit policy, enterprise or regional borders. Those boundaries constrain where data and workloads can run, forcing AI systems to operate across distributed environments moderately than a single unified stack.
“Possibly the finance BU has their Llama model; perhaps the accounting BU has an OpenAI model,” said Mike Barrett, vp and general manager of Red Hat Hybrid Platforms at Red Hat Inc., who spoke with theCUBE, SiliconANGLE Media’s livestreaming studio, during KubeCon EU. “What’s probably the most cost-effective strategy to surface the intelligence that you simply want back? Due to that, [Red Hat’s enterprise customers] are on the lookout for a horizontal platform.”
Red Hat, the poster child of Kubernetes within the enterprise, goals to rein in fragmentation with a Kubernetes control plane for AI workloads across all environments.
This feature is an element of SiliconANGLE Media’s exploration of how enterprises are constructing the control plane for AI, with Red Hat playing a central role in shaping that approach. (* Disclosure below.)
Kubernetes goes beyond orchestration to resolve AI challenges
Kubernetes was never designed for AI inference. It schedules containers. It doesn’t guarantee consistency across regions. That gap becomes visible when inference workloads move into production.
“These models are doing an amount of compute that’s hard to fathom, but after I refer to users of llm-d [an open-source, Red Hat-led Kubernetes-native inference framework hosted by CNCF, designed to scale distributed LLM workloads across clusters], they’re not only attempting to construct a state-of-the-art performance system, they’re also attempting to do these day-two operations,” said Robert Shaw (pictured, left), director of engineering at Red Hat.
That “day-two” problem is where AI systems often break — not in training, but in runtime behavior, latency swings, resource contention and policy drift. Red Hat AI Enterprise seeks to operationalize and speed up agentic AI and production inference with a unified, “metal-to-agent” solution, in keeping with Jan Melen, governing board vice chair on the Cloud Native Computing Foundation, which hosts Kubernetes and runs the KubeCon + CloudNativeCon conferences.
“Cloud-native exists because of worldwide open-source collaboration model,” Melen said in the course of the KubeCon EU keynote. “Hundreds of contributors from every region constructing shared infrastructure together.”
The implication will not be subtle. AI is pushing systems built on global consistency into environments defined by fragmentation.
“Agentic AI isn’t a model problem — it’s a platform architecture problem,” said Rob Strechay, principal analyst at theCUBE Research. “The enterprises that win won’t pick higher models; they’ll construct higher infrastructure to run them.”
Kubernetes becomes less about orchestration and more about enforcing behavioral consistency across fractured environments, Strechay identified.
Platform engineering makes Kubernetes the control plane for AI
While Kubernetes can unify control, it may’t assume every team can operate that control directly. Enterprise adoption collapses when complexity is exposed raw.
“What we realized is that AI is being developed by data scientists, and as a part of that, they’re constructing their very own infrastructure to run it on,” said Brian Stevens (pictured, right), senior VP and chief technology officer for AI at Red Hat.
That gap between builders and operators is where platform engineering enters, in keeping with Strechay.
“Fragmented tooling, skill gaps and operational complexity have gotten the true bottlenecks, driving a shift toward platform engineering and Kubernetes as a unifying control plane,” he explained.
The system stabilizes only when Kubernetes stops being exposed directly and becomes mediated through platforms that reduce friction. Red Hat OpenShift AI sits in that role, abstracting operational complexity into repeatable patterns, with model training, deployment, serving and inference for hybrid environments.
Virtual machines extend Kubernetes as a substitute of resisting it
Enterprises don’t modernize every thing without delay. Billing systems and databases are likely to stay where they’re. Mainly, risk keeps legacy systems alive.
Research shows that 84% of IT decision-makers report difficulty managing separate VM and container environments, with siloed tools and fragmented operations driving inefficiency across hybrid infrastructure. If those VMs stay outside Kubernetes, the system stays split. But what if virtualization is brought into Kubernetes?
“We predict virtualization and containers mustn’t live in silos; they ought to be on one platform — and KubeVirt makes that occur,” said Daniel Messer, senior manager of product management at Red Hat.
KubeVirt, a project now maturing at CNCF, extends Kubernetes into virtualization, allowing VMs and containers to share the identical control plane.
“Graduating for us makes it more obvious for those that [KubeVirt] is deeply embedded within the Kubernetes ecosystem [and] the CNCF ecosystem,” added Andrew Burden, KubeVirt maintainer.
The direction is consolidation of operational surfaces, not elimination of legacy systems.
Sovereignty creates distribution, not centralization
Sovereign AI often looks like an answer, however it also imposes constraints. Laws block data from moving across borders. Policy blocks centralization. Enterprises split workloads across clouds, on-prem and edge environments whether their architecture is prepared or not, in keeping with Gabriele Bartolini of EnterpriseDB, who reframed the underlying principle in a recent interview with The Recent Stack.
“True sovereignty starts with the database,” Bartolini said. “In case your PostgreSQL isn’t portable across environments, you don’t really control your stack.”
And he warns against assuming managed convenience equals control: “Convenience is the cloud’s biggest shortcut, but convenience isn’t sovereignty. Real control means you possibly can move your database anywhere and it behaves the identical.”
Jan Melen’s keynote draws a tough line contained in the sovereignty debate: “We should always separate code sovereignty from deployment sovereignty. The code itself stays global commons, shared, open, collaboratively developed.”
Deployment is where sovereignty bites. That’s where law and policy resolve where workloads can actually run, and under what conditions. The split is what Kubernetes tries to operationalize: global code, local execution.
Ecosystems define whether the Kubernetes control plane for AI will work
No vendor can cover AI infrastructure alone. A Kubernetes control plane for AI only works if it spans systems as a substitute of replacing them. That puts the burden on the ecosystem — the shared standards, APIs and upstream projects that permit different tools operate as one system.
Nashawaty points to Red Hat’s role inside that upstream layer: “Red Hat’s influence extends well beyond its industrial platform. The corporate has long been one of the energetic contributors to the Cloud Native Computing Foundation ecosystem.”
That upstream work will not be cosmetic. It’s what keeps Kubernetes consistent across vendors. Without it, every distribution drifts and the control plane fractures into competing implementations.
Except for contributing to open-source projects, Red Hat is partnering with corporations for scalable enterprise AI. Notably, Red Hat AI Factory with Nvidia focuses on constructing, deploying and scaling AI infrastructure using Red Hat OpenShift and Nvidia accelerated computing for high-performance AI workloads.
“When as many as 75% of enterprises report double-digit AI failure rates tied to fragmented systems, it’s clear the bottleneck has shifted to infrastructure,” said Nashawaty, underscoring the associated fee when upstream infrastructure coordination breaks down, especially in the case of AI.
That failure rate will not be about missing features. It reflects systems that can’t operate together. Ecosystems prevent Kubernetes from collapsing into one other silo — the precise consequence it’s purported to avoid.
Kubernetes becomes the production layer for AI
AI doesn’t collapse infrastructure in a single place. It stresses every seam without delay. Kubernetes becomes the layer that attempts to carry those seams together.
Stevens described the shift toward consolidating fragmented systems onto a single platform: “It’s a really powerful concept to reapply that — and likewise consolidate it on the identical platform with fewer vendors and fewer attack surface for changes in numerous learning curves, which I feel has been the ability of Kubernetes all along.”
That consolidation only works if the ecosystem holds. Melen underscores what happens if it doesn’t: “If sovereignty results in fragmentation, we risk undermining the trillions of dollars of value that open source has already brought globally.”
The system doesn’t change into simpler, however it becomes governable. Kubernetes isn’t the tool of selection since it is ideal. The industry chooses it because fragmentation leaves no alternative coordination layer that scales.
Red Hat’s bet is that abstraction through Kubernetes is the one viable strategy to keep AI operational across disparate worlds.
(* Disclosure: TheCUBE is a paid media partner for the KubeCon + CloudNativeCon NA event. Neither Red Hat Inc., the first sponsor of theCUBE’s event coverage, nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
Photo: SiliconANGLE
Support our mission to maintain content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.
- 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
- 11.4k+ theCUBE alumni — Connect with greater than 11,400 tech and business leaders shaping the long run through a novel trusted-based network.
About SiliconANGLE Media
Founded by tech visionaries John Furrier and Dave Vellante, SiliconANGLE Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our latest proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to assist technology corporations make data-driven decisions and stay on the forefront of industry conversations.

