Kubernetes-native Target Operating Model (KTOM) for Modern Times

Vishal Anand
AI+ Enterprise Engineering
4 min readMay 21, 2021

--

Author: Vishal Anand, Thought Leader IT Architect, Sr. Inventor

Kubernetes-native could be any object, resource, custom resource, application, middleware, workload, image, tool, interface etc. which is designed to run on Kubernetes platform, and it runs with its own or shared YAML file. This is the simplest definition.

Kubernetes-native technologies and environments are meant to provide true Portability and Interoperability.

So, what is Kubernetes-native operating model?

I firmly believe that it is fundamentally different from traditional Cloud-native operating model but some aspects can be same as well. This varies in certain conditions or use cases or specific transformational journeys. Kubernetes-native is specialization of Cloud-native.

Before I answer that question, think of a simple Kubernetes cluster with a bunch of worker or compute nodes running containerized workloads or applications.

Then, let us look at some scenarios.

Scenario 1: Think of an operating model where a node (without expensive HA solution) goes down which usually would trigger a priority severity incident and ticket is assigned to an administrator and the node issue is resolved within the expected time bound by the service level agreement or objective. Further, imagine a database or an application going down due to underlying compute node problem which waits for humans with expertise to intervene, resolve the issue and bring them up. In this scenario, traditionally, there would be relevant events /alerts following which human help desk gets involved, a human system admin gets involved, a human database admin or application admin gets involved and so on. If you already realized here, it heavily relied on a combination of tooling, resources, process, skills, humans, responses, human behaviour, timing etc.

Scenario 2: Now, let us look at another scenario. A worker node of Kubernetes cluster goes down which is leveraging underlying IaaS. Assume the billing is on hourly basis for the IaaS here (almost all cloud providers have that), you are not going to spend time in resolving node down issue, if it is not quick. Simply, throw the node out of the cluster and add a new one. Kubernetes is smart enough to schedule / re-schedule the pods across the nodes. Of course, I am assuming (rightly so) that the sizing and design aspects are proper. You could even configure eviction behaviour, scheduling behaviour, placement rules etc. (for failure scenarios) and auto-scalability aspects to take care of the scaling and load balancing within the cluster.

Scenario 3: Requirements, where a certain set of tools or patches have to be installed on compute machines. Traditionally, they get installed either by humans manually or sure by imperative automation (i.e. automation with frequent human intervention in simple terms).

Scenario 4: Same above requirements like tools installations and more like patching, upgrades, observability, auto-pilot, runbooks execution etc. get performed in a declarative automated manner (human knowledge codified) using Kubernetes Operators — a few consistent easy buttons to be pushed.

Scenario 2 and 4 here are great examples of Kubernetes-native operating model.

While operating models and their definitions may broaden depending upon the type of digital transformation journeys your clients take — it can include new ways of working or a simple declarative automation or immutability or culture or consumption or agility or adoption or process re-engineering or innovation — or a simple mindset change and so on.

Kubernetes-native operating model must enable innovation, speed, must be insights-driven, enable collaboration by design, blur (or minimize) the boundaries between Devs & Ops & Apps & Sec (DevSecAppOps) at the least, leverage declarative automation, be fault-tolerant by design, leverage native continuous integration / delivery, be codified (at infrastructure or application level), embrace resilience engineering integrating human factors, leverage bots (or SuperBots for Kubernetes) & operators, leverage immutability where possible, provide interoperability, support natively pluggable integrations with external ecosystem, provide circuit breakers, adjust blast radius, withstand chaos, leverage API economy, not wait for humans, provide observability, provide services consistency, provide user consistency, provide consumption consistency and so on. I must state the obvious that it must use (or integrate with or extend) Kubernetes API and other components (e.g. controller, scheduler, YAMLs etc.).

It further propels the power of operating model when the Kubernetes-native platform provides ‘Declarative coexistence of VMs and Containers’.

So, in a nutshell, when human architects think of designing a Kubernetes-native solution, they must think of designing a system where heavy lifting should be mostly (or always) done by the Kubernetes and its containerized or containerization enabled ecosystem. Containers and Kubernetes based systems (or services) work well when they hand-shake with (or interface with like-minded developed or engineered system) modern technologies, modern processes, modern thinking, and relevantly skilled resources — when this happens, it results into a Kubernetes-native operating model.

This is an era of digital transformation — and it must complete its cycle gracefully. Kubernetes-native operating model should be at its core.

I would also like to share how an enterprise class Kubernetes-native operating model architecture should look like (as shown below).

Kubernetes-native Target Operating Model (KTOM) Architecture by Vishal Anand

Disclaimer: Views and thoughts expressed are my very own and really personal ones.

--

--

Vishal Anand
AI+ Enterprise Engineering

Global Chief Technologist, Executive Architect, Master Inventor, Fellow of BCS, Certified Distinguished Architect.