Introducing Kite AI Agent: Conversational Operations for Kubernetes

date

Mar 6, 2026

updateDate

slug

kite-agent

status

Published

Why We Built It

Diagnosing a failing service usually looks like this:

Notice a deployment is failing in the dashboard.

Drill into the pods and find the crashing instance.

Fetch the pod logs.

Realize a ConfigMap is missing a crucial environment variable.

Open your terminal, edit the YAML, and apply the change.

Restart the deployment to pick up the new configuration.

The Kite AI Agent turns this multi-step process into a conversation. You can simply ask, "Why is the auth-service deployment crashing?" The agent will look at the deployment state, fetch the associated pod logs, identify the problem, and suggest a fix. If you tell it, "Add the missing API_URL to the ConfigMap and restart the deployment," it will generate the necessary patches and apply them.

What It Can Do

The agent is built leveraging LLM tool-calling (function calling). We've equipped it with a robust set of tools that allow it to safely read and mutate cluster state using standard Kubernetes APIs:

1. Contextual Diagnostics

Instead of chaining together multiple kubectl get and describe commands, you can query your infrastructure using natural language:

Cluster Overview: Ask for a summary of cluster health, node counts, and overall resource usage.

Resource Queries: "Find all pods in the production namespace that are currently in CrashLoopBackOff."

Log Analysis: "Fetch recent errors from the payment-worker pod and summarize them."

2. Active Remediation

The AI agent isn't strictly read-only. It can modify infrastructure directly, making it an excellent tool for rapid fixes and prototyping:

Patching: "Scale the frontend deployment to 5 replicas" or "Change the image tag of the worker daemonset to v1.2.0."

Creation & Updates: "Create a NodePort service exposing the Redis deployment on port 6379."

Cleanup: "Delete all failed pods and completed jobs in the default namespace."

Under the Hood

The Kite AI Agent runs entirely within our Go backend (pkg/ai). We built native integrations using the official Anthropic and OpenAI Go SDKs, giving you the flexibility to choose the model that best fits your workflow.

When you prompt the agent, it translates your intent into precise client-go API calls using dynamic clients (via unstructured types and discovery mapping). Tool calls like patch_resource or get_pod_logs are mapped directly to core Kubernetes APIs.

Because giving an LLM access to your infrastructure requires strict guardrails, the agent heavily relies on Kite's existing Role-Based Access Control (RBAC) implementation. The agent operates strictly within the boundaries of the logged-in user's permissions—it cannot perform actions or access namespaces that the user is not authorized to see.

Getting Started

https://github.com/kite-org/kite/releases/tag/v0.8.0

To try out the agent, pull the latest version of Kite, navigate to the AI configuration panel, and add your API key for OpenAI or Anthropic.

We are actively expanding the agent's toolbelt to handle more advanced operational workflows, including multi-cluster diagnostics and prometheus query tools.