Generated by GPT-5-mini| Kubernetes controller-manager | |
|---|---|
| Name | Kubernetes controller-manager |
| Developer | Cloud Native Computing Foundation |
| Initial release | 2014 |
| Written in | Go |
| License | Apache-2.0 |
| Website | https://kubernetes.io |
Kubernetes controller-manager The controller-manager is a control-plane component that runs a set of controllers which reconcile cluster state with desired state. It collaborates with the kube-apiserver, etcd (software), and node-level agents to manage resources such as pods, replicas, and endpoints, forming an integral part of CNCF-hosted Kubernetes (software) clusters. The component is implemented in Go (programming language) and maintained by contributors from organizations including Google LLC, Red Hat, Amazon Web Services, and Microsoft.
The controller-manager aggregates multiple controllers into a single binary and process to simplify lifecycle and scheduling; it communicates with the kube-apiserver, watches resource objects stored in etcd (software), and enqueues reconciliation loops. The design reflects principles codified in works like the Twelve-Factor App and operational patterns advocated by Cloud Native Computing Foundation projects. Historically, the consolidation into a manager drew on precedents from distributed systems research at Google LLC and operational tooling from projects such as Borg (cluster manager) and Omega (scheduling system).
The controller-manager runs controllers that implement control loops; its architecture includes leader election, shared informers, workqueues, and client-go libraries. It uses the client-go package for API access and relies on shared informer factories for event distribution; leader election enables high availability often orchestrated with the help of etcd (software) leases. The process interacts with other control plane components like the kube-scheduler and integrates with cloud-provider controllers developed by vendors such as Google LLC, Amazon Web Services, Microsoft Azure, and OpenStack Foundation-backed projects. Key components include the main binary, plugin registration, and per-controller goroutine pools influenced by patterns from Linux concurrency models and the Go (programming language) runtime.
Built-in controllers perform tasks such as maintaining desired replica counts, handling node lifecycle, managing endpoints, and garbage collection. Examples include the ReplicaSet controller (influenced by earlier replication managers at Google LLC), the Node controller, the Endpoints controller, the Service controller, the Namespace controller, the Garbage Collector, and cloud-provider controllers for load balancers and volumes. These controllers reference API types defined in Kubernetes (software) APIs and follow API stability policies similar to those in projects overseen by the Open Container Initiative and standards discussions involving the Linux Foundation.
The controller-manager can be deployed as a static pod on control plane hosts, as a managed service in platforms like Google Kubernetes Engine or Amazon Elastic Kubernetes Service, or as part of self-hosted control planes. Configuration is provided via command-line flags, API server RBAC bindings, and configuration files (e.g., ComponentConfig); common flags include lease durations for leader election, concurrent-syncs, and controllers toggles. Distribution and packaging methods reflect practices used by Debian, Red Hat, and cloud distributors; operators often manage lifecycle with tools such as systemd, kubeadm, and GitOps tools influenced by Flux (software) and Argo CD.
Custom controllers extend the controller-manager model via operators and controllers implemented with frameworks like the Operator SDK, Kubebuilder, and client-go. The operator pattern, promoted by vendors including Red Hat through projects like Operator Framework, encapsulates application-specific automation using CustomResourceDefinitions that interact with the API server. Community projects such as Prometheus (software) operators, Istio, and other CNCF incubating projects illustrate extensibility. Development workflows often reference specifications and tooling from standards bodies like the Open Policy Agent ecosystem.
The controller-manager requires careful RBAC configuration to grant least-privilege access to API resources; roles and role bindings define permissions for controllers to read and modify objects. Authentication typically uses client certificates or service account tokens provisioned by systems like kubeadm or cloud IAM services (e.g., AWS Identity and Access Management, Google Cloud IAM). Network-level hardening references best practices from CIS (Centre for Internet Security) benchmarks and integration with projects such as SPIFFE/SPIRE for workload identity. Audit logging, Admission Controllers including PodSecurityPolicy (deprecated) successors, and API admission webhooks are commonly applied to limit attack surface.
Operational troubleshooting uses logs, metrics, and tracing: logs from the controller-manager binary, metrics exposed via Prometheus instrumentation, and distributed traces that can integrate with systems like Jaeger (software). Common issues include leader election flapping, API throttling against the kube-apiserver, informer cache staleness, and excessive reconciliation resulting from misconfigured controllers or controllers with high concurrency. Tuning options include adjusting client-side QPS and burst, increasing informer resync periods cautiously, and scaling control-plane nodes or splitting controllers across processes as practiced in large deployments by organizations such as Spotify and Pinterest. Debugging workflows often reference community resources including SIGs associated with Kubernetes (software) and operational guidance from cloud providers like Google LLC and Amazon Web Services.