The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.
Please note: This schedule is automatically displayed in Mountain Standard Time (UTC -7). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.
Databricks manages over a thousand k8s clusters across three major cloud providers which run critical workloads in cloud regions around the world. This talk describes the system we built to upgrade nodes’ operating system, k8s version, and other configs monthly, supporting EKS, AKS, GKE, and self-managed k8s. Our system is built on k8s operators and performs zero-downtime blue-green rolling updates, respects contracts with services with features like PDBs, maintenance windows, deferred node draining, and custom workload handling plugins. It enables easy rollbacks, has good observability, and incurs minimal human operational cost. This has allowed us to patch vulnerabilities and release infrastructure changes quickly and reliably across the fleet. We will also share our lessons learned on building several operators that work together using the controller-runtime framework, designing the declarative interfaces between them, and achieving consistent behavior across three clouds.