Loading…
Attending this event?
In-person
November 12-15
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Mountain Standard Time (UTC -7). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 
Operations + Performance clear filter
arrow_back View All Dates
Wednesday, November 13
 

11:15am MST

ARM-Wrestling: Overcoming CPU Migration Challenges to Reduce Costs - Laurent Bernaille & Eric Mountain, Datadog
Wednesday November 13, 2024 11:15am - 11:50am MST
When you have a significant cloud footprint, you always look for performance improvements and cost reductions. So when ARM instances became commonly available on one of our providers, seemingly providing great performance at a lower cost, we had to take a closer look! In this talk, we will first describe the steps we took to make our clusters ARM-ready and a few interesting issues we encountered during our initial tests: from performance regressions due to compiler behaviors to subtle memory corruption bugs. We will then discuss new challenges, in particular how to achieve load-balancing and auto-scaling when running workloads on a mix of CPUs with different performances, and share our results. If migrating real workloads to ARM proved challenging, it was worth the effort and we now run more than 50% of our workloads on ARM.
Speakers
avatar for Laurent Bernaille

Laurent Bernaille

Principal Engineer, Datadog
Laurent Bernaille worked several years as a consultant specializing in cloud, containers, and automation and helped organizations migrate to the public cloud and adopt containers. He is now Principal Engineer at Datadog and works closely with infrastructure teams, which are responsible... Read More →
avatar for Eric Mountain

Eric Mountain

Staff Engineer, Datadog
Eric Mountain began working with Kubernetes in 2014 helping Amadeus migrate to container and cloud technology. Eric is now a Staff Engineer in Datadog’s Compute team providing large scale Kubernetes to our internal users.
Wednesday November 13, 2024 11:15am - 11:50am MST
Salt Palace | Level 1 | 155 BC
  Operations + Performance
  • Content Experience Level Any

12:10pm MST

Automated Multi-Cloud, Multi-Flavor Kubernetes Cluster Upgrades Using Operators - Ziyuan Chen, Databricks
Wednesday November 13, 2024 12:10pm - 12:45pm MST
Databricks manages over a thousand k8s clusters across three major cloud providers which run critical workloads in cloud regions around the world. This talk describes the system we built to upgrade nodes’ operating system, k8s version, and other configs monthly, supporting EKS, AKS, GKE, and self-managed k8s. Our system is built on k8s operators and performs zero-downtime blue-green rolling updates, respects contracts with services with features like PDBs, maintenance windows, deferred node draining, and custom workload handling plugins. It enables easy rollbacks, has good observability, and incurs minimal human operational cost. This has allowed us to patch vulnerabilities and release infrastructure changes quickly and reliably across the fleet. We will also share our lessons learned on building several operators that work together using the controller-runtime framework, designing the declarative interfaces between them, and achieving consistent behavior across three clouds.
Speakers
avatar for Ziyuan Chen

Ziyuan Chen

Software Engineer, Databricks
Ziyuan Chen is a software engineer at Databricks. He has worked on Databricks' cloud platform and OS infrastructure.
Wednesday November 13, 2024 12:10pm - 12:45pm MST
Salt Palace | Level 1 | 155 BC
  Operations + Performance

2:30pm MST

Does My K8s Application Need CPR? Performance Evaluation of a Multi-Cluster Workload Management App - Braulio Dumba & Ezra Silvera, IBM
Wednesday November 13, 2024 2:30pm - 3:05pm MST
KubeStellar (KS) is an open-source Kubernetes multi-cluster workload configuration management system that can be used to manage AI workloads in multi-cluster environments. Hence, understanding KS performance is crucial especially when managing resource intensive AI workloads. In this talk, we will present our experience in analyzing the performance metrics of KS across several dimensions of scalability (e.g., number of bindingPolicies, workload description spaces and number of managed remote clusters) and challenges that arise when conducting performance experiments in a multi-cluster environment. Our insights will demonstrate the utility of benchmarking the performance of a multi-cluster Kubernetes workload management application. Additionally, in this talk, we will demonstrate the usefulness of using several opensource tools such as clusterloader2, kube-burner & kwok to evaluate the performance of multi-cluster Kubernetes management applications.
Speakers
avatar for Ezra Silvera

Ezra Silvera

Senior Technical Staff Member, IBM
Ezra Silvera is a Senior Technical Staff Member at IBM Research. His interests include distributed systems, cloud management, and cloud infrastructure. Ezra is passionate about open-source technologies and has been involved in several notable open source projects such as Docker, KubeVirt... Read More →
avatar for Braulio Dumba

Braulio Dumba

Staff Research Scientist, IBM
Dr. Braulio Dumba is a Staff Research Scientist at IBM Research. In 2018, he joined IBM under the Hybrid Cloud organization. His current research is focus on edge computing and hybrid cloud computing. Dr. Dumba earned a Ph.D. in Computer Science from University of Minnesota, Twin... Read More →
Wednesday November 13, 2024 2:30pm - 3:05pm MST
Salt Palace | Level 1 | 155 BC
  Operations + Performance

3:25pm MST

Global Payments: Setting New Standards for Reliability in Cloud Native Multi-Region Applications - Trey Caliva, Global Payments
Wednesday November 13, 2024 3:25pm - 4:00pm MST
As a multinational FinTech provider, processing over 32 billion card transactions for 816 million accounts, Global Payments requires globally available architectures with quick disaster recovery while maintaining subsecond latencies. In addition, these workloads require strict adherence to compliance standards. This session will explore the high-level architectural decisions implemented in a cloud-native redesign and cloud migration of a mission critical legacy .NET application. Key cloud native tools leveraged include Kubernetes on GCP, and the use of CockroachDB as a cloud native database solution. We will explore how leveraging these cloud native technologies achieved extreme fault tolerance in a multi-region deployment, setting new standards for performance and reliability.
Speakers
avatar for Trey Caliva

Trey Caliva

Principal Cloud Architect, Global Payments
Trey Caliva is an Architect and engineer with 10+ years of hands-on experience planning, developing, managing, and securing deployments in Google Cloud and AWS. He is currently Principal Cloud Architect at Global Payments, a Fortune 500 company and a member of the S&P 500 focused... Read More →
Wednesday November 13, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 1 | 155 BC
  Operations + Performance

4:30pm MST

Kubernetes at Scale: Practical Solutions for Enhanced CNI and Kubelet Performance - Henrique Santana, Amazon Web Services & Bruno Gabriel da Silva, Sysdig
Wednesday November 13, 2024 4:30pm - 5:05pm MST
In this session, we'll explore challenges faced in maintaining optimal performance for Container Network Interface (CNI) and Kubelet components in Kubernetes clusters. Based on recurring real-world scenarios, we will dive into troubleshooting and mitigations of issues such as IP address allocation delays, registry pull queries per second (QPS), disk throttling. These pose significant impacts on the performance, scalability and stability of Kubernetes clusters. Our discussion will revolve around practical strategies aimed at mitigating such challenges, leveraging multiple block storage volumes, adjusting instance types, tuning registryPullQPS settings, and exploring the benefits of prefix mode for faster IP address allocation. Additionally, we'll examine the role of warm IP pools, and the implications of WARM_ENI_TARGET settings on CNI performance, providing attendees with a comprehensive understanding on how to optimize CNI and Kubelet performance effectively.
Speakers
avatar for Bruno Gabriel da Silva

Bruno Gabriel da Silva

Sr. Solutions Engineer, Sysdig
I have been working as a Solutions Engineer for several years, with my passion for cloud-native technologies igniting around 2018. That year, I transitioned from a traditional IT Windows Sysadmin role to fully embracing DevOps, focusing entirely on Open Source and Cloud. My first... Read More →
avatar for Henrique Santana

Henrique Santana

Sr. Cloud Support Engineer, Amazon Web Service
I'm Containers Specialist with over 15 years of experience in infrastructure operations. Skilled at automating workflows and solving problems through user-centered design and emerging technologies. Currently focusing on containers and container orchestration. Adept at optimizing resource... Read More →
Wednesday November 13, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | 155 BC
  Operations + Performance

5:25pm MST

Misadventures in Large Scale Cluster Performance - Shane Corbett, AWS & Dima Ilchenko, Lacework
Wednesday November 13, 2024 5:25pm - 6:00pm MST
Join us for our follow up to one of the highest rated talks of kubecon 2022 (73,000 pods a day, lessons from misadventures in multi-tenant). We are on a new misadventure, asking the question what if some of the most popular advice about Kubernetes was just...wrong? We spent over two years pouring through 800 page linux kernel performance books, tweaking obscure control plane settings, and developing detailed custom monitoring dashboards so you don’t have to! Join us as we take you through real world findings that took months of research to fully understand, and provide evidence that some of the things we were convinced were best practices, were the very things holding us back the most.
Speakers
avatar for Dima Ilchenko

Dima Ilchenko

SRE, Lacework
Dima is a staff SRE on a Compute Platform Team focused on troubleshooting, observability and scalability of large-scale Kubernetes platform at Lacework. Lacework's unique features create unique challenges that push Kubenetes to its limits, offering Dima unique perspective into often... Read More →
avatar for Shane Corbett

Shane Corbett

Senior Kubernetes Specialist, AWS
Shane Corbett is a Senior Containers Specialist at AWS focused on helping customers with the finer points of Kubernetes large scale design and performance. When not pushing Kubernetes to extremes you will find Shane pursuing his lifelong obsession of exploring the edge of the extreme... Read More →
Wednesday November 13, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | 155 BC
  Operations + Performance
 

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date - 
  • 🚨 Contribfest
  • 🪧 Poster Sessions
  • AI + ML
  • Breaks
  • ⚡ Lightning Talks
  • Cloud Native Experience
  • Cloud Native Novice
  • CNCF-hosted Co-located Events
  • Connectivity
  • Data Processing + Storage
  • Emerging + Advanced
  • Experiences
  • Keynote Sessions
  • Maintainer Track
  • Observability
  • Operations + Performance
  • Platform Engineering
  • Project Opportunties
  • Registration
  • SDLC
  • Security
  • Solutions Showcase
  • Sponsor-hosted Co-located Event
  • Tutorials