KubeCon + CloudNativeCon North America 2024: Full Schedule

In-person
November 12-15
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Mountain Standard Time (UTC -7). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.

5:30pm MST

⚡ Lightning Talk: `Kubectl Debug` Lacks an `IDE` Option. Let’s Fix That! - Mario Loriedo, Red Hat

Tuesday November 12, 2024 5:30pm - 5:35pm MST

Hyatt Regency | Level 4 | Regency Ballroom B

Don't get me wrong. `kubectl debug` is one of my favorite `kubectl` commands. But probably because I like it so much, I am convinced it deserves more love! This talk will present a `kubectl debug` extension that starts an IDE in an ephemeral container for debugging purposes. This extension uses the DevWorkspace operator, which is capable of running lightweight cloud development environments, including the IDE, in containers. If you like debugging by adding breakpoints in an IDE rather than inspecting your application's logs, you should attend this talk.

Speakers

Mario Loriedo

Senior Principal Software Engineer, Red Hat

Mario is a Senior Principal Software Engineer at Red Hat. He works on Podman and on container-based developer tools. He has been a CNCF Ambassador and the tech lead of the Eclipse Che project. He has co-created the Devfile (a CNCF Sandbox Project). He has been a speaker at conferences... Read More →

#1 Mario Loriedo kubectl debug is missing an IDE option pdf

Tuesday November 12, 2024 5:30pm - 5:35pm MST
Hyatt Regency | Level 4 | Regency Ballroom B

⚡ Lightning Talks, Cloud Native Experience

Content Experience Level Any

5:45pm MST

⚡ Lightning Talk: Evaluating Scheduler Efficiency for AI/ML Jobs Using Custom Resource Metrics - Dmitry Shmulevich, NVIDIA

Tuesday November 12, 2024 5:45pm - 5:50pm MST

Hyatt Regency | Level 4 | Regency Ballroom B

Kubernetes deployments frequently utilize custom resources beyond just CPU and memory, such as GPUs, which are essential for AI/ML workloads. While the Metrics API offers insights into CPU and memory usage at both the pod and node levels, it does not provide similar information for custom resources. Although resource requests for custom resources are specified in the pod spec, there is no visibility into how efficiently these resources are utilized at the node and cluster levels. To address this gap, we developed a Prometheus Node Resource Exporter tailored to monitor custom resources. Our case study focuses on evaluating the efficiency of Kubernetes schedulers when handling a high volume of AI/ML jobs, using GPU occupancy on the nodes as the primary indicator. In this lightning talk, we will present a comparative analysis of several scheduling frameworks based on the metrics collected by our custom exporter.

Speakers

Dmitry Shmulevich

Software Engineer, NVIDIA

Dmitry is a software engineer at NVIDIA with over 25 years of experience in software development, specializing in cloud computing for the past eight years. Throughout his career, he has made significant contributions to various systems and projects across the cloud stack. He is also... Read More →

Tuesday November 12, 2024 5:45pm - 5:50pm MST
Hyatt Regency | Level 4 | Regency Ballroom B

⚡ Lightning Talks, Observability

Content Experience Level Any

5:50pm MST

⚡ Lightning Talk: Future-Proofing Kubernetes: Impact of Storage Version Migration and Meaning of Resource Version (RV) - Nilekh Chaudhari, Microsoft

Tuesday November 12, 2024 5:50pm - 5:55pm MST

Hyatt Regency | Level 4 | Regency Ballroom B

Kubernetes relies on API data being actively rewritten to support some maintenance activities related to at-rest storage. Two prominent examples are the versioned schema of stored resources (i.e., the preferred storage schema changing from v1 to v2 for a given resource) and encryption at rest (i.e., rewriting stale data based on a change in how the data should be encrypted). The simplest way to rewrite data is to issue no-op update requests via kubectl. This approach is problematic for any resource that can contain a large amount of data, such as Kubernetes secrets, and it is also impractical to perform without automation, as the number of resources that need migration is always growing. Storage Version Migration (SVM), which is now available as a built-in alpha API since Kubernetes v1.30, helps achieve this. However, the implementation of SVM has significant implications for the entire Kubernetes project and its ecosystem.

Speakers

Nilekh Chaudhari

Software Engineer, Microsoft

Nilekh is a Software Engineer at Microsoft, specializing in Kubernetes. He actively contributes to SIG Auth and SIG API Machinery and is a core maintainer of the Secrets Store CSI Driver, the Azure Provider for the Secrets Store CSI Driver, and the Gatekeeper Library project.

Tuesday November 12, 2024 5:50pm - 5:55pm MST
Hyatt Regency | Level 4 | Regency Ballroom B

⚡ Lightning Talks, Platform Engineering

Content Experience Level Any

6:05pm MST

⚡ Lightning Talk: Running Kind Clusters with GPU Support Using Nvkind - Evan Lezar, NVIDIA

Tuesday November 12, 2024 6:05pm - 6:10pm MST

Hyatt Regency | Level 4 | Regency Ballroom B

Kind is a powerful tool for running local Kubernetes clusters using Docker. It is particularly useful for testing, development, and CI/CD workflows, offering features like multi-node cluster support, easy configuration, and cross-platform compatibility. However, providing access to GPUs in Kind is not a very straightforward process. There is no standard way to inject GPUs into a Kind worker node, and even with a series of "hacks" to make it possible, post-processing is still needed to isolate different sets of GPUs to different nodes. In this lightning talk, we introduce nvkind – a wrapper around Kind that encapsulates the steps necessary to make GPUs available to Kind worker nodes. Ideally, GPU support would have been added to Kind directly, but many challenges exist to make this possible. This talk discusses those challenges, how we've overcome them with nvkind, and the steps needed to eventually support GPUs directly within Kind itself.

Speakers

Evan Lezar

Senior Systems Software Engineer, NVIDIA

Evan Lezar is a Senior Systems Software Engineer on the Cloud Native team at NVIDIA. His focus is making GPUs and other NVIDIA devices easily accessible from containerized environments. This includes driving development and adoption of the Container Device Interface (CDI).

KCNA24 elezar Running Kind Clusters with GPU Support Using nvkind pptx

Tuesday November 12, 2024 6:05pm - 6:10pm MST
Hyatt Regency | Level 4 | Regency Ballroom B

⚡ Lightning Talks, AI + ML

Content Experience Level Any

11:15am MST

The Future of DBaaS on Kubernetes - Melissa Logan, Constantia; Sergey Pronin, Percona; Deepthi Sigireddi, PlanetScale; Gabriele Bartolini, EDB

Wednesday November 13, 2024 11:15am - 11:50am MST

Salt Palace | Level 1 | Grand Ballroom A

Running Database-as-a-Service (DBaaS) in the cloud is a common practice for organizations, and more are seeking to offer DBaaS on Kubernetes. Benefits include cost efficiencies, as well as providing a faster, more scalable development environment. While it has many benefits, managing a DBaaS on Kubernetes can be challenging. In this panel, database experts from the Data on Kubernetes Community will discuss how to get started with Kubernetes and operators to run DBaaS, storage and security requirements, common patterns for deployment and Day 2 operations, how to leverage AI for DBaaS, and pitfalls to avoid. They will also share real world experiences from users running DBaaS on Kubernetes.

Speakers

Melissa Logan

CEO, Constantia

Melissa Logan has worked in tech for 24 years and is currently director of the Data on Kubernetes and Data Mesh Learning communities, and founder of Constantia.io - a tech community and communications company. Constantia works with data and open source companies to provide marketing... Read More →

Gabriele Bartolini

VP of Cloud Native, EDB

Gabriele, a co-founder of 2ndQuadrant and open-source advocate, has been instrumental in PostgreSQL's global growth. Focused on enhancing business continuity for large-scale databases, he has championed stateful workloads in cloud-native environments since 2019. As a co-founder and... Read More →

Deepthi Sigireddi

Software Engineer, PlanetScale

Deepthi is the Technical lead for Vitess, a CNCF graduated open source project. She also leads the Vitess engineering team at PlanetScale which offers a database service built on Vitess. She brings over 20 years of experience building scalable systems to this role. She enjoys speaking... Read More →

sergey pronin

Product guy, Percona

Sergey is a passionate technology “driver”. After graduation worked in various fields: internet service provider, financial sector and M&A business. Main focal points were infrastructure and products around it. At Percona as a Group Product Manager drives forward Kubernetes and... Read More →

Wednesday November 13, 2024 11:15am - 11:50am MST
Salt Palace | Level 1 | Grand Ballroom A

Data Processing + Storage

Content Experience Level Any

11:15am MST

ARM-Wrestling: Overcoming CPU Migration Challenges to Reduce Costs - Laurent Bernaille & Eric Mountain, Datadog

Wednesday November 13, 2024 11:15am - 11:50am MST

Salt Palace | Level 1 | 155 B

When you have a significant cloud footprint, you always look for performance improvements and cost reductions. So when ARM instances became commonly available on one of our providers, seemingly providing great performance at a lower cost, we had to take a closer look! In this talk, we will first describe the steps we took to make our clusters ARM-ready and a few interesting issues we encountered during our initial tests: from performance regressions due to compiler behaviors to subtle memory corruption bugs. We will then discuss new challenges, in particular how to achieve load-balancing and auto-scaling when running workloads on a mix of CPUs with different performances, and share our results. If migrating real workloads to ARM proved challenging, it was worth the effort and we now run more than 50% of our workloads on ARM.

Speakers

Laurent Bernaille

Principal Engineer, Datadog

Laurent Bernaille worked several years as a consultant specializing in cloud, containers, and automation and helped organizations migrate to the public cloud and adopt containers. He is now Principal Engineer at Datadog and works closely with infrastructure teams, which are responsible... Read More →

Eric Mountain

Staff Engineer, Datadog

Eric Mountain began working with Kubernetes in 2014 helping Amadeus migrate to container and cloud technology. Eric is now a Staff Engineer in Datadog’s Compute team providing large scale Kubernetes to our internal users.

KubeCon NA 2024 ARM Wrestling pdf

Wednesday November 13, 2024 11:15am - 11:50am MST
Salt Palace | Level 1 | 155 B

Operations + Performance

Content Experience Level Any

11:15am MST

AuthZEN: The “OpenID Connect” for Authorization - Omri Gazitt, Aserto

Wednesday November 13, 2024 11:15am - 11:50am MST

Salt Palace | Level 1 | 151 G

Today, the authorization world is fractured - each vendor supports its own APIs & protocols. But this is about to change. AuthZEN, a new OpenID Foundation working group, was created in late 2023 to establish authorization standards. OIDF is the home of OpenID Connect, the ubiquitous standard for federated login, and that’s where we’re setting our sights. In this talk, I'll describe the current state of cloud-native authorization, including the policy-as-code and policy-as-data approaches, and the various open source projects in each camp. I'll also share the progress we’ve made creating a single authorization API that works across both policy-as-code (OPA, Topaz) and policy-as-data (Zanzibar-style projects), present the API specs we've created so far, and show off the various interoperable implementations. With this foundation in place, engineering teams can be more confident in externalizing their authorization and picking a provider without being locked in to a proprietary API.

Speakers

Omri Gazitt

Co-founder & CEO, Aserto

Omri is the co-founder/CEO of Aserto, an authorization startup, and his third entrepreneurial venture. He's spent the majority of his 30-year career working on developer and infrastructure technology, most recently as the CPO of Puppet. Previously he was the VP and GM of HP's Cloud... Read More →

AuthZEN OIDC of Authorization pdf

Wednesday November 13, 2024 11:15am - 11:50am MST
Salt Palace | Level 1 | 151 G

Security

Content Experience Level Any

11:15am MST

GitOops... I Did It Again! Protecting Your GitOps System from Being Used for Privilege Escalation - Oreen Livni & Elad Pticha, Cycode

Wednesday November 13, 2024 11:15am - 11:50am MST

Salt Palace | Level 2 | 250 AD

From data theft to privilege escalation in the Kubernetes cluster, you don't want to be the one telling your boss that your GitOps system has been compromised. This talk covers the security of GitOps tools, highlighting common misconfiguration pitfalls and how to avoid them. We will share the story of CVE-2024-31989, a critical vulnerability we discovered in the popular tool Argo. When installed with the default configuration, this vulnerability allowed privilege escalation from any access point to the cluster (such as a webshell) to complete cluster takeover. We will discuss common insecure configurations like this and provide examples from popular open-source projects to explain how your organization can protect itself from these risks. Attendees will receive a guide and practical tools to protect their GitOps systems against such threats.

Speakers

Elad Pticha

Security Researcher, Cycode

Elad is a passionate security researcher with a focus on software supply chain and web application security. He dedicates his time to writing security research tools and finding vulnerabilities across a broad spectrum, from open-source projects and web applications to IoT devices... Read More →

Oreen Livni

Security Researcher, Cycode

Oreen Livni is a passionate security researcher specializing in application and supply chain security, Domain, and networking. With a focus on software supply chain vulnerabilities. Alongside his professional commitments, he immerses himself in art, gardening, and the world of surfing... Read More →

Wednesday November 13, 2024 11:15am - 11:50am MST
Salt Palace | Level 2 | 250 AD

Security

Content Experience Level Any

12:10pm MST

AI and ML: Let’s Talk About the Boring (yet Critical!) Operational Side - Rob Koch, Slalom Build & Milad Vafaeifard, Epam

Wednesday November 13, 2024 12:10pm - 12:45pm MST

Salt Palace | Level 2 | 255 B

As AI and ML become increasingly prevalent, it’s worth looking harder at the operational side of running these applications. We need a lot of compute and access to GPU workloads. We also need to be reliable, while providing rock-solid separation between datasets and training processes. And we need great observability in case things go wrong, and must be simple to operate. Let's build our ML applications on top of a service mesh instead of spending resources reimplementing the wheel – or, worse, the flat tire. Join us for a lively, informative, and entertaining look at how a service mesh can solve real-world issues with ML applications while making it simpler and faster to actually get things done in the world of ML. Rob Koch, Principal at Slalom Build, will demonstrate how you can use Linkerd together with multiple clusters to develop, debug, and deploy an ML application in Kubernetes (including IPv6 and GPUs), with special attention to multitenancy and scaling.

Speakers

Rob Koch

Principal, Slalom Build

A tech enthusiast who thrives on steering projects from their initial spark to successful fruition, Rob Koch is Principal at Slalom Build, AWS Hero, and Co-chair of the CNCF Deaf and Hard of Hearing Working Group. His expertise in architecting event-driven systems is firmly rooted... Read More →

Milad Vafaeifard

Lead Software Engineer, Epam Systems

Milad Vafaeifard, a Lead Software Engineer at EPAM Systems, has 9+ years of web design and development expertise. Deaf but undeterred, he is the creative force behind Sign Language Tech and an active contributor to a YouTube channel focused on tech content for the signing tech community... Read More →

Version 2 AI and ML the boring (yet critical) ops side pdf

Wednesday November 13, 2024 12:10pm - 12:45pm MST
Salt Palace | Level 2 | 255 B

AI + ML

Content Experience Level Any

12:10pm MST

Beyond 'Can You Mentor Me?' - Crafting the Contribution Ladder - Nitish Kumar, Akuity; Wenjia Zhang, Google; Jared Watts, Upbound; Carol Valencia, Elastic; Nabarun Pal, Broadcom

Wednesday November 13, 2024 12:10pm - 12:45pm MST

Salt Palace | Level 2 | 251 AD

Mentorship, a cornerstone of the community's success, offers a transformative path to growth and development. However, finding the right mentor and building a successful mentorship relationship can be challenging. This panel discussion brings together experienced mentors from diverse roles within the Kubernetes community including maintainers, tech leads, and committee members. The panel members will share their insights on how to get the most out of mentorship at different stages of your Kubernetes journey, as you climb the Contributor ladder. By the end of this panel, the audience will understand essential takeaways for effective mentorship at different contributor ladder marks. The project maintainers can take inspiration from how the Kubernetes project maintainers make use of various mentorship techniques such as Role Based Shadowing, Peer-to-Peer Learning, and Mentorship Cohorts that can help any project especially CNCF incubating projects stick new contributors to the project.

Speakers

Jared Watts

Founding Engineer, Upbound

Jared Watts is a Founding Engineer at Upbound, where he is working on advancing cloud-native computing by enabling anyone to build their own cloud platform. He is also a co-creator of the open source Crossplane (https://crossplane.io) and Rook (https://rook.io) projects. Prior to... Read More →

Wenjia Zhang

Engineering Manager, Google

Wenjia Zhang is an Engineer Manager at Google, working on Google Kubernetes Engine and Google Distributed Cloud. She is an active contributor for Kubernetes and etcd open source projects.

Nabarun Pal

Kubernetes Maintainer, Independent

Nabarun is a Principal Software Engineer at VMware by Broadcom, a maintainer of the Kubernetes project, elected Kubernetes Steering Committee member and a chair of Kubernetes SIG Contributor Experience. He is a Release Manager for Kubernetes and has been the Kubernetes 1.21 Release... Read More →

Nitish Kumar

Software Engineering Intern, Akuity

Nitish is a Software Engineer at Akuity and a CNCF Ambassador. In the past, Nitish has served as a Linux Foundation Mentee under the Kubernetes Release Engineering Team, where he built the OBS library that is used by the Kubernetes project to automate the process of managing release... Read More →

Carolina Valencia

Customer Architect, Elastic

Carol is a passionate software developer dedicated to implementing secure cloud-native practices. She actively contributes to CNCF projects and the Kubernetes community as an open-source contributor. She enjoys learning new technologies and creating material, some of which she shares... Read More →

Wednesday November 13, 2024 12:10pm - 12:45pm MST
Salt Palace | Level 2 | 251 AD

Cloud Native Novice

Content Experience Level Any

12:10pm MST

Breaking Free from Vulnerability Scanning Noise: Automated VEX Aggregation for Accuracy - Teppei Fukuda, Aqua Security Software Ltd.

Wednesday November 13, 2024 12:10pm - 12:45pm MST

Salt Palace | Level 1 | 151 G

Vulnerability scanners detect known vulnerabilities in software dependencies, but often produce inaccurate results (false-positives) due to their inability to automatically determine if a vulnerability is actually exploitable. Vulnerability Exploitability eXchange (VEX) is an industry-wide initiative that aims to address this issue, but the lack of standardized distribution hinders its effective utilization. This talk introduces VEX Hub, a central repository that automatically aggregates VEX documents published by open-source projects. VEX Hub’s unique architecture makes it easy and practical for software maintainers to start adopting VEX, while at the same time making it seamless for scanners and users to incorporate VEX in their workflow. The presentation showcases a practical use case of VEX Hub with Trivy, an open-source security scanner that popularizes VEX thanks to VEX Hub and delivers more accurate and actionable scanning results to its users.

Speakers

Teppei Fukuda

Open Source Engineer, Aqua Security Software Ltd.

Teppei Fukuda is the creator of Trivy and works at Aqua Security as an Open Source Software Engineer. He has a wealth of software engineering experience working on network and security. Away from the work, he is an avid manga enthusiast, dreaming of reading every comic book in the... Read More →

slides pdf

Wednesday November 13, 2024 12:10pm - 12:45pm MST
Salt Palace | Level 1 | 151 G

Security

Content Experience Level Any

2:30pm MST

Architecting the Future of AI: From Cloud-Native Orchestration to Advanced LLMOps - Lily (Xiaoxuan) Liu, Anyscale

Wednesday November 13, 2024 2:30pm - 3:05pm MST

Salt Palace | Level 2 | 255 B

With the groundbreaking release of ChatGPT, large language models (LLMs) have taken the world by storm: they have enabled new applications, have exacerbated GPU shortage, and raised new questions about their answers’ veracity. This talk delves into an AI stack, encompassing cloud-native orchestration, distributed computing, and advanced LLMOps. Key topics include: - Kubernetes: The foundational technology that seamlessly manages AI workloads across diverse cloud environments. - Ray: The versatile, open-source framework that streamlines the development and scaling of distributed applications. - vLLM: The cutting-edge, high-performance, and memory-efficient inference and serving engine designed specifically for large language models. Attendees will gain insights into the architecture and integration of these powerful tools, driving innovation and efficiency in the deployment of AI solutions.

Speakers

Lily Liu

Researcher at Anyscale and PhD student at UC Berkeley, AnyScale

Wednesday November 13, 2024 2:30pm - 3:05pm MST
Salt Palace | Level 2 | 255 B

AI + ML

Content Experience Level Any

2:30pm MST

Choose Your Own Adventure: The Observability Odyssey - Whitney Lee, CNCF Ambassador & Viktor Farcic, Upbound

Wednesday November 13, 2024 2:30pm - 3:05pm MST

Salt Palace | Level 2 | 251 AD

Our hero, a running app in a secure K8s prod environment, knows they are destined for greater things! They're serving end users, but currently, they have no idea what is going on. Are apps scaling correctly? Are automated deployments successful? What just went wrong, and how can it be fixed? Hero is desperate to escape this fog by adding CNCF tools for metrics, traces, and progressive delivery. It is up to you, the audience, to guide our hero and help them grow from a lost and confused app to their final form⎯an app that knows their faults before their users do. In their fourth KubeCon 'Choose Your Own Adventure'-style talk, Whitney and Viktor will present choices that an anthropomorphized app must make as they add observability to their cluster, enabling the ability to answer arbitrary questions about their system. Throughout the presentation, the audience (YOU!) will vote to decide our hero's path! Can we navigate CNCF projects and add observability before the session time elapses?

Speakers

Viktor Farcic

Developer Advocate, Upbound

Viktor Farcic is a lead rapscallion at Upbound, a member of the CNCF Ambassadors, Google Developer Experts, CDF Ambassadors, and GitHub Stars groups, and a published author. He is a host of the YouTube channel DevOps Toolkit and a co-host of DevOps Paradox.

Whitney Lee

CNCF Ambassador

Whitney is a lovable goofball and a CNCF Ambassador who enjoys understanding and using tools in the cloud native landscape. Creative and driven, Whitney recently pivoted from an art-related career to one in tech. You can catch her lightboard streaming show ⚡️ Enlightning on her... Read More →

Wednesday November 13, 2024 2:30pm - 3:05pm MST
Salt Palace | Level 2 | 251 AD

Cloud Native Novice

Content Experience Level Any

2:30pm MST

Secure by Design CI/CD: Practical Insights from Adobe and Autodesk - Vikram Sethi, Adobe Inc. & Jesse Sanford, Autodesk

Wednesday November 13, 2024 2:30pm - 3:05pm MST

Salt Palace | Level 2 | 250 AD

Worried that your CI/CD pipelines and developer workflows are insecure? Lost in security buzzwords like SBOMs, provenance, attestation, SLSA, OpenSSF, and more? Seeking a clear, actionable reference architecture to secure your pipeline? Whether you are just getting started on your Software Supply Chain Security journey, or are ready to take it to the next level navigating this diverse ecosystem is challenging. Join Vikram and Jesse as they present a reference architecture for secure-by-default CI/CD pipelines and show you effective security controls at every step. See firsthand how these industry giants safeguarded their pipelines while maintaining agility and innovation. This talk will showcase their work, and the work of the CNOE (Cloud Native Operational Excellence) group, which aims to build a paved path through this problem space by producing opinionated software collections or “CNOE stacks” that can be adapted to meet you where your technology is.

Speakers

Jesse Sanford

Software Architect, Autodesk

Jesse is a lifelong software engineer focused on site reliability and Infosec. Currently architecting the juncture of platform engineering and security/compliance for Autodesk's Developer Enablement team. He regularly contributes to open source and frequently speaks about his work... Read More →

Vikram Sethi

Principal Scientist, Adobe Inc.

Vikram is a Principal Scientist in the Developer Platforms organization at Adobe. Vikram has been architecting and building the Developer Experience for Adobe's Internal Developer Platform for the last few years. In the last year or so, Vikram has been working on rearchitecting Adobe's... Read More →

KCNA24 Secure by Design CI CD Practical Insights from Adobe and Autodesk pdf

Wednesday November 13, 2024 2:30pm - 3:05pm MST
Salt Palace | Level 2 | 250 AD

SDLC

Content Experience Level Any

3:25pm MST

Optimizing Load Balancing and Autoscaling for Large Language Model (LLM) Inference on Kubernetes - David Gray, Red Hat

Wednesday November 13, 2024 3:25pm - 4:00pm MST

Salt Palace | Level 2 | 255 E

As generative AI language models improve, they are increasingly being integrated into business-critical applications. However, large language model (LLM) inference is a compute-intensive workload that often requires expensive GPU hardware. Making efficient use of these hardware resources in the public or private cloud is critical for managing costs and power usage. This talk introduces the KServe platform for deploying LLMs on Kubernetes and provides an overview of LLM inference performance concepts. Attendees will learn techniques to improve load balancing and autoscaling for LLM inference, such as leveraging KServe, Knative, and GPU operator features. Sharing test results, we will analyze the impact of these optimizations on key performance metrics, such as latency per token and tokens per second. This talk equips participants with strategies to maximize the efficiency of LLM inference deployments on Kubernetes, ultimately reducing costs and improving resource utilization.

Speakers

David Gray

Senior Software Engineer, Red Hat

David Gray is a Senior Software Engineer on the Performance and Scale team at Red Hat. His role involves analyzing and improving AI inference workloads on Kubernetes platforms. David is actively engaged in performance experimentation and analysis of running large language models in... Read More →

David Gray KCNA24 LLM Inference load balancing and autoscaling pdf

Wednesday November 13, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 2 | 255 E

AI + ML

Content Experience Level Any

3:25pm MST

Cash App's Journey Into a Multi-Cluster Ecosystem - Rachel Sheikh, Cash App

Wednesday November 13, 2024 3:25pm - 4:00pm MST

Salt Palace | Level 1 | Grand Ballroom H

Cash App's Compute team is responsible for the health and maintenance of the company's Kubernetes clusters, and the enablement of service owners to deploy their services into these clusters with confidence. Over the past year, we've made strides in improving our reliability and uptime, part of which involved introducing a paradigm around creating new Kubernetes clusters in our service ecosystem that allow us to seamlessly transition services in/out of to simplify cluster upgrades and provide us with guardrails against common outages. This talk intends to walk you through our experience introducing new Kubernetes clusters for our services at Cash App, migrating and splitting service traffic across clusters with zero downtime, and thinking through tooling adoption / creation to simplify cluster maintenance as our overhead scales.

Speakers

Rachel Sheikh

Engineer, Cash App

I'm a software engineer with a decade of experience building and scaling backend services across various industries. When I'm not working on clusters or writing Go, I'm probably watching pro League of Legends or taking pictures of my dog.

KubeCon 2024 talk pptx

Wednesday November 13, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 1 | Grand Ballroom H

Platform Engineering

Content Experience Level Any

4:30pm MST

DNS Deep Dive in Kubernetes with CoreDNS - Jingming Guo, Airbnb

Wednesday November 13, 2024 4:30pm - 5:05pm MST

Salt Palace | Level 2 | 251 AD

In the dynamic world of Kubernetes, efficient DNS resolution is critical for seamless application performance and scalability. CoreDNS, as the default DNS server for Kubernetes, offers flexible and high-performance DNS capabilities. This talk will delve into the lifecycle of a DNS request within a Kubernetes cluster using CoreDNS, offering insights into the flow of DNS traffic and enhancing your understanding of DNS requests and service discovery in Kubernetes—-key knowledge for effective debugging and issue resolution. Additionally, we will present a case study of Airbnb's successful integration of CoreDNS, highlighting the CoreDNS performance evaluation, our seamless migration approach, and scaling strategy. Finally, we will talk about the multi-cluster DNS resolution with CoreDNS. This section will demonstrate how multi-cluster DNS capabilities address the common challenges, discuss performance considerations and multi-cluster DNS limitations.

Speakers

Jingming Guo

Software Engineer, Airbnb

Jingming Guo, graduated from Northwestern University in 2017 and subsequently joined AWS EBS team. At AWS, Jingming led the development of Elastic Volume feature on the Block Express volume and led the EBS Server capacity increase release. In 2022, Jingming joined Airbnb and led the... Read More →

Wednesday November 13, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 2 | 251 AD

Cloud Native Novice

Content Experience Level Any

4:30pm MST

Building Resilience: Effective Backup and Disaster Recovery for Vector Databases on Kubernetes - Pavan Navarathna & Shwetha Subramanian, Veeam

Wednesday November 13, 2024 4:30pm - 5:05pm MST

Salt Palace | Level 1 | Grand Ballroom A

As generative AI revolutionizes industries, reliance on vector databases - crucial for managing and querying high-dimensional data - has skyrocketed. These databases are often deployed on Kubernetes for its scalability and orchestration capabilities. However, ensuring robust backup and disaster recovery for these stateful applications presents unique challenges. Join Pavan and Shwetha as they discuss the critical need for an effective data protection strategy for vector databases in Kubernetes environments, emphasizing its importance in maintaining data integrity and availability. Attendees will learn about the growing significance of vector databases driven by AI applications and the specific considerations for their reliable deployment and management in cloud-native settings. Through a practical demonstration, this session will introduce Kanister, a CNCF Sandbox project, showcasing how it simplifies the complex process of backing up and recovering vector databases on Kubernetes.

Speakers

Pavan Navarathna

Engineering Manager, Veeam

Pavan joined Kasten by Veeam in March 2018, where he leads the open-source efforts and manages a team of cloud-native engineers developing innovative solutions for data protection in Kubernetes. He has previously worked in data protection and networking at NetApp and Aryaka. Pavan... Read More →

Shwetha Subramanian

Software Engineer, Veeam Kasten, Veeam

Shwetha Subramanian is a 2+ year experienced software professional, armed with a Master’s in Computer Science (Machine Learning track) from Columbia University, currently working as an SWE in the Kasten team at Veeam. An inherently curious individual, she is on a journey of learning... Read More →

KCNA2024 pdf

Wednesday November 13, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | Grand Ballroom A

Data Processing + Storage

Content Experience Level Any

4:30pm MST

Watching the Watchers: How We Do Continuous Reliability at Grafana Labs - Nicole van der Hoeven, Grafana Labs

Wednesday November 13, 2024 4:30pm - 5:05pm MST

Salt Palace | Level 1 | Grand Ballroom B

Nothing is foolproof. Everything fails eventually. Observability tools help predict and lessen the impact of those failures, as the watchers of your software systems. But who watches the watchers? At Grafana Labs, we're not immune to production incidents. Just like any company, we still sometimes move too quickly. We run complex, microservices-based systems ourselves, so we have to eat our own dogfood on a daily basis. In this talk, I reveal: - how we solved a years-long mystery that cost us $100,000+ - how we got our internal Mimir clusters to reliably hold 1.3 billion time series for metrics - what we've had to do to scale our Loki clusters to handle 324 TB of logs a day - what our Grafana dashboards to monitor Grafana Cloud look like Sometimes, it's easier to learn from failures in observability than from successes. This talk is a confession of some of our worst sins as well as a realistic look under the hood at how we're improving the continuous reliability of our stack.

Speakers

Nicole van der Hoeven

Senior Developer Advocate, Grafana Labs

Nicole is a Senior Developer Advocate at Grafana Labs and a performance engineer with over a decade of experience in breaking software and learning to build it back up again. She has lived in the Philippines, the US, Australia, the Netherlands, and Portugal, helping teams all over... Read More →

Wednesday November 13, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | Grand Ballroom B

Observability

Content Experience Level Any

5:25pm MST

The OTTL Cookbook: A Collection of Solutions to Common Problems - Tyler Helmuth, Honeycomb & Evan Bradley, Dynatrace

Wednesday November 13, 2024 5:25pm - 6:00pm MST

Salt Palace | Level 1 | Grand Ballroom B

Is your telemetry missing key attributes? Maybe there are details in your log bodies you’d rather have as attributes. It is common to find yourself in situations where your data doesn't look how you expect: it's too large, the wrong shape, or doesn't have everything you want. The OpenTelemetry Collector uses the OpenTelemetry Transformation Language (OTTL) to solve these problems. OTTL enables telemetry transformations based on any field of the payload, utilizing functions to execute the changes. In this session, Tyler and Evan will go over a brief intro to OTTL and then cover example after example of situations where you can use OTTL to solve processing problems in the Collector, like setting attributes, or defining an entire OTLP log record from a kubernetes event. Get ready with situations of your own, as we’ll save time at the end to try writing OTTL statements live on stage for your transformation or filtering issues so we can demonstrate how flexible OTTL truly is.

Speakers

Tyler Helmuth

Staff Software Engineer, Honeycomb

Tyler is a Sr. Software Engineer at Honeycomb with a passion for observability and helping users start their observability journey. He is a maintainer for the OpenTelemetry Collector and OTel Helm Charts, and an active contributor to other OTel repositories. While not its originator... Read More →

Evan Bradley

Senior Software Engineer, Dynatrace

Evan helps maintain the OpenTelemetry Collector, where he is also a primary contributor to the OpenTelemetry Transformation Language (OTTL) and the OpenTelemetry Agent Management Protocol (OpAMP) Collector components. Evan has a background in developing DevOps tooling and observability... Read More →

OTTL Cookbook pdf

examples yaml

Wednesday November 13, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | Grand Ballroom B

Observability

Content Experience Level Any

5:25pm MST

Workload Identity Federation – Stop Using Long-Lived Credentials - Benjamin Dronen, Ford Motor Company & Anjali Telang, Red Hat

Wednesday November 13, 2024 5:25pm - 6:00pm MST

Salt Palace | Level 2 | 254 B

Workload identity federation is a somewhat daunting but extremely beneficial topic in Kubernetes security. In this session, we will share the lessons Ford Motor Company has learned through using workload identity federation with Google Cloud Platform, Microsoft Entra ID, and other platforms at scale from a wide variety of different workload types, how it has enhanced our security posture, improved developers’ lives, and reduced outages.

Speakers

Anjali Telang

Principal Product Manager, OpenShift Security and Identity, Red Hat

Anjali Telang is a Principal Product Manager for Security and Identity in OpenShift at RedHat. She is a security and cloud enthusiast with over 16 years of experience in cloud, security and networking. Prior to joining RedHat, she worked in various product and engineering roles at... Read More →

Benjamin Dronen

Kubernetes Platform Engineer, Ford Motor Company

Ben Dronen started at Ford Motor Company in 2022 as part of their Ford College Graduate rotational program. He currently holds a Kubernetes Platform Engineering position and focuses on bare metal Kubernetes deployments. Ben attended Andrews University in Southwest Michigan and holds... Read More →

Stop Using Long Lived Credentials Workload Identiy Federation.pptx pdf

Wednesday November 13, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 2 | 254 B

Security

Content Experience Level Any

6:00pm MST

🪧 Poster Session (PS08): Unveiling Anomalies: eBPF-Based Detection in High-Volume Encrypted Network Traffic - Ben Smith-Foley, Rensselaer Center for Open Source

Wednesday November 13, 2024 6:00pm - 8:00pm MST

Salt Palace | Level 1 | Halls A-C + 1-5 | Solutions Showcase

The increased use of encryption in network traffic presents a significant challenge for traditional network monitoring and security tools. As encrypting traffic becomes the norm, so does the need for advanced methods to detect malicious activities hidden within encrypted traffic. This poster will focus on how eBPF can be utilized to gain early observability into incoming packets by capturing and analyzing metadata before packets are fully processed, and how eBPF offers a unique vantage point for identifying anomalies in real-time. It will discuss methods to detect abnormal patterns, the design of the eBPF programs used, and the integration of these programs into a broader monitoring framework. The insights from this research have the potential to significantly enhance network security by providing a scalable and efficient solution for monitoring network traffic without compromising privacy. Attendees will gain an understanding of the practical applications of eBPF in network security.

Speakers

Ben Smith-Foley

University Student, Rensselaer Center for Open Source

Ben is a senior at Rensselaer Polytechnic Institute studying Computer Science with a concentration in Systems and Software. He is currently conducting undergraduate research in "Anomaly Detection in High-Volume Encrypted Network Traffic", helps lead the Rensselaer Center for Open... Read More →

Wednesday November 13, 2024 6:00pm - 8:00pm MST
Salt Palace | Level 1 | Halls A-C + 1-5 | Solutions Showcase

🪧 Poster Sessions, Security

Content Experience Level Any

6:30pm MST

CLBO: ClashLoopBackOff

Wednesday November 13, 2024 6:30pm - 8:00pm MST

Salt Palace | Level 1 | Halls A-C + 1-5 | Project Pavilion, Solutions Showcase

Come and watch a competition of two people using their technical ingenuity and creativity to solve a challenge put forth by the Scheduler (host). Time is limited and stakes are high, as this isn’t just a “live demo” for the masses. Over the course of twenty minutes, competitors will attempt to resolve a broken cluster, or deploy a service to production. At the end of the time, entries will be judged on four categories. Each category will be rated on Stability, Resiliency, Flexibility, and Observability.

Participants won’t know what challenge they’ll be given ahead of time but will be informed whether certain cloud resources or APIs will need to be enabled and available. Pre-creating any helpful scripts, code, or cloud resources is strictly prohibited. During the competition, the Scheduler will bounce between the participants’ screens, engage with the audience, and ask questions of the participants live.

Join us, root for our competitors, and feel free to engage live!

Wednesday November 13, 2024 6:30pm - 8:00pm MST
Salt Palace | Level 1 | Halls A-C + 1-5 | Project Pavilion, Solutions Showcase

Experiences, ClashLoopBackOff

Content Experience Level Any

11:00am MST

From Vectors to Pods: Integrating AI with Cloud Native - Rajas Kakodkar, Broadcom; Kevin Klues, NVIDIA; Joseph Sandoval, Adobe; Ricardo Rocha, CERN; Dawn Chen, Google

Thursday November 14, 2024 11:00am - 11:35am MST

Salt Palace | Level 2 | 255 E

The rise of AI is challenging long-standing assumptions about running cloud native workloads. AI demands hardware accelerators, vast data, efficient scheduling and exceptional scalability. Although Kubernetes remains the de facto choice, feedback from end users and collaboration with researchers and academia are essential to drive innovation, address gaps and integrate AI in cloud native. This panel features end users, AI infra researchers and leads of the CNCF AI and Kubernetes device management working groups focussed on: - Expanding beyond LLMs to explore AI for cloud native workload management, memory usage and debugging - Challenges with scheduling and scaling of AI workloads from the end user perspective - OSS Projects and innovation in AI and cloud native in the CNCF landscape - Improving resource utilisation and performance of AI workloads The next decade of Kubernetes will be shaped by AI. We don’t yet know what this will look like, come join us to discover it together.

Speakers

Dawn Chen

Principal Software Engineer, Google

Dawn Chen is a principal software engineer at Google. Dawn has worked on Kubernetes and Google Container Engine (GKE) before the project was founded. She has been one of tech leads in both Kubernetes and GKE. Prior to Kubernetes, she was the one of the tech leads for Google internal... Read More →

Ricardo Rocha

Lead Platforms Infrastructure, CERN

Ricardo leads the Platform Infrastructure team at CERN with a strong focus on cloud native deployments and machine learning. He has led for several years the internal effort to transition services and workloads to use cloud native technologies, as well as dissemination and training... Read More →

Kevin Klues

Distinguished Engineer, NVIDIA

Kevin Klues is a distinguished engineer on the NVIDIA Cloud Native team. Kevin has been involved in the design and implementation of a number of Kubernetes technologies, including the Topology Manager, the Kubernetes stack for Multi-Instance GPUs, and Dynamic Resource Allocation (DRA... Read More →

Joseph Sandoval

Principal Product Manager, Adobe

Joseph Sandoval, a seasoned tech expert with 25 years in various roles running distributed systems, infrastructure platforms and thrives on empowering developers to scale their applications. An advocate for OpenSource software, he harnesses its transformative power to champion change... Read More →

Rajas Kakodkar

Staff Software Engineer at Broadcom | Tech Lead CNCF TAG Runtime, Broadcom

Rajas is a staff software engineer at Broadcom and a tech lead of the CNCF Technical Advisory Group, Runtime. He is actively involved in the AI working group in the CNCF. He is a Kubernetes contributor and has been a maintainer of the Kube Proxy Next Gen Project. He has also served... Read More →

Thursday November 14, 2024 11:00am - 11:35am MST
Salt Palace | Level 2 | 255 E

AI + ML

Content Experience Level Any

11:00am MST

Shifting Gears: Leveraging CNCF Tools to Streamline Operations at Toyota Connected - Benson Phillips & Rob Heckel, Toyota Connected

Thursday November 14, 2024 11:00am - 11:35am MST

Salt Palace | Level 2 | 254 B

In the evolving landscape of cloud-native ecosystems, aligning teams and standardizing practices is crucial for operational excellence. At Toyota Connected, we faced significant challenges due to inconsistent practices and fragmented collaboration across departments. To address this, we adopted a suite of CNCF tools including ArgoCD, Backstage, Harbor, External Secrets Operator, and OpenCost. This session will delve into our journey of implementing these tools to unify our approach, streamline workflows, and enhance cross-team collaboration. Attendees will gain insights into the practical application of these tools, our successes and failures, and the substantial reduction in time to market achieved. By focusing on the integration of technical solutions and effective team practices, we aim to foster a cohesive and efficient cloud-native environment. This presentation provides actionable strategies for leveraging CNCF tools to drive innovation and excellence in your organization.

Speakers

Benson Phillips

Platform Architect, Toyota Connected

Software oriented, primarily working with cloud native computing. But my interests do not stop there as my love for technology is boundless.

Rob Heckel

Platform Architect, Toyota Connected North America

Rob has over 15 years in technology, specializing in open source and developer enablement. As a Platform Architect for Toyota Connected, he enhances DevOps, SDLC, and SRE practices. He has led the creation of an internal developer platform, streamlined tool integrations, and promoted... Read More →

KCNA24 Shifting Gears PDF pdf

Thursday November 14, 2024 11:00am - 11:35am MST
Salt Palace | Level 2 | 254 B

Cloud Native Experience

Content Experience Level Any

11:00am MST

Lessons Learned Adopting OpenTelemetry at Scale - Alex Arnell, Heroku / Salesforce

Thursday November 14, 2024 11:00am - 11:35am MST

Salt Palace | Level 1 | Grand Ballroom B

OpenTelemetry makes bold promises to unlock and unleash your observability, providing you with open standards, no vendor lock-in and interoperability with just about everything. You believe that your organization could really benefit from an uplift to modern observability. It would be easy to adopt if you were was starting out fresh, but let’s face it, most organizations have sprawling codebases and architectures. Decisions, infrastructure and often engineers that have been in place for decades. How do you even get started? This Heroku case study dives into our OpenTelemetry journey where you'll discover strategies on adoption, how to deal with internal resistance, and technical guidance on rolling out the change. Learn from our missteps and what we wished we had done differently. You’ll even see how a bit of luck can help drive adoption over the finish line. This session will equip you to navigate OpenTelemetry adoption in the most entrenched environments.

Speakers

Alex Arnell

Principal Engineer, Heroku / Salesforce

Alex Arnell is a Principal Engineer at Heroku / Salesforce with over two decades of software development experience. Alex has spent the last decade specializing in telemetry and observability systems. Alex is the lead engineer of the Telemetry team at Heroku, responsible for the collection... Read More →

Adopting OTel KCNA24.pptx pdf

Thursday November 14, 2024 11:00am - 11:35am MST
Salt Palace | Level 1 | Grand Ballroom B

Observability

Content Experience Level Any

11:00am MST

Engineering a Kubernetes Operator: Lessons Learned from Versions 1 to 5 - Andrew L'Ecuyer, Crunchy Data

Thursday November 14, 2024 11:00am - 11:35am MST

Salt Palace | Level 1 | Grand Ballroom H

Join me to uncover insights and hard-learned lessons from our journey through the first five versions of a Kubernetes Operator for Postgres. I will trace the development lifecycle from version 1 started in 2017 to version 5 now. Each version represents a milestone in addressing specific challenges, functionality, stability, and performance. We will discuss the architectural decisions, design patterns, and implementation strategies that shaped the evolution of the Operator. Key topics will include handling stateful applications, ensuring high availability, building for flexible deployment models, scalability, and managing rolling upgrades for both the Operator and underlying software. By the end of this session, participants will be equipped with practical knowledge and actionable strategies for engineering their own Kubernetes Operators, ready to accelerate their development process and avoid common pitfalls.

Speakers

Andrew L'Ecuyer

Sr. Director of Kubernetes Engineering, Crunchy Data

Andrew head’s up the Kubernetes Engineering Team at Crunchy Data. With a diverse background spanning both the public and private sectors, Andrew has played a key role in designing, building and integrating complex systems of all shapes and sizes. He holds degrees in both Computer... Read More →

Engineering A Kubernetes Operator FINAL pdf

Thursday November 14, 2024 11:00am - 11:35am MST
Salt Palace | Level 1 | Grand Ballroom H

Platform Engineering

Content Experience Level Any

11:00am MST

Yahoo’s Kubernetes Journey from on-Prem to Multi-Cloud at Scale - Nandhakumar Venkatachalam & Payal Patel, Yahoo

Thursday November 14, 2024 11:00am - 11:35am MST

Salt Palace | Level 2 | 251 AD

Yahoo is an early adopter of Kubernetes, operating 37 on-prem and 42 multi-cloud production clusters hosting 2700 applications. Our team offers a simple yet powerful interface for users to deploy applications onto our managed clusters. Since 2015, we have handled multiple complex upgrades, including Operating Systems and Kubernetes, upgrading from version 1.0.3 to 1.30.0. In 2023, Yahoo announced plans to migrate to both GCP and AWS cloud platforms. Leveraging extensive knowledge, our team successfully provisioned Kubernetes clusters in a multi-cloud environment within a short period. Our team faced numerous challenges during the cloud adoption process, including networking, security, cluster autoscaling, and cost. In this talk, we will share managing K8S in a multi-cloud and discuss the challenges faced and solutions found. Key topics include Shared VPC, IP Space for K8s, securely accessing private clusters, multi-tenant workload identity, and maintaining a user interface to K8S.

Speakers

Nandhakumar Venkatachalam

Sr Princ Production Engineer, Yahoo Inc

Nandhakumar Venkatachalam is a Senior Principal Production Engineer at Yahoo Inc. As a lead engineer responsible for operating the large-scale Kubernetes cluster, he has played a key architect role in building scalable cloud infrastructure. Nandha has been with Yahoo for over 17 years... Read More →

Payal Patel

Principal Software Development Engineer, Yahoo

Payal Patel is a Principal Software Development Engineer in the Cloud Infrastructure team at Yahoo. She is currently developing a hybrid cloud solution for Kubernetes clusters in AWS and GCP to set up the Kubernetes clusters at scale. Before that, she worked on managing the Kubernetes... Read More →

Yahoo’s Kubernetes Journey from On prem to Multi cloud at Scale pdf

Thursday November 14, 2024 11:00am - 11:35am MST
Salt Palace | Level 2 | 251 AD

Platform Engineering

Content Experience Level Any

11:55am MST

Democratizing AI Model Training on Kubernetes with Kubeflow TrainJob and JobSet - Andrey Velichkevich, Apple & Yuki Iwai, CyberAgent, Inc.

Thursday November 14, 2024 11:55am - 12:30pm MST

Salt Palace | Level 2 | 255 E

Running model training on Kubernetes is challenging due to the complexity of AI/ML models, large training datasets, and various distributed strategies like data and model parallelism. It is crucial to configure failure handling, success criteria, and gang-scheduling for large-scale distributed training to ensure fault tolerance and elasticity. This talk will introduce the new Kubeflow TrainJob API, which democratizes distributed training and LLM fine-tuning on Kubernetes. The speakers will demonstrate how TrainJob integrates with Kubernetes JobSet to ensure scalable and efficient AI model training with simplified Python experience for Data Scientists. Additionally, they will explain the innovative concept of reusable and extendable training runtimes within TrainJob. The speakers will highlight how these capabilities empower data scientists to rapidly iterate on their ML development, making Kubernetes more accessible and beneficial for the entire ML ecosystem.

Speakers

Andrey Velichkevich

Senior Software Engineer, Apple

Andrey Velichkevich is a Senior Software Engineer at Apple and is a key contributor to the Kubeflow open-source project. He is a member of Kubeflow Steering Committee and a co-chair of Kubeflow AutoML and Training WG. Additionally, Andrey is an active member of the CNCF WG AI. He... Read More →

Yuki Iwai

Software Engineer, CyberAgent, Inc.

Yuki is a Software Engineer at CyberAgent, Inc. He works on the internal platform for machine-learning applications and high-performance computing. He is currently a Technical Lead for Kubeflow WG AutoML / Training. He is also a Kubernetes WG Batch active member, Job API reviewer... Read More →

KubeCon NA 2024. Democratizing AI Model Training on Kubernetes with Kubeflow TrainJob and JobSet. Andrey Velichkevich and Yuki I pdf

Thursday November 14, 2024 11:55am - 12:30pm MST
Salt Palace | Level 2 | 255 E

AI + ML

Content Experience Level Any

11:55am MST

Running Quantum-Safe Applications on Kubernetes - Paul Schweigert & Michael Maximilien, IBM Quantum

Thursday November 14, 2024 11:55am - 12:30pm MST

Salt Palace | Level 2 | 255 B

Quantum computers pose a unique threat to computer security, as the encryption standards we rely upon are vulnerable to powerful quantum computers. While those computers are still several years away, "harvest now, decrypt later" attacks put all data not protected using quantum-safe security at risk. So what can we do now to protect our applications? In this talk, Paul will demo how to deploy a quantum-safe application on Kubernetes. He'll provide a brief overview of quantum-safe cryptography and why it's needed, highlight key work being done in the open source community to migrate to quantum-safe cryptography, and conclude with a demo of how to build a quantum-safe cloud-native application. In particular, he'll show where and how to make changes to a Kubernetes environment to ensure users are protected by quantum-safe connections. At the conclusion of this session, listeners will have a set of practical steps they can take to help secure their applications in a post-quantum world.

Speakers

Michael Maximilien

Distinguished Engineer, IBM

Max is an IBM Distinguished Engineer and leader for the teams contributing to Open Quantum and Serverless. Max has held elected and leadership positions in Cloud Foundry and Knative OSS communities. Max's main expertise are in software engineering and distributed systems. Max published... Read More →

Paul Schweigert

Senior Software Engineer, IBM

Paul Schweigert works on quantum and serverless technologies at IBM. He has extensive experience in open source (Knative and Kubernetes in particular) and has spoken at numerous conferences. He has also led various platform engineering and data science teams. In a previous life, he... Read More →

running quantum safe applications pdf

Thursday November 14, 2024 11:55am - 12:30pm MST
Salt Palace | Level 2 | 255 B

Emerging + Advanced

Content Experience Level Any

11:55am MST

Cognitive and Self-Adaptive System for Effective Distributed-Tracing in Applications - Mitul Tandon & Akash Gusain

Thursday November 14, 2024 11:55am - 12:30pm MST

Salt Palace | Level 1 | Grand Ballroom B

In response to challenges of limited trace capture in dynamic API tracing systems, the solution leverages Machine Learning and Cognitive approach for unbiased trace collection. Unlike existing implementations with a skewed distribution(~5%) towards normal traces, our self-adaptive system dynamically learns to prioritise and capture diverse traces, crucial for effective diagnosis of API failures and performance issues. This innovative approach significantly enhances the SREs ability to triage complex issues, leading to a game-changing reduction in Mean Time to Resolve (MTTR). The Adaptive Sampling approach analyses existing system traces and autonomously adjusts the sampling rate, eliminating manual configs. This ML-based solution outcome includes streamlined trace metric analysis, enhanced reliability work efficiency, and considerable infrastructure cost reduction through targeted trace collection, ultimately making a significant impact on operational effectiveness & reliability

Speakers

Akash Gusain

Software Engineer, Bito

Akash Gusain is a software engineer with over two years of experience in designing and deploying cloud-native applications. At VMware, he contributed to the development of scalable and robust cloud solutions, showcasing his ability to learn and adapt quickly to new technologies while... Read More →

Mitul Tandon

Software Engineer

A DevOps/SRE Engineer at VMware with 2+ years of experience with working on distributed systems and containerised applications.

KubeCon2024 pdf

Thursday November 14, 2024 11:55am - 12:30pm MST
Salt Palace | Level 1 | Grand Ballroom B

Observability

Content Experience Level Any

11:55am MST

Evolving Reddit’s Infrastructure via Principled Platform Abstractions - Karan Thukral & Harvey Xia, Reddit

Thursday November 14, 2024 11:55am - 12:30pm MST

Salt Palace | Level 1 | Grand Ballroom H

Reddit’s approach to infrastructure management has grown organically over time, adapted to solve tactical, near term problems. We have now reached a point where the only way to scale infrastructure capabilities to a growing engineering organization is through platform abstractions offering self-service management of standardized infrastructure patterns. Beginning in 2021, a concerted effort was made to reimagine infrastructure as an internal platform that empowers both application and infrastructure engineers to build impactful and maintainable systems. We present a case study of Reddit’s ongoing journey in evolving its infrastructure management practices from inefficient, human-in-the-loop processes to efficient, self-service interfaces. By treating Kubernetes as a universal control plane and extending it with custom control processes fronted by well-designed interfaces, we are moving the organization towards this vision. This will cover the the many trade-offs and lessons learnt.

Speakers

Harvey Xia

Staff Engineer, Compute Infrastructure @ Reddit, Reddit

I'm a software engineer with experience across a variety of disciplines including backend engineering, data engineering, and most recently, infrastructure engineering. I specialize in building cloud native infrastructure platform features.

Karan Thukral

Senior Engineer, Compute Infrastructure @ Reddit, Reddit

Karan is a Senior Software Engineer at Reddit working on the Compute team to build an easy to use internal developer platform which is scalable and reliable. He has been working in this problem space since 2017 building both internal and external developer platforms including App... Read More →

Achilles Kubecon Talk 2024 pdf

Thursday November 14, 2024 11:55am - 12:30pm MST
Salt Palace | Level 1 | Grand Ballroom H

Platform Engineering

Content Experience Level Any

2:30pm MST

What Istio Got Wrong: Learnings from the Last Seven Years of Service Mesh - Christian Posta & Louis Ryan, Solo.io

Thursday November 14, 2024 2:30pm - 3:05pm MST

Salt Palace | Level 2 | 254 B

Building complex systems often requires simplicity in components—a lesson the Istio project has learned throughout its seven(plus)-year journey. Although Istio offers a lot of powerful features for application networking, crucial for many organizations, the path to maturity and broader adoption was fraught with challenges. In this talk, we explore the key mistakes made during Istio's development, including its initially complex architecture, an overload of features, premature release of version 1.0, difficulties faced by contributors, and delays in joining the CNCF. We will discuss the impact of these mistakes, how these missteps were addressed, and how they have positioned Istio as a leader in the service mesh market. This presentation will detail how Istio's evolution reflects a shift towards simpler, more modular components that together offer effective solutions for managing APIs and service-to-service communication regardless of platform.

Speakers

Louis Ryan

CTO, Solo.io

Co-creator of Istio and gRPC

Christian Posta

Global Field CTO, Solo.io

Christian Posta (@christianposta) is Global Field CTO at Solo.io. He is the author of Istio in Action and many other books on cloud-native architecture. He's well known in the cloud-native community for being a speaker, blogger (https://blog.christianposta.com) and contributor to... Read More →

Thursday November 14, 2024 2:30pm - 3:05pm MST
Salt Palace | Level 2 | 254 B

Cloud Native Experience

Content Experience Level Any

2:30pm MST

Tutorial: Live with Gateway API V1.2 - Flynn, Buoyant & Mike Morris, Microsoft

Thursday November 14, 2024 2:30pm - 4:00pm MST

Salt Palace | Level 1 | Grand Ballroom G

Gateway API v1.2 is here! We have GA support for service mesh! We have timeouts in HTTPRoutes! We have GRPCRoutes! And we still have precious few real-world walkthroughs of using Gateway API to get real things done… In this hands-on workshop hosted by Gateway API contributors and GAMMA co-leads, we’ll start with completely unconfigured clusters, walk through installing a demo app with your choice of ingress controller and service mesh (Envoy Gateway + Linkerd, or Istio), then dig into actually using Gateway API for routing, resilience, and progressive delivery with an application using HTTP and gRPC at the same time. You’ll walk away with practical, real-world knowledge about what Gateway API can do and how to use it, and portable skills you’ll be able to apply to the many projects implementing Gateway API!

Speakers

Flynn -

Tech Evangelist, Buoyant

Flynn is a tech evangelist at Buoyant, educating developers about Linkerd, Kubernetes, and cloud-native development in general. He has spent 40 years in software engineering (from the kernel up through distributed applications, with a common thread of communications and security throughout... Read More →

Mike Morris

Senior Product Manager, Microsoft

Mike is a product manager at Microsoft working on upstream open source projects with a focus on Istio service mesh, and a Gateway API for service mesh co-lead. He is interested in building healthy, sustainable communities and scalable distributed systems, and working collaboratively... Read More →

Gateway API Workshop pptx

Thursday November 14, 2024 2:30pm - 4:00pm MST
Salt Palace | Level 1 | Grand Ballroom G

Tutorials, SDLC (Software Development Lifecycle)

Content Experience Level Any

3:00pm MST

CLBO: ClashLoopBackOff | Attendee Edition

Thursday November 14, 2024 3:00pm - 4:00pm MST

Salt Palace | Level 1 | Halls A-C + 1-5 | Project Pavilion, Solutions Showcase

Thursday November 14, 2024 3:00pm - 4:00pm MST
Salt Palace | Level 1 | Halls A-C + 1-5 | Project Pavilion, Solutions Showcase

Experiences, ClashLoopBackOff

Content Experience Level Any

3:25pm MST

TLS and MTLS: Introduction to Modern Security - Andrew Davis, Independent & Sandeep Kanabar, Gen (formerly NortonLifeLock)

Thursday November 14, 2024 3:25pm - 4:00pm MST

Salt Palace | Level 2 | 251 AD

A constant presence in our lives for nearly 25 years, TLS is a cornerstone of modern security practice — especially in a zero-trust world. In cloud native, mTLS comes up every time service meshes get mentioned. Even so, both these technologies are still sources of endless questions. How do they work? How are they related? What problems do they solve – and which others do they not solve? How does it relate to end-user auth? What's all this stuff with certificates anyway? And why should you care about these things? Thankfully, answering these questions isn't that complex. Sandeep Kanabar, Lead Software Engineer at Gen, and Andrew Davis, a Cybersecurity Expert—both Deaf & Hard of Hearing WG members—will discuss what TLS and mTLS are, what they do, how they work, why they matter as standards, and what nearly 25 years of attacking them have to say about security. They'll use Linkerd as an example, but this talk will apply to any situation involving mTLS or TLS, no matter the implementation.

Speakers

Sandeep Kanabar

Lead Software Engineer, Gen (formerly NortonLifeLock)

Hailing from India, Sandeep is a passionate software engineer working at Gen (formerly NortonLifeLock). A frequent meetup speaker, Sandeep enjoys sharing his lessons learned from 15+ years in the tech space with the community. He's a staunch advocate for diversity and inclusion and... Read More →

Andrew Davis

Cybersecurity Specialist, Not Applicable

A passionate self-taught cybersecurity expert, Andrew Davis is a big believer in life-long learning. He has worked for various Fortune 500 companies, including DELL and Fidelity Investments. Deaf himself, Andrew is a strong advocate for accessibility. He's an active member of the... Read More →

Thursday November 14, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 2 | 251 AD

Cloud Native Novice

Content Experience Level Any

3:25pm MST

You're Overpaying for CI - Kyle Penfound, Dagger

Thursday November 14, 2024 3:25pm - 4:00pm MST

Salt Palace | Level 2 | 250 AD

In recent years, the computational power of developer workstations has surged dramatically. With so much compute available at every developer's fingertips, why do we continue to waste time and money with lengthy build times on sluggish CI compute? Some forward-thinking organizations are re-evaluating this approach, questioning the necessity of paying for CI compute when the developers' workstations, which are already more powerful and paid for, remain underutilized. In this technical session we will transition a fully functioning production CI system from cloud-based compute to local workstation compute. We will explore the intricacies of replicating the functionality of a modern CI system, leveraging the power of developer workstations, all using open source software.

Speakers

Kyle Penfound

Solutions Engineer, Dagger

Kyle is part of the ecosystem team at dagger.io working on the future of CICD. He has a background in DevOps and just loves giving demos!

Thursday November 14, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 2 | 250 AD

SDLC

Content Experience Level Any

3:25pm MST

It's Dangerous to Build It Alone, Take This. - Jeremy Rickard & Ashna Mehrotra, Microsoft

Thursday November 14, 2024 3:25pm - 4:00pm MST

Salt Palace | Level 1 | 151 G

You've got high and critical CVEs in open source software packages that are critical to your platform or business. Time is almost up to patch them, and the upstream project hasn't fixed things. If you don't patch, your accreditation might be at risk. You're going to have to do it yourself! But where do you start? Fork the projects? Can you just patch in place? In this session, you'll learn about tools and strategies that can help you respond to CVEs in your container images faster, starting with patching existing images in place with Copacetic and moving on to patching and building projects from scratch. We'll look at challenges to building and testing upstream projects using existing tools and learn from emerging practices in industry. We'll also talk about how to inform your teams to stop using bad images! After this session, you'll have best practices and tools at your disposal, understand some of the pitfalls of owning your entire open source software supply chain.

Speakers

Ashna Mehrotra

Software Engineer, Microsoft

Ashna Mehrotra is a software engineer on the Upstream Security team, working on cloud-native open source security projects at Microsoft.

Jeremy Rickard

Principal Software Engineer, Microsoft

Jeremy Rickard is a principal software engineer at Microsoft where he works on the Azure Container Upstream team. He is currently a co-chair for SIG Release and serves on both the CNCF and the Kubernetes Code of Conduct Committees. He was also the Kubernetes 1.20 Release Lead.

KubeCon North America It's Dangerous To Build It Alone pdf

Thursday November 14, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 1 | 151 G

Security

Content Experience Level Any

4:30pm MST

Which GPU Sharing Strategy Is Right for You? A Comprehensive Benchmark Study Using DRA - Kevin Klues & Yuan Chen, NVIDIA

Thursday November 14, 2024 4:30pm - 5:05pm MST

Salt Palace | Level 2 | 255 E

Dynamic Resource Allocation (DRA) is one of the most anticipated features to ever make its way into Kubernetes. It promises to revolutionize the way hardware devices are consumed and shared between workloads. In particular, DRA unlocks the ability to manage heterogeneous GPUs in a unified and configurable manner without the need for awkward solutions shoehorned on top of the existing device plugin API. In this talk, we use DRA to benchmark various GPU sharing strategies including Multi-Instance GPUs, Multi-Process Service (MPS), and CUDA Time-Slicing. As part of this, we provide guidance on the class of applications that can benefit from each strategy as well as how to combine different strategies in order to achieve optimal performance. The talk concludes with a discussion of potential challenges, future enhancements, and a live demo showcasing the use of each GPU sharing strategy with real-world applications.

Speakers

Kevin Klues

Distinguished Engineer, NVIDIA

Yuan Chen

Principal Software Engineer, NVIDIA

Yuan Chen is a Principal Software Engineer at NVIDIA, working on building NVIDIA GPU Cloud for AI. He served as a Staff Software Engineer at Apple from 2019 to 2024, where he contributed to the development of Apple's Kubernetes infrastructure. Yuan has been an active code contributor... Read More →

KCNA24 DRA Benchmarking pdf

Thursday November 14, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 2 | 255 E

AI + ML

Content Experience Level Any

4:30pm MST

The Maintainer Monologues - Sarah Christoff, Defense Unicorns; Jason Hall, Chainguard; Scott Rigby & Karen Chu, Independent; Ryan Nowak, Microsoft

Thursday November 14, 2024 4:30pm - 5:05pm MST

Salt Palace | Level 2 | 254 B

Are maintainers born? Or made? Made. They’re definitely made. Oftentimes it’s a combination of trial and error, luck, and lots of hard work. With a mixed group of first time and experienced maintainers, join us for a panel covering the origin stories and learnings of CNCF sandbox/incubating/graduated project maintainers. They’ll share their journeys as their projects evolved, and cover topics such as: - Project milestones (inception, MVP, & donation) - Learning the ecosystem - Blind spots - Navigating social dynamics (community building, getting more help, navigating challenges) - Work life balance / open source burnout With this knowledge, you’ll be better equipped to become the next open source contributor, maintainer, or creator of projects, ready to navigate the ecosystem.

Speakers

Karen Chu

OSS Community PM

Karen Chu is an OSS Community PM. Having participated in the cloud native community since 2015, she is a CNCF Ambassador, Helm community manager/maintainer, emeritus Kubernetes Code of Conduct Committee member, meet-up organizer, and conference organizer. She has also worked on The... Read More →

Sarah Christoff

Software Engineer, Defense Unicorns

Sarah is a software engineer at Defense Unicorns who loves making complex code more digestible. She is the self-proclaimed founder of the Leslie Lamport fan club. When she's not bugbusting, she is running her animal rescue and competing in triathlons. She believes code should be like... Read More →

Scott Rigby

Senior Cloud Solutions Architect, NASA / Navteca

Scott is an artist, engineer & dad, collaborating on a different kind of world. Into collective art, activism, therapy & open source nerdy stuff. Scott is a Cloud Native Ambassador, speaker, organizer of CNCF community events including the New York Kubernetes Meetup, and international... Read More →

Jason Hall

Principal Software Engineer, Chainguard

Jason is a hopeless container image tooling nerd, living in Brooklyn with his wife, two children and (most importantly) lots of pizza.

Ryan Nowak

Incubations Architect, Microsoft

Ryan is an architect working on open-source projects from the Azure CTO's office. He's passionate about designing software for humans, incubating risky ideas, releasing them in open-source so everyone can benefit. At Microsoft, he's had a 15+ year career building developer-centric... Read More →

Thursday November 14, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 2 | 254 B

Cloud Native Experience

Content Experience Level Any

4:30pm MST

Elevating Kubeflow Spark Operator's Future: Best Practices and Enhancements - Vara Bonthu, AWS & Chaoran Yu, Apple Inc

Thursday November 14, 2024 4:30pm - 5:05pm MST

Salt Palace | Level 1 | Grand Ballroom A

As Kubernetes becomes the leading platform for data processing, mastering the deployment and management of Apache Spark on it is crucial. In this presentation, you'll hear from the new maintainers of the Kubeflow Spark Operator project, who will provide an overview of scaling the Spark Operator on Kubernetes, emphasizing best practices to optimize performance and efficiency. Attendees will explore the migration of the Spark Operator repository from Google to Kubeflow, gaining insights into the roadmap and key takeaways. The session will cover strategies for achieving multi-tenancy, managing multiple Spark Operator instances for large-scale deployments, ensuring robust security, and performing seamless upgrades. Participants will learn advanced techniques to maximize their Spark on Kubernetes deployments, making their data processing pipelines more efficient, reliable, and secure. This talk is for Data, ML, DevOps, and MLOps pros to enhance their Spark on Kubernetes skills.

Speakers

Chaoran Yu

Software Engineer, Apple Inc

Chaoran Yu is a software engineer at Apple. He leads a team that builds and operates a large-scale batch analytics data platform to meet the demanding requirements of data scientists and engineers. His passion lies in delivering the best value to stakeholders through best-of-breed... Read More →

Vara

Principal OSS Specialist, AWS

Vara Bonthu is a dedicated technology professional and Worldwide Tech Leader for Data on EKS, specializing in assisting AWS customers ranging from strategic accounts to diverse organizations. He is passionate about open-source technologies, Data Analytics, AI/ML, and Kubernetes, and... Read More →

Elevating Kubeflow Spark Operator Best Practices and Enhancements pptx

Thursday November 14, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | Grand Ballroom A

Data Processing + Storage

Content Experience Level Any

4:30pm MST

Mastering OpenTelemetry Collector Configuration - Steve Flanders, Cisco

Thursday November 14, 2024 4:30pm - 5:05pm MST

Salt Palace | Level 1 | Grand Ballroom B

Configuring the OpenTelemetry Collector can be a daunting task for both novices and seasoned professionals alike. Yet, mastering this crucial aspect is essential for unlocking the full potential of your observability stack. In this session, you will embark on a journey to gain the knowledge and skills needed to conquer common OpenTelemetry Collector configuration challenges. This session will draw from real-world experiences and best practices and provide live demonstrations to navigate the intricacies of OpenTelemetry Collector configuration. Whether you are a novice looking to get started or a seasoned veteran seeking to level up your skills, this session promises to empower you with the knowledge and confidence needed to properly and efficiently configure the OpenTelemetry Collector.

Speakers

Steve Flanders

Senior Director of Engineering, Splunk

Steve Flanders is a Senior Director of Engineering at Splunk (acquired by Cisco) responsible for the Observability Platform team, which includes contributions to the OpenTelemetry project. He was previously the Head of Product at Omnition (acquired by Splunk). Prior to Omnition, he... Read More →

KubeCon NA 2024 Mastering OpenTelemetry Collector Configuration pdf

Thursday November 14, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | Grand Ballroom B

Observability

Content Experience Level Any

4:30pm MST

Tutorial: No Mess Rollouts with Gateway API: Leveraging Gateway API and Argo Rollouts for Progressive Delivery - Nina Polshakova & Lawrence Gadban, Solo.io

Thursday November 14, 2024 4:30pm - 6:00pm MST

Salt Palace | Level 1 | Grand Ballroom G

Modern application delivery has many pitfalls: version transitions, traffic management, quality assurance, performance monitoring, and rollbacks. If you encounter an upgrade issue, what can you do? Mirror traffic? Debug locally? Roll back? Argo Rollouts lets teams gradually and safely deploy new versions of applications. A standard Gateway API enables any provider to support Argo Rollouts without provider-specific code. Argo Rollouts monitors Prometheus metrics to verify performance and reverts if success criteria aren’t met. This hands-on lab guides you on integrating Argo Rollouts with applications using different Gateway API implementations. Using Argo and Gateway API resources (HTTPRoute), you’ll learn to adjust traffic weights and gradually direct more traffic to a new version. We will also explore challenges in route delegation and role-based access control within Gateway API and potential extensions to address gaps in traffic shaping, access control, and debugging rollouts.

Speakers

Lawrence Gadban

Software Engineer, Solo.io

Lawrence is a Field Engineer at Solo.io where he works with organizations of all sizes to architect, adopt, and operationalize components such as Envoy proxy, API gateways, and service mesh. Most recently, he has been working directly with several organizations at various stages of... Read More →

Nina Polshakova

Software Engineer, Solo.io

Nina is a software engineer working on multi-cluster Istio solutions on the Gloo Platform team at Solo.io. She is a CNCF Ambassador and has also been on several Kubernetes release teams. She led the Enhancements team for the 1.29 release and is the current lead for the Release Notes... Read More →

KCNA24 No Mess Rollouts pdf

Thursday November 14, 2024 4:30pm - 6:00pm MST
Salt Palace | Level 1 | Grand Ballroom G

Tutorials, Operations + Performance

Content Experience Level Any

5:25pm MST

Managing and Distributing AI Models Using OCI Standards and Harbor - Steven Zou & Steven Ren, VMware by Broadcom

Thursday November 14, 2024 5:25pm - 6:00pm MST

Salt Palace | Level 2 | 255 E

Just as container images are vital to cloud-native technology, AI models are crucial to AI technology. Effectively, conveniently, and safely managing, maintaining, and distributing AI models is critical for supporting workflows like AI model training, inference, and application deployment. This presentation explores AI model management based on OCI standards and the Harbor project. Standardizing AI model structures and characteristics using OCI specifications and extension mechanisms like OCI Reference to link datasets and dependencies. When large models require efficient loading or privacy considerations, model replication or proxy with upstream repositories like Hugging Face becomes essential. Enhancing model distribution security through signing, vulnerability scanning, and policy-based governance is often necessary. Additionally, introducing acceleration mechanisms such as P2P can significantly improve the efficiency of large model loading.

Speakers

Steven Ren

Senior Manager, Broadcom

Steven Zou

Staff II Engineer, VMware by Broadcom

Steven Zou is a senior engineer with years of experience in cloud computing and cloud-native technology. He is currently working as a Staff II engineer at VMware, focusing on cloud-native and Kubernetes-related platform services. In addition, he is a core maintainer of the CNCF open-source... Read More →

Managing and Distributing AI Models Using OCI Standards and Harbor.pptx (2) pdf

Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 2 | 255 E

AI + ML

Content Experience Level Any

5:25pm MST

Navigating Failures in Pods with Devices: Challenges and Solutions - Sergey Kanzhelev, Google & Mrunal Patel, Red Hat

Thursday November 14, 2024 5:25pm - 6:00pm MST

Salt Palace | Level 2 | 250 AD

Pods are no longer running with just CPU and Memory. We provision GPUs, network cards, request special placement of those devices and allocated memory. And the more efficient or effective you want your set up to be, the more complicated those device requirements are, the more chances you will hit an edge case Kubernetes has not accounted for yet. Come to the talk to learn from Node Maintainers about some of those shortcomings in Kubernetes. If you are only starting with AI/ML and devices, you will be interested to learn what to expect. If you have lots of experience, you may still learn new things. With the increased focus on AI/ML workloads, highlighting those scenarios is important. As Kubernetes plans to fix those problems, you can give feedback on what would work best for you.

Speakers

Sergey Kanzhelev

Staff Software Engineer, Google

Sergey Kanzhelev is a seasoned open source and cloud native maintainer working actively on Kubernetes. Sergey is serving as co-chair of SIG node. He is also one of the founders of OpenTelemetry. He is working on engineering aspect of software and its practical application. He is contributing... Read More →

Mrunal Patel

Distinguished Engineer, Red Hat

Mrunal Patel is a Senior Principal Software Engineer at Red Hat working on containers for Openshift. He is a maintainer of runc/libcontainer and the OCI runtime specification. He started the CRI-O runtime. He is a SIG-Node chair and tech lead.

KubeCon NA 2024 Navigating Failures in Pods With Devices Challenges and Solutions.pptx pdf

Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 2 | 250 AD

AI + ML

Content Experience Level Any

5:25pm MST

Engaging the KServe Community, The Impact of Integrating a Solutions with Standardized CNCF Projects - Adam Tetelman, NVIDIA; Taneem Ibrahim, Red Hat; Johnu George, Nutanix; Tessa Pham, Bloomberg; Andreea Munteanu, Canonical

Thursday November 14, 2024 5:25pm - 6:00pm MST

Salt Palace | Level 1 | Grand Ballroom A

Building a new solution and contemplating whether or not the OSS path is right for you? Wondering where to get started with a large cloud initiative and where the pitfalls may lie? Curious to know all the benefits waiting if your organization embraces a rich CNCF ecosystem? In this talk we will discuss the trade-offs between building a product on a full OSS platform vs. a DIY approach. We will delve into the issues of working with internal stakeholders or partners to embrace an OSS community and will cover the benefits and scaling factors that come when embracing open standards. We will use the recent integration of NVIDIA NIM into KServe as a case study and talk through the trials and tribulations that paid off in a win-win-win situation for our solutions, the OSS projects, and our users. We will cover Kubeflow, Knative, Istio, KServe, and wg-serve as well as a network of companies building enterprise K8s platforms and enterprise AI applications on top of these foundations.

Speakers

Andreea Munteanu

AI Product Manager, Canonical

I lead AI at Canonical, the publisher of Ubuntu and a provider of open source security, support and services. With a background in data science across industries like retail and telecommunications, I help enterprises make data-driven decisions with AI. I am passionate about amplifying... Read More →

Tessa Pham

Senior Software Engineer, Bloomberg

Tessa Pham is a Senior Software Engineer on Bloomberg's Cloud Native Compute Services organization. She works on building an inference platform for Bloomberg’s Data Science Platform, used by engineers and data scientists for training, deploying and serving ML models. Tessa is a... Read More →

Johnu George

Technical Director, Nutanix, Nutanix

Johnu George is a Technical Director at Nutanix with a background in distributed systems and large-scale hybrid data pipelines. He is an active in open-source and has steered several industry collaborations on projects like Kubeflow, Apache Mnemonic and Knative. His research interests... Read More →

Adam Tetelman

Principal Product Architect, NVIDIA

Adam Tetelman is a principal architect at NVIDIA leading cloud native initiatives and CNCF engagements across the company; building inference platforms for NVIDIA AI Enterprise and DGX Cloud. He has degrees in computational robotics, computer & systems engineering, and cognitive science... Read More →

Taneem Ibrahim

Senior Engineering Manager, Red Hat

Taneem is an engineering leader at Red Hat where his organization is responsible for building and delivering Model Serving, Responsible AI, and Model Registry solution in OpenShift AI.

Engaging the KServe Community, The Impact of Integrating a Solutions with Standardized CNCF Projects.pptx pdf

Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | Grand Ballroom A

Cloud Native Experience

Content Experience Level Any

5:25pm MST

Pick My Project! Lessons Learned from Interviewing 20+ End Users for Cloud Native Case Studies - Shedrack Akintayo & Bill Mulligan, Isovalent at Cisco

Thursday November 14, 2024 5:25pm - 6:00pm MST

Salt Palace | Level 2 | 254 B

Cloud native projects can promise the moon in their READMEs, but have you ever wondered what actually causes end users to adopt a project? Shedrack and Bill have interviewed over 20 companies in industries ranging from media to financial services about why they picked a project for their cloud native platform. In this talk, they will reveal what end users truly want when adopting cloud native technologies and what the forcing function was for each of them. You’ll hear firsthand accounts of the triumphs and tribulations faced by companies like Bloomberg, DigitalOcean, The New York Times, and more as well as the specific benefits these organizations are reaping, from enhanced security and observability to improved performance and cost savings. Additionally, they’ll teach other projects their process for creating impactful case studies. By the end, the audience will understand the real-world applications and advantages of cloud native technologies and why end users pick a project.

Speakers

Shedrack Akintayo

Technical Marketing Engineer, Isovalent at Cisco

Shedrack Akintayo is a software engineer and technical writer based in London with six years of experience spanning Web Engineering, DevOps, Technical Writing, and Developer Relations. Shedrack works as a Technical Marketing Engineer at Cisco, via the Isovalent acquisition. He actively... Read More →

Bill Mulligan

Community, Isovalent at Cisco

Bill Mulligan is a cloud native pollinator and community builder. He has given talks, written articles, and appeared on podcasts on a wide range of topics around cloud native. While at CNCF he restarted the Kubernetes Community Day program. He is currently at Isovalent growing the... Read More →

Pick My Project! pdf

Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 2 | 254 B

Cloud Native Experience

Content Experience Level Any

5:25pm MST

Why Serverless Is Trending Again - Matt Butcher, Fermyon & Jay Jenkins, Akamai

Thursday November 14, 2024 5:25pm - 6:00pm MST

Salt Palace | Level 2 | 251 AD

The idea of serverless computing really took off in 2016. But after an apparent peak in 2019, it seemed to be on the decline. Yet things took an about face again in 2022. The idea of serverless functions not only regained lost ground, but even now it is hitting new levels of interest. Why? In this session, we first get very clear about what “serverless” means as a design pattern. Then we dive into what it is good for, and mention a few of the major successes of serverless computing. From there, we look into the present and future of serverless technology, particularly inside of Kubernetes. WebAssembly is the runtime technology that enables serverless in Kubernetes to outperform Amazon Lambda and other competitors.

Speakers

Jay Jenkins

CTO, Akamai

As an experienced technology leader with a background at Akamai, ByteDance and Google, I'm driven to help organizations maximize the benefits of Kubernetes and cloud-native technologies. My 20+ years in agile transformation across diverse industries have equipped me to guide teams... Read More →

Matt Butcher

CEO, Fermyon

Matt Butcher (CEO) is a founder of Fermyon. He is one of the original creators of Helm, Brigade, CNAB, OAM, Glide, and Krustlet. He has written or co-written many books, including "Learning Helm" and "Go in Practice." He is a co-creator of the "Illustrated Children’s Guide to Kubernetes... Read More →

Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 2 | 251 AD

Cloud Native Novice

Content Experience Level Any

5:25pm MST

One Gateway API to Rule Them All (and in the Cluster Configure Them) - Flynn, Buoyant

Thursday November 14, 2024 5:25pm - 6:00pm MST

Salt Palace | Level 1 | 155 E

Ingress, egress, east-west, north-south… Kubernetes has always had a lot of different ways to talk about network traffic, each with its own concerns. For years, the possibility of unifying these kinds of configuration under a single API was a tantalizing but far-off possibility until Gateway API v0.8 took the first step of combining ingress and mesh configuration. Now Linkerd is stepping up to use Gateway API to handle egress as well. Join us for a hot-off-the-presses look into what egress policy covers and what people need from it, how we can make egress functionality work within Gateway API's existing model, and why Linkerd took this approach. We'll touch on the implementation and finish up with a live demo showing off a real-world example of egress management using Linkerd and Gateway API.

Speakers

Flynn -

Tech Evangelist, Buoyant

Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | 155 E

Connectivity

Content Experience Level Any

5:25pm MST

Now You See Me: Tame MTTR with Real-Time Anomaly Detection - Kruthika Prasanna Simha & Raj Bhensadadia, Apple Inc.

Thursday November 14, 2024 5:25pm - 6:00pm MST

Salt Palace | Level 1 | Grand Ballroom B

Picture this! You are running an application on a Kubernetes cluster & you notice that your nodes have been restarting and your users are noticing that your application is unreachable. As an engineer, you want to identify these failures in real-time & differentiate these from known states, at scale. But we know, static thresholds fail for dynamic metrics! This session explores real-time anomaly detection for cloud-native systems. We'll show you how to reduce MTTR and mean time to analyse by proactively identifying abnormal application behavior using statistical & machine learning algorithms on time series data from Prometheus. Learn to pinpoint issues, identify missing instrumentation, and visualize anomalies using Grafana. This session equips you to achieve faster issue resolution and maintain optimal application health. We'll demo practical techniques for metrics selection, anomaly detection and proactive issue identification to manage your cloud-native applications.

Speakers

Raj

Machine Learning Engineer, Apple Inc.

Raj Bhensadadia, a machine learning engineer with a passion for leveraging ML technologies to enhance monitoring and analysis of large scale systems and ensure robustness and performance of infrastructure and services.

Kruthika Prasanna Simha

Senior Software Engineer, Apple

Kruthika is a software engineer at Apple specializing in building ML enabled observability solutions. She holds a Masters in Computer Engineering and has specialized in Machine Learning. In her free time, she likes to dabble with Jupyter Notebooks for running experiments with data... Read More →

Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | Grand Ballroom B

Observability

Content Experience Level Any

5:25pm MST

How Google Built a New Cloud on Top of Kubernetes - Jie Yu & Prashanth Venugopal, Google

Thursday November 14, 2024 5:25pm - 6:00pm MST

Salt Palace | Level 1 | Grand Ballroom H

“Build a new air-gapped cloud with open source technologies” – this is what a small team at Google was tasked with in late 2021. The team delivered a private cloud platform, complete with managed VMs, databases, AI services, and more. Moreover, it did so by leveraging a number of CNCF technologies, including Kubernetes, Istio, etc. We’ll share the potential of these technologies, as well as their limitations, by explaining how they were used to build a scalable, reliable, and secure cloud platform. We’ll discuss how to implement cloud tenancy concepts, enforce isolation among tenants, and how we built a cloud API leveraging k8s API machinery and service mesh. A key innovation in building the private cloud platform was the “Kubernetes Defined Networking” (KDN) stack we created: by leveraging existing k8s networking features (e.g. load balancer, etc.) along with a few key enhancements, we implemented most of the traditional cloud SDN concepts, like VPC, firewall, VM support, etc.

Speakers

prashanth venugopal

Kubernetes Networking Lead, Google

Prashanth has an almost two decades long career, across various networking market segments. In his current role as the lead architect of Google's Kubernetes networking stack, he helps drive the networking stack's evolution for Google Kubernetes Engine (for the Public Cloud Market... Read More →

Jie Yu

Principal Software Engineer, Google

Jie Yu is a currently a Principal Software Engineer at Google. Jie is currently working on Google Distributed Cloud, and is the leading architect for the product. Prior to Google, Jie was a Chief Architect at Mesosphere (D2IQ), and worked at Twitter. Jie joined Kubernetes community... Read More →

Kubecon SLC 2024 How Google Built a New Cloud on Top of Kubernetes pdf

Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | Grand Ballroom H

Platform Engineering

Content Experience Level Any

11:00am MST

Can You Put a Price Tag on Open Source? - Mario Fahlandt, Kubermatic & Bob Killen, CNCF

Friday November 15, 2024 11:00am - 11:35am MST

Salt Palace | Level 2 | 255 E

Earlier this year, the Harvard Business School released the paper titled “The Value of Open Source Software,” estimating the worldwide value of OSS at 8.8 trillion, and on average, it would cost companies at least 3.5x more to develop similar projects internally. Yet, many organizations and engineers struggle to understand or realize this kind of value from contributing to these projects. In this talk, Bob and Mario will discuss the many benefits individuals and companies can achieve by contributing to open source and guide you through the first steps to becoming a contributor. They will also cover how to develop a lightweight open source strategy and convince your organization that an open source first approach can yield great returns.

Speakers

Mario Fahlandt

Service Delivery Architect, Kubermatic

Mario is working as a Customer Delivery Architect @Kubermatic with the focus on planning and building concepts and architecture for Infrastructure in the cloud native world.He started the GDG Munich for Cloud and became a GDE in 2019. In the Kubernetes project he is involved in SIG-ContribEx... Read More →

Bob Killen

Senior Technical Program Manager, CNCF

Bob is a Program Manager at the Google Open Source Programs Office with a focus on Cloud Native computing. He serves the Kubernetes project as a Steering Committee member and chair of the Contributor Experience SIG. Bob comes from an academic background, spending 15 years at the University... Read More →

Can you put a price on open source KCNA24 pdf

Friday November 15, 2024 11:00am - 11:35am MST
Salt Palace | Level 2 | 255 E

Cloud Native Experience

Content Experience Level Any

11:00am MST

Open Source 2.0: The Maintainers' Perspective - William Morgan, Buoyant; Ashley Davis, Venafi; Deepthi Sigireddi, PlanetScale; Emily Omier, Emily Omier Consulting; Matt Bates, Cofide

Friday November 15, 2024 11:00am - 11:35am MST

Salt Palace | Level 2 | 255 B

Open source rules the world, and for a good reason: The code is generally better and more secure, bugs are fixed faster, and more. Virtually all modern applications run on it. But the landscape has changed since the early Linux days. Nights and weekends, volunteer-led projects are increasingly rare. Especially in the CNCF landscape, open source is maintained almost exclusively by companies that pursue a strategic goal, and they need a business justification for paying their engineers. So, who writes the code has changed, but the community's expectations — that it should be free — hasn't. While open source will remain free, the companies behind it must find ways to monetize it — whether through support, enterprise editions, or licensing models. Recent changes, including projects like Terraform, Flux, and Linkerd, highlight the need for a paradigm shift. Join this panel to hear from project maintainers why that is and the future they envision.

Speakers

Emily Omier

Consultant, Emily Omier Consulting

Emily Omier is a positioning consultant who helps open source startups accelerate revenue and community growth by clarifying the project and product's market category, unique value in the market and target user audience. She hosts The Business of Open Source podcast and writes a blog... Read More →

Matt Bates

Founder & CEO, Cofide

William Morgan

Linkerd Director, Buoyant CEO, Buoyant

William is a director on the Linkerd project and the co-founder and CEO of Buoyant, the creators of Linkerd. Prior to Buoyant, he was an infrastructure engineer at Twitter, a software engineer at Powerset, Microsoft, and Adap.tv, a research scientist at MITRE. He holds an MS in computer... Read More →

Ashley Davis

Staff Software Engineer, Venafi

As a teenager, Ash taught himself to program after wondering how exactly video games were made. That led to adventures trawling through open source codebases, sparking an interest in computers spanning from bare-metal machine code right up to scalable distributed platforms like Kubernetes... Read More →

Deepthi Sigireddi

Software Engineer, PlanetScale

Friday November 15, 2024 11:00am - 11:35am MST
Salt Palace | Level 2 | 255 B

Cloud Native Experience

Content Experience Level Any

11:00am MST

The State of Kubernetes Optimization and the Role of AI - James Wilson, nOps; Haoran Qiu, Microsoft; Katie Gamanji, Apple; Jasmine James, Square; Josh Cypher, Sonos

Friday November 15, 2024 11:00am - 11:35am MST

Salt Palace | Level 1 | 155 B

Featuring a diverse panel of experts, attendees will hear the latest in Kubernetes optimization. The session will encourage and engage attendees to challenge conventional wisdom and explore innovative approaches to optimization. Participants will leave with actionable knowledge and new perspectives they can apply to their own environments. Topics include: - Valuable insights into the current state of AI in optimization, highlighting both its potential and barriers to adoption - How and when AI can be used for real-time decision-making - Exploring the intersection of sustainability and optimization, emphasizing the importance of visibility in driving sustainable practices - The state of multidimensional pod autoscaling and potential to resolve conflicts between horizontal and vertical autoscaling - How new computing options and tools like Karpenter have the potential to disrupt the bin packing problem - How cloud-native projects can leverage new tools to track efficiencies

Speakers

Katie Gamanji

Sr Field Engineer, Apple

Katie is a cloud native leader and practitioner, currently in a Senior Field Engineer role at Apple and a TOC for CNCF. As a platform engineer, Katie contributed to Conde Nast and American Express platforms and at CNCF led the End User Community. Katie is the author of the Cloud Native... Read More →

Haoran Qiu

Research Engineer, Microsoft

Haoran Qiu is a Research Engineer at Microsoft Azure Systems Research. His research interests are in cloud efficiency, ML systems, and applying ML for cloud systems design and operation. Haoran was a recipient of ML and Systems Rising Star by MLCommons in 2023. Before joining Microsoft... Read More →

Jasmine James

Head of Development Infrastructure, Square

Jasmine James is an engineering leader at Square heading the Development Infrastructure for the Devices Platform overseeing CI Infrastructure, Developer Experience, and Test Rack teams aiming to streamline development and foster continuous feedback. She is passionate about diversity... Read More →

James Wilson

VP of Engineering, nOps

James has over two decades of experience in tech, with a strong focus in leading engineering teams in building cloud-based solutions. His expertise includes container orchestration, high-speed data transport, and cloud-native architectures. Currently, he leads the engineering team... Read More →

Josh Cypher

Senior DevOps Engineer, Sonos

Josh, a Senior DevOps Engineer at Sonos, has a diverse background in quality assurance and automation. Throughout his career, he has held roles such as tester, backend developer, automation engineer, engineering manager, and head of quality before specializing in DevOps and Kubernetes... Read More →

nOps KCNA24 Template.pptx pdf

Friday November 15, 2024 11:00am - 11:35am MST
Salt Palace | Level 1 | 155 B

Operations + Performance

Content Experience Level Any

11:00am MST

Platform Engineering in Financial Institutions: The Practitioner Panel - Paula Kennedy, Syntasso; Chris Plank, NatWest Bank; Suhail Patel, Monzo; Jinhong Brejnholt, Saxo Bank; Rachael Wonnacott, Fidelity International

Friday November 15, 2024 11:00am - 11:35am MST

Salt Palace | Level 1 | Grand Ballroom H

In the world of small and large financial institutions, platform engineering is a driver for shipping quickly, safely, and efficiently. This panel brings together seasoned practitioners from leading banks and financial institutions to share their firsthand platform experiences, successes, and challenges. - Discover how platform engineering can enhance developer experience, facilitate rapid innovation and drive efficiencies. - Delve into the complexities of navigating regulatory compliance, specifically when using open source technologies such as Kubernetes. - Learn from the experts' successes, setbacks and strategies (across technology and people), gaining actionable insights for successful implementation. Join us as we discuss the journey of adopting and deploying CNCF technologies at scale within the highly regulated financial sector. We’ll explore practical examples of both successes and incidents where things have gone wrong, providing the audience with valuable takeaways.

Speakers

Paula Kennedy

Chief Operating Officer, Syntasso

Paula is Co-Founder & Chief Operating Officer of Syntasso; previous roles include Senior Director at VMware Tanzu, Pivotal and Co-Founder & Chief Operating Officer of CloudCredo. With 20+ years experience in IT, Paula champions community, diversity and inclusion and has a range of... Read More →

Suhail Patel

Senior Staff Engineer, Monzo

Suhail is a Staff Engineer at Monzo focused on building the Core Platform. His role involves building and maintaining Monzo's infrastructure which spans over two thousand microservices and leverages key infrastructure components like Kubernetes, Cassandra, Etcd and more. He focuses... Read More →

Jinhong Brejnholt

Global Head of Cloud & Container Platforms, Saxo Bank

Jinhong is an accomplished cloud and platform architect, deeply committed to advancing DevSecOps practices and cloud-native technologies. She holds an MSc in Software Development and Technology and is certified as a Kubernetes application developer, administrator, and security specialist... Read More →

Chris Plank

Enterprise Architect & Joint Product Owner, NatWest Bank

Chris Plank is a Enterprise Architect working for NatWest Bank in Edinburgh, Scotland. He has been leading a Platform as a Product initiative within the Bank over the last year looking to radically change the Banks approach to provisioning and maintaining services. Outside of work... Read More →

Rachael Wonnacott

Technical Product Owner, Kubernetes Platform, Fidelity International

Rachael has spent the last decade focused on platform engineering. She places a conscious emphasis on improving flow and is on the quest to smooth the application lifecycle for developers in the enterprise. With a background in astrophysics, Rachael brings her scientific approach... Read More →

Friday November 15, 2024 11:00am - 11:35am MST
Salt Palace | Level 1 | Grand Ballroom H

Platform Engineering

Content Experience Level Any

11:00am MST

CLBO: ClashLoopBackOff | Attendee Edition

Friday November 15, 2024 11:00am - 12:00pm MST

Salt Palace | Level 1 | Halls A-C + 1-5 | Project Pavilion, Solutions Showcase

Friday November 15, 2024 11:00am - 12:00pm MST
Salt Palace | Level 1 | Halls A-C + 1-5 | Project Pavilion, Solutions Showcase

Experiences, ClashLoopBackOff

Content Experience Level Any

11:55am MST

Building Massive-Scale Generative AI Services with Kubernetes and Open Source - John McBride, OpenSauced

Friday November 15, 2024 11:55am - 12:30pm MST

Salt Palace | Level 2 | 250 AD

At OpenSauced, we power over 40,000 generative AI inferences every day, all through our in-house platform ontop of Kubernetes. The cost of doing this kind of at-scale AI inference with a third party provider API would be astronomic. Thankfully, using Kubernetes, the public cloud, and open-source technologies, we've been able to scale with relatively low costs and a lean stack. In this talk, John will walk through the journey of building a production grade generative AI system using open source technologies, open large language models, and Kubernetes. We'll also explore why we chose to build ontop of Kubernetes for our AI workloads over using a third party provider, and how we're running and managing our AI/ML clusters today. Additionally, we'll dive into the techniques we used to groom our Retrieval-Augmented-Generation pipelines for efficiency ontop of Kubernetes and other practical tips for deploying your own AI services at-scale.

Speakers

John McBride

Sr. Software Engineer, OpenSauced

John is a Sr. Software Engineer at OpenSauced where he also serves as Head of Infrastructure and AI engineer. He is the maintainer of spf13/cobra, the Go CLI bootstrapping library used throughout the CNCF landscape. In the past, he has worked on open source Kuberenetes platforms... Read More →

Kubecon 24 Building Massive Scale Generative AI Services with Kubernetes and Open Source John McBride, OpenSauced pdf

Friday November 15, 2024 11:55am - 12:30pm MST
Salt Palace | Level 2 | 250 AD

AI + ML

Content Experience Level Any

11:55am MST

Accessibility at KubeCon: Deaf Voices in Cloud Native - Alfonso Torres, DeafTech; Jay Jackson, CallRevu; Destiny O'Connor, Women Blessing Women; Travis Johnson, Convo Communications; Rob Koch, Slalom Build

Friday November 15, 2024 11:55am - 12:30pm MST

Salt Palace | Level 1 | Grand Ballroom B

Never met a deaf person at a conference? That is not surprising. While there are lots of deaf engineers, until recently, most conferences — and virtually any other community activity — haven't been accessible to deaf community members. But for KubeCon, that all changed exactly a year ago! During this discussion, deaf panelists from various countries will shed light on their unique experiences being deaf in tech and the impact that making KubeCon accessible has had on their lives and hopes for the future. Attendees will learn why the technology space is a great fit for deaf individuals, the benefits and opportunities deaf professionals bring to the table, and what it takes to be an accessible and welcoming community. Panelists will also debunk common misconceptions and empower *you* to take steps toward a more inclusive cloud native ecosystem.

Speakers

Rob Koch

Principal, Slalom Build

Alfonso Balderas Torres

Founder, DeafTech

I am founder of DeafTech, an initiative dedicated to support the Deaf community in their professional development in the field of technology, through technology events, programming courses and conferences. In addition, I am co-founder of WorldDeafTech, an organization focused on creating... Read More →

Destiny O'Connor

Co-Chair CNCF Deaf and Hard of Hearing WG, Web Developer, Women Blessing Women

As Co-Chair of the CNCF Deaf and Hard of Hearing Working Group, where I channel my passion for creating a more inclusive tech world for deaf and hard-of-hearing individuals. My mission is to educate the tech community about the unique challenges and experiences of being deaf in this... Read More →

Jay Jackson

Senior Software Engineer, CallRevu

Jay Jackson, a Senior Software Engineer at CallRevu, brings over 2 decades of experience in the tech industry. Jay has navigated this tech journey as a deaf individual, with American Sign Language (ASL) as his primary mode of communication and is passionate about exploring ways to... Read More →

Travis Johnson

Level 3 Engineer, Convo Communications

A Linux aficionado, Travis Johnson is a deaf Level 3 Engineer with 10+ years of experience in the VoIP industry, where he has gained deep knowledge of networking and scripting. A firm believer in lifetime learning, Travis continuously acquires new skills and certifications. Off work... Read More →

Friday November 15, 2024 11:55am - 12:30pm MST
Salt Palace | Level 1 | Grand Ballroom B

Cloud Native Experience, Diversity + Equity + Inclusion

Content Experience Level Any

11:55am MST

Kubernetes Upgrades: Less Pain, More Gain (and Maybe a Little Swearing) - Jago Macleod, Google

Friday November 15, 2024 11:55am - 12:30pm MST

Salt Palace | Level 1 | Grand Ballroom H

Kubernetes upgrades are a major pain point for many users, often due to the complexity of managing multiple, independently versioned components. This talk will delve into the strategies and best practices for minimizing disruption and maximizing success during Kubernetes upgrades. We'll explore: - Common pitfalls and challenges faced during upgrades - Practical tips for smoother, more reliable upgrade processes - The risks of relying solely on Long Term Support (LTS) versions - Improving upgrade reliability for all Kubernetes users, regardless of their chosen platform Led by the head of both OSS Kubernetes and GKE Release and Upgrades at Google, this talk will provide valuable insights and actionable advice for anyone looking to create a sustainable and successful upgrade strategy. Whether you're a seasoned Kubernetes veteran or just getting started, this session will equip you with the knowledge and tools to navigate the complex landscape of Kubernetes upgrades.

Speakers

Jago Macleod

Engineering Director, Google

Jago Macleod is an Engineering Director at Google, where he leads much of the Kubernetes and Google Kubernetes Engine (GKE) team, which gives him the opportunity to work with some of Google Cloud’s largest customers. Prior to working at Google, Jago helped make the smart homes that... Read More →

KubeCon NA 2024 Kubernetes Upgrades pptx

Friday November 15, 2024 11:55am - 12:30pm MST
Salt Palace | Level 1 | Grand Ballroom H

Platform Engineering

Content Experience Level Any

2:00pm MST

Gamifying Cloud Native: How to Design and Build an Educational Game for Your Project - Calum Murray, University of Toronto, Faculty of Applied Science and Engineering & Zainab Husain, OCAD University

Friday November 15, 2024 2:00pm - 2:35pm MST

Salt Palace | Level 1 | Grand Ballroom B

Have you ever struggled to explain what a Cloud Native project does? One of the challenges many cloud native projects face is that the abstractions they provide are not intuitive for new users. Since cloud technologies are often built on top of each other and use domain specific language, this problem compounds. Luckily, educational games can be made to help communicate these abstract concepts in a fun and engaging format! In this talk, we will explore how you can build an educational game for your project through the example of a game that the Knative community has built to teach Knative Eventing. We will walk through the steps other open source projects can follow to design their own educational game, including brainstorming strategies for deciding on key concepts and which metaphors/symbols to use to represent these concepts. These information design strategies can also be applied to create more understandable educational cloud native content in general!

Speakers

Zainab Husain

Knative UX Design Lead, OCAD University

Zainab Husain is a UX Design Researcher working at OCAD University. She completed her Masters in Engineering at the University of Toronto, focusing on Human Computer Interactions. Zainab is passionate about tools that improve collaboration between Engineers and Designers and is also... Read More →

Calum Murray

Knative Eventing Maintainer and UX Lead, University of Toronto, Canada

I'm a software engineer, and I love building cool things in open source. I like to seek out the most interesting and challenging problems which I think will have a large impact, and build creative solutions to them. I also like to share my passion for open source with others, and... Read More →

Knative Eventing Game pptx

Friday November 15, 2024 2:00pm - 2:35pm MST
Salt Palace | Level 1 | Grand Ballroom B

Cloud Native Experience

Content Experience Level Any

2:55pm MST

Tick, TAG, TOC - Keeping Cloud Native Running - Karena Angell, Red Hat; Rajas Kakodkar, Broadcom; Alex Chircop, Akamai; Ricardo Aravena, Snowflake; Marina Moore, Edera

Friday November 15, 2024 2:55pm - 3:30pm MST

Salt Palace | Level 1 | Grand Ballroom B

With only so many hours in the day, how does the cloud native community keep things running? Over 190 projects, thousands of contributors, and an array of groups all contribute to what we know as “cloud native” but there is more going on behind the scenes that keep the machine of cloud native running smoothly and driving the technical direction of the landscape. In this panel discussion, you’ll hear from Chairs and Technical Leads of Technical Advisory Group (TAG) Runtime, Storage, App Delivery and the chair of the CNCF Technical Oversight Committee (TOC) on - How they are defining the roadmap for the future - The glue and oil of collaboration between advisory, oversight, and projects’ health - How you can time your engagement with these groups to have an outsized impact! This is not a maintainer track session. While they are separate tracks for specific CNCF TAG and TOC activities, this is meant to be your backstage pass to see how the CNCF landscape gets shaped!

Speakers

Marina Moore

Edera

Marina Moore has a PhD from at NYU where she performed research into software supply chain security. This research focused on real-world application through open source contribution. She is an open source maintainer and active in open source communities through the CNCF and OpenSSF... Read More →

Alex Chircop

Chief Architect at Akamai, Akamai

Chief Architect at Akamai. Previously a founder and CTO of Ondat (formerly StoraeOS), building software defined solutions for cloud native environments. Alex is also a co-chair of the CNCF Storage TAG (previously SIG). Before embarking on the startup adventure he spent over 25 years... Read More →

Ricardo Aravena

Cloud Native Lead, Snowflake

Ricardo Aravena is a Cloud Infrastructure Lead helping automate systems with cloud native technologies at Snowflake. He's an open source enthusiast, co-chair of the CNCF TAG-Runtime, Lead for CNCF Cloud Native AI Working Group and a CNCF Ambassador. He has been working in tech for... Read More →

Karena Angell

Senior Principal Chief Architect, Red Hat

Karena Angell is a Senior Principal Chief Architect at Red Hat focusing on cloud native application workloads for Kubernetes, open source software projects, as well as solutions for the 'open' hybrid cloud.

Rajas Kakodkar

Staff Software Engineer at Broadcom | Tech Lead CNCF TAG Runtime, Broadcom

Friday November 15, 2024 2:55pm - 3:30pm MST
Salt Palace | Level 1 | Grand Ballroom B

Cloud Native Experience

Content Experience Level Any

2:55pm MST

The Key Value of etcd Over Custom Resources: Scalability - Jef Spaleta, Isovalent at Cisco

Friday November 15, 2024 2:55pm - 3:30pm MST

Salt Palace | Level 1 | 155 B

Cilium defaults to using Kubernetes Custom Resources to hold Cilium specific internal state, however when the cluster is large enough, the Kubernetes API becomes a bottleneck on performance. To scale a cluster to hundreds of nodes, Cilium can be configured to use a dedicated external etcd instance. This talk will discuss the details of what the external etcd looks like from an operator perspective, and explore why Cilium uses an external etcd for enhanced scalability. It will cover how to manage a cluster by bypassing the Kubernetes API and interacting only with the cluster's etcd key-value store - and also why it might be a bad idea. Get a taste of what's possible by bypassing the Kubernetes API and interacting with the etcd API directly, and learn why Cilium has an option to use a dedicated etcd deployment, not shared by the Kubernetes API, for holding Cilium state and the scalability benefits it can bring to your cluster.

Speakers

Jef Spaleta

Technical Community Advocate, Isovalent at Cisco

Jef Spaleta has more than a decade of experience in the technology industry; as software engineer, open source contributor, IoT hardware developer, operations, and most recently as a community advocate at Isovalent.

Friday November 15, 2024 2:55pm - 3:30pm MST
Salt Palace | Level 1 | 155 B

Operations + Performance

Content Experience Level Any

2:55pm MST

Modernization of Intuit Payroll Enterprise Using Event Driven Architecture - Hema Maarimuthu & Vigith Maurice, Intuit

Friday November 15, 2024 2:55pm - 3:30pm MST

Salt Palace | Level 1 | Grand Ballroom H

Intuit's Quickbooks Online Payroll Enterprise, a critical application serving over 2 million customers, processes over a million transactions and $34 billion in payroll taxes. We're modernizing with a heavy investment in event-driven architecture for effective handling of financial data. This major transition extends beyond just the payroll platform; it involves decomposing complex systems across Intuit products using event-driven architecture and a focus on availability, scalability, and security is crucial. To address challenges like autoscaling for high throughput, low latency, better operational excellence, and development productivity, we have built our modernized platform on Numaflow, an open-source, Kubernetes native, language-agnostic platform. In our presentation, we will share our journey of modernizing our stack using event-driven serverless architecture on Numaflow and highlight the advantages it has brought to our developers and technology infrastructure.

Speakers

Vigith Maurice

Principal Engineer, Intuit

Vigith is a co-creator of Numaproj and Principal Software Engineer for the Intuit Core Platform team in Mountain View, California. One of Vigith's current day-to-day focus areas is the various challenges in building scalable data and AIOps solutions for both batch and high-throughput... Read More →

Hema Maarimuthu

Principal Engineer, Intuit

Hema is a Principal Software Engineer for Intuit's Online Payroll Infrastructure team in Mountain View, California. Hema’s current work involves leading cross-functional teams, strategizing, and driving operational excellence initiatives. Her major accomplishments include successfully... Read More →

Friday November 15, 2024 2:55pm - 3:30pm MST
Salt Palace | Level 1 | Grand Ballroom H

Platform Engineering

Content Experience Level Any

4:00pm MST

Privacy in the Age of Big Compute - Sal Kimmich, Confidential Computing Consortium, Linux Foundation

Friday November 15, 2024 4:00pm - 4:35pm MST

Salt Palace | Level 1 | Grand Ballroom A

In the age of big compute, the definition of privacy has transformed as re-identification from anonymized datasets has become easier. This session explores the challenges and solutions in navigating privacy concerns in high-dimensional data environments. Attendees will learn about the risks of re-identification, the importance of unicity in data sets, and how Privacy Enhancing Technologies (PETs) and Confidential Computing can mitigate these risks. Discover how these advancements can help protect sensitive data, ensure compliance, and foster a more secure data ecosystem in cloud-native environments.

Speakers

Sal Kimmich

Technical Community Architect, Confidential Computing Consortium, Linux Foundation

Sal is an advocate for open source, passionate about helping engineers, ethical hackers, and digital enthusiasts navigate modern software development. With over a decade of experience building cloud-native machine learning pipelines in healthcare and tech for good sectors, Sal now... Read More →

Friday November 15, 2024 4:00pm - 4:35pm MST
Salt Palace | Level 1 | Grand Ballroom A

Data Processing + Storage

Content Experience Level Any

4:55pm MST

Goodbye etcd! Running Kubernetes on Distributed PostgreSQL - Denis Magda, Yugabyte

Friday November 15, 2024 4:55pm - 5:30pm MST

Salt Palace | Level 1 | Grand Ballroom A

Kubernetes once favored Etcd as a database for all cluster data. Back then, relational databases lacked the availability and scalability characteristics required by Kubernetes. However, as Etcd encountered challenges with various Kubernetes workloads, relational databases continued to evolve. This session is a practical guide for deploying fault-tolerant and scalable Kubernetes clusters on distributed PostgreSQL. We’ll begin with Kine, which integrates into the Kubernetes architecture, enabling relational databases for cluster metadata management. Then, we’ll use Kine to deploy Kubernetes on a single-server PostgreSQL instance. After that, we’ll migrate to a multi-node PostgreSQL instance, allowing Kubernetes to tolerate zone and region outages and scale to thousands of nodes on demand.

Speakers

Denis Magda

Head of DevRel, Yugabyte

Denis started his software engineering career at Sun Microsystems and Oracle, where he built JVM/JDK and led one of the Java development groups. After learning Java from the inside, he joined the world of distributed systems and databases, where he has remained ever since. His experience... Read More →

Friday November 15, 2024 4:55pm - 5:30pm MST
Salt Palace | Level 1 | Grand Ballroom A

Data Processing + Storage

Content Experience Level Any