Loading…
Attending this event?
In-person
November 12-15
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Mountain Standard Time (UTC -7). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 
Tuesday, November 12
 

5:30pm MST

⚡ Lightning Talk: `Kubectl Debug` Lacks an `IDE` Option. Let’s Fix That! - Mario Loriedo, Red Hat
Tuesday November 12, 2024 5:30pm - 5:35pm MST
Don't get me wrong. `kubectl debug` is one of my favorite `kubectl` commands. But probably because I like it so much, I am convinced it deserves more love! This talk will present a `kubectl debug` extension that starts an IDE in an ephemeral container for debugging purposes. This extension uses the DevWorkspace operator, which is capable of running lightweight cloud development environments, including the IDE, in containers. If you like debugging by adding breakpoints in an IDE rather than inspecting your application's logs, you should attend this talk.
Speakers
avatar for Mario Loriedo

Mario Loriedo

Senior Principal Software Engineer, Red Hat
Mario is a Senior Principal Software Engineer at Red Hat. He works on Podman and on container-based developer tools. He has been a CNCF Ambassador and the tech lead of the Eclipse Che project. He has co-created the Devfile (a CNCF Sandbox Project). He has been a speaker at conferences... Read More →
Tuesday November 12, 2024 5:30pm - 5:35pm MST
Hyatt Regency | Level 4 | Regency Ballroom BCD

5:45pm MST

⚡ Lightning Talk: Evaluating Scheduler Efficiency for AI/ML Jobs Using Custom Resource Metrics - Dmitry Shmulevich, NVIDIA
Tuesday November 12, 2024 5:45pm - 5:50pm MST
Kubernetes deployments frequently utilize custom resources beyond just CPU and memory, such as GPUs, which are essential for AI/ML workloads. While the Metrics API offers insights into CPU and memory usage at both the pod and node levels, it does not provide similar information for custom resources. Although resource requests for custom resources are specified in the pod spec, there is no visibility into how efficiently these resources are utilized at the node and cluster levels. To address this gap, we developed a Prometheus Node Resource Exporter tailored to monitor custom resources. Our case study focuses on evaluating the efficiency of Kubernetes schedulers when handling a high volume of AI/ML jobs, using GPU occupancy on the nodes as the primary indicator. In this lightning talk, we will present a comparative analysis of several scheduling frameworks based on the metrics collected by our custom exporter.
Speakers
avatar for Dmitry Shmulevich

Dmitry Shmulevich

Software Engineer, NVIDIA
Dmitry is a software engineer at NVIDIA with over 25 years of experience in software development, specializing in cloud computing for the past eight years. Throughout his career, he has made significant contributions to various systems and projects across the cloud stack. He is also... Read More →
Tuesday November 12, 2024 5:45pm - 5:50pm MST
Hyatt Regency | Level 4 | Regency Ballroom BCD
  ⚡ Lightning Talks, Observability
  • Content Experience Level Any

5:50pm MST

⚡ Lightning Talk: Future-Proofing Kubernetes: Impact of Storage Version Migration and Meaning of Resource Version (RV) - Nilekh Chaudhari, Microsoft
Tuesday November 12, 2024 5:50pm - 5:55pm MST
Kubernetes relies on API data being actively rewritten to support some maintenance activities related to at-rest storage. Two prominent examples are the versioned schema of stored resources (i.e., the preferred storage schema changing from v1 to v2 for a given resource) and encryption at rest (i.e., rewriting stale data based on a change in how the data should be encrypted). The simplest way to rewrite data is to issue no-op update requests via kubectl. This approach is problematic for any resource that can contain a large amount of data, such as Kubernetes secrets, and it is also impractical to perform without automation, as the number of resources that need migration is always growing. Storage Version Migration (SVM), which is now available as a built-in alpha API since Kubernetes v1.30, helps achieve this. However, the implementation of SVM has significant implications for the entire Kubernetes project and its ecosystem.
Speakers
avatar for Nilekh Chaudhari

Nilekh Chaudhari

Software Engineer, Microsoft
Nilekh is a Software Engineer at Microsoft, specializing in Kubernetes. He actively contributes to SIG Auth and SIG API Machinery and is a core maintainer of the Secrets Store CSI Driver, the Azure Provider for the Secrets Store CSI Driver, and the Gatekeeper Library project.
Tuesday November 12, 2024 5:50pm - 5:55pm MST
Hyatt Regency | Level 4 | Regency Ballroom BCD
  ⚡ Lightning Talks, Platform Engineering
  • Content Experience Level Any

6:10pm MST

⚡ Lightning Talk: Running Kind Clusters with GPU Support Using Nvkind - Evan Lezar, NVIDIA
Tuesday November 12, 2024 6:10pm - 6:15pm MST
Kind is a powerful tool for running local Kubernetes clusters using Docker. It is particularly useful for testing, development, and CI/CD workflows, offering features like multi-node cluster support, easy configuration, and cross-platform compatibility. However, providing access to GPUs in Kind is not a very straightforward process. There is no standard way to inject GPUs into a Kind worker node, and even with a series of "hacks" to make it possible, post-processing is still needed to isolate different sets of GPUs to different nodes. In this lightning talk, we introduce nvkind – a wrapper around Kind that encapsulates the steps necessary to make GPUs available to Kind worker nodes. Ideally, GPU support would have been added to Kind directly, but many challenges exist to make this possible. This talk discusses those challenges, how we've overcome them with nvkind, and the steps needed to eventually support GPUs directly within Kind itself.
Speakers
avatar for Evan Lezar

Evan Lezar

Senior Systems Software Engineer, NVIDIA
Evan Lezar is a Senior Systems Software Engineer on the Cloud Native team at NVIDIA. His focus is making GPUs and other NVIDIA devices easily accessible from containerized environments. This includes driving development and adoption of the Container Device Interface (CDI).
Tuesday November 12, 2024 6:10pm - 6:15pm MST
Hyatt Regency | Level 4 | Regency Ballroom BCD
  ⚡ Lightning Talks, AI + ML
  • Content Experience Level Any
 
Wednesday, November 13
 

11:15am MST

The Future of DBaaS on Kubernetes - Melissa Logan, Constantia; Sergey Pronin, Percona; Deepthi Sigireddi, PlanetScale; Gabriele Bartolini, EDB
Wednesday November 13, 2024 11:15am - 11:50am MST
Running Database-as-a-Service (DBaaS) in the cloud is a common practice for organizations, and more are seeking to offer DBaaS on Kubernetes. Benefits include cost efficiencies, as well as providing a faster, more scalable development environment. While it has many benefits, managing a DBaaS on Kubernetes can be challenging. In this panel, database experts from the Data on Kubernetes Community will discuss how to get started with Kubernetes and operators to run DBaaS, storage and security requirements, common patterns for deployment and Day 2 operations, how to leverage AI for DBaaS, and pitfalls to avoid. They will also share real world experiences from users running DBaaS on Kubernetes.
Speakers
avatar for Melissa Logan

Melissa Logan

CEO, Constantia
Melissa Logan has worked in tech for 24 years and is currently director of the Data on Kubernetes and Data Mesh Learning communities, and founder of Constantia.io - a tech community and communications company. Constantia works with data and open source companies to provide marketing... Read More →
avatar for Gabriele Bartolini

Gabriele Bartolini

VP of Cloud Native, EDB
Gabriele, a co-founder of 2ndQuadrant and open-source advocate, has been instrumental in PostgreSQL's global growth. Focused on enhancing business continuity for large-scale databases, he has championed stateful workloads in cloud-native environments since 2019. As a co-founder and... Read More →
avatar for Deepthi Sigireddi

Deepthi Sigireddi

Software Engineer, PlanetScale
Deepthi is the Technical lead for Vitess, a CNCF graduated open source project. She also leads the Vitess engineering team at PlanetScale which offers a database service built on Vitess. She brings over 20 years of experience building scalable systems to this role. She enjoys speaking... Read More →
avatar for sergey pronin

sergey pronin

Product guy, Percona
Sergey is a passionate technology “driver”. After graduation worked in various fields: internet service provider, financial sector and M&A business. Main focal points were infrastructure and products around it. At Percona as a Group Product Manager drives forward Kubernetes and... Read More →
Wednesday November 13, 2024 11:15am - 11:50am MST
Salt Palace | Level 1 | Grand Ballroom GI
  Data Processing + Storage
  • Content Experience Level Any

11:15am MST

ARM-Wrestling: Overcoming CPU Migration Challenges to Reduce Costs - Laurent Bernaille & Eric Mountain, Datadog
Wednesday November 13, 2024 11:15am - 11:50am MST
When you have a significant cloud footprint, you always look for performance improvements and cost reductions. So when ARM instances became commonly available on one of our providers, seemingly providing great performance at a lower cost, we had to take a closer look! In this talk, we will first describe the steps we took to make our clusters ARM-ready and a few interesting issues we encountered during our initial tests: from performance regressions due to compiler behaviors to subtle memory corruption bugs. We will then discuss new challenges, in particular how to achieve load-balancing and auto-scaling when running workloads on a mix of CPUs with different performances, and share our results. If migrating real workloads to ARM proved challenging, it was worth the effort and we now run more than 50% of our workloads on ARM.
Speakers
avatar for Laurent Bernaille

Laurent Bernaille

Principal Engineer, Datadog
Laurent Bernaille worked several years as a consultant specializing in cloud, containers, and automation and helped organizations migrate to the public cloud and adopt containers. He is now Principal Engineer at Datadog and works closely with infrastructure teams, which are responsible... Read More →
avatar for Eric Mountain

Eric Mountain

Staff Engineer, Datadog
Eric Mountain began working with Kubernetes in 2014 helping Amadeus migrate to container and cloud technology. Eric is now a Staff Engineer in Datadog’s Compute team providing large scale Kubernetes to our internal users.
Wednesday November 13, 2024 11:15am - 11:50am MST
Salt Palace | Level 1 | 155 BC
  Operations + Performance
  • Content Experience Level Any

11:15am MST

AuthZEN: The “OpenID Connect” for Authorization - Omri Gazitt, Aserto
Wednesday November 13, 2024 11:15am - 11:50am MST
Today, the authorization world is fractured - each vendor supports its own APIs & protocols. But this is about to change. AuthZEN, a new OpenID Foundation working group, was created in late 2023 to establish authorization standards. OIDF is the home of OpenID Connect, the ubiquitous standard for federated login, and that’s where we’re setting our sights. In this talk, I'll describe the current state of cloud-native authorization, including the policy-as-code and policy-as-data approaches, and the various open source projects in each camp. I'll also share the progress we’ve made creating a single authorization API that works across both policy-as-code (OPA, Topaz) and policy-as-data (Zanzibar-style projects), present the API specs we've created so far, and show off the various interoperable implementations. With this foundation in place, engineering teams can be more confident in externalizing their authorization and picking a provider without being locked in to a proprietary API.
Speakers
avatar for Omri Gazitt

Omri Gazitt

Co-founder & CEO, Aserto
Omri is the co-founder/CEO of Aserto, an authorization startup, and his third entrepreneurial venture. He's spent the majority of his 30-year career working on developer and infrastructure technology, most recently as the CPO of Puppet. Previously he was the VP and GM of HP's Cloud... Read More →
Wednesday November 13, 2024 11:15am - 11:50am MST
Salt Palace | Level 1 | 151
  Security
  • Content Experience Level Any

11:15am MST

GitOops... I Did It Again! Protecting Your GitOps System from Being Used for Privilege Escalation - Oreen Livni & Elad Pticha, Cycode
Wednesday November 13, 2024 11:15am - 11:50am MST
From data theft to privilege escalation in the Kubernetes cluster, you don't want to be the one telling your boss that your GitOps system has been compromised. This talk covers the security of GitOps tools, highlighting common misconfiguration pitfalls and how to avoid them. We will share the story of CVE-2024-31989, a critical vulnerability we discovered in the popular tool Argo. When installed with the default configuration, this vulnerability allowed privilege escalation from any access point to the cluster (such as a webshell) to complete cluster takeover. We will discuss common insecure configurations like this and provide examples from popular open-source projects to explain how your organization can protect itself from these risks. Attendees will receive a guide and practical tools to protect their GitOps systems against such threats.
Speakers
avatar for Elad Pticha

Elad Pticha

Security Researcher, Cycode
Elad is a passionate security researcher with a focus on software supply chain and web application security. He dedicates his time to writing security research tools and finding vulnerabilities across a broad spectrum, from open-source projects and web applications to IoT devices... Read More →
avatar for Oreen Livni

Oreen Livni

Security Researcher, Cycode
Oreen Livni is a passionate security researcher specializing in application and supply chain security, Domain, and networking. With a focus on software supply chain vulnerabilities. Alongside his professional commitments, he immerses himself in art, gardening, and the world of surfing... Read More →
Wednesday November 13, 2024 11:15am - 11:50am MST
Salt Palace | Level 2 | 250
  Security
  • Content Experience Level Any

12:10pm MST

AI and ML: Let’s Talk About the Boring (yet Critical!) Operational Side - Rob Koch, Slalom Build & Milad Vafaeifard, Epam
Wednesday November 13, 2024 12:10pm - 12:45pm MST
As AI and ML become increasingly prevalent, it’s worth looking harder at the operational side of running these applications. We need a lot of compute and access to GPU workloads. We also need to be reliable, while providing rock-solid separation between datasets and training processes. And we need great observability in case things go wrong, and must be simple to operate. Let's build our ML applications on top of a service mesh instead of spending resources reimplementing the wheel – or, worse, the flat tire. Join us for a lively, informative, and entertaining look at how a service mesh can solve real-world issues with ML applications while making it simpler and faster to actually get things done in the world of ML. Rob Koch, Principal at Slalom Build, will demonstrate how you can use Linkerd together with multiple clusters to develop, debug, and deploy an ML application in Kubernetes (including IPv6 and GPUs), with special attention to multitenancy and scaling.
Speakers
avatar for Rob Koch

Rob Koch

Principal, Slalom Build
A tech enthusiast who thrives on steering projects from their initial spark to successful fruition, Rob Koch is Principal at Slalom Build, AWS Hero, and Co-chair of the CNCF Deaf and Hard of Hearing Working Group. His expertise in architecting event-driven systems is firmly rooted... Read More →
avatar for Milad Vafaeifard

Milad Vafaeifard

Lead Software Engineer, Epam
Milad Vafaeifard, a Lead Software Engineer at EPAM Systems, has 9+ years of web design and development expertise. Deaf but undeterred, he is the creative force behind Sign Language Tech and an active contributor to a YouTube channel focused on tech content for the signing tech community... Read More →
Wednesday November 13, 2024 12:10pm - 12:45pm MST
Salt Palace | Level 2 | 255 EF
  AI + ML
  • Content Experience Level Any

12:10pm MST

Beyond 'Can You Mentor Me?' - Crafting the Contribution Ladder - Nitish Kumar, Akuity; Wenjia Zhang, Google; Lucas Käldström, Upbound; Carol Valencia, Elastic; Nabarun Pal, Broadcom
Wednesday November 13, 2024 12:10pm - 12:45pm MST
Mentorship, a cornerstone of the community's success, offers a transformative path to growth and development. However, finding the right mentor and building a successful mentorship relationship can be challenging. This panel discussion brings together experienced mentors from diverse roles within the Kubernetes community including maintainers, tech leads, and committee members. The panel members will share their insights on how to get the most out of mentorship at different stages of your Kubernetes journey, as you climb the Contributor ladder. By the end of this panel, the audience will understand essential takeaways for effective mentorship at different contributor ladder marks. The project maintainers can take inspiration from how the Kubernetes project maintainers make use of various mentorship techniques such as Role Based Shadowing, Peer-to-Peer Learning, and Mentorship Cohorts that can help any project especially CNCF incubating projects stick new contributors to the project.
Speakers
avatar for Lucas Käldström

Lucas Käldström

Senior Software Engineer, Upbound
Lucas is a Kubernetes and cloud native expert who has been serving the CNCF community in lead positions for 6 years. He’s awarded Top CNCF Ambassador 2017 with Sarah Novotny. Lucas was a co-lead for SIG Cluster Lifecycle, co-created kubeadm, Weave Ignite, and ported Kubernetes to... Read More →
avatar for Wenjia Zhang

Wenjia Zhang

Engineering Manager, Google
Wenjia Zhang is an Engineer Manager at Google, working on Google Kubernetes Engine and Google Distributed Cloud. She is an active contributor for Kubernetes and etcd open source projects.
avatar for Nabarun Pal

Nabarun Pal

Staff Engineer at VMware, Kubernetes Steering Committee and Maintainer, Broadcom
Nabarun is a Staff Software Engineer at VMware by Broadcom, a maintainer of the Kubernetes project, an elected Kubernetes Steering Committee member and a chair of Kubernetes SIG Contributor Experience. He is a Release Manager for Kubernetes and has been the Kubernetes 1.21 Release... Read More →
avatar for Nitish Kumar

Nitish Kumar

Software Engineering Intern, Akuity
Nitish is a Software Engineer at Akuity and a CNCF Ambassador. In the past, Nitish has served as a Linux Foundation Mentee under the Kubernetes Release Engineering Team, where he built the OBS library that is used by the Kubernetes project to automate the process of managing release... Read More →
avatar for Carolina Valencia

Carolina Valencia

Customer Architect, Elastic
Carol is a passionate software developer dedicated to implementing secure cloud-native practices. She actively contributes to CNCF projects and the Kubernetes community as an open-source contributor. She enjoys learning new technologies and creating material, some of which she shares... Read More →
Wednesday November 13, 2024 12:10pm - 12:45pm MST
Salt Palace | Level 2 | 251
  Cloud Native Novice
  • Content Experience Level Any

12:10pm MST

Breaking Free from Vulnerability Scanning Noise: Automated VEX Aggregation for Accuracy - Teppei Fukuda, Aqua Security Software Ltd.
Wednesday November 13, 2024 12:10pm - 12:45pm MST
Vulnerability scanners detect known vulnerabilities in software dependencies, but often produce inaccurate results (false-positives) due to their inability to automatically determine if a vulnerability is actually exploitable. Vulnerability Exploitability eXchange (VEX) is an industry-wide initiative that aims to address this issue, but the lack of standardized distribution hinders its effective utilization. This talk introduces VEX Hub, a central repository that automatically aggregates VEX documents published by open-source projects. VEX Hub’s unique architecture makes it easy and practical for software maintainers to start adopting VEX, while at the same time making it seamless for scanners and users to incorporate VEX in their workflow. The presentation showcases a practical use case of VEX Hub with Trivy, an open-source security scanner that popularizes VEX thanks to VEX Hub and delivers more accurate and actionable scanning results to its users.
Speakers
avatar for Teppei Fukuda

Teppei Fukuda

Open Source Engineer, Aqua Security Software Ltd.
Teppei Fukuda is the creator of Trivy and works at Aqua Security as an Open Source Software Engineer. He has a wealth of software engineering experience working on network and security. Away from the work, he is an avid manga enthusiast, dreaming of reading every comic book in the... Read More →
Wednesday November 13, 2024 12:10pm - 12:45pm MST
Salt Palace | Level 1 | 151
  Security
  • Content Experience Level Any

2:30pm MST

Architecting the Future of AI: From Cloud-Native Orchestration to Advanced LLMOps - Ion Stoica, Anyscale
Wednesday November 13, 2024 2:30pm - 3:05pm MST
With the groundbreaking release of ChatGPT, large language models (LLMs) have taken the world by storm: they have enabled new applications, have exacerbated GPU shortage, and raised new questions about their answers’ veracity. This talk delves into an AI stack, encompassing cloud-native orchestration, distributed computing, and advanced LLMOps. Key topics include: - Kubernetes: The foundational technology that seamlessly manages AI workloads across diverse cloud environments. - Ray: The versatile, open-source framework that streamlines the development and scaling of distributed applications. - vLLM: The cutting-edge, high-performance, and memory-efficient inference and serving engine designed specifically for large language models. Attendees will gain insights into the architecture and integration of these powerful tools, driving innovation and efficiency in the deployment of AI solutions.
Speakers
avatar for Ion Stoica

Ion Stoica

Co-founder, executive chairman & president, Anyscale
Ion Stoica is a Professor in the EECS Department at the University of California at Berkeley, and the Director of SkyLab. He is currently doing research on cloud computing and AI systems. Past work includes Ray, Apache Spark, Apache Mesos, Tachyon, Chord DHT, and Dynamic Packet State... Read More →
Wednesday November 13, 2024 2:30pm - 3:05pm MST
Salt Palace | Level 2 | 255 EF
  AI + ML
  • Content Experience Level Any

2:30pm MST

Choose Your Own Adventure: The Observability Odyssey - Whitney Lee, CNCF Ambassador & Viktor Farcic, Upbound
Wednesday November 13, 2024 2:30pm - 3:05pm MST
Our hero, a running app in a secure K8s prod environment, knows they are destined for greater things! They’re serving end users, but currently, they have no idea what is going on. Are apps scaling correctly? Are automated deployments successful? What just went wrong, and how can it be fixed? Hero is desperate to escape this fog by adding CNCF tools for logs, metrics, traces, and dashboards. It is up to you, the audience, to guide our hero and help them grow from a lost and confused app to their final form⎯an app that knows their faults before their users do. In their fourth KubeCon ‘Choose Your Own Adventure’-style talk, Whitney and Viktor will present choices that an anthropomorphized app must make as they add observability to their cluster, enabling the ability to answer meaningful questions about their system. Throughout the presentation, the audience (YOU!) will vote to decide our hero's path! Can we navigate CNCF projects and add observability before the session time elapses?
Speakers
avatar for Viktor Farcic

Viktor Farcic

Developer Advocate, Upbound
Viktor Farcic is a lead rapscallion at Upbound, a member of the CNCF Ambassadors, Google Developer Experts, CDF Ambassadors, and GitHub Stars groups, and a published author. He is a host of the YouTube channel DevOps Toolkit and a co-host of DevOps Paradox.
avatar for Whitney Lee

Whitney Lee

CNCF Ambassador
Whitney is a lovable goofball and a CNCF Ambassador who enjoys understanding and using tools in the cloud native landscape. Creative and driven, Whitney recently pivoted from an art-related career to one in tech. You can catch her lightboard streaming show ⚡️ Enlightning on her... Read More →
Wednesday November 13, 2024 2:30pm - 3:05pm MST
Salt Palace | Level 2 | 251
  Cloud Native Novice
  • Content Experience Level Any

2:30pm MST

Secure by Design CI/CD: Practical Insights from Adobe and Autodesk - Vikram Sethi, Adobe Inc. & Jesse Sanford, Autodesk
Wednesday November 13, 2024 2:30pm - 3:05pm MST
Worried that your CI/CD pipelines and developer workflows are insecure? Lost in security buzzwords like SBOMs, provenance, attestation, SLSA, OpenSSF, and more? Seeking a clear, actionable reference architecture to secure your pipeline? Whether you are just getting started on your Software Supply Chain Security journey, or are ready to take it to the next level navigating this diverse ecosystem is challenging. Join Vikram and Jesse as they present a reference architecture for secure-by-default CI/CD pipelines and show you effective security controls at every step. See firsthand how these industry giants safeguarded their pipelines while maintaining agility and innovation. This talk will showcase their work, and the work of the CNOE (Cloud Native Operational Excellence) group, which aims to build a paved path through this problem space by producing opinionated software collections or “CNOE stacks” that can be adapted to meet you where your technology is.
Speakers
avatar for Jesse Sanford

Jesse Sanford

Software Architect, Autodesk
Jesse is a lifelong software engineer focused on site reliability and Infosec. Currently architecting the juncture of platform engineering and security/compliance for Autodesk's Developer Enablement team. He regularly contributes to open source and frequently speaks about his work... Read More →
avatar for Vikram Sethi

Vikram Sethi

Principal Scientist, Adobe Inc.
Vikram is a Principal Scientist in the Developer Platforms organization at Adobe. Vikram has been architecting and building the Developer Experience for Adobe's Internal Developer Platform for the last few years. In the last year or so, Vikram has been working on rearchitecting Adobe's... Read More →
Wednesday November 13, 2024 2:30pm - 3:05pm MST
Salt Palace | Level 2 | 250
  SDLC
  • Content Experience Level Any

3:25pm MST

Optimizing Load Balancing and Autoscaling for Large Language Model (LLM) Inference on Kubernetes - David Gray, Red Hat
Wednesday November 13, 2024 3:25pm - 4:00pm MST
As generative AI language models improve, they are increasingly being integrated into business-critical applications. However, large language model (LLM) inference is a compute-intensive workload that often requires expensive GPU hardware. Making efficient use of these hardware resources in the public or private cloud is critical for managing costs and power usage. This talk introduces the KServe platform for deploying LLMs on Kubernetes and provides an overview of LLM inference performance concepts. Attendees will learn techniques to improve load balancing and autoscaling for LLM inference, such as leveraging KServe, Knative, and GPU operator features. Sharing test results, we will analyze the impact of these optimizations on key performance metrics, such as latency per token and tokens per second. This talk equips participants with strategies to maximize the efficiency of LLM inference deployments on Kubernetes, ultimately reducing costs and improving resource utilization.
Speakers
avatar for David Gray

David Gray

Senior Software Engineer, Red Hat
David Gray is a Senior Software Engineer on the Performance and Scale team at Red Hat. His role involves analyzing and improving AI inference workloads on Kubernetes platforms. David is actively engaged in performance experimentation and analysis of running large language models in... Read More →
Wednesday November 13, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 1 | Hall DE
  AI + ML
  • Content Experience Level Any

3:25pm MST

Cash App's Journey Into a Multi-Cluster Ecosystem - Rachel Sheikh, Cash App
Wednesday November 13, 2024 3:25pm - 4:00pm MST
Cash App's Compute team is responsible for the health and maintenance of the company's Kubernetes clusters, and the enablement of service owners to deploy their services into these clusters with confidence. Over the past year, we've made strides in improving our reliability and uptime, part of which involved introducing a paradigm around creating new Kubernetes clusters in our service ecosystem that allow us to seamlessly transition services in/out of to simplify cluster upgrades and provide us with guardrails against common outages. This talk intends to walk you through our experience introducing new Kubernetes clusters for our services at Cash App, migrating and splitting service traffic across clusters with zero downtime, and thinking through tooling adoption / creation to simplify cluster maintenance as our overhead scales.
Speakers
avatar for Rachel Sheikh

Rachel Sheikh

Ms., Cash App
I'm a software engineer with a decade of experience building and scaling backend services across various industries. When I'm not working on clusters or writing Go, I'm probably watching pro League of Legends or taking pictures of my dog.
Wednesday November 13, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 1 | Grand Ballroom BDF
  Platform Engineering
  • Content Experience Level Any

4:30pm MST

Museum of Weird Bugs: Our Favorites from 8 Years of Service Mesh Debugging - Tom Dean & Alen Haric, Buoyant
Wednesday November 13, 2024 4:30pm - 5:05pm MST
Over the past 8 years, we've fixed a lot of bugs in Linkerd. Many of these were straightforward, but some manifested in strange ways, or only showed up in unique situations, or otherwise surprised us. Some of them were just plain funny. In this talk, we will run through a couple of Linkerd's favorites: the most interesting, weird, and memorable bugs we've found and fixed Linkerd. We describe how they originally manifested (usually in someone else's production system),  how we went about tackling them (often by educating the reporter on how to construct a useful bug report), and the sometimes long and windy path to finally fixing them.
Speakers
avatar for Tom Dean

Tom Dean

Field Engineer, Buoyant
Tom Dean started programming BASIC on Apple IIs over 40 years ago, and has been hooked on tech since then. A long-time user of Linux and Open Source, he has been expanding his Cloud, Cloud Native and adjacent subject matter knowledge to become a more well-rounded technologist, and... Read More →
AH

Alen Haric

Solutions Architect, Buoyant
Wednesday November 13, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | Grand Ballroom BDF
  Cloud Native Experience
  • Content Experience Level Any

4:30pm MST

DNS Deep Dive in Kubernetes with CoreDNS - Jingming Guo, Airbnb
Wednesday November 13, 2024 4:30pm - 5:05pm MST
In the dynamic world of Kubernetes, efficient DNS resolution is critical for seamless application performance and scalability. CoreDNS, as the default DNS server for Kubernetes, offers flexible and high-performance DNS capabilities. This talk will delve into the lifecycle of a DNS request within a Kubernetes cluster using CoreDNS, offering insights into the flow of DNS traffic and enhancing your understanding of DNS requests and service discovery in Kubernetes—-key knowledge for effective debugging and issue resolution. Additionally, we will present a case study of Airbnb's successful integration of CoreDNS, highlighting the CoreDNS performance evaluation, our seamless migration approach, and scaling strategy. Finally, we will talk about the multi-cluster DNS resolution with CoreDNS. This section will demonstrate how multi-cluster DNS capabilities address the common challenges, discuss performance considerations and multi-cluster DNS limitations.
Speakers
avatar for Jingming Guo

Jingming Guo

Software Engineer, Airbnb
Jingming Guo, graduated from Northwestern University in 2017 and subsequently joined AWS EBS team. At AWS, Jingming led the development of Elastic Volume feature on the Block Express volume and led the EBS Server capacity increase release. In 2022, Jingming joined Airbnb and led the... Read More →
Wednesday November 13, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 2 | 251
  Cloud Native Novice
  • Content Experience Level Any

4:30pm MST

Building Resilience: Effective Backup and Disaster Recovery for Vector Databases on Kubernetes - Pavan Navarathna & Shwetha Subramanian, Veeam
Wednesday November 13, 2024 4:30pm - 5:05pm MST
As generative AI revolutionizes industries, reliance on vector databases - crucial for managing and querying high-dimensional data - has skyrocketed. These databases are often deployed on Kubernetes for its scalability and orchestration capabilities. However, ensuring robust backup and disaster recovery for these stateful applications presents unique challenges. Join Pavan and Shwetha as they discuss the critical need for an effective data protection strategy for vector databases in Kubernetes environments, emphasizing its importance in maintaining data integrity and availability. Attendees will learn about the growing significance of vector databases driven by AI applications and the specific considerations for their reliable deployment and management in cloud-native settings. Through a practical demonstration, this session will introduce Kanister, a CNCF Sandbox project, showcasing how it simplifies the complex process of backing up and recovering vector databases on Kubernetes.
Speakers
avatar for Pavan Navarathna

Pavan Navarathna

Engineering Manager, Veeam
Pavan joined Kasten by Veeam in March 2018, where he leads the open-source efforts and manages a team of cloud-native engineers developing innovative solutions for data protection in Kubernetes. He has previously worked in data protection and networking at NetApp and Aryaka. Pavan... Read More →
avatar for Shwetha Subramanian

Shwetha Subramanian

Software Engineer, Kasten by Veeam, Veeam
Shwetha Subramanian is a 2+ year experienced software professional, armed with a Master’s in Computer Science (Machine Learning track) from Columbia University, currently working as an SWE in the Kasten team at Veeam. An inherently curious individual, she is on a journey of learning... Read More →
Wednesday November 13, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | Grand Ballroom GI
  Data Processing + Storage
  • Content Experience Level Any

4:30pm MST

Watching the Watchers: How We Do Continuous Reliability at Grafana Labs - Nicole van der Hoeven, Grafana Labs
Wednesday November 13, 2024 4:30pm - 5:05pm MST
Nothing is foolproof. Everything fails eventually. Observability tools help predict and lessen the impact of those failures, as the watchers of your software systems. But who watches the watchers? At Grafana Labs, we're not immune to production incidents. Just like any company, we still sometimes move too quickly. We run complex, microservices-based systems ourselves, so we have to eat our own dogfood on a daily basis. In this talk, I reveal: - how we solved a years-long mystery that cost us $100,000+ - how we got our internal Mimir clusters to reliably hold 1.3 billion time series for metrics - what we've had to do to scale our Loki clusters to handle 324 TB of logs a day - what our Grafana dashboards to monitor Grafana Cloud look like Sometimes, it's easier to learn from failures in observability than from successes. This talk is a confession of some of our worst sins as well as a realistic look under the hood at how we're improving the continuous reliability of our stack.
Speakers
avatar for Nicole van der Hoeven

Nicole van der Hoeven

Senior Developer Advocate, Grafana Labs
Nicole is a Senior Developer Advocate at Grafana Labs and a performance engineer with over a decade of experience in breaking software and learning to build it back up again. She has lived in the Philippines, the US, Australia, the Netherlands, and Portugal, helping teams all over... Read More →
Wednesday November 13, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | Grand Ballroom HJ
  Observability
  • Content Experience Level Any

5:25pm MST

The OTTL Cookbook: A Collection of Solutions to Common Problems - Tyler Helmuth, Honeycomb & Evan Bradley, Dynatrace
Wednesday November 13, 2024 5:25pm - 6:00pm MST
Is your telemetry missing key attributes? Maybe there are details in your log bodies you’d rather have as attributes. It is common to find yourself in situations where your data doesn't look how you expect: it's too large, the wrong shape, or doesn't have everything you want. The OpenTelemetry Collector uses the OpenTelemetry Transformation Language (OTTL) to solve these problems. OTTL enables telemetry transformations based on any field of the payload, utilizing functions to execute the changes. In this session, Tyler and Evan will go over a brief intro to OTTL and then cover example after example of situations where you can use OTTL to solve processing problems in the Collector, like setting attributes, or defining an entire OTLP log record from a kubernetes event. Get ready with situations of your own, as we’ll save time at the end to try writing OTTL statements live on stage for your transformation or filtering issues so we can demonstrate how flexible OTTL truly is.
Speakers
avatar for Tyler Helmuth

Tyler Helmuth

Sr. Software Engineer, Honeycomb
Tyler is a Sr. Software Engineer at Honeycomb with a passion for observability and helping users start their observability journey. He is a maintainer for the OpenTelemetry Collector and OTel Helm Charts, and an active contributor to other OTel repositories. While not its originator... Read More →
avatar for Evan Bradley

Evan Bradley

Senior Software Engineer, Dynatrace
Evan helps maintain the OpenTelemetry Collector, where he is also a primary contributor to the OpenTelemetry Transformation Language (OTTL) and the OpenTelemetry Agent Management Protocol (OpAMP) Collector components. Evan has a background in developing DevOps tooling and observability... Read More →
Wednesday November 13, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | Grand Ballroom HJ
  Observability
  • Content Experience Level Any

5:25pm MST

Workload Identity Federation – Stop Using Long-Lived Credentials - Benjamin Dronen, Ford Motor Company & Kristen Newcomer, Red Hat
Wednesday November 13, 2024 5:25pm - 6:00pm MST
Workload identity federation is a somewhat daunting but extremely beneficial topic in Kubernetes security. In this session, we will share the lessons Ford Motor Company has learned through using workload identity federation with Google Cloud Platform, Microsoft Entra ID, and other platforms at scale from a wide variety of different workload types, how it has enhanced our security posture, improved developers’ lives, and reduced outages.
Speakers
avatar for Benjamin Dronen

Benjamin Dronen

Kubernetes Platform Engineer, Ford Motor Company
Ben Dronen started at Ford Motor Company in 2022 as part of their Ford College Graduate rotational program. He currently holds a Kubernetes Platform Engineering position and focuses on bare metal Kubernetes deployments. Ben attended Andrews University in Southwest Michigan and holds... Read More →
Wednesday November 13, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 2 | 255 BC
  Security
  • Content Experience Level Any

6:00pm MST

🪧 Poster Session: Unveiling Anomalies: eBPF-Based Detection in High-Volume Encrypted Network Traffic - Ben Smith-Foley, Rensselaer Center for Open Source
Wednesday November 13, 2024 6:00pm - 8:00pm MST
The increased use of encryption in network traffic presents a significant challenge for traditional network monitoring and security tools. As encrypting traffic becomes the norm, so does the need for advanced methods to detect malicious activities hidden within encrypted traffic. This poster will focus on how eBPF can be utilized to gain early observability into incoming packets by capturing and analyzing metadata before packets are fully processed, and how eBPF offers a unique vantage point for identifying anomalies in real-time. It will discuss methods to detect abnormal patterns, the design of the eBPF programs used, and the integration of these programs into a broader monitoring framework. The insights from this research have the potential to significantly enhance network security by providing a scalable and efficient solution for monitoring network traffic without compromising privacy. Attendees will gain an understanding of the practical applications of eBPF in network security.
Speakers
avatar for Ben Smith-Foley

Ben Smith-Foley

University Student, Rensselaer Center for Open Source
Ben is a senior at Rensselaer Polytechnic Institute studying Computer Science with a concentration in Systems and Software. He is currently conducting undergraduate research in "Anomaly Detection in High-Volume Encrypted Network Traffic", helps lead the Rensselaer Center for Open... Read More →
Wednesday November 13, 2024 6:00pm - 8:00pm MST
Salt Palace | Level 1 | Halls A-C + 1-5 | Solutions Showcase
  🪧 Poster Sessions, Security
  • Content Experience Level Any
 
Thursday, November 14
 

11:00am MST

Shifting Gears: Leveraging CNCF Tools to Streamline Operations at Toyota Connected - Benson Phillips & Rob Heckel, Toyota Connected
Thursday November 14, 2024 11:00am - 11:35am MST
In the evolving landscape of cloud-native ecosystems, aligning teams and standardizing practices is crucial for operational excellence. At Toyota Connected, we faced significant challenges due to inconsistent practices and fragmented collaboration across departments. To address this, we adopted a suite of CNCF tools including ArgoCD, Backstage, Harbor, External Secrets Operator, and OpenCost. This session will delve into our journey of implementing these tools to unify our approach, streamline workflows, and enhance cross-team collaboration. Attendees will gain insights into the practical application of these tools, our successes and failures, and the substantial reduction in time to market achieved. By focusing on the integration of technical solutions and effective team practices, we aim to foster a cohesive and efficient cloud-native environment. This presentation provides actionable strategies for leveraging CNCF tools to drive innovation and excellence in your organization.
Speakers
avatar for Benson Phillips

Benson Phillips

Platform Architect, Toyota Connected
Software oriented, primarily working with cloud native computing. But my interests do not stop there as my love for technology is boundless.
avatar for Rob Heckel

Rob Heckel

Platform Architect, Toyota Connected North America
Rob has over 15 years in technology, specializing in open source and developer enablement. As a Platform Architect for Toyota Connected, he enhances DevOps, SDLC, and SRE practices. He has led the creation of an internal developer platform, streamlined tool integrations, and promoted... Read More →
Thursday November 14, 2024 11:00am - 11:35am MST
Salt Palace | Level 2 | 255 BC
  Cloud Native Experience
  • Content Experience Level Any

11:00am MST

Lesson’s Learned Adopting OpenTelemetry at Scale - Alex Arnell, Heroku / Salesforce
Thursday November 14, 2024 11:00am - 11:35am MST
OpenTelemetry makes bold promises to unlock and unleash your observability, providing you with open standards, no vendor lock-in and interoperability with just about everything. You believe that your organization could really benefit from an uplift to modern observability. It would be easy to adopt if you were was starting out fresh, but let’s face it, most organizations have sprawling codebases and architectures. Decisions, infrastructure and often engineers that have been in place for decades. How do you even get started? This Heroku case study dives into our OpenTelemetry journey where you'll discover strategies on adoption, how to deal with internal resistance, and technical guidance on rolling out the change. Learn from our missteps and what we wished we had done differently. You’ll even see how a bit of luck can help drive adoption over the finish line. This session will equip you to navigate OpenTelemetry adoption in the most entrenched environments.
Speakers
avatar for Alex Arnell

Alex Arnell

Principal Engineer, Heroku / Salesforce
Alex Arnell is a Principal Engineer at Heroku / Salesforce with over two decades of software development experience. Alex has spent the last decade specializing in telemetry and observability systems. Alex is the lead engineer of the Telemetry team at Heroku, responsible for the collection... Read More →
Thursday November 14, 2024 11:00am - 11:35am MST
Salt Palace | Level 1 | Grand Ballroom HJ
  Observability
  • Content Experience Level Any

11:00am MST

Engineering a Kubernetes Operator: Lessons Learned from Versions 1 to 5 - Andrew L'Ecuyer, Crunchy Data
Thursday November 14, 2024 11:00am - 11:35am MST
Join me to uncover insights and hard-learned lessons from our journey through the first five versions of a Kubernetes Operator for Postgres. I will trace the development lifecycle from version 1 started in 2017 to version 5 now. Each version represents a milestone in addressing specific challenges, functionality, stability, and performance. We will discuss the architectural decisions, design patterns, and implementation strategies that shaped the evolution of the Operator. Key topics will include handling stateful applications, ensuring high availability, building for flexible deployment models, scalability, and managing rolling upgrades for both the Operator and underlying software. By the end of this session, participants will be equipped with practical knowledge and actionable strategies for engineering their own Kubernetes Operators, ready to accelerate their development process and avoid common pitfalls.
Speakers
avatar for Andrew L'Ecuyer

Andrew L'Ecuyer

Sr. Director of Kubernetes Engineering, Crunchy Data
Andrew head’s up the Kubernetes Engineering Team at Crunchy Data. With a diverse background spanning both the public and private sectors, Andrew has played a key role in designing, building and integrating complex systems of all shapes and sizes. He holds degrees in both Computer... Read More →
Thursday November 14, 2024 11:00am - 11:35am MST
Salt Palace | Level 1 | Grand Ballroom BDF
  Platform Engineering
  • Content Experience Level Any

11:00am MST

Yahoo’s Kubernetes Journey from on-Prem to Multi-Cloud at Scale - Nandhakumar Venkatachalam & Payal Patel, Yahoo
Thursday November 14, 2024 11:00am - 11:35am MST
Yahoo is an early adopter of Kubernetes, operating 37 on-prem and 42 multi-cloud production clusters hosting 2700 applications. Our team offers a simple yet powerful interface for users to deploy applications onto our managed clusters. Since 2015, we have handled multiple complex upgrades, including Operating Systems and Kubernetes, upgrading from version 1.0.3 to 1.30.0. In 2023, Yahoo announced plans to migrate to both GCP and AWS cloud platforms. Leveraging extensive knowledge, our team successfully provisioned Kubernetes clusters in a multi-cloud environment within a short period. Our team faced numerous challenges during the cloud adoption process, including networking, security, cluster autoscaling, and cost. In this talk, we will share managing K8S in a multi-cloud and discuss the challenges faced and solutions found. Key topics include Shared VPC, IP Space for K8s, securely accessing private clusters, multi-tenant workload identity, and maintaining a user interface to K8S.
Speakers
avatar for Nandhakumar Venkatachalam

Nandhakumar Venkatachalam

Sr Princ Production Engineer, Yahoo Inc
Nandhakumar Venkatachalam is a Senior Principal Production Engineer at Yahoo Inc. As a lead engineer responsible for operating the large-scale Kubernetes cluster, he has played a key architect role in building scalable cloud infrastructure. Nandha has been with Yahoo for over 17 years... Read More →
avatar for Payal Patel

Payal Patel

Principal Software Development Engineer, Yahoo
Payal Patel is a Principal Software Development Engineer in the Cloud Infrastructure team at Yahoo. She is currently developing a hybrid cloud solution for Kubernetes clusters in AWS and GCP to set up the Kubernetes clusters at scale. Before that, she worked on managing the Kubernetes... Read More →
Thursday November 14, 2024 11:00am - 11:35am MST
Salt Palace | Level 2 | 251
  Platform Engineering
  • Content Experience Level Any

11:55am MST

Democratizing AI Model Training on Kubernetes with Kubeflow TrainJob and JobSet - Andrey Velichkevich, Apple & Yuki Iwai, CyberAgent, Inc.
Thursday November 14, 2024 11:55am - 12:30pm MST
Running model training on Kubernetes is challenging due to the complexity of AI/ML models, large training datasets, and various distributed strategies like data and model parallelism. It is crucial to configure failure handling, success criteria, and gang-scheduling for large-scale distributed training to ensure fault tolerance and elasticity. This talk will introduce the new Kubeflow TrainJob API, which democratizes distributed training and LLM fine-tuning on Kubernetes. The speakers will demonstrate how TrainJob integrates with Kubernetes JobSet to ensure scalable and efficient AI model training with simplified Python experience for Data Scientists. Additionally, they will explain the innovative concept of reusable and extendable training runtimes within TrainJob. The speakers will highlight how these capabilities empower data scientists to rapidly iterate on their ML development, making Kubernetes more accessible and beneficial for the entire ML ecosystem.
Speakers
avatar for Andrey Velichkevich

Andrey Velichkevich

Senior Software Engineer, Apple
Andrey Velichkevich is a Senior Software Engineer at Apple and is a key contributor to the Kubeflow open-source project. He is a member of Kubeflow Steering Committee and a co-chair of Kubeflow AutoML and Training WG. Additionally, Andrey is an active member of the CNCF WG AI. He... Read More →
avatar for Yuki Iwai

Yuki Iwai

Software Engineer, CyberAgent, Inc.
Yuki is a Software Engineer at CyberAgent, Inc. He works on the internal platform for machine-learning applications and high-performance computing. He is currently a Technical Lead for Kubeflow WG AutoML / Training. He is also a Kubernetes WG Batch active member and a Kubernetes... Read More →
Thursday November 14, 2024 11:55am - 12:30pm MST
Salt Palace | Level 1 | Hall DE
  AI + ML
  • Content Experience Level Any

11:55am MST

Tick, TAG, TOC - Keeping Cloud Native Running - Karena Angell & Emily Fox, Red Hat; Rajas Kakodkar, Broadcom; Alex Chircop, Akamai; Ricardo Aravena, Truera
Thursday November 14, 2024 11:55am - 12:30pm MST
With only so many hours in the day, how does the cloud native community keep things running? Over 190 projects, thousands of contributors, and an array of groups all contribute to what we know as “cloud native” but there is more going on behind the scenes that keep the machine of cloud native running smoothly and driving the technical direction of the landscape. In this panel discussion, you’ll hear from Chairs and Technical Leads of Technical Advisory Group (TAG) Runtime, Storage, App Delivery and the chair of the CNCF Technical Oversight Committee (TOC) on - How they are defining the roadmap for the future - The glue and oil of collaboration between advisory, oversight, and projects’ health - How you can time your engagement with these groups to have an outsized impact! This is not a maintainer track session. While they are separate tracks for specific CNCF TAG and TOC activities, this is meant to be your backstage pass to see how the CNCF landscape gets shaped!
Speakers
avatar for Alex Chircop

Alex Chircop

Chief Product Architect at Akamai, Akamai
Chief Product Architect at Akamai. Previously a founder and CTO of Ondat (formerly StoraeOS), building software defined solutions for cloud native environments. Alex is also a co-chair of the CNCF Storage TAG (previously SIG). Before embarking on the startup adventure he spent over... Read More →
avatar for Ricardo Aravena

Ricardo Aravena

Cloud Native Lead, Truera
Ricardo currently works at TruEra as a Cloud Infrastructure Lead helping automate everything with cloud native technologies. He's an open source enthusiast and co-chair of the CNCF TAG-Runtime. He has been working in tech for more than 20 years and comes from a diverse professional... Read More →
avatar for Karena Angell

Karena Angell

Senior Principal Chief Architect, Red Hat
Karena Angell is a Senior Principal Chief Architect at Red Hat focusing on cloud native application workloads for Kubernetes, open source software projects, as well as solutions for the 'open' hybrid cloud.
avatar for Rajas Kakodkar

Rajas Kakodkar

Senior Member of Technical Staff | Tech Lead TAG Runtime CNCF, Broadcom
Rajas is a senior member of technical staff at Broadcom and a tech lead of the CNCF Technical Advisory Group, Runtime. He is actively involved in the AI working group in the CNCF. He is a Kubernetes contributor and has been a maintainer of the Kube Proxy Next Gen Project. He has also... Read More →
avatar for Emily Fox

Emily Fox

Emerging Technologies Security Lead, Red Hat
Emily Fox is a DevOps enthusiast, security unicorn, and advocate for Women in Technology. She promotes the cross-pollination of development and security practices. She has worked in security for over 14 years to drive a cultural change where security is unobstructive, natural, and... Read More →
Thursday November 14, 2024 11:55am - 12:30pm MST
Salt Palace | Level 2 | 255 BC
  Cloud Native Experience
  • Content Experience Level Any

11:55am MST

Running Quantum-Safe Applications on Kubernetes - Paul Schweigert & Michael Maximilien, IBM Quantum
Thursday November 14, 2024 11:55am - 12:30pm MST
Quantum computers pose a unique threat to computer security, as the encryption standards we rely upon are vulnerable to powerful quantum computers. While those computers are still several years away, "harvest now, decrypt later" attacks put all data not protected using quantum-safe security at risk. So what can we do now to protect our applications? In this talk, Paul will demo how to deploy a quantum-safe application on Kubernetes. He'll provide a brief overview of quantum-safe cryptography and why it's needed, highlight key work being done in the open source community to migrate to quantum-safe cryptography, and conclude with a demo of how to build a quantum-safe cloud-native application. In particular, he'll show where and how to make changes to a Kubernetes environment to ensure users are protected by quantum-safe connections. At the conclusion of this session, listeners will have a set of practical steps they can take to help secure their applications in a post-quantum world.
Speakers
avatar for Michael Maximilien

Michael Maximilien

Distinguished Engineer, IBM
My name is Michael Maximilien, better known as max or dr.max, and I am a currently a Distinguished Engineer with IBM. I am the leader for IBM’s Open Source team contributing to all things Serverless and Platform-as-a-Service (PaaS). I have worked at various divisions of IBM. At... Read More →
avatar for Paul Schweigert

Paul Schweigert

Senior Software Engineer, IBM
Paul Schweigert works on quantum and serverless technologies at IBM. He has extensive experience in open source (Knative and Kubernetes in particular) and has spoken at numerous conferences. He has also led various platform engineering and data science teams. In a previous life, he... Read More →
Thursday November 14, 2024 11:55am - 12:30pm MST
Salt Palace | Level 2 | 255 EF
  Emerging + Advanced
  • Content Experience Level Any

11:55am MST

Cognitive and Self-Adaptive System for Effective Distributed-Tracing in Applications - Mitul Tandon & Akash Gusain, VMware; Susobhit Panigrahi, Broadcom
Thursday November 14, 2024 11:55am - 12:30pm MST
In response to challenges of limited trace capture in dynamic API tracing systems, the solution leverages Machine Learning and Cognitive approach for unbiased trace collection. Unlike existing implementations with a skewed distribution(~5%) towards normal traces, our self-adaptive system dynamically learns to prioritise and capture diverse traces, crucial for effective diagnosis of API failures and performance issues. This innovative approach significantly enhances the SREs ability to triage complex issues, leading to a game-changing reduction in Mean Time to Resolve (MTTR). The Adaptive Sampling approach analyses existing system traces and autonomously adjusts the sampling rate, eliminating manual configs. This ML-based solution outcome includes streamlined trace metric analysis, enhanced reliability work efficiency, and considerable infrastructure cost reduction through targeted trace collection, ultimately making a significant impact on operational effectiveness & reliability
Speakers
avatar for Susobhit Panigrahi

Susobhit Panigrahi

Senior Software Engineer
As a Developer and DevOps Engineer at VMware, I specialize in developing scalable cloud software. My focus includes deploying and managing services with Kubernetes, Helm, and Istio. I'm keen to contribute to the open-source community, especially in Kubernetes and other CNCF projects... Read More →
avatar for Akash Gusain

Akash Gusain

Software Engineer, VMware
Akash Gusain is a Software Engineer at VMware with over two years of experience in building and deploying cloud-native applications. At VMware, Akash has contributed to the development of scalable and robust cloud solutions, demonstrating expertise in various technologies and fra... Read More →
avatar for Mitul Tandon

Mitul Tandon

DevOps Engineer, VMware
A DevOps/SRE Engineer at VMware with 2+ years of experience with working on distributed systems and containerised applications.
Thursday November 14, 2024 11:55am - 12:30pm MST
Salt Palace | Level 1 | Grand Ballroom HJ
  Observability
  • Content Experience Level Any

11:55am MST

Evolving Reddit’s Infrastructure via Principled Platform Abstractions - Karan Thukral & Harvey Xia, Reddit
Thursday November 14, 2024 11:55am - 12:30pm MST
Reddit’s approach to infrastructure management has grown organically over time, adapted to solve tactical, near term problems. We have now reached a point where the only way to scale infrastructure capabilities to a growing engineering organization is through platform abstractions offering self-service management of standardized infrastructure patterns. Beginning in 2021, a concerted effort was made to reimagine infrastructure as an internal platform that empowers both application and infrastructure engineers to build impactful and maintainable systems. We present a case study of Reddit’s ongoing journey in evolving its infrastructure management practices from inefficient, human-in-the-loop processes to efficient, self-service interfaces. By treating Kubernetes as a universal control plane and extending it with custom control processes fronted by well-designed interfaces, we are moving the organization towards this vision. This will cover the the many trade-offs and lessons learnt.
Speakers
avatar for Harvey Xia

Harvey Xia

Staff Engineer, Compute Infrastructure @ Reddit, Reddit
I'm a software engineer with experience across a variety of disciplines including backend engineering, data engineering, and most recently, infrastructure engineering. I specialize in building cloud native infrastructure platform features.
avatar for Karan Thukral

Karan Thukral

Senior Engineer, Compute Infrastructure @ Reddit, Reddit
Karan is a Senior Software Engineer at Reddit working on the Compute team to build an easy to use internal developer platform which is scalable and reliable. He has been working in this problem space since 2017 building both internal and external developer platforms including App... Read More →
Thursday November 14, 2024 11:55am - 12:30pm MST
Salt Palace | Level 1 | Grand Ballroom BDF
  Platform Engineering
  • Content Experience Level Any

2:30pm MST

What Istio Got Wrong: Learnings from the Last Seven Years of Service Mesh - Christian Posta & Louis Ryan, Solo.io
Thursday November 14, 2024 2:30pm - 3:05pm MST
Building complex systems often requires simplicity in components—a lesson the Istio project has learned throughout its seven(plus)-year journey. Although Istio offers a lot of powerful features for application networking, crucial for many organizations, the path to maturity and broader adoption was fraught with challenges. In this talk, we explore the key mistakes made during Istio's development, including its initially complex architecture, an overload of features, premature release of version 1.0, difficulties faced by contributors, and delays in joining the CNCF. We will discuss the impact of these mistakes, how these missteps were addressed, and how they have positioned Istio as a leader in the service mesh market. This presentation will detail how Istio's evolution reflects a shift towards simpler, more modular components that together offer effective solutions for managing APIs and service-to-service communication regardless of platform.
Speakers
avatar for Louis Ryan

Louis Ryan

CTO, Solo.io
Co-creator of Istio and gRPC
avatar for Christian Posta

Christian Posta

Global Field CTO, Solo.io
Christian Posta (@christianposta) is Global Field CTO at Solo.io. He is the author of Istio in Action and many other books on cloud-native architecture. He's well known in the cloud-native community for being a speaker, blogger (https://blog.christianposta.com) and contributor to... Read More →
Thursday November 14, 2024 2:30pm - 3:05pm MST
Salt Palace | Level 2 | 255 BC
  Cloud Native Experience
  • Content Experience Level Any

2:30pm MST

Tutorial: Live with Gateway API V1.2 - Flynn, Buoyant & Mike Morris, Microsoft
Thursday November 14, 2024 2:30pm - 4:00pm MST
Gateway API v1.2 is here! We have GA support for service mesh! We have timeouts in HTTPRoutes! We have GRPCRoutes! And we still have precious few real-world walkthroughs of using Gateway API to get real things done… In this hands-on workshop hosted by Gateway API contributors and GAMMA co-leads, we’ll start with completely unconfigured clusters, walk through installing a demo app with your choice of ingress controller and service mesh (Envoy Gateway + Linkerd, or Istio), then dig into actually using Gateway API for routing, resilience, and progressive delivery with an application using HTTP and gRPC at the same time. You’ll walk away with practical, real-world knowledge about what Gateway API can do and how to use it, and portable skills you’ll be able to apply to the many projects implementing Gateway API!
Speakers
avatar for Flynn -

Flynn -

Tech Evangelist, Buoyant
Flynn is a tech evangelist at Buoyant, educating developers about Linkerd, Kubernetes, and cloud-native development in general. He has spent 40 years in software engineering (from the kernel up through distributed applications, with a common thread of communications and security throughout... Read More →
avatar for Mike Morris

Mike Morris

Senior Product Manager, Microsoft
Mike is a product manager at Microsoft working on upstream open source projects with a focus on Istio service mesh, and a Gateway API for service mesh co-lead. He is interested in building healthy, sustainable communities and scalable distributed systems, and working collaboratively... Read More →
Thursday November 14, 2024 2:30pm - 4:00pm MST
Salt Palace | Level 1 | Grand Ballroom ACE

3:25pm MST

TLS and MTLS: Introduction to Modern Security - Andrew Davis, Independent & Sandeep Kanabar, Gen (formerly NortonLifeLock)
Thursday November 14, 2024 3:25pm - 4:00pm MST
A constant presence in our lives for nearly 25 years, TLS is a cornerstone of modern security practice — especially in a zero-trust world. In cloud native, mTLS comes up every time service meshes get mentioned. Even so, both these technologies are still sources of endless questions. How do they work? How are they related? What problems do they solve – and which others do they not solve? How does it relate to end-user auth? What's all this stuff with certificates anyway? And why should you care about these things? Thankfully, answering these questions isn't that complex. Sandeep Kanabar, Lead Software Engineer at Gen, and Andrew Davis, a Cybersecurity Expert—both Deaf & Hard of Hearing WG members—will discuss what TLS and mTLS are, what they do, how they work, why they matter as standards, and what nearly 25 years of attacking them have to say about security. They'll use Linkerd as an example, but this talk will apply to any situation involving mTLS or TLS, no matter the implementation.
Speakers
avatar for Sandeep Kanabar

Sandeep Kanabar

Lead Software Engineer, Gen (formerly NortonLifeLock)
Hailing from India, Sandeep is a passionate software engineer working at Gen (formerly NortonLifeLock). A frequent meetup speaker, Sandeep enjoys sharing his lessons learned from 15+ years in the tech space with the community. He's a staunch advocate for diversity and inclusion and... Read More →
avatar for Andrew Davis

Andrew Davis

Cybersecurity Specialist, Not Applicable
A passionate self-taught cybersecurity expert, Andrew Davis is a big believer in life-long learning. He has worked for various Fortune 500 companies, including DELL and Fidelity Investments. Deaf himself, Andrew is a strong advocate for accessibility. He's an active member of the... Read More →
Thursday November 14, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 2 | 251
  Cloud Native Novice
  • Content Experience Level Any

3:25pm MST

You're Overpaying for CI - Kyle Penfound, Dagger
Thursday November 14, 2024 3:25pm - 4:00pm MST
In recent years, the computational power of developer workstations has surged dramatically. With so much compute available at every developer's fingertips, why do we continue to waste time and money with lengthy build times on sluggish CI compute? Some forward-thinking organizations are re-evaluating this approach, questioning the necessity of paying for CI compute when the developers' workstations, which are already more powerful and paid for, remain underutilized. In this technical session we will transition a fully functioning production CI system from cloud-based compute to local workstation compute. We will explore the intricacies of replicating the functionality of a modern CI system, leveraging the power of developer workstations, all using open source software.
Speakers
avatar for Kyle Penfound

Kyle Penfound

Solutions Engineer, Dagger
Kyle is part of the ecosystem team at dagger.io working on the future of CICD. He has a background in DevOps and just loves giving demos!
Thursday November 14, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 2 | 250
  SDLC
  • Content Experience Level Any

3:25pm MST

It's Dangerous to Build It Alone, Take This. - Jeremy Rickard & Ashna Mehrotra, Microsoft
Thursday November 14, 2024 3:25pm - 4:00pm MST
You've got high and critical CVEs in open source software packages that are critical to your platform or business. Time is almost up to patch them, and the upstream project hasn't fixed things. If you don't patch, your accreditation might be at risk. You're going to have to do it yourself! But where do you start? Fork the projects? Can you just patch in place? In this session, you'll learn about tools and strategies that can help you respond to CVEs in your container images faster, starting with patching existing images in place with Copacetic and moving on to patching and building projects from scratch. We'll look at challenges to building and testing upstream projects using existing tools and learn from emerging practices in industry. We'll also talk about how to inform your teams to stop using bad images! After this session, you'll have best practices and tools at your disposal, understand some of the pitfalls of owning your entire open source software supply chain.
Speakers
avatar for Ashna Mehrotra

Ashna Mehrotra

Software Engineer, Microsoft
Ashna Mehrotra is a software engineer on the Upstream Security team, working on cloud-native open source security projects at Microsoft.
avatar for Jeremy Rickard

Jeremy Rickard

Principal Software Engineer, Microsoft
Jeremy Rickard is a principal software engineer at Microsoft where he works on the Azure Container Upstream team. He is currently a co-chair for SIG Release and serves on both the CNCF and the Kubernetes Code of Conduct Committees. He was also the Kubernetes 1.20 Release Lead.
Thursday November 14, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 1 | 151
  Security
  • Content Experience Level Any

4:30pm MST

Which GPU Sharing Strategy Is Right for You? a Comprehensive Benchmark Study Using DRA - Kevin Klues & Yuan Chen, NVIDIA
Thursday November 14, 2024 4:30pm - 5:05pm MST
Dynamic Resource Allocation (DRA) is one of the most anticipated features to ever make its way into Kubernetes. It promises to revolutionize the way hardware devices are consumed and shared between workloads. In particular, DRA unlocks the ability to manage heterogeneous GPUs in a unified and configurable manner without the need for awkward solutions shoehorned on top of the existing device plugin API. In this talk, we use DRA to benchmark various GPU sharing strategies including Multi-Instance GPUs, Multi-Process Service (MPS), and CUDA Time-Slicing. As part of this, we provide guidance on the class of applications that can benefit from each strategy as well as how to combine different strategies in order to achieve optimal performance. The talk concludes with a discussion of potential challenges, future enhancements, and a live demo showcasing the use of each GPU sharing strategy with real-world applications.
Speakers
avatar for Kevin Klues

Kevin Klues

Distinguished Engineer, NVIDIA
Kevin Klues is a distinguished engineer on the NVIDIA Cloud Native team. Kevin has been involved in the design and implementation of a number of Kubernetes technologies, including the Topology Manager, the Kubernetes stack for Multi-Instance GPUs, and Dynamic Resource Allocation (DRA... Read More →
avatar for Yuan Chen

Yuan Chen

Principal Software Engineer, NVIDIA
Yuan Chen is a Principal Software Engineer at NVIDIA, working on building NVIDIA GPU Cloud for AI. He served as a Staff Software Engineer at Apple from 2019 to 2024, where he contributed to the development of Apple's Kubernetes infrastructure. Yuan has been an active code contributor... Read More →
Thursday November 14, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | Hall DE
  AI + ML
  • Content Experience Level Any

4:30pm MST

The Maintainer Monologues - Sarah Christoff, Defense Unicorns; Karen Chu, Fermyon; Jason Hall, Chainguard; Scott Rigby, Independent; Ryan Nowak, Microsoft
Thursday November 14, 2024 4:30pm - 5:05pm MST
Are maintainers born? Or made? Made. They’re definitely made. Oftentimes it’s a combination of trial and error, luck, and lots of hard work. With a mixed group of first time and experienced maintainers, join us for a panel covering the origin stories and learnings of CNCF sandbox/incubating/graduated project maintainers. They’ll share their journeys as their projects evolved, and cover topics such as: - Project milestones (inception, MVP, & donation) - Learning the ecosystem - Blind spots - Navigating social dynamics (community building, getting more help, navigating challenges) - Work life balance / open source burnout With this knowledge, you’ll be better equipped to become the next open source contributor, maintainer, or creator of projects, ready to navigate the ecosystem.
Speakers
avatar for Karen Chu

Karen Chu

OSS Community PM
Karen Chu is an OSS Community PM. Having participated in the cloud native community since 2015, she is a CNCF Ambassador, Helm community manager/maintainer, emeritus Kubernetes Code of Conduct Committee member, meet-up organizer, and conference organizer. She has also worked on The... Read More →
avatar for Sarah Christoff

Sarah Christoff

Software Engineer, Defense Unicorns
Sarah is a software engineer at Defense Unicorns who loves making complex code more digestible. She is the self-proclaimed founder of the Leslie Lamport fan club. When she's not bugbusting, she is running her animal rescue and competing in triathlons. She believes code should be like... Read More →
avatar for Scott Rigby

Scott Rigby

Senior Cloud Solutions Architect, NASA / Navteca
Scott is an artist, engineer & dad, collaborating on a different kind of world. Into collective art, activism, therapy & open source nerdy stuff. Scott is a Cloud Native Ambassador, speaker, organizer of CNCF community events including the New York Kubernetes Meetup, and international... Read More →
avatar for Jason Hall

Jason Hall

Principal Software Engineer, Chainguard
Jason is a hopeless container image tooling nerd, living in Brooklyn with his wife, two children and (most importantly) lots of pizza.
avatar for Ryan Nowak

Ryan Nowak

Incubations Architect, Microsoft
Ryan is an architect working on open-source projects from the Azure CTO's office. He's passionate about designing software for humans, incubating risky ideas, releasing them in open-source so everyone can benefit. At Microsoft, he's had a 15+ year career building developer-centric... Read More →
Thursday November 14, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 2 | 255 BC
  Cloud Native Experience
  • Content Experience Level Any

4:30pm MST

Elevating Kubeflow Spark Operator's Future: Best Practices and Enhancements - Vara Bonthu, AWS & Chaoran Yu, Apple Inc
Thursday November 14, 2024 4:30pm - 5:05pm MST
As Kubernetes becomes the leading platform for data processing, mastering the deployment and management of Apache Spark on it is crucial. In this presentation, you'll hear from the new maintainers of the Kubeflow Spark Operator project, who will provide an overview of scaling the Spark Operator on Kubernetes, emphasizing best practices to optimize performance and efficiency. Attendees will explore the migration of the Spark Operator repository from Google to Kubeflow, gaining insights into the roadmap and key takeaways. The session will cover strategies for achieving multi-tenancy, managing multiple Spark Operator instances for large-scale deployments, ensuring robust security, and performing seamless upgrades. Participants will learn advanced techniques to maximize their Spark on Kubernetes deployments, making their data processing pipelines more efficient, reliable, and secure. This talk is for Data, ML, DevOps, and MLOps pros to enhance their Spark on Kubernetes skills.
Speakers
avatar for Chaoran Yu

Chaoran Yu

Software Engineer, Apple Inc
Chaoran Yu is a software engineer at Apple. He leads a team that builds and operates a large-scale batch analytics data platform to meet the demanding requirements of data scientists and engineers. His passion lies in delivering the best value to stakeholders through best-of-breed... Read More →
avatar for Vara

Vara

Principal OSS Specialist, AWS
Vara Bonthu is a dedicated technology professional and Worldwide Tech Leader for Data on EKS, specializing in assisting AWS customers ranging from strategic accounts to diverse organizations. He is passionate about open-source technologies, Data Analytics, AI/ML, and Kubernetes, and... Read More →
Thursday November 14, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | Grand Ballroom GI
  Data Processing + Storage
  • Content Experience Level Any

4:30pm MST

Mastering OpenTelemetry Collector Configuration - Steve Flanders, Cisco
Thursday November 14, 2024 4:30pm - 5:05pm MST
Configuring the OpenTelemetry Collector can be a daunting task for both novices and seasoned professionals alike. Yet, mastering this crucial aspect is essential for unlocking the full potential of your observability stack. In this session, you will embark on a journey to gain the knowledge and skills needed to conquer common OpenTelemetry Collector configuration challenges. This session will draw from real-world experiences and best practices and provide live demonstrations to navigate the intricacies of OpenTelemetry Collector configuration. Whether you are a novice looking to get started or a seasoned veteran seeking to level up your skills, this session promises to empower you with the knowledge and confidence needed to properly and efficiently configure the OpenTelemetry Collector.
Speakers
avatar for Steve Flanders

Steve Flanders

Senior Director of Engineering, Cisco
Steve Flanders is a Senior Director of Engineering at Splunk (acquired by Cisco) responsible for the Observability Platform team, which includes contributions to the OpenTelemetry project. He was previously the Head of Product at Omnition (acquired by Splunk). Prior to Omnition, he... Read More →
Thursday November 14, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | Grand Ballroom HJ
  Observability
  • Content Experience Level Any

4:30pm MST

Tutorial: No Mess Rollouts with Gateway API: Leveraging Gateway API and Argo Rollouts for Progressive Delivery - Nina Polshakova & Lawrence Gadban, Solo.io
Thursday November 14, 2024 4:30pm - 6:00pm MST
Modern application delivery has many pitfalls: version transitions, traffic management, quality assurance, performance monitoring, and rollbacks. If you encounter an upgrade issue, what can you do? Mirror traffic? Debug locally? Roll back? Argo Rollouts lets teams gradually and safely deploy new versions of applications. A standard Gateway API enables any provider to support Argo Rollouts without provider-specific code. Argo Rollouts monitors Prometheus metrics to verify performance and reverts if success criteria aren’t met. This hands-on lab guides you on integrating Argo Rollouts with applications using different Gateway API implementations. Using Argo and Gateway API resources (HTTPRoute), you’ll learn to adjust traffic weights and gradually direct more traffic to a new version. We will also explore challenges in route delegation and role-based access control within Gateway API and potential extensions to address gaps in traffic shaping, access control, and debugging rollouts.
Speakers
avatar for Lawrence Gadban

Lawrence Gadban

Software Engineer, Solo.io
Lawrence is a Field Engineer at Solo.io where he works with organizations of all sizes to architect, adopt, and operationalize components such as Envoy proxy, API gateways, and service mesh. Most recently, he has been working directly with several organizations at various stages of... Read More →
avatar for Nina Polshakova

Nina Polshakova

Software Engineer, Solo.io
Nina is a software engineer working on multi-cluster Istio solutions on the Gloo Platform team at Solo.io. She is a CNCF Ambassador and has also been on several Kubernetes release teams. She led the Enhancements team for the 1.29 release and is the current lead for the Release Notes... Read More →
Thursday November 14, 2024 4:30pm - 6:00pm MST
Salt Palace | Level 1 | Grand Ballroom ACE
  Tutorials, Operations + Performance
  • Content Experience Level Any

5:25pm MST

Managing and Distributing AI Models Using OCI Standards and Harbor - Steven Zou & Steven Ren, VMware by Broadcom
Thursday November 14, 2024 5:25pm - 6:00pm MST
Just as container images are vital to cloud-native technology, AI models are crucial to AI technology. Effectively, conveniently, and safely managing, maintaining, and distributing AI models is critical for supporting workflows like AI model training, inference, and application deployment. This presentation explores AI model management based on OCI standards and the Harbor project. Standardizing AI model structures and characteristics using OCI specifications and extension mechanisms like OCI Reference to link datasets and dependencies. When large models require efficient loading or privacy considerations, model replication or proxy with upstream repositories like Hugging Face becomes essential. Enhancing model distribution security through signing, vulnerability scanning, and policy-based governance is often necessary. Additionally, introducing acceleration mechanisms such as P2P can significantly improve the efficiency of large model loading.
Speakers
avatar for Steven Ren

Steven Ren

Senior Manager, Broadcom
avatar for Steven Zou

Steven Zou

Staff II Engineer, VMware by Broadcom
Steven Zou is a senior engineer with years of experience in cloud computing and cloud-native technology. He is currently working as a Staff II engineer at VMware, focusing on cloud-native and Kubernetes-related platform services. In addition, he is a core maintainer of the CNCF open-source... Read More →
Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | Hall DE
  AI + ML
  • Content Experience Level Any

5:25pm MST

Navigating Failures in Pods with Devices: Challenges and Solutions - Sergey Kanzhelev, Google & Mrunal Patel, Red Hat
Thursday November 14, 2024 5:25pm - 6:00pm MST
Pods are no longer running with just CPU and Memory. We provision GPUs, network cards, request special placement of those devices and allocated memory. And the more efficient or effective you want your set up to be, the more complicated those device requirements are, the more chances you will hit an edge case Kubernetes has not accounted for yet. Come to the talk to learn from Node Maintainers about some of those shortcomings in Kubernetes. If you are only starting with AI/ML and devices, you will be interested to learn what to expect. If you have lots of experience, you may still learn new things. With the increased focus on AI/ML workloads, highlighting those scenarios is important. As Kubernetes plans to fix those problems, you can give feedback on what would work best for you.
Speakers
avatar for Sergey Kanzhelev

Sergey Kanzhelev

Staff Software Engineer, Google
Sergey Kanzhelev is a seasoned open source and cloud native maintainer working actively on Kubernetes. Sergey is serving as co-chair of SIG node. He is also one of the founders of OpenTelemetry. He is working on engineering aspect of software and its practical application. He is contributing... Read More →
avatar for Mrunal Patel

Mrunal Patel

Distinguished Engineer, Red Hat
Mrunal Patel is a Senior Principal Software Engineer at Red Hat working on containers for Openshift. He is a maintainer of runc/libcontainer and the OCI runtime specification. He started the CRI-O runtime. He is a SIG-Node chair and tech lead.
Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 2 | 250
  AI + ML
  • Content Experience Level Any

5:25pm MST

Engaging the KServe Community, The Impact of Integrating a Solutions with Standardized CNCF Projects - Adam Tetelman, NVIDIA; Taneem Ibrahim, Red Hat; Johnu George, Nutanix; Tessa Pham, Bloomberg; Andreea Munteanu, Canonical
Thursday November 14, 2024 5:25pm - 6:00pm MST
Building a new solution and contemplating whether or not the OSS path is right for you? Wondering where to get started with a large cloud initiative and where the pitfalls may lie? Curious to know all the benefits waiting if your organization embraces a rich CNCF ecosystem? In this talk we will discuss the trade-offs between building a product on a full OSS platform vs. a DIY approach. We will delve into the issues of working with internal stakeholders or partners to embrace an OSS community and will cover the benefits and scaling factors that come when embracing open standards. We will use the recent integration of NVIDIA NIM into KServe as a case study and talk through the trials and tribulations that paid off in a win-win-win situation for our solutions, the OSS projects, and our users. We will cover Kubeflow, Knative, Istio, KServe, and wg-serve as well as a network of companies building enterprise K8s platforms and enterprise AI applications on top of these foundations.
Speakers
avatar for Andreea Munteanu

Andreea Munteanu

AI Product Manager, Canonical
I lead AI at Canonical, the publisher of Ubuntu and a provider of open source security, support and services. With a background in data science across industries like retail and telecommunications, I help enterprises make data-driven decisions with AI. I am passionate about amplifying... Read More →
avatar for Tessa Pham

Tessa Pham

Senior Software Engineer, Bloomberg
Tessa Pham is a Senior Software Engineer on Bloomberg's Cloud Native Compute Services organization. She works on building an inference platform for Bloomberg’s Data Science Platform, used by engineers and data scientists for training, deploying and serving ML models. Tessa is a... Read More →
avatar for Johnu George

Johnu George

Staff Engineer, Nutanix
Johnu George is a staff engineer at Nutanix with a background in distributed systems and large-scale hybrid data pipelines. He is an active in open-source and has steered several industry collaborations on projects like Kubeflow, Apache Mnemonic and Knative. His research interests... Read More →
avatar for Adam Tetelman

Adam Tetelman

Principal Product Architect, NVIDIA
Adam Tetelman is a principal architect at NVIDIA leading cloud native initiatives and CNCF engagements across the company; building inference platforms for NVIDIA AI Enterprise and DGX Cloud. He has degrees in computational robotics, computer & systems engineering, and cognitive science... Read More →
avatar for Taneem Ibrahim

Taneem Ibrahim

Senior Engineering Manager, Red Hat
Taneem is an engineering leader at Red Hat where his organization is responsible for building and delivering Model Serving, Responsible AI, and Model Registry solution in OpenShift AI.
Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | Grand Ballroom GI
  Cloud Native Experience
  • Content Experience Level Any

5:25pm MST

Pick My Project! Lessons Learned from Interviewing 20+ End Users for Cloud Native Case Studies - Shedrack Akintayo & Bill Mulligan, Isovalent at Cisco
Thursday November 14, 2024 5:25pm - 6:00pm MST
Cloud native projects can promise the moon in their READMEs, but have you ever wondered what actually causes end users to adopt a project? Shedrack and Bill have interviewed over 20 companies in industries ranging from media to financial services about why they picked a project for their cloud native platform. In this talk, they will reveal what end users truly want when adopting cloud native technologies and what the forcing function was for each of them. You’ll hear firsthand accounts of the triumphs and tribulations faced by companies like Bloomberg, DigitalOcean, The New York Times, and more as well as the specific benefits these organizations are reaping, from enhanced security and observability to improved performance and cost savings. Additionally, they’ll teach other projects their process for creating impactful case studies. By the end, the audience will understand the real-world applications and advantages of cloud native technologies and why end users pick a project.
Speakers
avatar for Shedrack Akintayo

Shedrack Akintayo

Technical Marketing Engineer, Isovalent at Cisco
Shedrack Akintayo is a software engineer and technical writer based in London with six years of experience spanning Web Engineering, DevOps, Technical Writing, and Developer Relations. Shedrack works as a Technical Marketing Engineer at Cisco, via the Isovalent acquisition. He actively... Read More →
avatar for Bill Mulligan

Bill Mulligan

Community Pollinator, Isovalent at Cisco
Bill Mulligan is a cloud native pollinator and community builder. He has given talks, written articles, and appeared on podcasts on a wide range of topics around cloud native. While at CNCF he restarted the Kubernetes Community Day program. He is currently at Isovalent growing the... Read More →
Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 2 | 255 BC
  Cloud Native Experience
  • Content Experience Level Any

5:25pm MST

Why Serverless Is Trending Again - Matt Butcher, Fermyon
Thursday November 14, 2024 5:25pm - 6:00pm MST
The idea of serverless computing really took off in 2016. But after an apparent peak in 2019, it seemed to be on the decline. Yet things took an about face again in 2022. The idea of serverless functions not only regained lost ground, but even now it is hitting new levels of interest. Why? In this session, we first get very clear about what “serverless” means as a design pattern. Then we dive into what it is good for, and mention a few of the major successes of serverless computing. From there, we look into the present and future of serverless technology, particularly inside of Kubernetes. WebAssembly is the runtime technology that enables serverless in Kubernetes to outperform Amazon Lambda and other competitors.
Speakers
avatar for Matt Butcher

Matt Butcher

CEO, Fermyon
Matt Butcher (CEO) is a founder of Fermyon. He is one of the original creators of Helm, Brigade, CNAB, OAM, Glide, and Krustlet. He has written or co-written many books, including "Learning Helm" and "Go in Practice." He is a co-creator of the "Illustrated Children’s Guide to Kubernetes... Read More →
Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 2 | 251
  Cloud Native Novice
  • Content Experience Level Any

5:25pm MST

One Gateway API to Rule Them All (and in the Cluster Configure Them) - Flynn, Buoyant
Thursday November 14, 2024 5:25pm - 6:00pm MST
Ingress, egress, east-west, north-south… Kubernetes has always had a lot of different ways to talk about network traffic, each with its own concerns. For years, the possibility of unifying these kinds of configuration under a single API was a tantalizing but far-off possibility until Gateway API v0.8 took the first step of combining ingress and mesh configuration. Now Gateway API is taking the next step: bringing egress to the party. Join us for a look into how Linkerd is using these new egress capabilities to meet real user needs! We’ll start with a quick overview of what egress policy covers and what people need from it, how Gateway API makes egress work within its existing model, continue to cover how Linkerd implements it, and finish up with a live demo showing off a real-world example of egress management through the Gateway API. Welcome to the grand unified world!
Speakers
avatar for Flynn -

Flynn -

Tech Evangelist, Buoyant
Flynn is a tech evangelist at Buoyant, educating developers about Linkerd, Kubernetes, and cloud-native development in general. He has spent 40 years in software engineering (from the kernel up through distributed applications, with a common thread of communications and security throughout... Read More →
Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | 155 EF
  Connectivity
  • Content Experience Level Any

5:25pm MST

Now You See Me: Tame MTTR with Real-Time Anomaly Detection - Kruthika Prasanna Simha & Raj Bhensadadia, Apple Inc.
Thursday November 14, 2024 5:25pm - 6:00pm MST
Picture this! You are running an application on a Kubernetes cluster & you notice that your nodes have been restarting and your users are noticing that your application is unreachable. As an engineer, you want to identify these failures in real-time & differentiate these from known states, at scale. But we know, static thresholds fail for dynamic metrics! This session explores real-time anomaly detection for cloud-native systems. We'll show you how to reduce MTTR and mean time to analyse by proactively identifying abnormal application behavior using statistical & machine learning algorithms on time series data from Prometheus. Learn to pinpoint issues, identify missing instrumentation, and visualize anomalies using Grafana. This session equips you to achieve faster issue resolution and maintain optimal application health. We'll demo practical techniques for metrics selection, anomaly detection and proactive issue identification to manage your cloud-native applications.
Speakers
avatar for Raj

Raj

Machine Learning Engineer, Apple Inc.
Raj Bhensadadia, a machine learning engineer with a passion for leveraging ML technologies to enhance monitoring and analysis of large scale systems and ensure robustness and performance of infrastructure and services.
avatar for Kruthika Prasanna Simha

Kruthika Prasanna Simha

Software Engineer, Apple Inc.
Kruthika is a software engineer at Apple specializing in building ML enabled observability solutions. She holds a Masters in Computer Engineering and has specialized in Machine Learning. In her free time, she likes to dabble with Jupyter Notebooks for running experiments with data... Read More →
Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | Grand Ballroom HJ
  Observability
  • Content Experience Level Any

5:25pm MST

How Google Build Its New Cloud on Top of Kubernetes - Saad Ali, Jie Yu & Prashanth Venugopal, Google
Thursday November 14, 2024 5:25pm - 6:00pm MST
“Build a new air-gapped cloud with open source technologies” – this is what a small team at Google was tasked with in late 2021. The team delivered a private cloud platform, complete with managed VMs, databases, AI services, and more. Moreover, it did so by leveraging a number of CNCF technologies, including Kubernetes, Istio, etc. We’ll share the potential of these technologies, as well as their limitations, by explaining how they were used to build a scalable, reliable, and secure cloud platform. We’ll discuss how to implement cloud tenancy concepts, enforce isolation among tenants, and how we built a cloud API leveraging k8s API machinery and service mesh. A key innovation in building the private cloud platform was the “Kubernetes Defined Networking” (KDN) stack we created: by leveraging existing k8s networking features (e.g. load balancer, etc.) along with a few key enhancements, we implemented most of the traditional cloud SDN concepts, like VPC, firewall, VM support, etc.
Speakers
avatar for Saad Ali

Saad Ali

Senior Engineering Manager, Google
Saad Ali is a Senior Engineering Manager at Google. He works on Google Distributed Cloud and the open-source Kubernetes project. He led the development of the Kubernetes storage and volume subsystem. He serves as a lead of the Kubernetes Storage SIG, has served as member of the CNCF... Read More →
avatar for prashanth venugopal

prashanth venugopal

Kubernetes Networking Lead, Google
Prashanth has an almost two decades long career, across various networking market segments. In his current role as the lead architect of Google's Kubernetes networking stack, he helps drive the networking stack's evolution for Google Kubernetes Engine (for the Public Cloud Market... Read More →
avatar for Jie Yu

Jie Yu

Principal Software Engineer, Google
Jie Yu is a currently a Principal Software Engineer at Google. Jie is currently working on Google Distributed Cloud, and is the leading architect for the product. Prior to Google, Jie was a Chief Architect at Mesosphere (D2IQ), and worked at Twitter. Jie joined Kubernetes community... Read More →
Thursday November 14, 2024 5:25pm - 6:00pm MST
Salt Palace | Level 1 | Grand Ballroom BDF
  Platform Engineering
  • Content Experience Level Any
 
Friday, November 15
 

11:00am MST

Open Source 2.0: The Maintainers' Perspective - William Morgan, Buoyant; Ashley Davis, Venafi; Deepthi Sigireddi, PlanetScale
Friday November 15, 2024 11:00am - 11:35am MST
Open source rules the world, and for a good reason: The code is generally better and more secure, bugs are fixed faster, and more. Virtually all modern applications run on it. But the landscape has changed since the early Linux days. Nights and weekends, volunteer-led projects are increasingly rare. Especially in the CNCF landscape, open source is maintained almost exclusively by companies that pursue a strategic goal, and they need a business justification for paying their engineers. So, who writes the code has changed, but the community's expectations — that it should be free — hasn't. While open source will remain free, the companies behind it must find ways to monetize it — whether through support, enterprise editions, or licensing models. Recent changes, including projects like Terraform, Flux, and Linkerd, highlight the need for a paradigm shift. Join this panel to hear from project maintainers why that is and the future they envision.
Speakers
avatar for William Morgan

William Morgan

Linkerd Director, Buoyant CEO, Buoyant
William is a director on the Linkerd project and the co-founder and CEO of Buoyant, the creators of Linkerd. Prior to Buoyant, he was an infrastructure engineer at Twitter, a software engineer at Powerset, Microsoft, and Adap.tv, a research scientist at MITRE. He holds an MS in computer... Read More →
avatar for Ashley Davis

Ashley Davis

Staff Software Engineer, Venafi
As a teenager, Ash taught himself to program after wondering how exactly video games were made. That led to adventures trawling through open source codebases, sparking an interest in computers spanning from bare-metal machine code right up to scalable distributed platforms like Kubernetes... Read More →
avatar for Deepthi Sigireddi

Deepthi Sigireddi

Software Engineer, PlanetScale
Deepthi is the Technical lead for Vitess, a CNCF graduated open source project. She also leads the Vitess engineering team at PlanetScale which offers a database service built on Vitess. She brings over 20 years of experience building scalable systems to this role. She enjoys speaking... Read More →
Friday November 15, 2024 11:00am - 11:35am MST
Salt Palace | Level 2 | 255 EF
  Cloud Native Experience
  • Content Experience Level Any

11:00am MST

The State of Kubernetes Optimization and the Role of AI - James Wilson, nOps; Haoran Qiu, Microsoft; Katie Gamanji, Apple; Jasmine James, Square; Josh Cypher, Sonos
Friday November 15, 2024 11:00am - 11:35am MST
Featuring a diverse panel of experts, attendees will hear the latest in Kubernetes optimization. The session will encourage and engage attendees to challenge conventional wisdom and explore innovative approaches to optimization. Participants will leave with actionable knowledge and new perspectives they can apply to their own environments. Topics include: - Valuable insights into the current state of AI in optimization, highlighting both its potential and barriers to adoption - How and when AI can be used for real-time decision-making - Exploring the intersection of sustainability and optimization, emphasizing the importance of visibility in driving sustainable practices - The state of multidimensional pod autoscaling and potential to resolve conflicts between horizontal and vertical autoscaling - How new computing options and tools like Karpenter have the potential to disrupt the bin packing problem - How cloud-native projects can leverage new tools to track efficiencies
Speakers
avatar for Katie Gamanji

Katie Gamanji

Sr Field Engineer, Apple
Katie is a cloud native leader and practitioner, currently in a Senior Field Engineer role at Apple and a TOC for CNCF. As a platform engineer, Katie contributed to Conde Nast and American Express platforms and at CNCF led the End User Community. Katie is the author of the Cloud Native... Read More →
avatar for Haoran Qiu

Haoran Qiu

Research SDE, Microsoft
Haoran Qiu is a Research Software Development Engineer at Microsoft Azure Systems Research. His research interests are in cloud efficiency, ML systems, and applying ML for cloud systems design and operation. Haoran was a recipient of ML and Systems Rising Star by MLCommons in 2023... Read More →
avatar for Jasmine James

Jasmine James

Head of Development Infrastructure, Square
Jasmine James is an engineering leader at Square heading the Development Infrastructure for the Devices Platform overseeing CI Infrastructure, Developer Experience, and Test Rack teams aiming to streamline development and foster continuous feedback. She is passionate about diversity... Read More →
avatar for James Wilson

James Wilson

VP of Engineering, nOps
James has over two decades of experience in tech, with a strong focus in leading engineering teams in building cloud-based solutions. His expertise includes container orchestration, high-speed data transport, and cloud-native architectures. Currently, he leads the engineering team... Read More →
avatar for Josh Cypher

Josh Cypher

Senior DevOps Engineer, Sonos
Josh, a Senior DevOps Engineer at Sonos, has a diverse background in quality assurance and automation. Throughout his career, he has held roles such as tester, backend developer, automation engineer, engineering manager, and head of quality before specializing in DevOps and Kubernetes... Read More →
Friday November 15, 2024 11:00am - 11:35am MST
Salt Palace | Level 1 | 155 BC
  Operations + Performance
  • Content Experience Level Any

11:00am MST

Platform Engineering in Financial Institutions: The Practitioner Panel - Paula Kennedy, Syntasso; Chris Plank, NatWest Bank; Suhail Patel, Monzo; Jinhong Brejnholt, Saxo Bank; Rachael Wonnacott, Fidelity International
Friday November 15, 2024 11:00am - 11:35am MST
In the world of small and large financial institutions, platform engineering is a driver for shipping quickly, safely, and efficiently. This panel brings together seasoned practitioners from leading banks and financial institutions to share their firsthand platform experiences, successes, and challenges. - Discover how platform engineering can enhance developer experience, facilitate rapid innovation and drive efficiencies. - Delve into the complexities of navigating regulatory compliance, specifically when using open source technologies such as Kubernetes. - Learn from the experts' successes, setbacks and strategies (across technology and people), gaining actionable insights for successful implementation. Join us as we discuss the journey of adopting and deploying CNCF technologies at scale within the highly regulated financial sector. We’ll explore practical examples of both successes and incidents where things have gone wrong, providing the audience with valuable takeaways.
Speakers
avatar for Paula Kennedy

Paula Kennedy

Chief Operating Officer, Syntasso
Paula is Co-Founder & Chief Operating Officer of Syntasso; previous roles include Senior Director at VMware Tanzu, Pivotal and Co-Founder & Chief Operating Officer of CloudCredo. With 20+ years experience in IT, Paula champions community, diversity and inclusion and has a range of... Read More →
avatar for Suhail Patel

Suhail Patel

Senior Staff Engineer, Monzo
Suhail is a Staff Engineer at Monzo focused on building the Core Platform. His role involves building and maintaining Monzo's infrastructure which spans over two thousand microservices and leverages key infrastructure components like Kubernetes, Cassandra, Etcd and more. He focuses... Read More →
avatar for Jinhong Brejnholt

Jinhong Brejnholt

Chief Cloud Architect, Saxo Bank
Jinhong is an accomplished cloud and platform architect, deeply committed to advancing DevSecOps practices and cloud-native technologies. She holds an MSc in Software Development and Technology and is certified as a Kubernetes application developer, administrator, and security specialist... Read More →
avatar for Chris Plank

Chris Plank

Enterprise Architect & Joint Product Owner, NatWest Bank
Chris Plank is a Enterprise Architect working for NatWest Bank in Edinburgh, Scotland. He has been leading a Platform as a Product initiative within the Bank over the last year looking to radically change the Banks approach to provisioning and maintaining services. Outside of work... Read More →
avatar for Rachael Wonnacott

Rachael Wonnacott

Technical Product Owner, Kubernetes Platform, Fidelity International
Rachael has spent the last decade focused on platform engineering. She places a conscious emphasis on improving flow and is on the quest to smooth the application lifecycle for developers in the enterprise. With a background in astrophysics, Rachael brings her scientific approach... Read More →
Friday November 15, 2024 11:00am - 11:35am MST
Salt Palace | Level 1 | Grand Ballroom BDF
  Platform Engineering
  • Content Experience Level Any

11:55am MST

Building Massive-Scale Generative AI Services with Kubernetes and Open Source - John McBride, OpenSauced
Friday November 15, 2024 11:55am - 12:30pm MST
At OpenSauced, we power over 40,000 generative AI inferences every day, all through our in-house platform ontop of Kubernetes. The cost of doing this kind of at-scale AI inference with a third party provider API would be astronomic. Thankfully, using Kubernetes, the public cloud, and open-source technologies, we've been able to scale with relatively low costs and a lean stack. In this talk, John will walk through the journey of building a production grade generative AI system using open source technologies, open large language models, and Kubernetes. We'll also explore why we chose to build ontop of Kubernetes for our AI workloads over using a third party provider, and how we're running and managing our AI/ML clusters today. Additionally, we'll dive into the techniques we used to groom our Retrieval-Augmented-Generation pipelines for efficiency ontop of Kubernetes and other practical tips for deploying your own AI services at-scale.
Speakers
avatar for John McBride

John McBride

Sr. Software Engineer, OpenSauced
John is a Sr. Software Engineer at OpenSauced where he also serves as Head of Infrastructure and AI engineer. He is the maintainer of spf13/cobra, the Go CLI bootstrapping library used throughout the CNCF landscape. In the past, he has worked on open source Kuberenetes platforms... Read More →
Friday November 15, 2024 11:55am - 12:30pm MST
Salt Palace | Level 2 | 250
  AI + ML
  • Content Experience Level Any

11:55am MST

Accessibility at KubeCon: Deaf Voices in Cloud Native - Rob Koch, Slalom Build; Jay Jackson, CallRevu; Destiny O'Connor, Women Blessing Women; Anastasiia Gubska, BT Group; Travis Johnson, Convo Communications
Friday November 15, 2024 11:55am - 12:30pm MST
Never met a deaf person at a conference? That is not surprising. While there are lots of deaf engineers, until recently, most conferences — and virtually any other community activity — haven't been accessible to deaf community members. But for KubeCon, that all changed exactly a year ago! During this discussion, deaf panelists from various countries will shed light on their unique experiences being deaf in tech and the impact that making KubeCon accessible has had on their lives and hopes for the future. Attendees will learn why the technology space is a great fit for deaf individuals, the benefits and opportunities deaf professionals bring to the table, and what it takes to be an accessible and welcoming community. Panelists will also debunk common misconceptions and empower *you* to take steps toward a more inclusive cloud native ecosystem.
Speakers
avatar for Anastasiia Gubska

Anastasiia Gubska

SRE/DevOps Engineer, BT Group
Anastasiia Gubska, a Deaf SRE/DevOps Engineer at BT Group, develops and implements best practices for software delivery at the UK-based multinational telecommunications company. Passionate about discovering new communities and embracing diverse cultures, Anastasiia is an active member... Read More →
avatar for Travis Johnson

Travis Johnson

Level 3 Engineer, Convo Communications
A Linux aficionado, Travis Johnson is a deaf Level 3 Engineer with 10+ years of experience in the VoIP industry, where he has gained deep knowledge of networking and scripting. A firm believer in lifetime learning, Travis continuously acquires new skills and certifications. Off work... Read More →
avatar for Rob Koch

Rob Koch

Principal, Slalom Build
A tech enthusiast who thrives on steering projects from their initial spark to successful fruition, Rob Koch is Principal at Slalom Build, AWS Hero, and Co-chair of the CNCF Deaf and Hard of Hearing Working Group. His expertise in architecting event-driven systems is firmly rooted... Read More →
avatar for Destiny O'Connor

Destiny O'Connor

Co-Chair CNCF Deaf and Hard of Hearing WG, Web Developer, Women Blessing Women
As Co-Chair of the CNCF Deaf and Hard of Hearing Working Group, where I channel my passion for creating a more inclusive tech world for deaf and hard-of-hearing individuals. My mission is to educate the tech community about the unique challenges and experiences of being deaf in this... Read More →
avatar for Jay Jackson

Jay Jackson

Senior Software Engineer, CallRevu
Jay Jackson, a Senior Software Engineer at CallRevu, brings over 2 decades of experience in the tech industry. Jay has navigated this tech journey as a deaf individual, with American Sign Language (ASL) as his primary mode of communication and is passionate about exploring ways to... Read More →
Friday November 15, 2024 11:55am - 12:30pm MST
Salt Palace | Level 1 | Grand Ballroom HJ
  Cloud Native Experience
  • Content Experience Level Any

11:55am MST

Kubernetes Upgrades: Less Pain, More Gain (and Maybe a Little Swearing) - Jago Macleod, Google
Friday November 15, 2024 11:55am - 12:30pm MST
Kubernetes upgrades are a major pain point for many users, often due to the complexity of managing multiple, independently versioned components. This talk will delve into the strategies and best practices for minimizing disruption and maximizing success during Kubernetes upgrades. We'll explore: - Common pitfalls and challenges faced during upgrades - Practical tips for smoother, more reliable upgrade processes - The risks of relying solely on Long Term Support (LTS) versions - Improving upgrade reliability for all Kubernetes users, regardless of their chosen platform Led by the head of both OSS Kubernetes and GKE Release and Upgrades at Google, this talk will provide valuable insights and actionable advice for anyone looking to create a sustainable and successful upgrade strategy. Whether you're a seasoned Kubernetes veteran or just getting started, this session will equip you with the knowledge and tools to navigate the complex landscape of Kubernetes upgrades.
Speakers
avatar for Jago Macleod

Jago Macleod

Engineering Director, Google
Jago Macleod is an Engineering Director at Google, where he leads much of the Kubernetes and Google Kubernetes Engine (GKE) team, which gives him the opportunity to work with some of Google Cloud’s largest customers. Prior to working at Google, Jago helped make the smart homes that... Read More →
Friday November 15, 2024 11:55am - 12:30pm MST
Salt Palace | Level 1 | Grand Ballroom BDF
  Platform Engineering
  • Content Experience Level Any

2:00pm MST

From Vectors to Pods: Integrating AI with Cloud Native - Rajas Kakodkar, Broadcom; Kevin Klues, NVIDIA; Joseph Sandoval, Adobe; Ricardo Rocha, CERN; Cathy Zhang, Intel
Friday November 15, 2024 2:00pm - 2:35pm MST
The rise of AI is challenging long-standing assumptions about running cloud native workloads. AI demands hardware accelerators, vast data, efficient scheduling and exceptional scalability. Although Kubernetes remains the de facto choice, feedback from end users and collaboration with researchers and academia are essential to drive innovation, address gaps and integrate AI in cloud native. This panel features end users, AI infra researchers and leads of the CNCF AI and Kubernetes device management working groups focussed on: - Expanding beyond LLMs to explore AI for cloud native workload management, memory usage and debugging - Challenges with scheduling and scaling of AI workloads from the end user perspective - OSS Projects and innovation in AI and cloud native in the CNCF landscape - Improving resource utilisation and performance of AI workloads The next decade of Kubernetes will be shaped by AI. We don’t yet know what this will look like, come join us to discover it together.
Speakers
avatar for Ricardo Rocha

Ricardo Rocha

Lead Platforms Infrastructure, CERN
Ricardo leads the Platform Infrastructure team at CERN with a strong focus on cloud native deployments and machine learning. He has led for several years the internal effort to transition services and workloads to use cloud native technologies, as well as dissemination and training... Read More →
avatar for Kevin Klues

Kevin Klues

Distinguished Engineer, NVIDIA
Kevin Klues is a distinguished engineer on the NVIDIA Cloud Native team. Kevin has been involved in the design and implementation of a number of Kubernetes technologies, including the Topology Manager, the Kubernetes stack for Multi-Instance GPUs, and Dynamic Resource Allocation (DRA... Read More →
avatar for Joseph Sandoval

Joseph Sandoval

Principal Product Manager, Adobe Inc.
Joseph Sandoval, a seasoned tech expert with 25 years in various roles running distributed systems, infrastructure platforms and thrives on empowering developers to scale their applications. An advocate for OpenSource software, he harnesses its transformative power to champion change... Read More →
avatar for Cathy Zhang

Cathy Zhang

senior principal engineer, Intel
As a member of the CNCF TOC, Cathy has been sponsoring and guiding projects' applications for graduation/incubating, and reviewing/approving new sandbox projects. She has been a committee member for several KubeCon. Cathy is a currently Senior Principal Engineer at Intel, leading... Read More →
avatar for Rajas Kakodkar

Rajas Kakodkar

Senior Member of Technical Staff | Tech Lead TAG Runtime CNCF, Broadcom
Rajas is a senior member of technical staff at Broadcom and a tech lead of the CNCF Technical Advisory Group, Runtime. He is actively involved in the AI working group in the CNCF. He is a Kubernetes contributor and has been a maintainer of the Kube Proxy Next Gen Project. He has also... Read More →
Friday November 15, 2024 2:00pm - 2:35pm MST
Salt Palace | Level 1 | Hall DE
  AI + ML
  • Content Experience Level Any

2:00pm MST

Can You Put a Price Tag on Open Source? - Mario Fahlandt, Kubermatic & Bob Killen, CNCF
Friday November 15, 2024 2:00pm - 2:35pm MST
Earlier this year, the Harvard Business School released the paper titled “The Value of Open Source Software,” estimating the worldwide value of OSS at 8.8 trillion, and on average, it would cost companies at least 3.5x more to develop similar projects internally. Yet, many organizations and engineers struggle to understand or realize this kind of value from contributing to these projects. In this talk, Bob and Mario will discuss the many benefits individuals and companies can achieve by contributing to open source and guide you through the first steps to becoming a contributor. They will also cover how to develop a lightweight open source strategy and convince your organization that an open source first approach can yield great returns.
Speakers
avatar for Mario Fahlandt

Mario Fahlandt

Service Delivery Architect, Kubermatic
Mario is working as a Customer Delivery Architect @Kubermatic with the focus on planning and building concepts and architecture for Infrastructure in the cloud native world.He started the GDG Munich for Cloud and became a GDE in 2019. In the Kubernetes project he is involved in SIG-ContribEx... Read More →
avatar for Bob Killen

Bob Killen

Senior Technical Program Manager, CNCF
Bob is a Program Manager at the Google Open Source Programs Office with a focus on Cloud Native computing. He serves the Kubernetes project as a Steering Committee member and chair of the Contributor Experience SIG. Bob comes from an academic background, spending 15 years at the University... Read More →
Friday November 15, 2024 2:00pm - 2:35pm MST
Salt Palace | Level 1 | Grand Ballroom HJ
  Cloud Native Experience
  • Content Experience Level Any

2:55pm MST

The Key Value of Etcd Over Custom Resources: Scalability - Jef Spaleta, Isovalent at Cisco
Friday November 15, 2024 2:55pm - 3:30pm MST
Cilium defaults to using Kubernetes Custom Resources to hold Cilium specific internal state, however when the cluster is large enough, the Kubernetes API becomes a bottleneck on performance. To scale a cluster to hundreds of nodes, Cilium can be configured to use a dedicated external etcd instance. This talk will discuss the details of what the external etcd looks like from an operator perspective, and explore why Cilium uses an external etcd for enhanced scalability. It will cover how to manage a cluster by bypassing the Kubernetes API and interacting only with the cluster's etcd key-value store - and also why it might be a bad idea. Get a taste of what's possible by bypassing the Kubernetes API and interacting with the etcd API directly, and learn why Cilium has an option to use a dedicated etcd deployment, not shared by the Kubernetes API, for holding Cilium state and the scalability benefits it can bring to your cluster.
Speakers
avatar for Jef Spaleta

Jef Spaleta

Technical Community Advocate, Isovalent at Cisco
Jef Spaleta has more than a decade of experience in the technology industry; as software engineer, open source contributor, IoT hardware developer, operations, and most recently as a community advocate at Isovalent.
Friday November 15, 2024 2:55pm - 3:30pm MST
Salt Palace | Level 1 | 155 BC
  Operations + Performance
  • Content Experience Level Any

2:55pm MST

Modernization of Intuit Payroll Enterprise Using Event Driven Architecture - Hema Maarimuthu & Vigith Maurice, Intuit
Friday November 15, 2024 2:55pm - 3:30pm MST
Intuit's Quickbooks Online Payroll Enterprise, a critical application serving over 2 million customers, processes over a million transactions and $34 billion in payroll taxes. We're modernizing with a heavy investment in event-driven architecture for effective handling of financial data. This major transition extends beyond just the payroll platform; it involves decomposing complex systems across Intuit products using event-driven architecture and a focus on availability, scalability, and security is crucial. To address challenges like autoscaling for high throughput, low latency, better operational excellence, and development productivity, we have built our modernized platform on Numaflow, an open-source, Kubernetes native, language-agnostic platform. In our presentation, we will share our journey of modernizing our stack using event-driven serverless architecture on Numaflow and highlight the advantages it has brought to our developers and technology infrastructure.
Speakers
avatar for Vigith Maurice

Vigith Maurice

Principal Engineer, Intuit
Vigith is a co-creator of Numaproj and Principal Software Engineer for the Intuit Core Platform team in Mountain View, California. One of Vigith's current day-to-day focus areas is the various challenges in building scalable data and AIOps solutions for both batch and high-throughput... Read More →
avatar for Hema Maarimuthu

Hema Maarimuthu

Principal Engineer, Intuit
Hema is a Principal Software Engineer for Intuit's Online Payroll Infrastructure team in Mountain View, California. Hema’s current work involves leading cross-functional teams, strategizing, and driving operational excellence initiatives. Her major accomplishments include successfully... Read More →
Friday November 15, 2024 2:55pm - 3:30pm MST
Salt Palace | Level 1 | Grand Ballroom BDF
  Platform Engineering
  • Content Experience Level Any

4:00pm MST

Gamifying Cloud Native: How to Design and Build an Educational Game for Your Project - Calum Murray, University of Toronto, Faculty of Applied Science and Engineering & Zainab Husain, OCAD University
Friday November 15, 2024 4:00pm - 4:35pm MST
Have you ever struggled to explain what a Cloud Native project does? One of the challenges many cloud native projects face is that the abstractions they provide are not intuitive for new users. Since cloud technologies are often built on top of each other and use domain specific language, this problem compounds. Luckily, educational games can be made to help communicate these abstract concepts in a fun and engaging format! In this talk, we will explore how you can build an educational game for your project through the example of a game that the Knative community has built to teach Knative Eventing. We will walk through the steps other open source projects can follow to design their own educational game, including brainstorming strategies for deciding on key concepts and which metaphors/symbols to use to represent these concepts. These information design strategies can also be applied to create more understandable educational cloud native content in general!
Speakers
avatar for Zainab Husain

Zainab Husain

Knative UX Design Lead, OCAD University
Zainab Husain is a UX Design Researcher working at OCAD University. She completed her Masters in Engineering at the University of Toronto, focusing on Human Computer Interactions. Zainab is passionate about tools that improve collaboration between Engineers and Designers and is also... Read More →
avatar for Calum Murray

Calum Murray

Engineering Science Student, University of Toronto, Faculty of Applied Science and Engineering
I'm a software engineer, and I love building cool things in open source. I like to seek out the most interesting and challenging problems which I think will have a large impact, and build creative solutions to them. I also like to share my passion for open source with others, and... Read More →
Friday November 15, 2024 4:00pm - 4:35pm MST
Salt Palace | Level 1 | Grand Ballroom HJ
  Cloud Native Experience
  • Content Experience Level Any

4:00pm MST

Privacy in the Age of Big Compute - Sal Kimmich, Confidential Computing Consortium, Linux Foundation
Friday November 15, 2024 4:00pm - 4:35pm MST
In the age of big compute, the definition of privacy has transformed as re-identification from anonymized datasets has become easier. This session explores the challenges and solutions in navigating privacy concerns in high-dimensional data environments. Attendees will learn about the risks of re-identification, the importance of unicity in data sets, and how Privacy Enhancing Technologies (PETs) and Confidential Computing can mitigate these risks. Discover how these advancements can help protect sensitive data, ensure compliance, and foster a more secure data ecosystem in cloud-native environments.
Speakers
avatar for Sal Kimmich

Sal Kimmich

Technical Community Architect, Confidential Computing Consortium, Linux Foundation
Sal is an advocate for open source, passionate about helping engineers, ethical hackers, and digital enthusiasts navigate modern software development. With over a decade of experience building cloud-native machine learning pipelines in healthcare and tech for good sectors, Sal now... Read More →
Friday November 15, 2024 4:00pm - 4:35pm MST
Salt Palace | Level 1 | Grand Ballroom GI
  Data Processing + Storage
  • Content Experience Level Any

4:55pm MST

Goodbye Etcd! Running Kubernetes on Distributed PostgreSQL - Denis Magda, Yugabyte
Friday November 15, 2024 4:55pm - 5:30pm MST
Kubernetes once favored Etcd as a database for all cluster data. Back then, relational databases lacked the availability and scalability characteristics required by Kubernetes. However, as Etcd encountered challenges with various Kubernetes workloads, relational databases continued to evolve. This session is a practical guide for deploying fault-tolerant and scalable Kubernetes clusters on distributed PostgreSQL. We’ll begin with Kine, which integrates into the Kubernetes architecture, enabling relational databases for cluster metadata management. Then, we’ll use Kine to deploy Kubernetes on a single-server PostgreSQL instance. After that, we’ll migrate to a multi-node PostgreSQL instance, allowing Kubernetes to tolerate zone and region outages and scale to thousands of nodes on demand.
Speakers
avatar for Denis Magda

Denis Magda

Head of DevRel, Yugabyte
Denis started his software engineering career at Sun Microsystems and Oracle, where he built JVM/JDK and led one of the Java development groups. After learning Java from the inside, he joined the world of distributed systems and databases, where he has remained ever since. His experience... Read More →
Friday November 15, 2024 4:55pm - 5:30pm MST
Salt Palace | Level 1 | Grand Ballroom GI
  Data Processing + Storage
  • Content Experience Level Any
 

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
  • 🚨 Contribfest
  • 🪧 Poster Sessions
  • AI + ML
  • Breaks
  • ⚡ Lightning Talks
  • Cloud Native Experience
  • Cloud Native Novice
  • CNCF-hosted Co-located Events
  • Connectivity
  • Data Processing + Storage
  • Emerging + Advanced
  • Experiences
  • Keynote Sessions
  • Maintainer Track
  • Observability
  • Operations + Performance
  • Platform Engineering
  • Project Opportunties
  • Registration
  • SDLC
  • Security
  • Solutions Showcase
  • Sponsor-hosted Co-located Event
  • Tutorials