KubeCon + CloudNativeCon North America 2024: Full Schedule

In-person
November 12-15
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Mountain Standard Time (UTC -7). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.

arrow_back View All Dates

11:00am MST

Shopify’s Open Source Approach to Network Monitoring with eBPF, Vector and ClickHouse - Sebastian Rabenhorst & Matt Franklin, Shopify

Friday November 15, 2024 11:00am - 11:35am MST

Salt Palace | Level 1 | Grand Ballroom B

At Shopify, we’ve successfully implemented a scalable, open-source network monitoring solution for the cloud. In this talk, we will demonstrate how we built a network monitoring solution leveraging eBPF, Vector, ClickHouse, and Grafana. This solution enables us to monitor over 30 million network flow, DNS and other networking-related events per second at the container level for thousands of services across hundreds of Kubernetes clusters in the Shopify Cloud. We will also share the lessons we learned regarding these technologies and provide insights on how you can implement your own purely open-source monitoring solution capable of handling millions of events per second.

Speakers

Matt Franklin

Shopify

Sebastian

Senior Production Engineer, Shopify

Sebastian is a Senior Production Engineer at Shopify mostly working on monitoring and logging solutions as part of the observability team.

Shopify’s Open Source Approach to Network Monitoring with eBPF, Vector and ClickHouse pdf

Friday November 15, 2024 11:00am - 11:35am MST
Salt Palace | Level 1 | Grand Ballroom B

Observability

Content Experience Level Beginner

11:00am MST

Tutorial: OpenTelemetry Hands-on - Automatic and Manual Instrumentation for Java and Python Apps - Tobias Angerstein, Novatec Consulting GmbH & Tiffany Jernigan, Independent

Friday November 15, 2024 11:00am - 12:30pm MST

Salt Palace | Level 1 | Grand Ballroom G

In today's software landscape - in the cloud-native one in particular - observability has become a critical aspect of ensuring the performance, reliability, and security of applications. OpenTelemetry, a standard and OSS observability framework, provides a unified way to collect and export telemetry data from applications and services. This tutorial will guide participants through the process of using OpenTelemetry to instrument a simple application, collect metrics, traces, and logs, and send them to various backends for analysis. It covers the implementation and usage of OpenTelemetry into Python and Java-based applications. The exercises include: the instrumentation of a polyglot microservice application, auto vs. manual instrumentation, evaluating the collected traces, logs and metrics, configuring a collector, analysing the results in Jaeger and Prometheus. This tutorial is made for everyone seeking a pragmatic understanding of OpenTelemetry's immediate benefits.

Speakers

Tobias Angerstein

Senior Consultant Observability, Novatec Consulting GmbH

Tiffany Jernigan

Developer Advocate, www.tiffanyfay.dev

Tiffany is a CNCF Ambassador and a seasoned technologist and content creator in the Cloud Native space. She most recently was a senior developer advocate at VMware. She also formerly worked as a software developer and developer advocate at Amazon, Docker, and Intel. Prior to that... Read More →

OtelLabPresentation pdf

Friday November 15, 2024 11:00am - 12:30pm MST
Salt Palace | Level 1 | Grand Ballroom G

Tutorials, Observability

Content Experience Level Beginner

11:55am MST

Introduction to Distributed ML Workloads with Ray on Kubernetes - Mofi Rahman & Abdel Sghiouar, Google

Friday November 15, 2024 11:55am - 12:30pm MST

Salt Palace | Level 2 | 255 B

The rapidly evolving landscape of Machine Learning and Large Language Models demands efficient scalable ways to run distributed workloads to train, fine-tune and serve models. Ray is an Open Source framework that simplifies distributed machine learning, and Kubernetes streamlines deployment. In this introductory talk, we'll uncover how to combine Ray and Kubernetes for your ML projects. You will learn about: - Basic Ray concepts (actors, tasks) and their relevance to ML - Setting up a simple Ray cluster within Kubernetes - Running your first distributed ML training job

Speakers

Abdel Sghiouar

Cloud Developer Advocate, Google

Abdel Sghiouar is a senior Cloud Developer Advocate @Google Cloud. A co-host of the Kubernetes Podcast by Google and a CNCF Ambassador. His focused areas are GKE/Kubernetes, Service Mesh and Serverless.

Mofi Rahman

Developer Relations Engineer, Google

Mofizur Rahman (@moficodes) is a Developer Advocate at Google. His favorite programming language these days is Go. He is a strong believer of the power of open source and importance of giving back to the community. He is a self proclaimed sticker collecting addict and has collected... Read More →

Introduction to Distributed workload with Ray on Kubernetes pdf

Friday November 15, 2024 11:55am - 12:30pm MST
Salt Palace | Level 2 | 255 B

Cloud Native Novice

Content Experience Level Beginner

2:00pm MST

Seccomp and eBPF; What’s the Difference? Why Do I Need to Know? - Natalia Reka Ivanko & Duffie Cooley, Isovalent @ Cisco

Friday November 15, 2024 2:00pm - 2:35pm MST

Salt Palace | Level 1 | 151 G

Containers in Kubernetes share a common Linux kernel so how can we limit access where it isn’t required so we can follow the principle of least privilege? Join Natalia and Duffie as they each explore different approaches to harden your container security with Secure Computing (seccomp) and eBPF! The talk will begin with an overview and comparison between seccomp and eBPF and how they both can solve the same problem - limiting access to the Linux Kernel that all containers share. This will be a fun talk, showing each solution with a live demo. You will leave this talk with a better understanding of how to limit what system calls a process can make and restrict your containers’ behavior to only access the files, binaries and external DNS names they need and nothing more. Which is the right solution for your environment? Come and learn about two of the commonly used technologies in use today!

Speakers

Natalia Reka Ivanko

Sr. Product Manager, Isovalent, now part of Cisco

Natalia Ivanko is a Sr. Product Manager at Isovalent, and now part of Cisco, leading an eBPF-based Runtime Security Product, Tetragon. She has been previously a Security Engineer with a strong background in Linux, Container and Cloud Security. Passionate about building things that... Read More →

Duffie Cooley

Field CTO, Isovalent @ Cisco

Duffie is Field CTO at Isovalent focused on helping enterprises find success with Cilium and modern security tooling. Duffie has been working with all things systems and networking for 20 years and remembers most of it. A student of perspective, Duffie is always interested in working... Read More →

Friday November 15, 2024 2:00pm - 2:35pm MST
Salt Palace | Level 1 | 151 G

Security

Content Experience Level Beginner

2:55pm MST

Cloud-Native AI: Wasm in Portable, Secure AI/ML Workloads - Miley Fu, Second State

Friday November 15, 2024 2:55pm - 3:30pm MST

Salt Palace | Level 2 | 250 AD

In this talk, we present Wasm as a pioneering solution for running AI/ML workloads in cloud-native environments. Our focus is on demonstrating how Wasm (on the server) facilitates the execution of AI models, such as Llama3, Grok by X, Mixtral etc, across diverse cloud and edge platforms without sacrificing performance. We will discuss the advantages of using Rust and WebAssembly in AI/ML workloads, highlighting aspects like portability, speed, and security. Real-world examples will illustrate the deployment of AI inference models using Wasm runtime in Kubernetes environments, showcasing seamless orchestration and execution across varied devices. This session is aimed at cloud-native practitioners and AI/ML enthusiasts eager to explore innovative approaches in AI deployment.

Speakers

Miley Fu

DevRel, WasmEdge

Miley is a Developer Advocate with a passion for empowering developers to build and contribute to open source. With over 5 years of experience working on WasmEdge runtime in CNCF sandbox as the founding member, she talked at KubeCon, KCD Shenzhen, CloudDay Italy, DevRelCon, Open Source... Read More →

Friday November 15, 2024 2:55pm - 3:30pm MST
Salt Palace | Level 2 | 250 AD

AI + ML

Content Experience Level Beginner

2:55pm MST

Enabling Fault Tolerance for GPU Accelerated AI Workloads in Kubernetes - Arpit Singh & Abhijit Paithankar, NVIDIA

Friday November 15, 2024 2:55pm - 3:30pm MST

Salt Palace | Level 2 | 255 E

In K8s based ML platforms, job failures from hardware errors such as GPU malfunctions, network disruptions, ECC errors, and OOM events pose significant challenges. These failures cause resource underutilization, wasted engineering time, and high operational costs, often requiring users to resubmit jobs. Current AI/ML frameworks lack adequate fault tolerance strategies, typically requiring manual intervention and causing delays before jobs can resume. This talk explores fault tolerance strategies including naive job restarts on failure, job restarts with hot spares, and job restarts by replacing faulty nodes. We discuss how to achieve fault propagation by leveraging node and pod conditions and address gaps in fault discovery and error propagation in the existing Kubernetes ecosystem. Our talk will also include ways to enhance components like the node-problem-detector and introduce new elements to close the gaps in fault detection , propagation reaction and remediation.

Speakers

Abhijit Paithankar

Tech Lead and Engineering Manager, NVIDIA

Abhijit Paithankar is the AI and HPC Systems Tech Lead and Engineering Manager at NVIDIA, focusing on advanced computing technologies. Previously, he co-founded Crave.IO and served as CTO, and held key roles at Nutanix and VMware, developing critical hypervisor and storage solutions... Read More →

Arpit Singh (SW-CLOUD) US

Senior Software Engineer, Nvidia

Arpit Singh specializes in AI infrastructure at Nvidia, enhancing deep learning applications. Besides being a Kubernetes contributor, Arpit has 10+ years of experience spanning Nvidia, Nutanix and Cisco. He holds multiple patents (2 granted, 4+ pending) and has dual master's degr... Read More →

Fault Tolerance AI workloads pdf

Friday November 15, 2024 2:55pm - 3:30pm MST
Salt Palace | Level 2 | 255 E

AI + ML

Content Experience Level Beginner

2:55pm MST

Practical Supply Chain Security: Implementing SLSA Compliance from Build to Runtime - Enguerrand Allamel, Ledger

Friday November 15, 2024 2:55pm - 3:30pm MST

Salt Palace | Level 1 | 151 G

Securing the software supply chain can feel overwhelming, especially with dynamic frameworks like SLSA (Supply-chain Levels for Software Artifacts). This beginner-friendly session on software supply chain security explores practical strategies to secure your software from build to runtime.

We will utilize GitHub Actions, implement Cosign for seamless artifact signing without managing keys, and apply Kyverno for enforcing runtime policies. Additionally, you will learn how to use in-toto and Kubescape to verify and maintain artifact integrity effectively. To further bolster security, we will briefly explore integrating Hardware Security Modules (HSMs) into your workflow, providing a robust layer for key management.

By the end of this talk, you will have actionable insights and a clear understanding of how to achieve SLSA compliance within the CNCF ecosystem.

Speakers

Enguerrand Allamel

Staff Cloud Security Engineer, Ledger

Enguerrand is a Staff Cloud Security Engineer at Ledger with a background in Site Reliability Engineering.His focus areas include Software Supply Chain Security and Cloud Security.

Practical Supply Chain Security Implementing SLSA Compliance from Build to Runtime pdf

Friday November 15, 2024 2:55pm - 3:30pm MST
Salt Palace | Level 1 | 151 G

Security

Content Experience Level Beginner

4:00pm MST

Best Practices for Deploying LLM Inference, RAG and Fine Tuning Pipelines on K8s - Meenakshi Kaushik & Shiva Krishna Merla, NVIDIA

Friday November 15, 2024 4:00pm - 4:35pm MST

Salt Palace | Level 2 | 250 AD

In this session, we'll cover best practices for deploying, scaling, and managing LLM inference pipelines on Kubernetes (K8s). We'll explore common patterns like inference, retrieval-augmented generation (RAG), and fine-tuning. Key challenges addressed include: [1]. Minimizing initial inference latency with model caching [2] Optimizing GPU usage with efficient scheduling, multi-GPU/node handling, and auto-quantization [3] Enhancing security and management with RBAC, monitoring, auto-scaling, and support for air-gapped clusters We'll also demonstrate building customizable pipelines for inference, RAG, and fine-tuning, and managing them post-deployment. Solutions include [1] a lightweight standalone tool built using operator pattern and [2] KServe, a robust open-source AI inference platform. This session will equip you to effectively manage LLM inference pipelines on K8s, improving performance, efficiency, and security

Speakers

Meenakshi Kaushik

Product Management, Nvidia

Meenakshi Kaushik leads product management for NIM Operator and KServe.. Meenakshi is interested in the AI and ML space and is excited to see how the technology can enhance human well-being and productivity.

Shiva Krishna Merla

Senior Software Engineer, NVIDIA

Shiva Krishna Merla is a senior software engineer on the NVIDIA Cloud Native team where he works on GPU cloud infrastructure, orchestration and monitoring. He is focused on enabling GPU-accelerated DL and AI workloads in container orchestration systems such as Kubernetes and OpenShift... Read More →

best practices llm inference rag finetuning pdf

Friday November 15, 2024 4:00pm - 4:35pm MST
Salt Palace | Level 2 | 250 AD

AI + ML

Content Experience Level Beginner

4:00pm MST

Platform Engineering for Software Developers and Architects - Daniel Bryant, Syntasso

Friday November 15, 2024 4:00pm - 4:35pm MST

Salt Palace | Level 2 | 251 AD

Building on my KubeCon EU 2022 talk, "From Kubernetes to PaaS to... err, what's next", I'll introduce the topic of platform engineering through the lens of a software developer and architect. My primary goal is for developers to understand "what good looks like" with a successful platform build and help them understand how a platform can influence the SDLC (for better or worse!) Key takeaways from the session: - Explore how platform architecture influences software architecture and vice versa - Learn why the principles of coupling and cohesion apply to platform components (and configuration) in the same way as they do with software components - Understand what to expect from an effective platform, including how applications are built, shipped, and run - Learn about key platform metrics grounded in developer experience frameworks such as DORA, SPACE, and DevEx

Speakers

Daniel Bryant

Platform Engineer & Head of Product Marketing, Syntasso

Daniel Bryant is the head of product marketing at Syntasso. His technical expertise focuses on ‘DevOps’ tooling, cloud/container platforms, and microservice implementations. Daniel is a long-time coder, platform engineer, and Java Champion. He also writes for InfoQ, O’Reilly... Read More →

Friday November 15, 2024 4:00pm - 4:35pm MST
Salt Palace | Level 2 | 251 AD

Cloud Native Novice

Content Experience Level Beginner

4:00pm MST

Medical Research Computing Infrastructure on Hybrid Kubernetes - Jennings Zhang, Boston Children's Hospital

Friday November 15, 2024 4:00pm - 4:35pm MST

Salt Palace | Level 1 | Grand Ballroom H

Research computing is essential across biomedical research, especially in medical imaging and radiology where ML+AI are rapidly disrupting the field. But while the research frontier continues moving forward, the computing infrastructure of research and healthcare institutions tend to lag behind. At the Boston Children’s Hospital, we are closing the gap by developing the ChRIS Research Integration Service (ChRIS for short). ChRIS is an MIT-licensed platform for medical computation, enabling the use of research software in clinical practice, while maximizing the utility of our hybrid-cloud resources. This talk will be a discussion of the cloud-native software ecosystem from the perspective of a medical researcher of a teaching hospital. We will consider the advantages of adopting cloud-native software and Kubernetes for research and healthcare institutions, as well as the challenges in doing so.

Speakers

Jennings Zhang

Research Developer, Boston Children's Hospital

Jennings is a neuroscience researcher and software developer at the Boston Children's Hospital. His work and interests are split between biological questions, e.g. human brain development, and all-things software development, especially containers and Rust.

KubeCon2024 ChRIS pdf

Friday November 15, 2024 4:00pm - 4:35pm MST
Salt Palace | Level 1 | Grand Ballroom H

Platform Engineering

Content Experience Level Beginner

4:00pm MST

Tutorial: Stop Kubernetes' Revolving Door: A Hands-on Tutorial to Secure a Kubernetes Cluster - Savitha Raghunathan & Rey Lejano, Red Hat; Mahé Tardy, Isovalent at Cisco

Friday November 15, 2024 4:00pm - 5:30pm MST

Salt Palace | Level 1 | Grand Ballroom G

Out-of-the-box, upstream Kubernetes is not secure by default. This tutorial will walk through the official/upstream Kubernetes Security Checklist to set up a cluster securely. The tutorial starts with an introduction to the critical security considerations for Kubernetes environments. Participants will then embark on a guided journey through practical exercises designed to implement security best practices within Kubernetes clusters. Attendees will gain firsthand experience in aspects such as authentication, authorization, network policies, pod security, and more, providing participants with a comprehensive understanding of Kubernetes security principles and how to implement them. This will equip them with the knowledge and skills to effectively secure their clusters. Whether you're new to Kubernetes security or seeking to enhance your expertise, this tutorial offers valuable insights and hands-on experience to strengthen your Kubernetes clusters against potential threats.

Speakers

Savitha Raghunathan

Senior Software Engineer, Red Hat

Savitha Raghunathan is a Senior Software Engineer at Red Hat, working on Container Migration and Application Modernization. She leads K8s sig-security-docs sub-project aiming to create security awareness through docs. As a maintainer of the Konveyor project, she leads the community... Read More →

Mahé Tardy

Software Engineer, Isovalent at Cisco

Mahé is a security engineer at Isovalent and an active contributor to Kubernetes SIG Security. He was previously working as a security researcher and loves working with Linux, security, and Kubernetes!

Rey Lejano

Solutions Architect @ Red Hat, CNCF Ambassador, K8s SIG Docs co-chair, SIG Security subproject lead, K8s v1.23 release lead, DevOps Institute Ambassador, Red Hat

Rey Lejano is a Solutions Architect at Red Hat and is the co-chair of Kubernetes SIG Docs. He contributes to Kubernetes SIG Security, Release, & Contributor Experience. He is a member of seven Kubernetes Release Teams including serving as the 1.23 Release Lead and 1.25 Emeritus Adviser... Read More →

Stop Kubernetes' Revolving Door A Hands On Tutorial to Secure a Kubernetes Cluster pdf

Friday November 15, 2024 4:00pm - 5:30pm MST
Salt Palace | Level 1 | Grand Ballroom G

Tutorials, Security

Content Experience Level Beginner

4:55pm MST

With Great Flexibility Comes Great Complexity: Inspect Your Gateway API Configuration - Mattia Lavacca, Kong & Gaurav Ghildiyal, Google

Friday November 15, 2024 4:55pm - 5:30pm MST

Salt Palace | Level 1 | 155 E

With its graduation, Gateway API has emerged as the new standard for managing L4 and L7 routing within Kubernetes, as it brings in a wider set of functionalities and flexibility never seen with the ingress API, and is implemented widely for both ingress and service mesh use cases. The trade-off of having such a powerful API is additional complexity, and navigating the intricacies of Gateway API involves listing multiple resources, cross-referencing and understanding the relationships between them, and ensuring explicit authorization for all cross-namespace references - a formidable challenge, nonetheless. In this talk, Gaurav and Mattia will walk you through how to use gwctl, a command-line tool designed specifically for Gateway API (which is part of the Gateway API project itself), that works seamlessly alongside Kubectl. Together, we will easily navigate resources, wrangle policies, and track down trouble in your Gateway API configuration.

Speakers

Mattia Lavacca

Software Engineer, Kong

Software engineer at Kong, working on Kubernetes networking. I actively participate in the SIG-Network community, where I serve as a maintainer of the Gateway API. I work on key Kong projects related to networking in Kubernetes, such as the Ingress controller and the Gateway Oper... Read More →

Gaurav Ghildiyal

Software Engineer, Google

Gaurav is a Software Engineer at Google specializing in Kubernetes Networking. He is actively involved in the open-source Gateway API project, recently focusing on shepherding the development of gwctl, a command-line tool for Gateway API. Gaurav also actively contributes to other... Read More →

Inspect Your Gateway API Configuration pdf

Friday November 15, 2024 4:55pm - 5:30pm MST
Salt Palace | Level 1 | 155 E

Connectivity

Content Experience Level Beginner