Name: Which GPU Sharing Strategy Is Right for You? A Comprehensive Benchmark Study Using DRA - Kevin Klues & Yuan Chen, NVIDIA
Start: 2024-11-14T16:30:00-0700
End: 2024-11-14T17:05:00-0700

In-person
November 12-15
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Mountain Standard Time (UTC -7). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.

Thursday November 14, 2024 4:30pm - 5:05pm MST

Salt Palace | Level 2 | 255 E

Dynamic Resource Allocation (DRA) is one of the most anticipated features to ever make its way into Kubernetes. It promises to revolutionize the way hardware devices are consumed and shared between workloads. In particular, DRA unlocks the ability to manage heterogeneous GPUs in a unified and configurable manner without the need for awkward solutions shoehorned on top of the existing device plugin API. In this talk, we use DRA to benchmark various GPU sharing strategies including Multi-Instance GPUs, Multi-Process Service (MPS), and CUDA Time-Slicing. As part of this, we provide guidance on the class of applications that can benefit from each strategy as well as how to combine different strategies in order to achieve optimal performance. The talk concludes with a discussion of potential challenges, future enhancements, and a live demo showcasing the use of each GPU sharing strategy with real-world applications.

Speakers

Kevin Klues

Distinguished Engineer, NVIDIA

Kevin Klues is a distinguished engineer on the NVIDIA Cloud Native team. Kevin has been involved in the design and implementation of a number of Kubernetes technologies, including the Topology Manager, the Kubernetes stack for Multi-Instance GPUs, and Dynamic Resource Allocation (DRA... Read More →

Yuan Chen

Principal Software Engineer, NVIDIA

Yuan Chen is a Principal Software Engineer at NVIDIA, working on building NVIDIA GPU Cloud for AI. He served as a Staff Software Engineer at Apple from 2019 to 2024, where he contributed to the development of Apple's Kubernetes infrastructure. Yuan has been an active code contributor... Read More →

KCNA24 DRA Benchmarking pdf

Thursday November 14, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 2 | 255 E

AI + ML

Content Experience Level Any

KubeCon + CloudNativeCon North America 2024

Kevin Klues

Yuan Chen

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!