Loading…
Attending this event?
In-person
November 12-15
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Mountain Standard Time (UTC -7). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 
Friday November 15, 2024 11:55am - 12:30pm MST
At OpenSauced, we power over 40,000 generative AI inferences every day, all through our in-house platform ontop of Kubernetes. The cost of doing this kind of at-scale AI inference with a third party provider API would be astronomic. Thankfully, using Kubernetes, the public cloud, and open-source technologies, we've been able to scale with relatively low costs and a lean stack. In this talk, John will walk through the journey of building a production grade generative AI system using open source technologies, open large language models, and Kubernetes. We'll also explore why we chose to build ontop of Kubernetes for our AI workloads over using a third party provider, and how we're running and managing our AI/ML clusters today. Additionally, we'll dive into the techniques we used to groom our Retrieval-Augmented-Generation pipelines for efficiency ontop of Kubernetes and other practical tips for deploying your own AI services at-scale.
Speakers
avatar for John McBride

John McBride

Sr. Software Engineer, OpenSauced
John is a Sr. Software Engineer at OpenSauced where he also serves as Head of Infrastructure and AI engineer. He is the maintainer of spf13/cobra, the Go CLI bootstrapping library used throughout the CNCF landscape. In the past, he has worked on open source Kuberenetes platforms... Read More →
Friday November 15, 2024 11:55am - 12:30pm MST
Salt Palace | Level 2 | 250
  AI + ML
  • Content Experience Level Any

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link