Loading…
Attending this event?
In-person
November 12-15
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon North America 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Mountain Standard Time (UTC -7). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 
Data Processing + Storage clear filter
arrow_back View All Dates
Thursday, November 14
 

11:00am MST

Cooperative Scheduling for Stateful Systems - Michael Youssef & Laxman Prabhu, LinkedIn
Thursday November 14, 2024 11:00am - 11:35am MST
At LinkedIn, we develop many stateful systems and run them on tens of thousands of machines in our datacenters. As we move LinkedIn’s infrastructure to Kubernetes, we quickly realized that StatefulSet was not going to be enough to support running critical stateful systems and satisfy the safety and durability goals of the teams developing stateful systems. We've built first-class support for running stateful workloads on bare metal where the stateful systems can coordinate with Kubernetes to stay available and ensure durability. With our design, we support planned/unplanned maintenance, swapping out hardware, and allow stateful systems to customize their rollout policies natively on Kubernetes. This talk covers: - Our LiStatefulSet API. - How we allow apps to customize safety checks and deployment policies via an ApplicationClusterManager, our pluggable policy engine. - The ApplicationClusterManager protocol that allows coordination of the lifecycle of workloads with Kubernetes.
Speakers
LP

Laxman Prabhu

Staff Software Engineer, Systems Infrastructure, LinkedIn
avatar for Michael Youssef

Michael Youssef

Staff Software Engineer, LinkedIn
Michael is a Staff Software Engineer at LinkedIn, currently making management and deployment of sharded systems a touch less painful on Kubernetes. In his free time he enjoys spending time with his cat, inhaling chocolate, and playing tennis.
Thursday November 14, 2024 11:00am - 11:35am MST
Salt Palace | Level 1 | Grand Ballroom GI
  Data Processing + Storage

11:55am MST

Database DevOps: CD for Stateful Applications - Stephen Atwell, Harness.io & Christopher Crow, Pure Storage
Thursday November 14, 2024 11:55am - 12:30pm MST
Running stateful applications on Kubernetes can provide many of the same advantages as stateless applications. In this talk, Stephen and Chris will share some thoughts on managing stateful applications as part of a CD Pipeline so that applications - and the application's data - can be versioned and deployed safely and repeatedly. This talk will discuss managing persistent data within kubernetes, as well as managing structural changes to a database as part of a CD process. With Kubernetes and liquibase, we can provide something better than before: A more testable, repeatable, and open way to deploy stateful applications. This talk features a practical demo of how CD tooling can empower users to automate data migrations within Kubernetes.
Speakers
avatar for Christopher Crow

Christopher Crow

Technical Marketing Engineer, Pure Storage
Chris Crow works as a cloud architect at Portworx. He has worked previously as an education, systems administrator. He is a lifelong open-source enthusiast.
avatar for Stephen Atwell

Stephen Atwell

Principal Product Manager, Harness.io
With over 26 years of technology experience, Stephen focuses on solving problems encountered in his previous roles. He has worn hats ranging from network administrator, to database administrator, to software engineer, to product manager. Outside of work, Stephen develops open source... Read More →
Thursday November 14, 2024 11:55am - 12:30pm MST
Salt Palace | Level 1 | Grand Ballroom GI
  Data Processing + Storage

2:30pm MST

Distributed Cache Empowers AI/ML Workloads on Kubernetes Cluster - Yuichiro Ueno & Toru Komatsu, Preferred Networks, Inc.
Thursday November 14, 2024 2:30pm - 3:05pm MST
Today, storage technologies play a fundamental role in the realm of AI/ML. Read performance is essential for swiftly moving datasets from storage to AI accelerators. However, the rapid enhancement of AI accelerators' performance often outpaces I/O, bottlenecks the training. Due to the scheduling of pods in Kubernetes across multiple nodes, utilizing node-local storage effectively presents a challenge. To address this, we introduce a distributed cache system built atop node-local storages, designed for AI/ML workloads. This cache system has been successfully deployed on our on-premise 1024+ GPUs Kubernetes cluster within a multi-tenancy environment. Throughout our two-year experience operating this cache system, we have overcome numerous hurdles across several components, including the I/O library, load balancers, and the storage backend. We will share the challenges and the solutions we implemented, leading to a system delivering 50+ GB/s throughput and less than 2ms latency.
Speakers
avatar for Toru Komatsu

Toru Komatsu

Engineer, Preferred Networks, Inc.
Toru is a machine learning platform engineer at Preferred Networks in Japan. He is the creator and lead developer of youki, an OCI Runtime in Rust, and a maintainer of the OCI Runtime Specification. Additionally, he serves as a reviewer for runwasi and is involved in developing a world that utilizes containers and Wasm. Additionally, he is a member of the Kubernetes org and is especially interested in... Read More →
avatar for Yuichiro Ueno

Yuichiro Ueno

Engineer, Preferred Networks, Inc.
He is currently a machine learning platform engineer at Preferred Networks in Japan. His research and engineering interests include a range of high-performance computing (distributed deep learning, networking/RDMA, and storage technologies), performance engineering, and Kubernete... Read More →
Thursday November 14, 2024 2:30pm - 3:05pm MST
Salt Palace | Level 1 | Grand Ballroom GI
  Data Processing + Storage

3:25pm MST

Elastic Data Streaming: Autoscaling Apache Kafka - Jakub Scholz, Red Hat
Thursday November 14, 2024 3:25pm - 4:00pm MST
Autoscaling is an important part of modern cloud-native architecture. It allows applications to handle a big load at peak times while helping to optimize costs and make deployments more green and sustainable at the same time. Apache Kafka is well known for its scalability. It can grow with your project from a small cluster up to hundreds of brokers. But it was not very elastic for a long time and using dynamic autoscaling with it was very hard. This talk will guide the attendees through the main challenges of auto-scaling Apache Kafka on Kubernetes. It will show how these challenges can be solved with the help of new features added recently in Strimzi and Apache Kafka projects such as auto-rebalancing, node pools, or tiered storage. And it will help the users get started with the auto-scaling of Apache Kafka.
Speakers
avatar for Jakub Scholz

Jakub Scholz

Senior Principal Software Engineer, Red Hat
Jakub works at Red Hat as Senior Principal Software Engineer. He has long-term experience with messaging and currently focuses mainly on Apache Kafka and its integration with Kubernetes. He is one of the maintainers of the Strimzi project which provides tooling for running Apache... Read More →
Thursday November 14, 2024 3:25pm - 4:00pm MST
Salt Palace | Level 1 | Grand Ballroom GI
  Data Processing + Storage

4:30pm MST

Elevating Kubeflow Spark Operator's Future: Best Practices and Enhancements - Vara Bonthu, AWS & Chaoran Yu, Apple Inc
Thursday November 14, 2024 4:30pm - 5:05pm MST
As Kubernetes becomes the leading platform for data processing, mastering the deployment and management of Apache Spark on it is crucial. In this presentation, you'll hear from the new maintainers of the Kubeflow Spark Operator project, who will provide an overview of scaling the Spark Operator on Kubernetes, emphasizing best practices to optimize performance and efficiency. Attendees will explore the migration of the Spark Operator repository from Google to Kubeflow, gaining insights into the roadmap and key takeaways. The session will cover strategies for achieving multi-tenancy, managing multiple Spark Operator instances for large-scale deployments, ensuring robust security, and performing seamless upgrades. Participants will learn advanced techniques to maximize their Spark on Kubernetes deployments, making their data processing pipelines more efficient, reliable, and secure. This talk is for Data, ML, DevOps, and MLOps pros to enhance their Spark on Kubernetes skills.
Speakers
avatar for Chaoran Yu

Chaoran Yu

Software Engineer, Apple Inc
Chaoran Yu is a software engineer at Apple. He leads a team that builds and operates a large-scale batch analytics data platform to meet the demanding requirements of data scientists and engineers. His passion lies in delivering the best value to stakeholders through best-of-breed... Read More →
avatar for Vara

Vara

Principal OSS Specialist, AWS
Vara Bonthu is a dedicated technology professional and Worldwide Tech Leader for Data on EKS, specializing in assisting AWS customers ranging from strategic accounts to diverse organizations. He is passionate about open-source technologies, Data Analytics, AI/ML, and Kubernetes, and... Read More →
Thursday November 14, 2024 4:30pm - 5:05pm MST
Salt Palace | Level 1 | Grand Ballroom GI
  Data Processing + Storage
  • Content Experience Level Any
 

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.
Filtered by Date - 
  • 🚨 Contribfest
  • 🪧 Poster Sessions
  • AI + ML
  • Breaks
  • ⚡ Lightning Talks
  • Cloud Native Experience
  • Cloud Native Novice
  • CNCF-hosted Co-located Events
  • Connectivity
  • Data Processing + Storage
  • Emerging + Advanced
  • Experiences
  • Keynote Sessions
  • Maintainer Track
  • Observability
  • Operations + Performance
  • Platform Engineering
  • Project Opportunties
  • Registration
  • SDLC
  • Security
  • Solutions Showcase
  • Sponsor-hosted Co-located Event
  • Tutorials