Choose the Right Storage Engine for Kubeflow and ML Workloads

Kubeflow is one of the unique workloads designed for Kubernetes. This platform abstracts the underpinnings of Kubernetes by exposing a set of integrated functionalities for data scientists, developers, machine learning engineers, and operators. It’s also unique because of the prerequisites it has for running a robust, cloud native, and enterprise-ready machine learning platform.

Like any other mature applications designed for Kubernetes, Kubeflow heavily relies on the storage layer for achieving high availability and delivering expected performance.

There are many open source and commercially available storage engines for Kubernetes, which can be used with Kubeflow. From Ceph/Rook to Red Hat‘s GlusterFS to good old NFS, customers can choose from a variety of options. But, no single storage layer meets all the requirements of running the Kubeflow platform, and the diverse set of components such as Notebook Servers, Pipelines, and KFServing.

When you use Kubeflow, you are expected to meet the storage requirements of the platform and the ML jobs that you run through Jupyter Notebooks, Pipelines, Katib, and KFServing. It’s important to know that the Kubeflow platform and the ML jobs have distinct storage requirements.

Let’s take a closer look at the storage configuration that these two layers: The Kubeflow platform and custom jobs created by users that run on the platform.

#kubernetes

thenewstack.io

Choose the Right Storage Engine for Kubeflow and ML Workloads