Should You Use the ML Monitoring Solution Offered by Your Cloud Provider?

As AI systems become increasingly ubiquitous in many industries, the need to monitor these systems rises. AI systems, much more than traditional software, are hyper sensitive to changes in their data inputs. Consequently, a new class of monitoring solutions has risen at the data and functional level (rather than infrastructure of application levels). These solutions aim to detect the unique issues that are common in AI systems, namely concept drifts, biases and more.

The AI vendor landscape is now crowded with companies touting monitoring capabilities. These companies include best-of-breed/standalone solutions, and integrated AI lifecycle management suites. The latter offer more basic monitoring capabilities, as a secondary focus.

To further the hype, some of the major cloud providers began communicating that they also offer monitoring features for machine learning models deployed on their cloud platforms. AWS and Azure, the largest and 2nd largest providers by market share, each announced specific features under the umbrellas of their respective ML platforms — SageMaker Model Monitor (AWS), and Dataset Monitors (Azure) respectively. Google (GCP), so far, seems to only offer application level monitoring for serving models and for training jobs.

In this post we provide a general overview of the current offerings from the cloud providers (we focused on AWS and Azure) and discuss the gaps in these solutions (which are generally covered well by best-of-breed solutions).

#monitoring #mlops #machine-learning #cloud-services #artificial-intelligence

towardsdatascience.com

Should You Use the ML Monitoring Solution Offered by Your Cloud Provider?