This is the third part of a four-part series. I suggest you read part 1 and 2 for a better understanding:

Part 1

Part 2

In this article, we will continue the exploration of production machine learning systems on GCP with a special focus on the design of high-performance ML systems.

Content

  • Efficiency at training
  • Fast input pipelines
  • Efficient inferencing

This is a descriptive series at a high-level, there will be another series on implementing some of these standard concepts but if you would love to get fully hands-on before then, I suggest you take the Advance Machine Learning with Tensorflow on GCP course by Google ML team for a start.

Design High-Performance ML systems

High-performance ML systems could mean different things to different companies depending on the project goal. This could mean a powerful ML system that has the ability to handle a large dataset, a system that can do the job as fast as possible, a system that has the ability to train for a long period of time, or even achieving the best possible accuracy, etc. These many factors and characteristics of high-performance ML systems are important, but one key aspect is the time it takes to train a model. Assuming we wish to train a model to attain a specific evaluation measure (e.g. accuracy), we could design a high-performance ML system from the infrastructure performance perspective. When allocating or provisioning for infrastructure to the machine learning tasks, we should consider some key factors such as time-to-train, budget, inference, and prediction time.

#heartbeat #machine-learning #cloud-computing #mlops #data-science

Building Production Machine Learning Systems on Google Cloud Platform
1.30 GEEK