A lot of people were interested in the architecture behind Cartoonizer. So Tejas and I (Niraj_) have tried explaining the process we employed to make it work. Kudos to Algorithmia for powering our video inference pipeline. _😇

Image for post

AI cartoonized image! Original Photo by Duy Pham on Unsplash

In today’s fast paced world, ML experts are expected to put on multiple hats in the ML workflow. One of the critical tasks in the workflow is to serve the models in production! This seemingly important piece in the pipeline tends to get overlooked thus faltering to provide value to the customers.

The engineering discipline clearly can’t exist without the work of the (data) scientists — MLE (Machine Learning Engineering) is built on the work of data science — but the engineering is how the the science gets applied to the world.

_- _Caleb Kaiser

This article is going to explain our attempt to not only serve a computationally intensive GAN model in production inexpensively but also scale it horizontally.

ML Woes 😅

If you are familiar with hosting a REST API, it warrants these basic things -

  1. A fast protoype in Flask
  2. Setting up an environment
  • GCP or AWS instance
  • System dependencies as well as python specific dependencies (pip)
  • Proxy server
  • Multiple workers to scale horizontally

As an ML engineer, the 2nd point is tedious and less than satisfactory in terms of scalability and server costs. Gone are those days when the responsibility of maintaining servers rests on your shoulders! I am talking about outsourcing and automating the 2nd point completely. Enter Google Cloud Run!

#engineering #machine-learning #deep-learning #google-cloud-run #algorithmia #deep learning

How we built an inexpensive
1.25 GEEK