AI models as Microservices — Training to production

Why does it matter?

Embracing Artificial Intelligence in a business is a journey that requires investment and persistence. Though the advantages of using AI could be plain for everyone to see, it is an enormous challenge to take it from training to production. The journey saddles the application architects with multiple obstacles. They need to ponder upon:

1. How to integrate the AI model (possibly with a different tech stack) with the existing business application

2. Operationalize it with seamless deployments while managing frequent upgrades to AI model thereby dealing with versioning

3. Ensuring high performance and scalability

Considering all eyes are fixated on the AI model to keep learning and keep getting better over time (self-learning), it is not an easy task for the designers by any means and reasonably so.

Zooming into the problems

Most teams start with experiments in the research labs, working to solve a business problem with a certain dataset. They typically use **Jupyter notebooks **running on python. Once the proof-of-concept (POC) is ready and signed off it has to be taken to production. This is where the real challenge starts and overall picture starts unfolding.

Data problems

Data may come from a streaming pipeline, a database or from files in various formats. It may be structured or unstructured. This requires an iterative process to analyze and process the data before feeding it to the AI-models to train (or tune). If the quality of the data is not good it may bring down the performance or accuracy of the AI model considerably. Since it is always a wonder as to how the AI model will perform in the real world on production systems hence one needs to come up with a strategy to regularly evaluate the quality of data and the performance of AI model running on production.

The integration problems

The most preferred option of AI engineering team is to train and develop the AI model using python with C++ extensions but there are a plethora of competing platforms and technologies for various purposes within data science. The legacy business application could be running on a different tech stack (say Java for instance). This creates a challenge for the architects to appropriately integrate the two (AI software with the non-AI software). This also triggers a debate whether to deploy AI model separately or in embedded mode within the business application. There are some options available to have all code executing within a single JVM, embedded with the business application. However this is not preferred by many because of the scalability constraints that it brings. Separate deployment of AI models is also preferred over embedded mode to utilize the computing power through GPUs for train-heavy use cases. Moreover no one wants to train an AI model while the business application is catering to the needs of its users as it would put heavy load on the system resources. Running python in embedded mode also partially restricts platform independence as the scripts need to be pre-compiled as per the architecture of the target environment. Embedded mode may just suffice in high performing inference-only use cases only where scalability may not be a concern.

#machine-learning #containerization #mlops #microservices #artificial-intelligence

Why does it matter?

Zooming into the problems

Data problems

The integration problems

faun.pub

AI models as Microservices — Training to production