TensorFlow Inference on WebAssembly — from 10 Minutes to 10 Milliseconds

This presentation will cover options to run Tensorflow model inference on WebAssembly. We will start from the unique challenges of deploying AI inference models in production, and how Rust + WebAssembly could help. Using a Mobilenet image classification task as an example, we will discuss the pros and cons of the plain JS approach, Tensorflow.js, pure Rust crates for Tensorflow compiled to Wasm, and WASI-like Tensorflow Wasm extensions that run on specialized inference chips. We will go through the journey of 60,000x performance gain over different WebAssembly approaches. We will also discuss what’s the future for WebAssembly-based AI on the edge cloud.

#tensorflow #webassembly

youtube.com

TensorFlow Inference on WebAssembly — from 10 Minutes to 10 Milliseconds