Faster and Lighter Model Inference with ONNX Runtime from Cloud to Client

Faster and Lighter Model Inference with ONNX Runtime from Cloud to Client

ONNX Runtime is a high-performance inferencing and training engine for machine learning models. ONNX Runtime has been widely adopted by a variety of Microsoft products including Bing, Office 365 and Azure Cognitive Services, achieving an average of 2.9x inference speedup. Now we are glad to introduce ONNX Runtime quantization and ONNX Runtime mobile for further accelerating model inference with even smaller model size and runtime size. ONNX Runtime keeps evolving not only for cloud-based inference but also for on-device inference.

ONNX Runtime is a high-performance inferencing and training engine for machine learning models. This show focuses on ONNX Runtime for model inference. ONNX Runtime has been widely adopted by a variety of Microsoft products including Bing, Office 365 and Azure Cognitive Services, achieving an average of 2.9x inference speedup. Now we are glad to introduce ONNX Runtime quantization and ONNX Runtime mobile for further accelerating model inference with even smaller model size and runtime size. ONNX Runtime keeps evolving not only for cloud-based inference but also for on-device inference.

Jump To:
[01:02] ONNX and ONNX Runtime overview https://aka.ms/AIShow/ONNXRuntimeGH [02:26] model operationalization with ONNX Runtime [04:04] ONNX Runtime adoption [05:07] ONNX Runtime INT8 quantization for model size reduction and inference speedup [09:46] Demo of ONNX Runtime INT8 quantization [16:00] ONNX Runtime mobile for runtime size reduction

Learn More: ONNX Runtime https://aka.ms/AIShow/ONNXRuntimeGH Faster and smaller quantized NLP with Hugging Face and ONNX Runtime https://aka.ms/AIShow/QuantizedNLP ONNX Runtime for Mobile Platforms https://aka.ms/AIShow/RuntimeforMobilePlatforms ONNX Runtime Inference on Azure Machine Learning https://aka.ms/AIShow/RuntimeInferenceonAML

cloud programming developer

Bootstrap 5 Complete Course with Examples

Bootstrap 5 Tutorial - Bootstrap 5 Crash Course for Beginners

Nest.JS Tutorial for Beginners

Hello Vue 3: A First Look at Vue 3 and the Composition API

Building a simple Applications with Vue 3

Deno Crash Course: Explore Deno and Create a full REST API with Deno

How to Build a Real-time Chat App with Deno and WebSockets

Convert HTML to Markdown Online

HTML entity encoder decoder Online

Multi-cloud Spending: 8 Tips To Lower Cost

Mismanagement of multi-cloud expense costs an arm and leg to business and its management has become a major pain point. Here we break down some crucial tips to take some of the management challenges off your plate and help you optimize your cloud spend.

What are the benefits of cloud migration? Reasons you should migrate

To move or not to move? Benefits are multifold when you are migrating to the cloud. Get the correct information to make your decision, with our cloud engineering expertise.

How long does it take to develop/build an app?

This article covers A-Z about the mobile and web app development process and answers your question on how long does it take to develop/build an app.

Developer Career Path: To Become a Team Lead or Stay a Developer?

For a developer, becoming a team leader can be a trap or open up opportunities for creating software. Two years ago, when I was a developer, ... by Oleg Sklyarov, Fullstack Developer at Skyeng company

Tracking a Developer’s Journey From Documentation Visit

Measuring website activity provides only half the story. See how to best track the developer's journey and what funnel stages makes sense for API-first products