Overview
So, I have a working machine learning (ML) model that I want to move to the edge.
By the way, my ML model processes images for depth estimation to provide perception capabilities for an autonomous robot. The machine learning model used is based on Fast Depth from MIT. This is a U-Net architecture focused on speed. It uses a MobileNet encoder and a matching decoder with skip connections.
(Image by Author)
It was developed using Keras (PyTorch at first) on Python with a CUDA backend.
By moving to the edge, I mean I needed this running on a small CPU/GPU (Qualcomm 820) running an embedded Linux where the GPU (Adreno 530) can only be accessed via OpenCL (i.e., not CUDA.)
Caveat — If you’re on iOS or Android you’re already relatively home free. This article is for getting machine learning on the GPU on embedded Linux using OpenCL.
Travels to Find a Workable Solution
This is going to be easy. Hah! Turns out once you leave CUDA behind you’re in the wilderness…
There are really two broad approaches to deploying a model at the edge.
#machine-learning #edge-computing