Neural networks have expanded the breadth of tasks we can perform and revolutionized the accuracies at which we can achieve them. However, they are computationally expensive and thus make it difficult to perform tasks at volume and scale — a necessity for most real world applications. This article explores one of the various state of the art research that provides system level and architectural optimizations to scale these payloads into real-world applications. In particular, we will be exploring queries over video data, a medium that increasingly needs real-time responses and carries the challenge of large file sizes.

Let’s start with a scenario.

Your company provides traffic analytic services at city intersections and has recently trained a neural network to identify various traffic scenarios such as accidents, stopped cars, red-light violations, etc from security cameras. As your business grows, you deal with and process more footage from more security cameras.

Now, you’re approached by a city to help them track accidents at certain key intersections. What optimizations can we make to identify accidents more efficiently?

NoScope: Optimizing Neural Network Queries over Video at Scale — Kang et al.

NoScope introduces a system for querying videos that can significantly lower the cost of analysis via _inference-optimized model search. _This system leverages the reference neural network and input video to generate a cascade of models consisting of specialized neural networks and difference detectors. NoScope generates results at the same levels of accuracy, but at almost one-third the original cost.

Architecture

Image for post

NoScope Architecture

As shown in the upper half of the architecture diagram, NoScope applies the references model to a subset of the target video and generates labeled examples. In our presented scenario, NoScope would identify frames as “accident” or “no accident”.

Next, NoScope trains a specialized model that is much shallower than the reference model. Specialized models forego the generality of a reference model, but offer similar behavior on a specific task. Specialized models at one intersection are unlikely to generalize to other intersections or other queries, but a new specialized model can be generated at each intersection.

#video-analytics #deep-learning #machine-learning #research #deep learning

NoScope: Optimizing Neural Network Queries over Video at Scale — Kang et al.

Architecture

medium.com

NoScope: Scaling Neural Network Queries to the Real-World